Adaptive Combination of Commercial OCR Systems
Identifieur interne : 001633 ( Main/Merge ); précédent : 001632; suivant : 001634Adaptive Combination of Commercial OCR Systems
Auteurs : Elke Wilczok [Allemagne] ; Wolfgang Lellmann [Allemagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2004.
Abstract
Abstract: Combining multiple classifiers to achieve improved recognition results has become a popular technique in recent years. As for OCR systems, most investigations focus on fusion strategies on the character level. This paper describes a flexible framework for the combination of result strings which are the common output of commercial OCR systems. By synchronizing strings according to geometrical criteria, incorrect character segmentations can be avoided, while character recognition is improved by classical combination rules like Borda Count or Plurality Vote. To reduce computing time, further expert calls are stopped as soon as the quality of a temporary combination result exceeds a given threshold. The system allows easy integration of arbitrary new OCR systems and simplifies the determination of optimal system parameters by analyzing the input data at hand. Quantitative results are shown for a two-recognizer system, while the framework allows an arbitrary number of experts.
Url:
DOI: 10.1007/978-3-540-24642-8_8
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000048
- to stream Istex, to step Curation: 000047
- to stream Istex, to step Checkpoint: 000E17
Links to Exploration step
ISTEX:1A79B6E54E225211DCFBB052FB02326853C9C04FLe document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Adaptive Combination of Commercial OCR Systems</title>
<author><name sortKey="Wilczok, Elke" sort="Wilczok, Elke" uniqKey="Wilczok E" first="Elke" last="Wilczok">Elke Wilczok</name>
</author>
<author><name sortKey="Lellmann, Wolfgang" sort="Lellmann, Wolfgang" uniqKey="Lellmann W" first="Wolfgang" last="Lellmann">Wolfgang Lellmann</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:1A79B6E54E225211DCFBB052FB02326853C9C04F</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-24642-8_8</idno>
<idno type="url">https://api.istex.fr/document/1A79B6E54E225211DCFBB052FB02326853C9C04F/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000048</idno>
<idno type="wicri:Area/Istex/Curation">000047</idno>
<idno type="wicri:Area/Istex/Checkpoint">000E17</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Wilczok E:adaptive:combination:of</idno>
<idno type="wicri:Area/Main/Merge">001633</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Adaptive Combination of Commercial OCR Systems</title>
<author><name sortKey="Wilczok, Elke" sort="Wilczok, Elke" uniqKey="Wilczok E" first="Elke" last="Wilczok">Elke Wilczok</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Océ Document Technologies GmbH, Constance</wicri:regionArea>
<wicri:noRegion>Constance</wicri:noRegion>
<wicri:noRegion>Constance</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: Elke.Wilczok@odt-oce.com</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Lellmann, Wolfgang" sort="Lellmann, Wolfgang" uniqKey="Lellmann W" first="Wolfgang" last="Lellmann">Wolfgang Lellmann</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Océ Document Technologies GmbH, Constance</wicri:regionArea>
<wicri:noRegion>Constance</wicri:noRegion>
<wicri:noRegion>Constance</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: Wolfgang.Lellmann@odt-oce.com</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">1A79B6E54E225211DCFBB052FB02326853C9C04F</idno>
<idno type="DOI">10.1007/978-3-540-24642-8_8</idno>
<idno type="ChapterID">8</idno>
<idno type="ChapterID">Chap8</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Combining multiple classifiers to achieve improved recognition results has become a popular technique in recent years. As for OCR systems, most investigations focus on fusion strategies on the character level. This paper describes a flexible framework for the combination of result strings which are the common output of commercial OCR systems. By synchronizing strings according to geometrical criteria, incorrect character segmentations can be avoided, while character recognition is improved by classical combination rules like Borda Count or Plurality Vote. To reduce computing time, further expert calls are stopped as soon as the quality of a temporary combination result exceeds a given threshold. The system allows easy integration of arbitrary new OCR systems and simplifies the determination of optimal system parameters by analyzing the input data at hand. Quantitative results are shown for a two-recognizer system, while the framework allows an arbitrary number of experts.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001633 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001633 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= ISTEX:1A79B6E54E225211DCFBB052FB02326853C9C04F |texte= Adaptive Combination of Commercial OCR Systems }}
This area was generated with Dilib version V0.6.32. |