Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Adaptive Combination of Commercial OCR Systems

Identifieur interne : 001633 ( Main/Merge ); précédent : 001632; suivant : 001634

Adaptive Combination of Commercial OCR Systems

Auteurs : Elke Wilczok [Allemagne] ; Wolfgang Lellmann [Allemagne]

Source :

RBID : ISTEX:1A79B6E54E225211DCFBB052FB02326853C9C04F

Abstract

Abstract: Combining multiple classifiers to achieve improved recognition results has become a popular technique in recent years. As for OCR systems, most investigations focus on fusion strategies on the character level. This paper describes a flexible framework for the combination of result strings which are the common output of commercial OCR systems. By synchronizing strings according to geometrical criteria, incorrect character segmentations can be avoided, while character recognition is improved by classical combination rules like Borda Count or Plurality Vote. To reduce computing time, further expert calls are stopped as soon as the quality of a temporary combination result exceeds a given threshold. The system allows easy integration of arbitrary new OCR systems and simplifies the determination of optimal system parameters by analyzing the input data at hand. Quantitative results are shown for a two-recognizer system, while the framework allows an arbitrary number of experts.

Url:
DOI: 10.1007/978-3-540-24642-8_8

Links toward previous steps (curation, corpus...)


Links to Exploration step

ISTEX:1A79B6E54E225211DCFBB052FB02326853C9C04F

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Adaptive Combination of Commercial OCR Systems</title>
<author>
<name sortKey="Wilczok, Elke" sort="Wilczok, Elke" uniqKey="Wilczok E" first="Elke" last="Wilczok">Elke Wilczok</name>
</author>
<author>
<name sortKey="Lellmann, Wolfgang" sort="Lellmann, Wolfgang" uniqKey="Lellmann W" first="Wolfgang" last="Lellmann">Wolfgang Lellmann</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:1A79B6E54E225211DCFBB052FB02326853C9C04F</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-24642-8_8</idno>
<idno type="url">https://api.istex.fr/document/1A79B6E54E225211DCFBB052FB02326853C9C04F/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000048</idno>
<idno type="wicri:Area/Istex/Curation">000047</idno>
<idno type="wicri:Area/Istex/Checkpoint">000E17</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Wilczok E:adaptive:combination:of</idno>
<idno type="wicri:Area/Main/Merge">001633</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Adaptive Combination of Commercial OCR Systems</title>
<author>
<name sortKey="Wilczok, Elke" sort="Wilczok, Elke" uniqKey="Wilczok E" first="Elke" last="Wilczok">Elke Wilczok</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Océ Document Technologies GmbH, Constance</wicri:regionArea>
<wicri:noRegion>Constance</wicri:noRegion>
<wicri:noRegion>Constance</wicri:noRegion>
</affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: Elke.Wilczok@odt-oce.com</wicri:noCountry>
</affiliation>
</author>
<author>
<name sortKey="Lellmann, Wolfgang" sort="Lellmann, Wolfgang" uniqKey="Lellmann W" first="Wolfgang" last="Lellmann">Wolfgang Lellmann</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Océ Document Technologies GmbH, Constance</wicri:regionArea>
<wicri:noRegion>Constance</wicri:noRegion>
<wicri:noRegion>Constance</wicri:noRegion>
</affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: Wolfgang.Lellmann@odt-oce.com</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">1A79B6E54E225211DCFBB052FB02326853C9C04F</idno>
<idno type="DOI">10.1007/978-3-540-24642-8_8</idno>
<idno type="ChapterID">8</idno>
<idno type="ChapterID">Chap8</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Combining multiple classifiers to achieve improved recognition results has become a popular technique in recent years. As for OCR systems, most investigations focus on fusion strategies on the character level. This paper describes a flexible framework for the combination of result strings which are the common output of commercial OCR systems. By synchronizing strings according to geometrical criteria, incorrect character segmentations can be avoided, while character recognition is improved by classical combination rules like Borda Count or Plurality Vote. To reduce computing time, further expert calls are stopped as soon as the quality of a temporary combination result exceeds a given threshold. The system allows easy integration of arbitrary new OCR systems and simplifies the determination of optimal system parameters by analyzing the input data at hand. Quantitative results are shown for a two-recognizer system, while the framework allows an arbitrary number of experts.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001633 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001633 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     ISTEX:1A79B6E54E225211DCFBB052FB02326853C9C04F
   |texte=   Adaptive Combination of Commercial OCR Systems
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024