OcrV1, PascalFrancis, Corpus, bibRecord, 000742

Optical character recognition : An illustrated guide to the frontier

Identifieur interne : 000742 ( PascalFrancis/Corpus ); précédent : 000741; suivant : 000743

Optical character recognition : An illustrated guide to the frontier

Auteurs : G. Nagy ; T. A. Nartker ; S. V. Rice

Source :

SPIE proceedings series [ 1017-2653 ] ; 2000.

RBID : Pascal:01-0029148

Descripteurs français

Pascal (Inist)
- Reconnaissance optique caractère, Evaluation système, Erreur, Typologie, Amélioration, Difficulté tâche, Cause, ISRI (Information Science Research Institute).

English descriptors

KwdEn :
- Cause, Error, Improvement, Optical character recognition, System evaluation, Task difficulty, Typology.

Abstract

We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

A01	`01`	`1`		`@0 1017-2653`
A05				`@2 3967`
A08	`01`	`1`	`ENG`	`@1 Optical character recognition : An illustrated guide to the frontier`
A09	`01`	`1`	`ENG`	`@1 Document recognition and retrieval VII : San Jose CA, 26-27 January 2000`
A11	`01`	`1`		`@1 NAGY (G.)`
A11	`02`	`1`		`@1 NARTKER (T. A.)`
A11	`03`	`1`		`@1 RICE (S. V.)`
A12	`01`	`1`		`@1 LOPRESTI (Daniel P.) @9 ed.`
A12	`02`	`1`		`@1 JIANGYING ZHOU @9 ed.`
A14	`01`			`@1 Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute @2 Troy, NY 12180 @3 USA @Z 1 aut.`
A14	`02`			`@1 Dept. of Computer Science, University of Nevada @2 Las Vegas, NV 89154 @3 USA @Z 2 aut.`
A14	`03`			`@1 Comparisonics Corporation @2 Grass Valley, CA 95945 @3 USA @Z 3 aut.`
A18	`01`	`1`		`@1 International Society for Optical Engineering @2 Bellingham WA @3 USA @9 patr.`
A20				`@1 58-69`
A21				`@1 2000`
A23	`01`			`@0 ENG`
A26	`01`			`@0 0-8194-3585-6`
A43	`01`			`@1 INIST @2 21760 @5 354000090075690070`
A44				`@0 0000 @1 © 2001 INIST-CNRS. All rights reserved.`
A45				`@0 1 ref.`
A47	`01`	`1`		`@0 01-0029148`
A60				`@1 P @2 C`
A61				`@0 A`
A64	`01`	`1`		`@0 SPIE proceedings series`
A66	`01`			`@0 USA`
C01	`01`		`ENG`	@0 We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.
C02	`01`	`X`		`@0 001A01G02B`
C02	`02`	`X`		`@0 205`
C03	`01`	`X`	`FRE`	`@0 Reconnaissance optique caractère @5 01`
C03	`01`	`X`	`ENG`	`@0 Optical character recognition @5 01`
C03	`01`	`X`	`SPA`	`@0 Reconocimento óptico de caracteres @5 01`
C03	`02`	`X`	`FRE`	`@0 Evaluation système @5 02`
C03	`02`	`X`	`ENG`	`@0 System evaluation @5 02`
C03	`02`	`X`	`SPA`	`@0 Evaluación sistema @5 02`
C03	`03`	`X`	`FRE`	`@0 Erreur @5 03`
C03	`03`	`X`	`ENG`	`@0 Error @5 03`
C03	`03`	`X`	`SPA`	`@0 Error @5 03`
C03	`04`	`X`	`FRE`	`@0 Typologie @5 04`
C03	`04`	`X`	`ENG`	`@0 Typology @5 04`
C03	`04`	`X`	`SPA`	`@0 Tipología @5 04`
C03	`05`	`X`	`FRE`	`@0 Amélioration @5 05`
C03	`05`	`X`	`ENG`	`@0 Improvement @5 05`
C03	`05`	`X`	`SPA`	`@0 Mejoría @5 05`
C03	`06`	`X`	`FRE`	`@0 Difficulté tâche @5 06`
C03	`06`	`X`	`ENG`	`@0 Task difficulty @5 06`
C03	`06`	`X`	`SPA`	`@0 Dificultad tarea @5 06`
C03	`07`	`X`	`FRE`	`@0 Cause @5 07`
C03	`07`	`X`	`ENG`	`@0 Cause @5 07`
C03	`07`	`X`	`SPA`	`@0 Causa @5 07`
C03	`08`	`X`	`FRE`	`@0 ISRI (Information Science Research Institute) @2 NJ @4 INC @5 27`
N21				`@1 015`

A30	`01`	`1`	`ENG`	`@1 Document recognition and retrieval. Conference @2 7 @3 San Jose CA USA @4 2000-01-26`

Format Inist (serveur)

NO :	PASCAL 01-0029148 INIST
ET :	Optical character recognition : An illustrated guide to the frontier
AU :	NAGY (G.); NARTKER (T. A.); RICE (S. V.); LOPRESTI (Daniel P.); JIANGYING ZHOU
AF :	Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute/Troy, NY 12180/Etats-Unis (1 aut.); Dept. of Computer Science, University of Nevada/Las Vegas, NV 89154/Etats-Unis (2 aut.); Comparisonics Corporation/Grass Valley, CA 95945/Etats-Unis (3 aut.)
DT :	Publication en série; Congrès; Niveau analytique
SO :	SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2000; Vol. 3967; Pp. 58-69; Bibl. 1 ref.
LA :	Anglais
EA :	We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.
CC :	001A01G02B; 205
FD :	Reconnaissance optique caractère; Evaluation système; Erreur; Typologie; Amélioration; Difficulté tâche; Cause; ISRI (Information Science Research Institute)
ED :	Optical character recognition; System evaluation; Error; Typology; Improvement; Task difficulty; Cause
SD :	Reconocimento óptico de caracteres; Evaluación sistema; Error; Tipología; Mejoría; Dificultad tarea; Causa
LO :	INIST-21760.354000090075690070
ID :	01-0029148

Links to Exploration step

Pascal:01-0029148

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Optical character recognition : An illustrated guide to the frontier</title>
<author><name sortKey="Nagy, G" sort="Nagy, G" uniqKey="Nagy G" first="G." last="Nagy">G. Nagy</name>
<affiliation><inist:fA14 i1="01"><s1>Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY 12180</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Nartker, T A" sort="Nartker, T A" uniqKey="Nartker T" first="T. A." last="Nartker">T. A. Nartker</name>
<affiliation><inist:fA14 i1="02"><s1>Dept. of Computer Science, University of Nevada</s1>
<s2>Las Vegas, NV 89154</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Rice, S V" sort="Rice, S V" uniqKey="Rice S" first="S. V." last="Rice">S. V. Rice</name>
<affiliation><inist:fA14 i1="03"><s1>Comparisonics Corporation</s1>
<s2>Grass Valley, CA 95945</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">01-0029148</idno>
<date when="2000">2000</date>
<idno type="stanalyst">PASCAL 01-0029148 INIST</idno>
<idno type="RBID">Pascal:01-0029148</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000742</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Optical character recognition : An illustrated guide to the frontier</title>
<author><name sortKey="Nagy, G" sort="Nagy, G" uniqKey="Nagy G" first="G." last="Nagy">G. Nagy</name>
<affiliation><inist:fA14 i1="01"><s1>Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY 12180</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Nartker, T A" sort="Nartker, T A" uniqKey="Nartker T" first="T. A." last="Nartker">T. A. Nartker</name>
<affiliation><inist:fA14 i1="02"><s1>Dept. of Computer Science, University of Nevada</s1>
<s2>Las Vegas, NV 89154</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Rice, S V" sort="Rice, S V" uniqKey="Rice S" first="S. V." last="Rice">S. V. Rice</name>
<affiliation><inist:fA14 i1="03"><s1>Comparisonics Corporation</s1>
<s2>Grass Valley, CA 95945</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint><date when="2000">2000</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Cause</term>
<term>Error</term>
<term>Improvement</term>
<term>Optical character recognition</term>
<term>System evaluation</term>
<term>Task difficulty</term>
<term>Typology</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance optique caractère</term>
<term>Evaluation système</term>
<term>Erreur</term>
<term>Typologie</term>
<term>Amélioration</term>
<term>Difficulté tâche</term>
<term>Cause</term>
<term>ISRI (Information Science Research Institute)</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1017-2653</s0>
</fA01>
<fA05><s2>3967</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG"><s1>Optical character recognition : An illustrated guide to the frontier</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval VII : San Jose CA, 26-27 January 2000</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>NAGY (G.)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>NARTKER (T. A.)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>RICE (S. V.)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>LOPRESTI (Daniel P.)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>JIANGYING ZHOU</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY 12180</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>Dept. of Computer Science, University of Nevada</s1>
<s2>Las Vegas, NV 89154</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="03"><s1>Comparisonics Corporation</s1>
<s2>Grass Valley, CA 95945</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1"><s1>International Society for Optical Engineering</s1>
<s2>Bellingham WA</s2>
<s3>USA</s3>
<s9>patr.</s9>
</fA18>
<fA20><s1>58-69</s1>
</fA20>
<fA21><s1>2000</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA26 i1="01"><s0>0-8194-3585-6</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>21760</s2>
<s5>354000090075690070</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2001 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>1 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>01-0029148</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>SPIE proceedings series</s0>
</fA64>
<fA66 i1="01"><s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001A01G02B</s0>
</fC02>
<fC02 i1="02" i2="X"><s0>205</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Reconnaissance optique caractère</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Optical character recognition</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Reconocimento óptico de caracteres</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Evaluation système</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>System evaluation</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Evaluación sistema</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Erreur</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Error</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Error</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Typologie</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Typology</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Tipología</s0>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Amélioration</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Improvement</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Mejoría</s0>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Difficulté tâche</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Task difficulty</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Dificultad tarea</s0>
<s5>06</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Cause</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG"><s0>Cause</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA"><s0>Causa</s0>
<s5>07</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>ISRI (Information Science Research Institute)</s0>
<s2>NJ</s2>
<s4>INC</s4>
<s5>27</s5>
</fC03>
<fN21><s1>015</s1>
</fN21>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval. Conference</s1>
<s2>7</s2>
<s3>San Jose CA USA</s3>
<s4>2000-01-26</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 01-0029148 INIST</NO>
<ET>Optical character recognition : An illustrated guide to the frontier</ET>
<AU>NAGY (G.); NARTKER (T. A.); RICE (S. V.); LOPRESTI (Daniel P.); JIANGYING ZHOU</AU>
<AF>Dept. of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute/Troy, NY 12180/Etats-Unis (1 aut.); Dept. of Computer Science, University of Nevada/Las Vegas, NV 89154/Etats-Unis (2 aut.); Comparisonics Corporation/Grass Valley, CA 95945/Etats-Unis (3 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2000; Vol. 3967; Pp. 58-69; Bibl. 1 ref.</SO>
<LA>Anglais</LA>
<EA>We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of "snippets" from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.</EA>
<CC>001A01G02B; 205</CC>
<FD>Reconnaissance optique caractère; Evaluation système; Erreur; Typologie; Amélioration; Difficulté tâche; Cause; ISRI (Information Science Research Institute)</FD>
<ED>Optical character recognition; System evaluation; Error; Typology; Improvement; Task difficulty; Cause</ED>
<SD>Reconocimento óptico de caracteres; Evaluación sistema; Error; Tipología; Mejoría; Dificultad tarea; Causa</SD>
<LO>INIST-21760.354000090075690070</LO>
<ID>01-0029148</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000742 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000742 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:01-0029148
   |texte=   Optical character recognition : An illustrated guide to the frontier
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Optical character recognition : An illustrated guide to the frontier

Optical character recognition : An illustrated guide to the frontier

Source :

Descripteurs français

English descriptors

Abstract

Notice en format standard (ISO 2709)

Format Inist (serveur)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri