Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Software tools and test data for research and testing of page-reading OCR systems

Identifieur interne : 000454 ( PascalFrancis/Corpus ); précédent : 000453; suivant : 000455

Software tools and test data for research and testing of page-reading OCR systems

Auteurs : Thomas A. Nartker ; Stephen V. Rice ; Steven E. Lumos

Source :

RBID : Pascal:05-0361379

Descripteurs français

English descriptors

Abstract

We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 1017-2653
A05       @2 5676
A08 01  1  ENG  @1 Software tools and test data for research and testing of page-reading OCR systems
A09 01  1  ENG  @1 Document recognition and retrieval XII : San Jose CA, 19-20 January 2005
A11 01  1    @1 NARTKER (Thomas A.)
A11 02  1    @1 RICE (Stephen V.)
A11 03  1    @1 LUMOS (Steven E.)
A12 01  1    @1 SMITH (Elisa H. Barney) @9 ed.
A12 02  1    @1 TAGHVA (Kazem) @9 ed.
A14 01      @1 Information Science Research Institute (ISRI) University of Nevada, Las Vegas @2 Las Vegas, NV 89154-4021 @3 USA @Z 1 aut. @Z 2 aut. @Z 3 aut.
A18 01  1    @1 International Society for Optical Engineering @2 Bellingham WA @3 USA @9 org-cong.
A20       @1 37-47
A21       @1 2005
A23 01      @0 ENG
A26 01      @0 0-8194-5649-7
A43 01      @1 INIST @2 21760 @5 354000124499720050
A44       @0 0000 @1 © 2005 INIST-CNRS. All rights reserved.
A45       @0 4 ref.
A47 01  1    @0 05-0361379
A60       @1 P @2 C
A61       @0 A
A64 01  1    @0 SPIE proceedings series
A66 01      @0 USA
C01 01    ENG  @0 We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.
C02 01  X    @0 001D04A05C
C03 01  X  FRE  @0 Outil logiciel @5 01
C03 01  X  ENG  @0 Software tool @5 01
C03 01  X  SPA  @0 Herramienta software @5 01
C03 02  X  FRE  @0 Appareil lecture @5 02
C03 02  X  ENG  @0 Reading device @5 02
C03 02  X  SPA  @0 Aparato lectura @5 02
C03 03  X  FRE  @0 Reconnaissance optique caractère @5 03
C03 03  X  ENG  @0 Optical character recognition @5 03
C03 03  X  SPA  @0 Reconocimento óptico de caracteres @5 03
C03 04  X  FRE  @0 Disponibilité @5 04
C03 04  X  ENG  @0 Availability @5 04
C03 04  X  SPA  @0 Disponibilidad @5 04
C03 05  3  FRE  @0 Traitement image document @5 05
C03 05  3  ENG  @0 Document image processing @5 05
C03 06  X  FRE  @0 Evaluation performance @5 06
C03 06  X  ENG  @0 Performance evaluation @5 06
C03 06  X  SPA  @0 Evaluación prestación @5 06
C03 07  X  FRE  @0 Algorithme @5 07
C03 07  X  ENG  @0 Algorithm @5 07
C03 07  X  SPA  @0 Algoritmo @5 07
C03 08  X  FRE  @0 Collecte donnée @5 08
C03 08  X  ENG  @0 Data gathering @5 08
C03 08  X  SPA  @0 Recolección dato @5 08
C03 09  X  FRE  @0 Précision @5 09
C03 09  X  ENG  @0 Accuracy @5 09
C03 09  X  SPA  @0 Precisión @5 09
C03 10  X  FRE  @0 Vérification programme @5 10
C03 10  X  ENG  @0 Program verification @5 10
C03 10  X  SPA  @0 Verificación programa @5 10
N21       @1 248
N44 01      @1 OTO
N82       @1 OTO
pR  
A30 01  1  ENG  @1 Document recognition and retrieval. Conference @2 12 @3 San Jose CA USA @4 2005-01-19

Format Inist (serveur)

NO : PASCAL 05-0361379 INIST
ET : Software tools and test data for research and testing of page-reading OCR systems
AU : NARTKER (Thomas A.); RICE (Stephen V.); LUMOS (Steven E.); SMITH (Elisa H. Barney); TAGHVA (Kazem)
AF : Information Science Research Institute (ISRI) University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut., 3 aut.)
DT : Publication en série; Congrès; Niveau analytique
SO : SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2005; Vol. 5676; Pp. 37-47; Bibl. 4 ref.
LA : Anglais
EA : We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.
CC : 001D04A05C
FD : Outil logiciel; Appareil lecture; Reconnaissance optique caractère; Disponibilité; Traitement image document; Evaluation performance; Algorithme; Collecte donnée; Précision; Vérification programme
ED : Software tool; Reading device; Optical character recognition; Availability; Document image processing; Performance evaluation; Algorithm; Data gathering; Accuracy; Program verification
SD : Herramienta software; Aparato lectura; Reconocimento óptico de caracteres; Disponibilidad; Evaluación prestación; Algoritmo; Recolección dato; Precisión; Verificación programa
LO : INIST-21760.354000124499720050
ID : 05-0361379

Links to Exploration step

Pascal:05-0361379

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Software tools and test data for research and testing of page-reading OCR systems</title>
<author>
<name sortKey="Nartker, Thomas A" sort="Nartker, Thomas A" uniqKey="Nartker T" first="Thomas A." last="Nartker">Thomas A. Nartker</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute (ISRI) University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Rice, Stephen V" sort="Rice, Stephen V" uniqKey="Rice S" first="Stephen V." last="Rice">Stephen V. Rice</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute (ISRI) University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Lumos, Steven E" sort="Lumos, Steven E" uniqKey="Lumos S" first="Steven E." last="Lumos">Steven E. Lumos</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute (ISRI) University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">05-0361379</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0361379 INIST</idno>
<idno type="RBID">Pascal:05-0361379</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000454</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Software tools and test data for research and testing of page-reading OCR systems</title>
<author>
<name sortKey="Nartker, Thomas A" sort="Nartker, Thomas A" uniqKey="Nartker T" first="Thomas A." last="Nartker">Thomas A. Nartker</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute (ISRI) University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Rice, Stephen V" sort="Rice, Stephen V" uniqKey="Rice S" first="Stephen V." last="Rice">Stephen V. Rice</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute (ISRI) University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Lumos, Steven E" sort="Lumos, Steven E" uniqKey="Lumos S" first="Steven E." last="Lumos">Steven E. Lumos</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute (ISRI) University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint>
<date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Accuracy</term>
<term>Algorithm</term>
<term>Availability</term>
<term>Data gathering</term>
<term>Document image processing</term>
<term>Optical character recognition</term>
<term>Performance evaluation</term>
<term>Program verification</term>
<term>Reading device</term>
<term>Software tool</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Outil logiciel</term>
<term>Appareil lecture</term>
<term>Reconnaissance optique caractère</term>
<term>Disponibilité</term>
<term>Traitement image document</term>
<term>Evaluation performance</term>
<term>Algorithme</term>
<term>Collecte donnée</term>
<term>Précision</term>
<term>Vérification programme</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1017-2653</s0>
</fA01>
<fA05>
<s2>5676</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG">
<s1>Software tools and test data for research and testing of page-reading OCR systems</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval XII : San Jose CA, 19-20 January 2005</s1>
</fA09>
<fA11 i1="01" i2="1">
<s1>NARTKER (Thomas A.)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>RICE (Stephen V.)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>LUMOS (Steven E.)</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>SMITH (Elisa H. Barney)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>TAGHVA (Kazem)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01">
<s1>Information Science Research Institute (ISRI) University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1">
<s1>International Society for Optical Engineering</s1>
<s2>Bellingham WA</s2>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20>
<s1>37-47</s1>
</fA20>
<fA21>
<s1>2005</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA26 i1="01">
<s0>0-8194-5649-7</s0>
</fA26>
<fA43 i1="01">
<s1>INIST</s1>
<s2>21760</s2>
<s5>354000124499720050</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2005 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>4 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>05-0361379</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>SPIE proceedings series</s0>
</fA64>
<fA66 i1="01">
<s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D04A05C</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Outil logiciel</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Software tool</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Herramienta software</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Appareil lecture</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Reading device</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Aparato lectura</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Disponibilité</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Availability</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Disponibilidad</s0>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="3" l="FRE">
<s0>Traitement image document</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="3" l="ENG">
<s0>Document image processing</s0>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Evaluation performance</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Performance evaluation</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Evaluación prestación</s0>
<s5>06</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Algorithme</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Algorithm</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Algoritmo</s0>
<s5>07</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Collecte donnée</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Data gathering</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Recolección dato</s0>
<s5>08</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Précision</s0>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Accuracy</s0>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Precisión</s0>
<s5>09</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE">
<s0>Vérification programme</s0>
<s5>10</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG">
<s0>Program verification</s0>
<s5>10</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA">
<s0>Verificación programa</s0>
<s5>10</s5>
</fC03>
<fN21>
<s1>248</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval. Conference</s1>
<s2>12</s2>
<s3>San Jose CA USA</s3>
<s4>2005-01-19</s4>
</fA30>
</pR>
</standard>
<server>
<NO>PASCAL 05-0361379 INIST</NO>
<ET>Software tools and test data for research and testing of page-reading OCR systems</ET>
<AU>NARTKER (Thomas A.); RICE (Stephen V.); LUMOS (Steven E.); SMITH (Elisa H. Barney); TAGHVA (Kazem)</AU>
<AF>Information Science Research Institute (ISRI) University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut., 3 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2005; Vol. 5676; Pp. 37-47; Bibl. 4 ref.</SO>
<LA>Anglais</LA>
<EA>We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.</EA>
<CC>001D04A05C</CC>
<FD>Outil logiciel; Appareil lecture; Reconnaissance optique caractère; Disponibilité; Traitement image document; Evaluation performance; Algorithme; Collecte donnée; Précision; Vérification programme</FD>
<ED>Software tool; Reading device; Optical character recognition; Availability; Document image processing; Performance evaluation; Algorithm; Data gathering; Accuracy; Program verification</ED>
<SD>Herramienta software; Aparato lectura; Reconocimento óptico de caracteres; Disponibilidad; Evaluación prestación; Algoritmo; Recolección dato; Precisión; Verificación programa</SD>
<LO>INIST-21760.354000124499720050</LO>
<ID>05-0361379</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000454 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000454 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:05-0361379
   |texte=   Software tools and test data for research and testing of page-reading OCR systems
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024