Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Public domain optical character recognition

Identifieur interne : 000953 ( PascalFrancis/Corpus ); précédent : 000952; suivant : 000954

Public domain optical character recognition

Auteurs : M. D. Garris ; J. L. Blue ; G. T. Candela ; D. L. Dimmick ; J. Geist ; P. J. Grother ; S. A. Janet ; C. L. Wilson

Source :

RBID : Pascal:97-0135005

Descripteurs français

English descriptors

Abstract

A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on Handwriting Sample Forms like the ones distributed with NIST Special Database I. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized Probabilistic Neural Network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations including computers manufactured by Digital Equipment Corporation, Hewlett Packard, IBM, Silicon Graphics Incorporated, and Sum Microsystems. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 1017-2653
A05       @2 2422
A08 01  1  ENG  @1 Public domain optical character recognition
A09 01  1  ENG  @1 Document recognition II : San Jose CA, 6-7 February 1995
A11 01  1    @1 GARRIS (M. D.)
A11 02  1    @1 BLUE (J. L.)
A11 03  1    @1 CANDELA (G. T.)
A11 04  1    @1 DIMMICK (D. L.)
A11 05  1    @1 GEIST (J.)
A11 06  1    @1 GROTHER (P. J.)
A11 07  1    @1 JANET (S. A.)
A11 08  1    @1 WILSON (C. L.)
A12 01  1    @1 VINCENT (Luc M.) @9 ed.
A12 02  1    @1 BAIRD (Henry S.) @9 ed.
A14 01      @1 National Institute of Standards and Technology @2 Gaithersburg, Maryland 20899 @3 USA @Z 1 aut. @Z 2 aut. @Z 3 aut. @Z 4 aut. @Z 5 aut. @Z 6 aut. @Z 7 aut. @Z 8 aut.
A18 01  1    @1 International Society for Optical Engineering @2 Bellingham WA @3 USA @9 patr.
A18 02  1    @1 Society for Imaging Science and Technology @2 Springfield VA @3 USA @9 patr.
A20       @1 2-14
A21       @1 1995
A23 01      @0 ENG
A43 01      @1 INIST @2 21760 @5 354000053416650010
A44       @0 0000 @1 © 1997 INIST-CNRS. All rights reserved.
A47 01  1    @0 97-0135005
A60       @1 P @2 C
A61       @0 A
A64 01  1    @0 SPIE proceedings series
A66 01      @0 USA
C01 01    ENG  @0 A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on Handwriting Sample Forms like the ones distributed with NIST Special Database I. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized Probabilistic Neural Network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations including computers manufactured by Digital Equipment Corporation, Hewlett Packard, IBM, Silicon Graphics Incorporated, and Sum Microsystems. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.
C02 01  X    @0 001A01G02A
C02 02  X    @0 205
C03 01  X  FRE  @0 Reconnaissance caractère @5 04
C03 01  X  ENG  @0 Character recognition @5 04
C03 01  X  SPA  @0 Reconocimiento carácter @5 04
C03 02  X  FRE  @0 Reconnaissance forme @5 05
C03 02  X  ENG  @0 Pattern recognition @5 05
C03 02  X  GER  @0 Mustererkennung @5 05
C03 02  X  SPA  @0 Reconocimiento patrón @5 05
C03 03  X  FRE  @0 Réseau neuronal @5 06
C03 03  X  ENG  @0 Neural network @5 06
C03 03  X  SPA  @0 Red neuronal @5 06
C03 04  X  FRE  @0 Reconnaissance optique caractère @5 07
C03 04  X  ENG  @0 Optical character recognition @5 07
C03 04  X  SPA  @0 Reconocimento óptico de caracteres @5 07
C03 05  X  FRE  @0 Document @5 11
C03 05  X  ENG  @0 Document @5 11
C03 05  X  SPA  @0 Documento @5 11
C03 06  X  FRE  @0 Ecriture @5 12
C03 06  X  ENG  @0 Hand writing @5 12
C03 06  X  SPA  @0 Escritura manual @5 12
C03 07  X  FRE  @0 Traitement document @5 13
C03 07  X  ENG  @0 Document processing @5 13
C03 07  X  SPA  @0 Tratamiento documento @5 13
C03 08  X  FRE  @0 Domaine public @4 CD @5 96
C03 08  X  ENG  @0 Public domain @4 CD @5 96
C03 09  X  FRE  @0 Texte manuscrit @4 CD @5 97
C03 09  X  ENG  @0 Handwritten text @4 CD @5 97
N21       @1 055
pR  
A30 01  1  ENG  @1 Document recognition. Conference @3 San Jose CA USA @4 1995-02-06

Format Inist (serveur)

NO : PASCAL 97-0135005 INIST
ET : Public domain optical character recognition
AU : GARRIS (M. D.); BLUE (J. L.); CANDELA (G. T.); DIMMICK (D. L.); GEIST (J.); GROTHER (P. J.); JANET (S. A.); WILSON (C. L.); VINCENT (Luc M.); BAIRD (Henry S.)
AF : National Institute of Standards and Technology/Gaithersburg, Maryland 20899/Etats-Unis (1 aut., 2 aut., 3 aut., 4 aut., 5 aut., 6 aut., 7 aut., 8 aut.)
DT : Publication en série; Congrès; Niveau analytique
SO : SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 1995; Vol. 2422; Pp. 2-14
LA : Anglais
EA : A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on Handwriting Sample Forms like the ones distributed with NIST Special Database I. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized Probabilistic Neural Network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations including computers manufactured by Digital Equipment Corporation, Hewlett Packard, IBM, Silicon Graphics Incorporated, and Sum Microsystems. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.
CC : 001A01G02A; 205
FD : Reconnaissance caractère; Reconnaissance forme; Réseau neuronal; Reconnaissance optique caractère; Document; Ecriture; Traitement document; Domaine public; Texte manuscrit
ED : Character recognition; Pattern recognition; Neural network; Optical character recognition; Document; Hand writing; Document processing; Public domain; Handwritten text
GD : Mustererkennung
SD : Reconocimiento carácter; Reconocimiento patrón; Red neuronal; Reconocimento óptico de caracteres; Documento; Escritura manual; Tratamiento documento
LO : INIST-21760.354000053416650010
ID : 97-0135005

Links to Exploration step

Pascal:97-0135005

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Public domain optical character recognition</title>
<author>
<name sortKey="Garris, M D" sort="Garris, M D" uniqKey="Garris M" first="M. D." last="Garris">M. D. Garris</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Blue, J L" sort="Blue, J L" uniqKey="Blue J" first="J. L." last="Blue">J. L. Blue</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Candela, G T" sort="Candela, G T" uniqKey="Candela G" first="G. T." last="Candela">G. T. Candela</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Dimmick, D L" sort="Dimmick, D L" uniqKey="Dimmick D" first="D. L." last="Dimmick">D. L. Dimmick</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Geist, J" sort="Geist, J" uniqKey="Geist J" first="J." last="Geist">J. Geist</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Grother, P J" sort="Grother, P J" uniqKey="Grother P" first="P. J." last="Grother">P. J. Grother</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Janet, S A" sort="Janet, S A" uniqKey="Janet S" first="S. A." last="Janet">S. A. Janet</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Wilson, C L" sort="Wilson, C L" uniqKey="Wilson C" first="C. L." last="Wilson">C. L. Wilson</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">97-0135005</idno>
<date when="1995">1995</date>
<idno type="stanalyst">PASCAL 97-0135005 INIST</idno>
<idno type="RBID">Pascal:97-0135005</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000953</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Public domain optical character recognition</title>
<author>
<name sortKey="Garris, M D" sort="Garris, M D" uniqKey="Garris M" first="M. D." last="Garris">M. D. Garris</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Blue, J L" sort="Blue, J L" uniqKey="Blue J" first="J. L." last="Blue">J. L. Blue</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Candela, G T" sort="Candela, G T" uniqKey="Candela G" first="G. T." last="Candela">G. T. Candela</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Dimmick, D L" sort="Dimmick, D L" uniqKey="Dimmick D" first="D. L." last="Dimmick">D. L. Dimmick</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Geist, J" sort="Geist, J" uniqKey="Geist J" first="J." last="Geist">J. Geist</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Grother, P J" sort="Grother, P J" uniqKey="Grother P" first="P. J." last="Grother">P. J. Grother</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Janet, S A" sort="Janet, S A" uniqKey="Janet S" first="S. A." last="Janet">S. A. Janet</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Wilson, C L" sort="Wilson, C L" uniqKey="Wilson C" first="C. L." last="Wilson">C. L. Wilson</name>
<affiliation>
<inist:fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint>
<date when="1995">1995</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Document</term>
<term>Document processing</term>
<term>Hand writing</term>
<term>Handwritten text</term>
<term>Neural network</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Public domain</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Réseau neuronal</term>
<term>Reconnaissance optique caractère</term>
<term>Document</term>
<term>Ecriture</term>
<term>Traitement document</term>
<term>Domaine public</term>
<term>Texte manuscrit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on Handwriting Sample Forms like the ones distributed with NIST Special Database I. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized Probabilistic Neural Network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations including computers manufactured by Digital Equipment Corporation, Hewlett Packard, IBM, Silicon Graphics Incorporated, and Sum Microsystems. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1017-2653</s0>
</fA01>
<fA05>
<s2>2422</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG">
<s1>Public domain optical character recognition</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG">
<s1>Document recognition II : San Jose CA, 6-7 February 1995</s1>
</fA09>
<fA11 i1="01" i2="1">
<s1>GARRIS (M. D.)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>BLUE (J. L.)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>CANDELA (G. T.)</s1>
</fA11>
<fA11 i1="04" i2="1">
<s1>DIMMICK (D. L.)</s1>
</fA11>
<fA11 i1="05" i2="1">
<s1>GEIST (J.)</s1>
</fA11>
<fA11 i1="06" i2="1">
<s1>GROTHER (P. J.)</s1>
</fA11>
<fA11 i1="07" i2="1">
<s1>JANET (S. A.)</s1>
</fA11>
<fA11 i1="08" i2="1">
<s1>WILSON (C. L.)</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>VINCENT (Luc M.)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>BAIRD (Henry S.)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01">
<s1>National Institute of Standards and Technology</s1>
<s2>Gaithersburg, Maryland 20899</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
<sZ>5 aut.</sZ>
<sZ>6 aut.</sZ>
<sZ>7 aut.</sZ>
<sZ>8 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1">
<s1>International Society for Optical Engineering</s1>
<s2>Bellingham WA</s2>
<s3>USA</s3>
<s9>patr.</s9>
</fA18>
<fA18 i1="02" i2="1">
<s1>Society for Imaging Science and Technology</s1>
<s2>Springfield VA</s2>
<s3>USA</s3>
<s9>patr.</s9>
</fA18>
<fA20>
<s1>2-14</s1>
</fA20>
<fA21>
<s1>1995</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>21760</s2>
<s5>354000053416650010</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 1997 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA47 i1="01" i2="1">
<s0>97-0135005</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>SPIE proceedings series</s0>
</fA64>
<fA66 i1="01">
<s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on Handwriting Sample Forms like the ones distributed with NIST Special Database I. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized Probabilistic Neural Network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations including computers manufactured by Digital Equipment Corporation, Hewlett Packard, IBM, Silicon Graphics Incorporated, and Sum Microsystems. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001A01G02A</s0>
</fC02>
<fC02 i1="02" i2="X">
<s0>205</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>04</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Character recognition</s0>
<s5>04</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Reconocimiento carácter</s0>
<s5>04</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Reconnaissance forme</s0>
<s5>05</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Pattern recognition</s0>
<s5>05</s5>
</fC03>
<fC03 i1="02" i2="X" l="GER">
<s0>Mustererkennung</s0>
<s5>05</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Reconocimiento patrón</s0>
<s5>05</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Réseau neuronal</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Neural network</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Red neuronal</s0>
<s5>06</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>07</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Document</s0>
<s5>11</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Document</s0>
<s5>11</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Documento</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Ecriture</s0>
<s5>12</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Hand writing</s0>
<s5>12</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Escritura manual</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Traitement document</s0>
<s5>13</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Document processing</s0>
<s5>13</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Tratamiento documento</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Domaine public</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Public domain</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Texte manuscrit</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Handwritten text</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fN21>
<s1>055</s1>
</fN21>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>Document recognition. Conference</s1>
<s3>San Jose CA USA</s3>
<s4>1995-02-06</s4>
</fA30>
</pR>
</standard>
<server>
<NO>PASCAL 97-0135005 INIST</NO>
<ET>Public domain optical character recognition</ET>
<AU>GARRIS (M. D.); BLUE (J. L.); CANDELA (G. T.); DIMMICK (D. L.); GEIST (J.); GROTHER (P. J.); JANET (S. A.); WILSON (C. L.); VINCENT (Luc M.); BAIRD (Henry S.)</AU>
<AF>National Institute of Standards and Technology/Gaithersburg, Maryland 20899/Etats-Unis (1 aut., 2 aut., 3 aut., 4 aut., 5 aut., 6 aut., 7 aut., 8 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 1995; Vol. 2422; Pp. 2-14</SO>
<LA>Anglais</LA>
<EA>A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on Handwriting Sample Forms like the ones distributed with NIST Special Database I. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized Probabilistic Neural Network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations including computers manufactured by Digital Equipment Corporation, Hewlett Packard, IBM, Silicon Graphics Incorporated, and Sum Microsystems. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.</EA>
<CC>001A01G02A; 205</CC>
<FD>Reconnaissance caractère; Reconnaissance forme; Réseau neuronal; Reconnaissance optique caractère; Document; Ecriture; Traitement document; Domaine public; Texte manuscrit</FD>
<ED>Character recognition; Pattern recognition; Neural network; Optical character recognition; Document; Hand writing; Document processing; Public domain; Handwritten text</ED>
<GD>Mustererkennung</GD>
<SD>Reconocimiento carácter; Reconocimiento patrón; Red neuronal; Reconocimento óptico de caracteres; Documento; Escritura manual; Tratamiento documento</SD>
<LO>INIST-21760.354000053416650010</LO>
<ID>97-0135005</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000953 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000953 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:97-0135005
   |texte=   Public domain optical character recognition
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024