Hardware design of a letter-driven OCR and document processing system
Identifieur interne : 002A33 ( Main/Merge ); précédent : 002A32; suivant : 002A34Hardware design of a letter-driven OCR and document processing system
Auteurs : N. Bourbakis [États-Unis] ; N. Pereira ; S. MertogunoSource :
- Journal of network and computer applications [ 1084-8045 ] ; 1996.
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Stockage, Base de données.
English descriptors
- KwdEn :
Abstract
In this paper the design of a letter-driven OCR and document processing system is presented. The system can scan, detect, extract and recognize text characters directly from a document. Instead of sending binary strings of '0s' and '1s' like conventional scanners to the host computer's memory (where software programs are used to recognize the characters), it sends only the ASCII code of recognized characters to the host computer. When it works as a document processing system, it saves in the main processor memory all the recognizable characters, which belong to the same word, and attempts a matching process with the contents of lexicon database. The system presented here consists of ten main parts : a focusing and zooming unit (FZ), segmentation and text binarization unit (STB), text sentences detection and paragraph synthesis unit (TSDPS), a raster scanner unit (RS), a horizontal and vertical projection unit (HVP), a character pre-processing circuit (CPC), a chain code generation unit (CCG), a line generator/recognizer unit (LGR), a graph generator unit (GG) and a matching processing unit (MP). Note that text characters to be recognized are scanned in through the focusing and zooming unit and the corresponding ASCII code of each recognized character is produced by the matching processor.
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000988
- to stream PascalFrancis, to step Curation: 000A10
- to stream PascalFrancis, to step Checkpoint: 000946
Links to Exploration step
Pascal:96-0399157Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Hardware design of a letter-driven OCR and document processing system</title>
<author><name sortKey="Bourbakis, N" sort="Bourbakis, N" uniqKey="Bourbakis N" first="N." last="Bourbakis">N. Bourbakis</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Binghamton University, Center for Intelligence Systems & AAAI Lab</s1>
<s2>Binghamton, NY 13902</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Pereira, N" sort="Pereira, N" uniqKey="Pereira N" first="N." last="Pereira">N. Pereira</name>
</author>
<author><name sortKey="Mertoguno, S" sort="Mertoguno, S" uniqKey="Mertoguno S" first="S." last="Mertoguno">S. Mertoguno</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">96-0399157</idno>
<date when="1996">1996</date>
<idno type="stanalyst">PASCAL 96-0399157 INIST</idno>
<idno type="RBID">Pascal:96-0399157</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000988</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000A10</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000946</idno>
<idno type="wicri:doubleKey">1084-8045:1996:Bourbakis N:hardware:design:of</idno>
<idno type="wicri:Area/Main/Merge">002A33</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Hardware design of a letter-driven OCR and document processing system</title>
<author><name sortKey="Bourbakis, N" sort="Bourbakis, N" uniqKey="Bourbakis N" first="N." last="Bourbakis">N. Bourbakis</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Binghamton University, Center for Intelligence Systems & AAAI Lab</s1>
<s2>Binghamton, NY 13902</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Pereira, N" sort="Pereira, N" uniqKey="Pereira N" first="N." last="Pereira">N. Pereira</name>
</author>
<author><name sortKey="Mertoguno, S" sort="Mertoguno, S" uniqKey="Mertoguno S" first="S." last="Mertoguno">S. Mertoguno</name>
</author>
</analytic>
<series><title level="j" type="main">Journal of network and computer applications</title>
<idno type="ISSN">1084-8045</idno>
<imprint><date when="1996">1996</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Journal of network and computer applications</title>
<idno type="ISSN">1084-8045</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character processing</term>
<term>Character recognition</term>
<term>Character string</term>
<term>Code generation</term>
<term>Database</term>
<term>Document processing</term>
<term>Manuscript character</term>
<term>OCR</term>
<term>Printed character</term>
<term>Scanner</term>
<term>Segmentation</term>
<term>Storage (materials)</term>
<term>System design</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance caractère</term>
<term>Conception système</term>
<term>Caractère manuscrit</term>
<term>Caractère imprimé</term>
<term>Traitement caractère</term>
<term>Traitement document</term>
<term>Chaîne caractère</term>
<term>Scanneur</term>
<term>Segmentation</term>
<term>Stockage</term>
<term>Génération code</term>
<term>Base donnée</term>
<term>OCR</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Stockage</term>
<term>Base de données</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">In this paper the design of a letter-driven OCR and document processing system is presented. The system can scan, detect, extract and recognize text characters directly from a document. Instead of sending binary strings of '0s' and '1s' like conventional scanners to the host computer's memory (where software programs are used to recognize the characters), it sends only the ASCII code of recognized characters to the host computer. When it works as a document processing system, it saves in the main processor memory all the recognizable characters, which belong to the same word, and attempts a matching process with the contents of lexicon database. The system presented here consists of ten main parts : a focusing and zooming unit (FZ), segmentation and text binarization unit (STB), text sentences detection and paragraph synthesis unit (TSDPS), a raster scanner unit (RS), a horizontal and vertical projection unit (HVP), a character pre-processing circuit (CPC), a chain code generation unit (CCG), a line generator/recognizer unit (LGR), a graph generator unit (GG) and a matching processing unit (MP). Note that text characters to be recognized are scanned in through the focusing and zooming unit and the corresponding ASCII code of each recognized character is produced by the matching processor.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>État de New York</li>
</region>
</list>
<tree><noCountry><name sortKey="Mertoguno, S" sort="Mertoguno, S" uniqKey="Mertoguno S" first="S." last="Mertoguno">S. Mertoguno</name>
<name sortKey="Pereira, N" sort="Pereira, N" uniqKey="Pereira N" first="N." last="Pereira">N. Pereira</name>
</noCountry>
<country name="États-Unis"><region name="État de New York"><name sortKey="Bourbakis, N" sort="Bourbakis, N" uniqKey="Bourbakis N" first="N." last="Bourbakis">N. Bourbakis</name>
</region>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002A33 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 002A33 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= Pascal:96-0399157 |texte= Hardware design of a letter-driven OCR and document processing system }}
This area was generated with Dilib version V0.6.32. |