Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Hardware design of a letter-driven OCR and document processing system

Identifieur interne : 002A33 ( Main/Merge ); précédent : 002A32; suivant : 002A34

Hardware design of a letter-driven OCR and document processing system

Auteurs : N. Bourbakis [États-Unis] ; N. Pereira ; S. Mertoguno

Source :

RBID : Pascal:96-0399157

Descripteurs français

English descriptors

Abstract

In this paper the design of a letter-driven OCR and document processing system is presented. The system can scan, detect, extract and recognize text characters directly from a document. Instead of sending binary strings of '0s' and '1s' like conventional scanners to the host computer's memory (where software programs are used to recognize the characters), it sends only the ASCII code of recognized characters to the host computer. When it works as a document processing system, it saves in the main processor memory all the recognizable characters, which belong to the same word, and attempts a matching process with the contents of lexicon database. The system presented here consists of ten main parts : a focusing and zooming unit (FZ), segmentation and text binarization unit (STB), text sentences detection and paragraph synthesis unit (TSDPS), a raster scanner unit (RS), a horizontal and vertical projection unit (HVP), a character pre-processing circuit (CPC), a chain code generation unit (CCG), a line generator/recognizer unit (LGR), a graph generator unit (GG) and a matching processing unit (MP). Note that text characters to be recognized are scanned in through the focusing and zooming unit and the corresponding ASCII code of each recognized character is produced by the matching processor.

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:96-0399157

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Hardware design of a letter-driven OCR and document processing system</title>
<author>
<name sortKey="Bourbakis, N" sort="Bourbakis, N" uniqKey="Bourbakis N" first="N." last="Bourbakis">N. Bourbakis</name>
<affiliation wicri:level="2">
<inist:fA14 i1="01">
<s1>Binghamton University, Center for Intelligence Systems & AAAI Lab</s1>
<s2>Binghamton, NY 13902</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Pereira, N" sort="Pereira, N" uniqKey="Pereira N" first="N." last="Pereira">N. Pereira</name>
</author>
<author>
<name sortKey="Mertoguno, S" sort="Mertoguno, S" uniqKey="Mertoguno S" first="S." last="Mertoguno">S. Mertoguno</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">96-0399157</idno>
<date when="1996">1996</date>
<idno type="stanalyst">PASCAL 96-0399157 INIST</idno>
<idno type="RBID">Pascal:96-0399157</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000988</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000A10</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000946</idno>
<idno type="wicri:doubleKey">1084-8045:1996:Bourbakis N:hardware:design:of</idno>
<idno type="wicri:Area/Main/Merge">002A33</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Hardware design of a letter-driven OCR and document processing system</title>
<author>
<name sortKey="Bourbakis, N" sort="Bourbakis, N" uniqKey="Bourbakis N" first="N." last="Bourbakis">N. Bourbakis</name>
<affiliation wicri:level="2">
<inist:fA14 i1="01">
<s1>Binghamton University, Center for Intelligence Systems & AAAI Lab</s1>
<s2>Binghamton, NY 13902</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Pereira, N" sort="Pereira, N" uniqKey="Pereira N" first="N." last="Pereira">N. Pereira</name>
</author>
<author>
<name sortKey="Mertoguno, S" sort="Mertoguno, S" uniqKey="Mertoguno S" first="S." last="Mertoguno">S. Mertoguno</name>
</author>
</analytic>
<series>
<title level="j" type="main">Journal of network and computer applications</title>
<idno type="ISSN">1084-8045</idno>
<imprint>
<date when="1996">1996</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Journal of network and computer applications</title>
<idno type="ISSN">1084-8045</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character processing</term>
<term>Character recognition</term>
<term>Character string</term>
<term>Code generation</term>
<term>Database</term>
<term>Document processing</term>
<term>Manuscript character</term>
<term>OCR</term>
<term>Printed character</term>
<term>Scanner</term>
<term>Segmentation</term>
<term>Storage (materials)</term>
<term>System design</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Conception système</term>
<term>Caractère manuscrit</term>
<term>Caractère imprimé</term>
<term>Traitement caractère</term>
<term>Traitement document</term>
<term>Chaîne caractère</term>
<term>Scanneur</term>
<term>Segmentation</term>
<term>Stockage</term>
<term>Génération code</term>
<term>Base donnée</term>
<term>OCR</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Stockage</term>
<term>Base de données</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this paper the design of a letter-driven OCR and document processing system is presented. The system can scan, detect, extract and recognize text characters directly from a document. Instead of sending binary strings of '0s' and '1s' like conventional scanners to the host computer's memory (where software programs are used to recognize the characters), it sends only the ASCII code of recognized characters to the host computer. When it works as a document processing system, it saves in the main processor memory all the recognizable characters, which belong to the same word, and attempts a matching process with the contents of lexicon database. The system presented here consists of ten main parts : a focusing and zooming unit (FZ), segmentation and text binarization unit (STB), text sentences detection and paragraph synthesis unit (TSDPS), a raster scanner unit (RS), a horizontal and vertical projection unit (HVP), a character pre-processing circuit (CPC), a chain code generation unit (CCG), a line generator/recognizer unit (LGR), a graph generator unit (GG) and a matching processing unit (MP). Note that text characters to be recognized are scanned in through the focusing and zooming unit and the corresponding ASCII code of each recognized character is produced by the matching processor.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>État de New York</li>
</region>
</list>
<tree>
<noCountry>
<name sortKey="Mertoguno, S" sort="Mertoguno, S" uniqKey="Mertoguno S" first="S." last="Mertoguno">S. Mertoguno</name>
<name sortKey="Pereira, N" sort="Pereira, N" uniqKey="Pereira N" first="N." last="Pereira">N. Pereira</name>
</noCountry>
<country name="États-Unis">
<region name="État de New York">
<name sortKey="Bourbakis, N" sort="Bourbakis, N" uniqKey="Bourbakis N" first="N." last="Bourbakis">N. Bourbakis</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002A33 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 002A33 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     Pascal:96-0399157
   |texte=   Hardware design of a letter-driven OCR and document processing system
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024