Recognition of printed text under realistic conditions
Identifieur interne : 003336 ( Main/Merge ); précédent : 003335; suivant : 003337Recognition of printed text under realistic conditions
Auteurs : T. Pavlidis [États-Unis]Source :
- Pattern recognition letters [ 0167-8655 ] ; 1993.
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Classification.
English descriptors
Abstract
Past research in OCR has focused on the shape analysis of binarized images, quite often assuming good quality document and isolated characters. Such assumptions are challenged by the conditions met in practice: binarization is difficult for low contrast documents, characters often touch each other, not only on the sides but also between lines, etc. After a brief review of past work we will describe current efforts to deal with OCR as a signal processing problem where the causes of noise and distortions as well the idealized images (definitions of typefaces) are modeled and subjected to a quantitative analysis. The key idea of the analysis is that while printed text images may be binary in an ideal state, the images seen by the sensors are gray scale because of convolution distortion and other causes
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000B26
- to stream PascalFrancis, to step Curation: 000876
- to stream PascalFrancis, to step Checkpoint: 000A85
Links to Exploration step
Pascal:93-0513262Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Recognition of printed text under realistic conditions</title>
<author><name sortKey="Pavlidis, T" sort="Pavlidis, T" uniqKey="Pavlidis T" first="T." last="Pavlidis">T. Pavlidis</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>SUNY, dep. computer sci.</s1>
<s2>Stony Brook NY 11794-4400</s2>
<s3>USA</s3>
</inist:fA14>
<country>États-Unis</country>
<wicri:noRegion>Stony Brook NY 11794-4400</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">93-0513262</idno>
<date when="1993">1993</date>
<idno type="stanalyst">PASCAL 93-0513262 INIST</idno>
<idno type="RBID">Pascal:93-0513262</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000B26</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000876</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000A85</idno>
<idno type="wicri:doubleKey">0167-8655:1993:Pavlidis T:recognition:of:printed</idno>
<idno type="wicri:Area/Main/Merge">003336</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Recognition of printed text under realistic conditions</title>
<author><name sortKey="Pavlidis, T" sort="Pavlidis, T" uniqKey="Pavlidis T" first="T." last="Pavlidis">T. Pavlidis</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>SUNY, dep. computer sci.</s1>
<s2>Stony Brook NY 11794-4400</s2>
<s3>USA</s3>
</inist:fA14>
<country>États-Unis</country>
<wicri:noRegion>Stony Brook NY 11794-4400</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Pattern recognition letters</title>
<title level="j" type="abbreviated">Pattern recogn. lett.</title>
<idno type="ISSN">0167-8655</idno>
<imprint><date when="1993">1993</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Pattern recognition letters</title>
<title level="j" type="abbreviated">Pattern recogn. lett.</title>
<idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Classification</term>
<term>OCR</term>
<term>Pattern recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance forme</term>
<term>Reconnaissance caractère</term>
<term>Classification</term>
<term>OCR</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Past research in OCR has focused on the shape analysis of binarized images, quite often assuming good quality document and isolated characters. Such assumptions are challenged by the conditions met in practice: binarization is difficult for low contrast documents, characters often touch each other, not only on the sides but also between lines, etc. After a brief review of past work we will describe current efforts to deal with OCR as a signal processing problem where the causes of noise and distortions as well the idealized images (definitions of typefaces) are modeled and subjected to a quantitative analysis. The key idea of the analysis is that while printed text images may be binary in an ideal state, the images seen by the sensors are gray scale because of convolution distortion and other causes</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
</list>
<tree><country name="États-Unis"><noRegion><name sortKey="Pavlidis, T" sort="Pavlidis, T" uniqKey="Pavlidis T" first="T." last="Pavlidis">T. Pavlidis</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003336 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 003336 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= Pascal:93-0513262 |texte= Recognition of printed text under realistic conditions }}
This area was generated with Dilib version V0.6.32. |