Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Recognition of printed text under realistic conditions

Identifieur interne : 003336 ( Main/Merge ); précédent : 003335; suivant : 003337

Recognition of printed text under realistic conditions

Auteurs : T. Pavlidis [États-Unis]

Source :

RBID : Pascal:93-0513262

Descripteurs français

English descriptors

Abstract

Past research in OCR has focused on the shape analysis of binarized images, quite often assuming good quality document and isolated characters. Such assumptions are challenged by the conditions met in practice: binarization is difficult for low contrast documents, characters often touch each other, not only on the sides but also between lines, etc. After a brief review of past work we will describe current efforts to deal with OCR as a signal processing problem where the causes of noise and distortions as well the idealized images (definitions of typefaces) are modeled and subjected to a quantitative analysis. The key idea of the analysis is that while printed text images may be binary in an ideal state, the images seen by the sensors are gray scale because of convolution distortion and other causes

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:93-0513262

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Recognition of printed text under realistic conditions</title>
<author>
<name sortKey="Pavlidis, T" sort="Pavlidis, T" uniqKey="Pavlidis T" first="T." last="Pavlidis">T. Pavlidis</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>SUNY, dep. computer sci.</s1>
<s2>Stony Brook NY 11794-4400</s2>
<s3>USA</s3>
</inist:fA14>
<country>États-Unis</country>
<wicri:noRegion>Stony Brook NY 11794-4400</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">93-0513262</idno>
<date when="1993">1993</date>
<idno type="stanalyst">PASCAL 93-0513262 INIST</idno>
<idno type="RBID">Pascal:93-0513262</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000B26</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000876</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000A85</idno>
<idno type="wicri:doubleKey">0167-8655:1993:Pavlidis T:recognition:of:printed</idno>
<idno type="wicri:Area/Main/Merge">003336</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Recognition of printed text under realistic conditions</title>
<author>
<name sortKey="Pavlidis, T" sort="Pavlidis, T" uniqKey="Pavlidis T" first="T." last="Pavlidis">T. Pavlidis</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>SUNY, dep. computer sci.</s1>
<s2>Stony Brook NY 11794-4400</s2>
<s3>USA</s3>
</inist:fA14>
<country>États-Unis</country>
<wicri:noRegion>Stony Brook NY 11794-4400</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Pattern recognition letters</title>
<title level="j" type="abbreviated">Pattern recogn. lett.</title>
<idno type="ISSN">0167-8655</idno>
<imprint>
<date when="1993">1993</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Pattern recognition letters</title>
<title level="j" type="abbreviated">Pattern recogn. lett.</title>
<idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Classification</term>
<term>OCR</term>
<term>Pattern recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance forme</term>
<term>Reconnaissance caractère</term>
<term>Classification</term>
<term>OCR</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Past research in OCR has focused on the shape analysis of binarized images, quite often assuming good quality document and isolated characters. Such assumptions are challenged by the conditions met in practice: binarization is difficult for low contrast documents, characters often touch each other, not only on the sides but also between lines, etc. After a brief review of past work we will describe current efforts to deal with OCR as a signal processing problem where the causes of noise and distortions as well the idealized images (definitions of typefaces) are modeled and subjected to a quantitative analysis. The key idea of the analysis is that while printed text images may be binary in an ideal state, the images seen by the sensors are gray scale because of convolution distortion and other causes</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
</list>
<tree>
<country name="États-Unis">
<noRegion>
<name sortKey="Pavlidis, T" sort="Pavlidis, T" uniqKey="Pavlidis T" first="T." last="Pavlidis">T. Pavlidis</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003336 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 003336 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     Pascal:93-0513262
   |texte=   Recognition of printed text under realistic conditions
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024