Machine-printed and hand-written text lines identification
Identifieur interne : 001A32 ( Main/Curation ); précédent : 001A31; suivant : 001A33Machine-printed and hand-written text lines identification
Auteurs : U. Pal [Inde] ; Bidyut Baran Chaudhuri [Inde]Source :
- Pattern Recognition Letters [ 0167-8655 ] ; 2000.
Abstract
There are many types of documents where machine-printed and hand-written texts intermixedly appear. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, to achieve optimal performance it is necessary to separate these two types of texts before feeding them to their respective OCR systems. In this paper, we present a machine-printed and hand-written text classification scheme for Bangla and Devnagari, the two most popular Indian scripts. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of 98.6%.
Url:
DOI: 10.1016/S0167-8655(00)00126-4
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000601
- to stream Istex, to step Curation: Pour aller vers cette notice dans l'étape Curation :000593
- to stream Istex, to step Checkpoint: Pour aller vers cette notice dans l'étape Curation :001086
- to stream Main, to step Merge: Pour aller vers cette notice dans l'étape Curation :001B25
Links to Exploration step
ISTEX:49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title>Machine-printed and hand-written text lines identification</title>
<author><name sortKey="Pal, U" sort="Pal, U" uniqKey="Pal U" first="U." last="Pal">U. Pal</name>
</author>
<author><name sortKey="Chaudhuri, B B" sort="Chaudhuri, B B" uniqKey="Chaudhuri B" first="B. B." last="Chaudhuri">Bidyut Baran Chaudhuri</name>
<affiliation><country>Inde</country>
<placeName><settlement type="city">Calcutta</settlement>
<region type="province">Bengale-Occidental</region>
</placeName>
<orgName type="lab" n="5">Institut indien de statistiques</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9</idno>
<date when="2001" year="2001">2001</date>
<idno type="doi">10.1016/S0167-8655(00)00126-4</idno>
<idno type="url">https://api.istex.fr/document/49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000601</idno>
<idno type="wicri:Area/Istex/Curation">000593</idno>
<idno type="wicri:Area/Istex/Checkpoint">001086</idno>
<idno type="wicri:doubleKey">0167-8655:2001:Pal U:machine:printed:and</idno>
<idno type="wicri:Area/Main/Merge">001B25</idno>
<idno type="wicri:Area/Main/Curation">001A32</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a">Machine-printed and hand-written text lines identification</title>
<author><name sortKey="Pal, U" sort="Pal, U" uniqKey="Pal U" first="U." last="Pal">U. Pal</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B.T. Road, Calcutta 700 035</wicri:regionArea>
<wicri:noRegion>Calcutta 700 035</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
</affiliation>
</author>
<author><name sortKey="Chaudhuri, B B" sort="Chaudhuri, B B" uniqKey="Chaudhuri B" first="B. B." last="Chaudhuri">Bidyut Baran Chaudhuri</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B.T. Road, Calcutta 700 035</wicri:regionArea>
<wicri:noRegion>Calcutta 700 035</wicri:noRegion>
<placeName><settlement type="city">Calcutta</settlement>
<region type="province">Bengale-Occidental</region>
</placeName>
<orgName type="lab" n="5">Institut indien de statistiques</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
<placeName><settlement type="city">Calcutta</settlement>
<region type="province">Bengale-Occidental</region>
</placeName>
<orgName type="lab" n="5">Institut indien de statistiques</orgName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Pattern Recognition Letters</title>
<title level="j" type="abbrev">PATREC</title>
<idno type="ISSN">0167-8655</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="2000">2000</date>
<biblScope unit="volume">22</biblScope>
<biblScope unit="issue">3–4</biblScope>
<biblScope unit="page" from="431">431</biblScope>
<biblScope unit="page" to="441">441</biblScope>
</imprint>
<idno type="ISSN">0167-8655</idno>
</series>
<idno type="istex">49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9</idno>
<idno type="DOI">10.1016/S0167-8655(00)00126-4</idno>
<idno type="PII">S0167-8655(00)00126-4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">There are many types of documents where machine-printed and hand-written texts intermixedly appear. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, to achieve optimal performance it is necessary to separate these two types of texts before feeding them to their respective OCR systems. In this paper, we present a machine-printed and hand-written text classification scheme for Bangla and Devnagari, the two most popular Indian scripts. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of 98.6%.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A32 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 001A32 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Curation |type= RBID |clé= ISTEX:49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9 |texte= Machine-printed and hand-written text lines identification }}
This area was generated with Dilib version V0.6.32. |