Machine Recognition of Printed Kannada Text
Identifieur interne : 000F98 ( Istex/Checkpoint ); précédent : 000F97; suivant : 000F99Machine Recognition of Printed Kannada Text
Auteurs : B. Vijay Kumar [Inde] ; G. Ramakrishnan [Inde]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is dificult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.
Url:
DOI: 10.1007/3-540-45869-7_4
Affiliations:
Links toward previous steps (curation, corpus...)
Links to Exploration step
ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Machine Recognition of Printed Kannada Text</title>
<author><name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</author>
<author><name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_4</idno>
<idno type="url">https://api.istex.fr/document/3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000A04</idno>
<idno type="wicri:Area/Istex/Curation">000992</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F98</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Machine Recognition of Printed Kannada Text</title>
<author><name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore</wicri:regionArea>
<wicri:noRegion>Bangalore</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
</affiliation>
</author>
<author><name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore</wicri:regionArea>
<wicri:noRegion>Bangalore</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8</idno>
<idno type="DOI">10.1007/3-540-45869-7_4</idno>
<idno type="ChapterID">4</idno>
<idno type="ChapterID">Chap4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is dificult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.</div>
</front>
</TEI>
<affiliations><list><country><li>Inde</li>
</country>
</list>
<tree><country name="Inde"><noRegion><name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</noRegion>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F98 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Istex/Checkpoint/biblio.hfd -nk 000F98 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Istex |étape= Checkpoint |type= RBID |clé= ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8 |texte= Machine Recognition of Printed Kannada Text }}
This area was generated with Dilib version V0.6.32. |