OcrV1, Istex, Checkpoint, bibRecord, 000F98

Machine Recognition of Printed Kannada Text

Identifieur interne : 000F98 ( Istex/Checkpoint ); précédent : 000F97; suivant : 000F99

Machine Recognition of Printed Kannada Text

Auteurs : B. Vijay Kumar [Inde] ; G. Ramakrishnan [Inde]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.

RBID : ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8

Abstract

Abstract: This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is dificult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.

Url:

https://api.istex.fr/document/3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8/fulltext/pdf

DOI: 10.1007/3-540-45869-7_4

Affiliations:

Inde

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 000A04
to stream Istex, to step Curation: 000992

Links to Exploration step

ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Machine Recognition of Printed Kannada Text</title>
<author><name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</author>
<author><name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_4</idno>
<idno type="url">https://api.istex.fr/document/3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000A04</idno>
<idno type="wicri:Area/Istex/Curation">000992</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F98</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Machine Recognition of Printed Kannada Text</title>
<author><name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore</wicri:regionArea>
<wicri:noRegion>Bangalore</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
</affiliation>
</author>
<author><name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore</wicri:regionArea>
<wicri:noRegion>Bangalore</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8</idno>
<idno type="DOI">10.1007/3-540-45869-7_4</idno>
<idno type="ChapterID">4</idno>
<idno type="ChapterID">Chap4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is dificult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.</div>
</front>
</TEI>
<affiliations><list><country><li>Inde</li>
</country>
</list>
<tree><country name="Inde"><noRegion><name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</noRegion>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Checkpoint

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F98 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Istex/Checkpoint/biblio.hfd -nk 000F98 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Checkpoint
   |type=    RBID
   |clé=     ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8
   |texte=   Machine Recognition of Printed Kannada Text
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Machine Recognition of Printed Kannada Text

Machine Recognition of Printed Kannada Text

Source :

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri