Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Complete OCR System for Gurmukhi Script

Identifieur interne : 001949 ( Main/Exploration ); précédent : 001948; suivant : 001950

A Complete OCR System for Gurmukhi Script

Auteurs : S. Lehal [Inde] ; Chandan Singh [Inde]

Source :

RBID : ISTEX:EA77217A2541B2454D448621375D10BBB78EEA89

Abstract

Abstract: Recognition of Indian language scripts is a challenging problem. Work for the development of complete OCR systems for Indian language scripts is still in infancy. Complete OCR systems have recently been developed for Devanagri and Bangla scripts. Research in the field of recognition of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectivity of characters on the headline, characters in a word present in both horizontal and vertical directions, two or more characters in a word having intersecting minimum bounding rectangles along horizontal direction, existence of a large set of visually similar character pairs, multi-component characters, touching characters which are present even in clean documents and horizontally overlapping text segments. This paper addresses the problems in the various stages of the development of a complete OCR for Gurmukhi script and discusses potential solutions.

Url:
DOI: 10.1007/3-540-70659-3_37


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A Complete OCR System for Gurmukhi Script</title>
<author>
<name sortKey="Lehal, S" sort="Lehal, S" uniqKey="Lehal S" first="S." last="Lehal">S. Lehal</name>
</author>
<author>
<name sortKey="Singh, Chandan" sort="Singh, Chandan" uniqKey="Singh C" first="Chandan" last="Singh">Chandan Singh</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:EA77217A2541B2454D448621375D10BBB78EEA89</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-70659-3_37</idno>
<idno type="url">https://api.istex.fr/document/EA77217A2541B2454D448621375D10BBB78EEA89/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000349</idno>
<idno type="wicri:Area/Istex/Curation">000344</idno>
<idno type="wicri:Area/Istex/Checkpoint">001063</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Lehal S:a:complete:ocr</idno>
<idno type="wicri:Area/Main/Merge">001A29</idno>
<idno type="wicri:Area/Main/Curation">001949</idno>
<idno type="wicri:Area/Main/Exploration">001949</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">A Complete OCR System for Gurmukhi Script</title>
<author>
<name sortKey="Lehal, S" sort="Lehal, S" uniqKey="Lehal S" first="S." last="Lehal">S. Lehal</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Computer Science and Engineering, Thapar Institute of Engineering &Technology, Patiala</wicri:regionArea>
<wicri:noRegion>Patiala</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Singh, Chandan" sort="Singh, Chandan" uniqKey="Singh C" first="Chandan" last="Singh">Chandan Singh</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Computer Science and Engineering, Punjabi University, Patiala</wicri:regionArea>
<wicri:noRegion>Patiala</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">EA77217A2541B2454D448621375D10BBB78EEA89</idno>
<idno type="DOI">10.1007/3-540-70659-3_37</idno>
<idno type="ChapterID">37</idno>
<idno type="ChapterID">Chap37</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Recognition of Indian language scripts is a challenging problem. Work for the development of complete OCR systems for Indian language scripts is still in infancy. Complete OCR systems have recently been developed for Devanagri and Bangla scripts. Research in the field of recognition of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectivity of characters on the headline, characters in a word present in both horizontal and vertical directions, two or more characters in a word having intersecting minimum bounding rectangles along horizontal direction, existence of a large set of visually similar character pairs, multi-component characters, touching characters which are present even in clean documents and horizontally overlapping text segments. This paper addresses the problems in the various stages of the development of a complete OCR for Gurmukhi script and discusses potential solutions.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Inde</li>
</country>
</list>
<tree>
<country name="Inde">
<noRegion>
<name sortKey="Lehal, S" sort="Lehal, S" uniqKey="Lehal S" first="S." last="Lehal">S. Lehal</name>
</noRegion>
<name sortKey="Singh, Chandan" sort="Singh, Chandan" uniqKey="Singh C" first="Chandan" last="Singh">Chandan Singh</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001949 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001949 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:EA77217A2541B2454D448621375D10BBB78EEA89
   |texte=   A Complete OCR System for Gurmukhi Script
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024