OcrV1, Main, Exploration, bibRecord, 001364

Two template matching approaches to arabic, amharic and latin isolated characters recognition

Identifieur interne : 001364 ( Main/Exploration ); précédent : 001363; suivant : 001365

Two template matching approaches to arabic, amharic and latin isolated characters recognition

Auteurs : John Cowell [Royaume-Uni] ; Fiaz Hussain [Royaume-Uni]

Source :

Machine graphics & vision [ 1230-0535 ] ; 2005.

RBID : Pascal:06-0200198

Descripteurs français

Pascal (Inist)
- Concordance forme, Appariement image, Reconnaissance caractère, Reconnaissance forme, Reconnaissance optique caractère, Texte, Chinois, Signature électronique, Similitude, Arabe, Japonais, Jeu caractère.
Wicri :
- topic : Signature électronique.

English descriptors

KwdEn :
- Arabic, Character recognition, Character set, Chinese, Digital signature, Image matching, Japanese, Optical character recognition, Pattern matching, Pattern recognition, Similarity, Text.

Abstract

With the establishment of commercial OCR systems for Latin text, recent research efforts have been directed at the design of recognition systems for non-Latin scripts, such as Japanese, Cyrillic, Chinese, Hindi, Tibetan, and in particular Arabic. The Unicode 4.0 standard supports 50 scripts that are used across the world, and many, such as Amharic (Ethiopic), have attracted virtually no attention from researchers. An extensive literature review reveals no papers which report on an OCR system for Amharic. This paper describes a normalised technique which can be used for recognition of isolated Arabic, Amharic and Latin characters. Two approaches are considered for identifying the characters by comparing them to a series of templates and using a signature template scheme. The degrees of similarity between pairs of Amharic, Arabic and typical Latin characters are presented in the confusion matrix, and the performance of the two approaches is compared for each of these three character sets.

Affiliations:

Royaume-Uni

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000395
to stream PascalFrancis, to step Curation: 000391
to stream PascalFrancis, to step Checkpoint: 000370
to stream Main, to step Merge: 001400
to stream Main, to step Curation: 001364

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Two template matching approaches to arabic, amharic and latin isolated characters recognition</title>
<author><name sortKey="Cowell, John" sort="Cowell, John" uniqKey="Cowell J" first="John" last="Cowell">John Cowell</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Centre for Computational Intelligence, De Montfort University, The Gateway</s1>
<s2>Leicester, LE1 9BH, England</s2>
<s3>GBR</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Royaume-Uni</country>
<wicri:noRegion>Leicester, LE1 9BH, England</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Hussain, Fiaz" sort="Hussain, Fiaz" uniqKey="Hussain F" first="Fiaz" last="Hussain">Fiaz Hussain</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>Dept. of Computing Information Systems, University of Luton,Park Square</s1>
<s2>Luton, LU1 3JU,England</s2>
<s3>GBR</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Royaume-Uni</country>
<wicri:noRegion>Luton, LU1 3JU,England</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">06-0200198</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 06-0200198 INIST</idno>
<idno type="RBID">Pascal:06-0200198</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000395</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000391</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000370</idno>
<idno type="wicri:doubleKey">1230-0535:2005:Cowell J:two:template:matching</idno>
<idno type="wicri:Area/Main/Merge">001400</idno>
<idno type="wicri:Area/Main/Curation">001364</idno>
<idno type="wicri:Area/Main/Exploration">001364</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Two template matching approaches to arabic, amharic and latin isolated characters recognition</title>
<author><name sortKey="Cowell, John" sort="Cowell, John" uniqKey="Cowell J" first="John" last="Cowell">John Cowell</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Centre for Computational Intelligence, De Montfort University, The Gateway</s1>
<s2>Leicester, LE1 9BH, England</s2>
<s3>GBR</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Royaume-Uni</country>
<wicri:noRegion>Leicester, LE1 9BH, England</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Hussain, Fiaz" sort="Hussain, Fiaz" uniqKey="Hussain F" first="Fiaz" last="Hussain">Fiaz Hussain</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>Dept. of Computing Information Systems, University of Luton,Park Square</s1>
<s2>Luton, LU1 3JU,England</s2>
<s3>GBR</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Royaume-Uni</country>
<wicri:noRegion>Luton, LU1 3JU,England</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Machine graphics & vision</title>
<title level="j" type="abbreviated">Mach. graph. vis.</title>
<idno type="ISSN">1230-0535</idno>
<imprint><date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Machine graphics & vision</title>
<title level="j" type="abbreviated">Mach. graph. vis.</title>
<idno type="ISSN">1230-0535</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Arabic</term>
<term>Character recognition</term>
<term>Character set</term>
<term>Chinese</term>
<term>Digital signature</term>
<term>Image matching</term>
<term>Japanese</term>
<term>Optical character recognition</term>
<term>Pattern matching</term>
<term>Pattern recognition</term>
<term>Similarity</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Concordance forme</term>
<term>Appariement image</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Chinois</term>
<term>Signature électronique</term>
<term>Similitude</term>
<term>Arabe</term>
<term>Japonais</term>
<term>Jeu caractère</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Signature électronique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">With the establishment of commercial OCR systems for Latin text, recent research efforts have been directed at the design of recognition systems for non-Latin scripts, such as Japanese, Cyrillic, Chinese, Hindi, Tibetan, and in particular Arabic. The Unicode 4.0 standard supports 50 scripts that are used across the world, and many, such as Amharic (Ethiopic), have attracted virtually no attention from researchers. An extensive literature review reveals no papers which report on an OCR system for Amharic. This paper describes a normalised technique which can be used for recognition of isolated Arabic, Amharic and Latin characters. Two approaches are considered for identifying the characters by comparing them to a series of templates and using a signature template scheme. The degrees of similarity between pairs of Amharic, Arabic and typical Latin characters are presented in the confusion matrix, and the performance of the two approaches is compared for each of these three character sets.</div>
</front>
</TEI>
<affiliations><list><country><li>Royaume-Uni</li>
</country>
</list>
<tree><country name="Royaume-Uni"><noRegion><name sortKey="Cowell, John" sort="Cowell, John" uniqKey="Cowell J" first="John" last="Cowell">John Cowell</name>
</noRegion>
<name sortKey="Hussain, Fiaz" sort="Hussain, Fiaz" uniqKey="Hussain F" first="Fiaz" last="Hussain">Fiaz Hussain</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001364 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001364 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:06-0200198
   |texte=   Two template matching approaches to arabic, amharic and latin isolated characters recognition
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Two template matching approaches to arabic, amharic and latin isolated characters recognition

Two template matching approaches to arabic, amharic and latin isolated characters recognition

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri