Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A hidden Markov model-based character extraction method

Identifieur interne : 000D48 ( Main/Merge ); précédent : 000D47; suivant : 000D49

A hidden Markov model-based character extraction method

Auteurs : SONGTAO HUANG [Canada] ; Majid Ahmadi [Canada] ; M. A. Sid-Ahmed [Canada]

Source :

RBID : Pascal:08-0319208

Descripteurs français

English descriptors

Abstract

In this paper a hidden Markov model (HMM)-based binarization algorithm is presented. This algorithm performs well for images with nonuniform background. To test the usefullness of the proposed technique some images of composite documents of printed characters were used. These characters were extracted through the proposed binarization algorithms and used in a commercial OCR. A comparative study of various binarization techniques is also presented.

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:08-0319208

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A hidden Markov model-based character extraction method</title>
<author>
<name sortKey="Songtao Huang" sort="Songtao Huang" uniqKey="Songtao Huang" last="Songtao Huang">SONGTAO HUANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical and Computer Engineering, University of Windsor, Essex Hall 401 Sunset Avenue</s1>
<s2>Windsor, Ontario, N9B 3P4</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Canada</country>
<wicri:noRegion>Windsor, Ontario, N9B 3P4</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Ahmadi, Majid" sort="Ahmadi, Majid" uniqKey="Ahmadi M" first="Majid" last="Ahmadi">Majid Ahmadi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical and Computer Engineering, University of Windsor, Essex Hall 401 Sunset Avenue</s1>
<s2>Windsor, Ontario, N9B 3P4</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Canada</country>
<wicri:noRegion>Windsor, Ontario, N9B 3P4</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Sid Ahmed, M A" sort="Sid Ahmed, M A" uniqKey="Sid Ahmed M" first="M. A." last="Sid-Ahmed">M. A. Sid-Ahmed</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical and Computer Engineering, University of Windsor, Essex Hall 401 Sunset Avenue</s1>
<s2>Windsor, Ontario, N9B 3P4</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Canada</country>
<wicri:noRegion>Windsor, Ontario, N9B 3P4</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">08-0319208</idno>
<date when="2008">2008</date>
<idno type="stanalyst">PASCAL 08-0319208 INIST</idno>
<idno type="RBID">Pascal:08-0319208</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000278</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000506</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000258</idno>
<idno type="wicri:doubleKey">0031-3203:2008:Songtao Huang:a:hidden:markov</idno>
<idno type="wicri:Area/Main/Merge">000D48</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A hidden Markov model-based character extraction method</title>
<author>
<name sortKey="Songtao Huang" sort="Songtao Huang" uniqKey="Songtao Huang" last="Songtao Huang">SONGTAO HUANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical and Computer Engineering, University of Windsor, Essex Hall 401 Sunset Avenue</s1>
<s2>Windsor, Ontario, N9B 3P4</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Canada</country>
<wicri:noRegion>Windsor, Ontario, N9B 3P4</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Ahmadi, Majid" sort="Ahmadi, Majid" uniqKey="Ahmadi M" first="Majid" last="Ahmadi">Majid Ahmadi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical and Computer Engineering, University of Windsor, Essex Hall 401 Sunset Avenue</s1>
<s2>Windsor, Ontario, N9B 3P4</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Canada</country>
<wicri:noRegion>Windsor, Ontario, N9B 3P4</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Sid Ahmed, M A" sort="Sid Ahmed, M A" uniqKey="Sid Ahmed M" first="M. A." last="Sid-Ahmed">M. A. Sid-Ahmed</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical and Computer Engineering, University of Windsor, Essex Hall 401 Sunset Avenue</s1>
<s2>Windsor, Ontario, N9B 3P4</s2>
<s3>CAN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Canada</country>
<wicri:noRegion>Windsor, Ontario, N9B 3P4</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Pattern recognition</title>
<title level="j" type="abbreviated">Pattern recogn.</title>
<idno type="ISSN">0031-3203</idno>
<imprint>
<date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Pattern recognition</title>
<title level="j" type="abbreviated">Pattern recogn.</title>
<idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithm</term>
<term>Background</term>
<term>Comparative study</term>
<term>Feature extraction</term>
<term>Hidden Markov models</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Pattern recognition</term>
<term>Printed character</term>
<term>Probabilistic approach</term>
<term>Signal processing</term>
<term>Threshold detection</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Modèle Markov variable cachée</term>
<term>Extraction forme</term>
<term>Algorithme</term>
<term>Caractère imprimé</term>
<term>Reconnaissance optique caractère</term>
<term>Etude comparative</term>
<term>Détection seuil</term>
<term>Approche probabiliste</term>
<term>Extraction caractéristique</term>
<term>Reconnaissance forme</term>
<term>Traitement signal</term>
<term>Arrière plan</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this paper a hidden Markov model (HMM)-based binarization algorithm is presented. This algorithm performs well for images with nonuniform background. To test the usefullness of the proposed technique some images of composite documents of printed characters were used. These characters were extracted through the proposed binarization algorithms and used in a commercial OCR. A comparative study of various binarization techniques is also presented.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Canada</li>
</country>
</list>
<tree>
<country name="Canada">
<noRegion>
<name sortKey="Songtao Huang" sort="Songtao Huang" uniqKey="Songtao Huang" last="Songtao Huang">SONGTAO HUANG</name>
</noRegion>
<name sortKey="Ahmadi, Majid" sort="Ahmadi, Majid" uniqKey="Ahmadi M" first="Majid" last="Ahmadi">Majid Ahmadi</name>
<name sortKey="Sid Ahmed, M A" sort="Sid Ahmed, M A" uniqKey="Sid Ahmed M" first="M. A." last="Sid-Ahmed">M. A. Sid-Ahmed</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000D48 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 000D48 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     Pascal:08-0319208
   |texte=   A hidden Markov model-based character extraction method
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024