Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Identifieur interne : 000191 ( PascalFrancis/Corpus ); précédent : 000190; suivant : 000192

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Auteurs : Majid Ziaratban ; Karim Faez

Source :

RBID : Pascal:10-0182400

Descripteurs français

English descriptors

Abstract

Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 1433-2833
A03   1    @0 Int. j. doc. anal. recognit. : (Print)
A05       @2 12
A06       @2 4
A08 01  1  ENG  @1 Non-uniform slant estimation and correction for Farsi/Arabic handwritten words
A11 01  1    @1 ZIARATBAN (Majid)
A11 02  1    @1 FAEZ (Karim)
A14 01      @1 Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave @2 15916-34311 Tehran @3 IRN @Z 1 aut. @Z 2 aut.
A20       @1 249-267
A21       @1 2009
A23 01      @0 ENG
A43 01      @1 INIST @2 26790 @5 354000171651020020
A44       @0 0000 @1 © 2010 INIST-CNRS. All rights reserved.
A45       @0 29 ref.
A47 01  1    @0 10-0182400
A60       @1 P
A61       @0 A
A64 01  1    @0 International journal on document analysis and recognition : (Print)
A66 01      @0 DEU
C01 01    ENG  @0 Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.
C02 01  X    @0 001D02C03
C02 02  X    @0 001D02B02
C03 01  X  FRE  @0 Reconnaissance optique caractère @5 06
C03 01  X  ENG  @0 Optical character recognition @5 06
C03 01  X  SPA  @0 Reconocimento óptico de caracteres @5 06
C03 02  X  FRE  @0 Caractère manuscrit @5 07
C03 02  X  ENG  @0 Manuscript character @5 07
C03 02  X  SPA  @0 Carácter manuscrito @5 07
C03 03  X  FRE  @0 Reconnaissance caractère @5 08
C03 03  X  ENG  @0 Character recognition @5 08
C03 03  X  SPA  @0 Reconocimiento carácter @5 08
C03 04  X  FRE  @0 Langage spécification @5 09
C03 04  X  ENG  @0 Specification language @5 09
C03 04  X  SPA  @0 Lenguaje especificación @5 09
C03 05  X  FRE  @0 Arabe @5 18
C03 05  X  ENG  @0 Arabic @5 18
C03 05  X  SPA  @0 Árabe @5 18
C03 06  X  FRE  @0 Correction erreur @5 23
C03 06  X  ENG  @0 Error correction @5 23
C03 06  X  SPA  @0 Corrección error @5 23
C03 07  X  FRE  @0 Erreur estimation @5 24
C03 07  X  ENG  @0 Estimation error @5 24
C03 07  X  SPA  @0 Error estimación @5 24
C03 08  X  FRE  @0 Estimation erreur @5 25
C03 08  X  ENG  @0 Error estimation @5 25
C03 08  X  SPA  @0 Estimación error @5 25
N21       @1 123
N44 01      @1 OTO
N82       @1 OTO

Format Inist (serveur)

NO : PASCAL 10-0182400 INIST
ET : Non-uniform slant estimation and correction for Farsi/Arabic handwritten words
AU : ZIARATBAN (Majid); FAEZ (Karim)
AF : Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave/15916-34311 Tehran/Iran (1 aut., 2 aut.)
DT : Publication en série; Niveau analytique
SO : International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2009; Vol. 12; No. 4; Pp. 249-267; Bibl. 29 ref.
LA : Anglais
EA : Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.
CC : 001D02C03; 001D02B02
FD : Reconnaissance optique caractère; Caractère manuscrit; Reconnaissance caractère; Langage spécification; Arabe; Correction erreur; Erreur estimation; Estimation erreur
ED : Optical character recognition; Manuscript character; Character recognition; Specification language; Arabic; Error correction; Estimation error; Error estimation
SD : Reconocimento óptico de caracteres; Carácter manuscrito; Reconocimiento carácter; Lenguaje especificación; Árabe; Corrección error; Error estimación; Estimación error
LO : INIST-26790.354000171651020020
ID : 10-0182400

Links to Exploration step

Pascal:10-0182400

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</title>
<author>
<name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">10-0182400</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 10-0182400 INIST</idno>
<idno type="RBID">Pascal:10-0182400</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000191</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</title>
<author>
<name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Arabic</term>
<term>Character recognition</term>
<term>Error correction</term>
<term>Error estimation</term>
<term>Estimation error</term>
<term>Manuscript character</term>
<term>Optical character recognition</term>
<term>Specification language</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance optique caractère</term>
<term>Caractère manuscrit</term>
<term>Reconnaissance caractère</term>
<term>Langage spécification</term>
<term>Arabe</term>
<term>Correction erreur</term>
<term>Erreur estimation</term>
<term>Estimation erreur</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1433-2833</s0>
</fA01>
<fA03 i2="1">
<s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05>
<s2>12</s2>
</fA05>
<fA06>
<s2>4</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>ZIARATBAN (Majid)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>FAEZ (Karim)</s1>
</fA11>
<fA14 i1="01">
<s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA20>
<s1>249-267</s1>
</fA20>
<fA21>
<s1>2009</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>26790</s2>
<s5>354000171651020020</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2010 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>29 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>10-0182400</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01">
<s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D02C03</s0>
</fC02>
<fC02 i1="02" i2="X">
<s0>001D02B02</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Caractère manuscrit</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Manuscript character</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Carácter manuscrito</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Character recognition</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Reconocimiento carácter</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Langage spécification</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Specification language</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Lenguaje especificación</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Arabe</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Arabic</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Árabe</s0>
<s5>18</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Correction erreur</s0>
<s5>23</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Error correction</s0>
<s5>23</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Corrección error</s0>
<s5>23</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Erreur estimation</s0>
<s5>24</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Estimation error</s0>
<s5>24</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Error estimación</s0>
<s5>24</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Estimation erreur</s0>
<s5>25</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Error estimation</s0>
<s5>25</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Estimación error</s0>
<s5>25</s5>
</fC03>
<fN21>
<s1>123</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
</standard>
<server>
<NO>PASCAL 10-0182400 INIST</NO>
<ET>Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</ET>
<AU>ZIARATBAN (Majid); FAEZ (Karim)</AU>
<AF>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave/15916-34311 Tehran/Iran (1 aut., 2 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2009; Vol. 12; No. 4; Pp. 249-267; Bibl. 29 ref.</SO>
<LA>Anglais</LA>
<EA>Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.</EA>
<CC>001D02C03; 001D02B02</CC>
<FD>Reconnaissance optique caractère; Caractère manuscrit; Reconnaissance caractère; Langage spécification; Arabe; Correction erreur; Erreur estimation; Estimation erreur</FD>
<ED>Optical character recognition; Manuscript character; Character recognition; Specification language; Arabic; Error correction; Estimation error; Error estimation</ED>
<SD>Reconocimento óptico de caracteres; Carácter manuscrito; Reconocimiento carácter; Lenguaje especificación; Árabe; Corrección error; Error estimación; Estimación error</SD>
<LO>INIST-26790.354000171651020020</LO>
<ID>10-0182400</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000191 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000191 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:10-0182400
   |texte=   Non-uniform slant estimation and correction for Farsi/Arabic handwritten words
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024