OcrV1, PascalFrancis, Curation, bibRecord, 000586

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Identifieur interne : 000586 ( PascalFrancis/Curation ); précédent : 000585; suivant : 000587

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Auteurs : Majid Ziaratban [Iran] ; Karim Faez [Iran]

Source :

International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2009.

RBID : Pascal:10-0182400

Descripteurs français

Pascal (Inist)
- Reconnaissance optique caractère, Caractère manuscrit, Reconnaissance caractère, Langage spécification, Arabe, Correction erreur, Erreur estimation, Estimation erreur.

English descriptors

KwdEn :
- Arabic, Character recognition, Error correction, Error estimation, Estimation error, Manuscript character, Optical character recognition, Specification language.

Abstract

Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.

A01	`01`	`1`		`@0 1433-2833`
A03		`1`		`@0 Int. j. doc. anal. recognit. : (Print)`
A05				`@2 12`
A06				`@2 4`
A08	`01`	`1`	`ENG`	`@1 Non-uniform slant estimation and correction for Farsi/Arabic handwritten words`
A11	`01`	`1`		`@1 ZIARATBAN (Majid)`
A11	`02`	`1`		`@1 FAEZ (Karim)`
A14	`01`			`@1 Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave @2 15916-34311 Tehran @3 IRN @Z 1 aut. @Z 2 aut.`
A20				`@1 249-267`
A21				`@1 2009`
A23	`01`			`@0 ENG`
A43	`01`			`@1 INIST @2 26790 @5 354000171651020020`
A44				`@0 0000 @1 © 2010 INIST-CNRS. All rights reserved.`
A45				`@0 29 ref.`
A47	`01`	`1`		`@0 10-0182400`
A60				`@1 P`
A61				`@0 A`
A64	`01`	`1`		`@0 International journal on document analysis and recognition : (Print)`
A66	`01`			`@0 DEU`
C01	`01`		`ENG`	@0 Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.
C02	`01`	`X`		`@0 001D02C03`
C02	`02`	`X`		`@0 001D02B02`
C03	`01`	`X`	`FRE`	`@0 Reconnaissance optique caractère @5 06`
C03	`01`	`X`	`ENG`	`@0 Optical character recognition @5 06`
C03	`01`	`X`	`SPA`	`@0 Reconocimento óptico de caracteres @5 06`
C03	`02`	`X`	`FRE`	`@0 Caractère manuscrit @5 07`
C03	`02`	`X`	`ENG`	`@0 Manuscript character @5 07`
C03	`02`	`X`	`SPA`	`@0 Carácter manuscrito @5 07`
C03	`03`	`X`	`FRE`	`@0 Reconnaissance caractère @5 08`
C03	`03`	`X`	`ENG`	`@0 Character recognition @5 08`
C03	`03`	`X`	`SPA`	`@0 Reconocimiento carácter @5 08`
C03	`04`	`X`	`FRE`	`@0 Langage spécification @5 09`
C03	`04`	`X`	`ENG`	`@0 Specification language @5 09`
C03	`04`	`X`	`SPA`	`@0 Lenguaje especificación @5 09`
C03	`05`	`X`	`FRE`	`@0 Arabe @5 18`
C03	`05`	`X`	`ENG`	`@0 Arabic @5 18`
C03	`05`	`X`	`SPA`	`@0 Árabe @5 18`
C03	`06`	`X`	`FRE`	`@0 Correction erreur @5 23`
C03	`06`	`X`	`ENG`	`@0 Error correction @5 23`
C03	`06`	`X`	`SPA`	`@0 Corrección error @5 23`
C03	`07`	`X`	`FRE`	`@0 Erreur estimation @5 24`
C03	`07`	`X`	`ENG`	`@0 Estimation error @5 24`
C03	`07`	`X`	`SPA`	`@0 Error estimación @5 24`
C03	`08`	`X`	`FRE`	`@0 Estimation erreur @5 25`
C03	`08`	`X`	`ENG`	`@0 Error estimation @5 25`
C03	`08`	`X`	`SPA`	`@0 Estimación error @5 25`
N21				`@1 123`
N44	`01`			`@1 OTO`
N82				`@1 OTO`

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000191

Links to Exploration step

Pascal:10-0182400

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</title>
<author><name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">10-0182400</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 10-0182400 INIST</idno>
<idno type="RBID">Pascal:10-0182400</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000191</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000586</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</title>
<author><name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
<author><name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Arabic</term>
<term>Character recognition</term>
<term>Error correction</term>
<term>Error estimation</term>
<term>Estimation error</term>
<term>Manuscript character</term>
<term>Optical character recognition</term>
<term>Specification language</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance optique caractère</term>
<term>Caractère manuscrit</term>
<term>Reconnaissance caractère</term>
<term>Langage spécification</term>
<term>Arabe</term>
<term>Correction erreur</term>
<term>Erreur estimation</term>
<term>Estimation erreur</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1433-2833</s0>
</fA01>
<fA03 i2="1"><s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05><s2>12</s2>
</fA05>
<fA06><s2>4</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG"><s1>Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</s1>
</fA08>
<fA11 i1="01" i2="1"><s1>ZIARATBAN (Majid)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>FAEZ (Karim)</s1>
</fA11>
<fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA20><s1>249-267</s1>
</fA20>
<fA21><s1>2009</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA43 i1="01"><s1>INIST</s1>
<s2>26790</s2>
<s5>354000171651020020</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2010 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>29 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>10-0182400</s0>
</fA47>
<fA60><s1>P</s1>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01"><s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02C03</s0>
</fC02>
<fC02 i1="02" i2="X"><s0>001D02B02</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Reconnaissance optique caractère</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Optical character recognition</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Reconocimento óptico de caracteres</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Caractère manuscrit</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Manuscript character</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Carácter manuscrito</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Reconnaissance caractère</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Character recognition</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Reconocimiento carácter</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Langage spécification</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Specification language</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Lenguaje especificación</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Arabe</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Arabic</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Árabe</s0>
<s5>18</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Correction erreur</s0>
<s5>23</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Error correction</s0>
<s5>23</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Corrección error</s0>
<s5>23</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Erreur estimation</s0>
<s5>24</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG"><s0>Estimation error</s0>
<s5>24</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA"><s0>Error estimación</s0>
<s5>24</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Estimation erreur</s0>
<s5>25</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Error estimation</s0>
<s5>25</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Estimación error</s0>
<s5>25</s5>
</fC03>
<fN21><s1>123</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
</standard>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Curation

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000586 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Curation/biblio.hfd -nk 000586 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Curation
   |type=    RBID
   |clé=     Pascal:10-0182400
   |texte=   Non-uniform slant estimation and correction for Farsi/Arabic handwritten words
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri