Offline Handwritten Arabic Character Segmentation with Probabilistic Model
Identifieur interne : 001051 ( Main/Merge ); précédent : 001050; suivant : 001052Offline Handwritten Arabic Character Segmentation with Probabilistic Model
Auteurs : Pingping Xiu [République populaire de Chine] ; Liangrui Peng [République populaire de Chine] ; Xiaoqing Ding [République populaire de Chine] ; Hua Wang [République populaire de Chine]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2006.
Abstract
Abstract: The research on offline handwritten Arabic character recognition has received more and more attention in recent years, because of the increasing needs of Arabic document digitization. The variation in Arabic handwriting brings great difficulty in character segmentation and recognition, eg., the sub-parts (diacritics) of the Arabic character may shift away from the main part. In this paper, a new probabilistic segmentation model is proposed. First, a contour-based over-segmentation method is conducted, cutting the word image into graphemes. The graphemes are sorted into 3 queues, which are character main parts, sub-parts (diacritics) above or below main parts respectively. The confidence for each character is calculated by the probabilistic model, taking into account both of the recognizer output and the geometric confidence besides with logical constraint. Then, the global optimization is conducted to find optimal cutting path, taking weighted average of character confidences as objective function. Experiments on handwritten Arabic documents with various writing styles show the proposed method is effective.
Url:
DOI: 10.1007/11669487_36
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000695
- to stream Istex, to step Curation: 000687
- to stream Istex, to step Checkpoint: 000A07
Links to Exploration step
ISTEX:E97F72643B849104A483AD5BE74808CEC4A1E1BBLe document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Offline Handwritten Arabic Character Segmentation with Probabilistic Model</title>
<author><name sortKey="Xiu, Pingping" sort="Xiu, Pingping" uniqKey="Xiu P" first="Pingping" last="Xiu">Pingping Xiu</name>
</author>
<author><name sortKey="Peng, Liangrui" sort="Peng, Liangrui" uniqKey="Peng L" first="Liangrui" last="Peng">Liangrui Peng</name>
</author>
<author><name sortKey="Ding, Xiaoqing" sort="Ding, Xiaoqing" uniqKey="Ding X" first="Xiaoqing" last="Ding">Xiaoqing Ding</name>
</author>
<author><name sortKey="Wang, Hua" sort="Wang, Hua" uniqKey="Wang H" first="Hua" last="Wang">Hua Wang</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:E97F72643B849104A483AD5BE74808CEC4A1E1BB</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11669487_36</idno>
<idno type="url">https://api.istex.fr/document/E97F72643B849104A483AD5BE74808CEC4A1E1BB/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000695</idno>
<idno type="wicri:Area/Istex/Curation">000687</idno>
<idno type="wicri:Area/Istex/Checkpoint">000A07</idno>
<idno type="wicri:doubleKey">0302-9743:2006:Xiu P:offline:handwritten:arabic</idno>
<idno type="wicri:Area/Main/Merge">001051</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Offline Handwritten Arabic Character Segmentation with Probabilistic Model</title>
<author><name sortKey="Xiu, Pingping" sort="Xiu, Pingping" uniqKey="Xiu P" first="Pingping" last="Xiu">Pingping Xiu</name>
<affiliation wicri:level="3"><country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Dept. of Electronic Engineering, Tsinghua University, State Key Laboratory of Intelligent Technology and Systems, 100084, Beijing</wicri:regionArea>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">République populaire de Chine</country>
</affiliation>
</author>
<author><name sortKey="Peng, Liangrui" sort="Peng, Liangrui" uniqKey="Peng L" first="Liangrui" last="Peng">Liangrui Peng</name>
<affiliation wicri:level="3"><country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Dept. of Electronic Engineering, Tsinghua University, State Key Laboratory of Intelligent Technology and Systems, 100084, Beijing</wicri:regionArea>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">République populaire de Chine</country>
</affiliation>
</author>
<author><name sortKey="Ding, Xiaoqing" sort="Ding, Xiaoqing" uniqKey="Ding X" first="Xiaoqing" last="Ding">Xiaoqing Ding</name>
<affiliation wicri:level="3"><country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Dept. of Electronic Engineering, Tsinghua University, State Key Laboratory of Intelligent Technology and Systems, 100084, Beijing</wicri:regionArea>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">République populaire de Chine</country>
</affiliation>
</author>
<author><name sortKey="Wang, Hua" sort="Wang, Hua" uniqKey="Wang H" first="Hua" last="Wang">Hua Wang</name>
<affiliation wicri:level="3"><country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Dept. of Electronic Engineering, Tsinghua University, State Key Laboratory of Intelligent Technology and Systems, 100084, Beijing</wicri:regionArea>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">République populaire de Chine</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2006</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">E97F72643B849104A483AD5BE74808CEC4A1E1BB</idno>
<idno type="DOI">10.1007/11669487_36</idno>
<idno type="ChapterID">36</idno>
<idno type="ChapterID">Chap36</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The research on offline handwritten Arabic character recognition has received more and more attention in recent years, because of the increasing needs of Arabic document digitization. The variation in Arabic handwriting brings great difficulty in character segmentation and recognition, eg., the sub-parts (diacritics) of the Arabic character may shift away from the main part. In this paper, a new probabilistic segmentation model is proposed. First, a contour-based over-segmentation method is conducted, cutting the word image into graphemes. The graphemes are sorted into 3 queues, which are character main parts, sub-parts (diacritics) above or below main parts respectively. The confidence for each character is calculated by the probabilistic model, taking into account both of the recognizer output and the geometric confidence besides with logical constraint. Then, the global optimization is conducted to find optimal cutting path, taking weighted average of character confidences as objective function. Experiments on handwritten Arabic documents with various writing styles show the proposed method is effective.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001051 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001051 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= ISTEX:E97F72643B849104A483AD5BE74808CEC4A1E1BB |texte= Offline Handwritten Arabic Character Segmentation with Probabilistic Model }}
This area was generated with Dilib version V0.6.32. |