Text/Graphics Separation in Maps
Identifieur interne : 001926 ( Main/Merge ); précédent : 001925; suivant : 001927Text/Graphics Separation in Maps
Auteurs : Ruini Cao [Singapour] ; Lim Tan [Singapour]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: The separation of overlapping text and graphics is a challenging problem in document image analysis. This paper proposes a specific method of detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. Experimental results showed that the proposed method improved the percentage of correctly detected text as well as the accuracy of character recognition significantly.
Url:
DOI: 10.1007/3-540-45868-9_14
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 001203
- to stream Istex, to step Curation: 001130
- to stream Istex, to step Checkpoint: 000F60
Links to Exploration step
ISTEX:2806202E744FA13270D3CC536B7030DC32E25B0ALe document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Text/Graphics Separation in Maps</title>
<author><name sortKey="Cao, Ruini" sort="Cao, Ruini" uniqKey="Cao R" first="Ruini" last="Cao">Ruini Cao</name>
</author>
<author><name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:2806202E744FA13270D3CC536B7030DC32E25B0A</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45868-9_14</idno>
<idno type="url">https://api.istex.fr/document/2806202E744FA13270D3CC536B7030DC32E25B0A/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001203</idno>
<idno type="wicri:Area/Istex/Curation">001130</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F60</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Cao R:text:graphics:separation</idno>
<idno type="wicri:Area/Main/Merge">001926</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Text/Graphics Separation in Maps</title>
<author><name sortKey="Cao, Ruini" sort="Cao, Ruini" uniqKey="Cao R" first="Ruini" last="Cao">Ruini Cao</name>
<affiliation wicri:level="4"><country xml:lang="fr">Singapour</country>
<wicri:regionArea>School of Computing, National University of Singapore, 3 Science Drive 2, 117543</wicri:regionArea>
<orgName type="university">Université nationale de Singapour</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Singapour</country>
</affiliation>
</author>
<author><name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
<affiliation wicri:level="4"><country xml:lang="fr">Singapour</country>
<wicri:regionArea>School of Computing, National University of Singapore, 3 Science Drive 2, 117543</wicri:regionArea>
<orgName type="university">Université nationale de Singapour</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Singapour</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">2806202E744FA13270D3CC536B7030DC32E25B0A</idno>
<idno type="DOI">10.1007/3-540-45868-9_14</idno>
<idno type="ChapterID">14</idno>
<idno type="ChapterID">Chap14</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The separation of overlapping text and graphics is a challenging problem in document image analysis. This paper proposes a specific method of detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. Experimental results showed that the proposed method improved the percentage of correctly detected text as well as the accuracy of character recognition significantly.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001926 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001926 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= ISTEX:2806202E744FA13270D3CC536B7030DC32E25B0A |texte= Text/Graphics Separation in Maps }}
This area was generated with Dilib version V0.6.32. |