Robust Frame Extraction and Removal for Processing Form Documents
Identifieur interne : 000F81 ( Istex/Checkpoint ); précédent : 000F80; suivant : 000F82Robust Frame Extraction and Removal for Processing Form Documents
Auteurs : Daisuke Nishiwaki [États-Unis] ; Masato Hayashi [Japon] ; Atsushi Sato [États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: A new frame extraction and a removal method for processing form documents is proposed. The method robustly extracts scanned preprintings such as frames and lines. It consists of a frame detection process and a frame removal process. In the frame detection process, the center coordinates are extracted using a Generalized Hough Transformation-based method. Then, using those coordinates, an inscribed rectangular image for each frame is produced. In the frame removal process, the detected frame image is removed along the outside of the rectangular edge. These processes are repeated to remove the target frames successfully by changing some pre-processings such as reducing and enhancing. The method was applied to some types of images. They are postal codes on mail and forms received by facsimiles. In both cases, there often can be seen low quailty pre-printings. For those low quality images, convetional approach such as model pattern maching was not well worked because of local distortion. Through experiments in frame detection and removal of the images, we demonstrated that all of the frames could be successfully removed.
Url:
DOI: 10.1007/3-540-45868-9_4
Affiliations:
Links toward previous steps (curation, corpus...)
Links to Exploration step
ISTEX:B5CC639E804E78E0139A665BABE80BE757D1F1A3Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Robust Frame Extraction and Removal for Processing Form Documents</title>
<author><name sortKey="Nishiwaki, Daisuke" sort="Nishiwaki, Daisuke" uniqKey="Nishiwaki D" first="Daisuke" last="Nishiwaki">Daisuke Nishiwaki</name>
</author>
<author><name sortKey="Hayashi, Masato" sort="Hayashi, Masato" uniqKey="Hayashi M" first="Masato" last="Hayashi">Masato Hayashi</name>
</author>
<author><name sortKey="Sato, Atsushi" sort="Sato, Atsushi" uniqKey="Sato A" first="Atsushi" last="Sato">Atsushi Sato</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:B5CC639E804E78E0139A665BABE80BE757D1F1A3</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45868-9_4</idno>
<idno type="url">https://api.istex.fr/document/B5CC639E804E78E0139A665BABE80BE757D1F1A3/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003169</idno>
<idno type="wicri:Area/Istex/Curation">002F28</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F81</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Robust Frame Extraction and Removal for Processing Form Documents</title>
<author><name sortKey="Nishiwaki, Daisuke" sort="Nishiwaki, Daisuke" uniqKey="Nishiwaki D" first="Daisuke" last="Nishiwaki">Daisuke Nishiwaki</name>
<affiliation wicri:level="1"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Multimedia Research Labs.</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Hayashi, Masato" sort="Hayashi, Masato" uniqKey="Hayashi M" first="Masato" last="Hayashi">Masato Hayashi</name>
<affiliation wicri:level="1"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Social Information Solution Division, NEC Corporation, 1-1, 4-chome Miyazaki, Miyamae-ku, Kawasaki, 216-8555, Kanagawa</wicri:regionArea>
<wicri:noRegion>Kanagawa</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sato, Atsushi" sort="Sato, Atsushi" uniqKey="Sato A" first="Atsushi" last="Sato">Atsushi Sato</name>
<affiliation wicri:level="1"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Multimedia Research Labs.</wicri:regionArea>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">B5CC639E804E78E0139A665BABE80BE757D1F1A3</idno>
<idno type="DOI">10.1007/3-540-45868-9_4</idno>
<idno type="ChapterID">4</idno>
<idno type="ChapterID">Chap4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: A new frame extraction and a removal method for processing form documents is proposed. The method robustly extracts scanned preprintings such as frames and lines. It consists of a frame detection process and a frame removal process. In the frame detection process, the center coordinates are extracted using a Generalized Hough Transformation-based method. Then, using those coordinates, an inscribed rectangular image for each frame is produced. In the frame removal process, the detected frame image is removed along the outside of the rectangular edge. These processes are repeated to remove the target frames successfully by changing some pre-processings such as reducing and enhancing. The method was applied to some types of images. They are postal codes on mail and forms received by facsimiles. In both cases, there often can be seen low quailty pre-printings. For those low quality images, convetional approach such as model pattern maching was not well worked because of local distortion. Through experiments in frame detection and removal of the images, we demonstrated that all of the frames could be successfully removed.</div>
</front>
</TEI>
<affiliations><list><country><li>Japon</li>
<li>États-Unis</li>
</country>
</list>
<tree><country name="États-Unis"><noRegion><name sortKey="Nishiwaki, Daisuke" sort="Nishiwaki, Daisuke" uniqKey="Nishiwaki D" first="Daisuke" last="Nishiwaki">Daisuke Nishiwaki</name>
</noRegion>
<name sortKey="Sato, Atsushi" sort="Sato, Atsushi" uniqKey="Sato A" first="Atsushi" last="Sato">Atsushi Sato</name>
</country>
<country name="Japon"><noRegion><name sortKey="Hayashi, Masato" sort="Hayashi, Masato" uniqKey="Hayashi M" first="Masato" last="Hayashi">Masato Hayashi</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F81 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Istex/Checkpoint/biblio.hfd -nk 000F81 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Istex |étape= Checkpoint |type= RBID |clé= ISTEX:B5CC639E804E78E0139A665BABE80BE757D1F1A3 |texte= Robust Frame Extraction and Removal for Processing Form Documents }}
This area was generated with Dilib version V0.6.32. |