OcrV1, Main, Merge, bibRecord, 001947

Robust Frame Extraction and Removal for Processing Form Documents

Identifieur interne : 001947 ( Main/Merge ); précédent : 001946; suivant : 001948

Robust Frame Extraction and Removal for Processing Form Documents

Auteurs : Daisuke Nishiwaki [États-Unis] ; Masato Hayashi [Japon] ; Atsushi Sato [États-Unis]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.

RBID : ISTEX:B5CC639E804E78E0139A665BABE80BE757D1F1A3

Abstract

Abstract: A new frame extraction and a removal method for processing form documents is proposed. The method robustly extracts scanned preprintings such as frames and lines. It consists of a frame detection process and a frame removal process. In the frame detection process, the center coordinates are extracted using a Generalized Hough Transformation-based method. Then, using those coordinates, an inscribed rectangular image for each frame is produced. In the frame removal process, the detected frame image is removed along the outside of the rectangular edge. These processes are repeated to remove the target frames successfully by changing some pre-processings such as reducing and enhancing. The method was applied to some types of images. They are postal codes on mail and forms received by facsimiles. In both cases, there often can be seen low quailty pre-printings. For those low quality images, convetional approach such as model pattern maching was not well worked because of local distortion. Through experiments in frame detection and removal of the images, we demonstrated that all of the frames could be successfully removed.

Url:

https://api.istex.fr/document/B5CC639E804E78E0139A665BABE80BE757D1F1A3/fulltext/pdf

DOI: 10.1007/3-540-45868-9_4

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 003169
to stream Istex, to step Curation: 002F28
to stream Istex, to step Checkpoint: 000F81

Links to Exploration step

ISTEX:B5CC639E804E78E0139A665BABE80BE757D1F1A3

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Robust Frame Extraction and Removal for Processing Form Documents</title>
<author><name sortKey="Nishiwaki, Daisuke" sort="Nishiwaki, Daisuke" uniqKey="Nishiwaki D" first="Daisuke" last="Nishiwaki">Daisuke Nishiwaki</name>
</author>
<author><name sortKey="Hayashi, Masato" sort="Hayashi, Masato" uniqKey="Hayashi M" first="Masato" last="Hayashi">Masato Hayashi</name>
</author>
<author><name sortKey="Sato, Atsushi" sort="Sato, Atsushi" uniqKey="Sato A" first="Atsushi" last="Sato">Atsushi Sato</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:B5CC639E804E78E0139A665BABE80BE757D1F1A3</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45868-9_4</idno>
<idno type="url">https://api.istex.fr/document/B5CC639E804E78E0139A665BABE80BE757D1F1A3/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003169</idno>
<idno type="wicri:Area/Istex/Curation">002F28</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F81</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Nishiwaki D:robust:frame:extraction</idno>
<idno type="wicri:Area/Main/Merge">001947</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Robust Frame Extraction and Removal for Processing Form Documents</title>
<author><name sortKey="Nishiwaki, Daisuke" sort="Nishiwaki, Daisuke" uniqKey="Nishiwaki D" first="Daisuke" last="Nishiwaki">Daisuke Nishiwaki</name>
<affiliation wicri:level="1"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Multimedia Research Labs.</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Hayashi, Masato" sort="Hayashi, Masato" uniqKey="Hayashi M" first="Masato" last="Hayashi">Masato Hayashi</name>
<affiliation wicri:level="1"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Social Information Solution Division, NEC Corporation, 1-1, 4-chome Miyazaki, Miyamae-ku, Kawasaki, 216-8555, Kanagawa</wicri:regionArea>
<wicri:noRegion>Kanagawa</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Sato, Atsushi" sort="Sato, Atsushi" uniqKey="Sato A" first="Atsushi" last="Sato">Atsushi Sato</name>
<affiliation wicri:level="1"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Multimedia Research Labs.</wicri:regionArea>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">B5CC639E804E78E0139A665BABE80BE757D1F1A3</idno>
<idno type="DOI">10.1007/3-540-45868-9_4</idno>
<idno type="ChapterID">4</idno>
<idno type="ChapterID">Chap4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: A new frame extraction and a removal method for processing form documents is proposed. The method robustly extracts scanned preprintings such as frames and lines. It consists of a frame detection process and a frame removal process. In the frame detection process, the center coordinates are extracted using a Generalized Hough Transformation-based method. Then, using those coordinates, an inscribed rectangular image for each frame is produced. In the frame removal process, the detected frame image is removed along the outside of the rectangular edge. These processes are repeated to remove the target frames successfully by changing some pre-processings such as reducing and enhancing. The method was applied to some types of images. They are postal codes on mail and forms received by facsimiles. In both cases, there often can be seen low quailty pre-printings. For those low quality images, convetional approach such as model pattern maching was not well worked because of local distortion. Through experiments in frame detection and removal of the images, we demonstrated that all of the frames could be successfully removed.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001947 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001947 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     ISTEX:B5CC639E804E78E0139A665BABE80BE757D1F1A3
   |texte=   Robust Frame Extraction and Removal for Processing Form Documents
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Robust Frame Extraction and Removal for Processing Form Documents

Robust Frame Extraction and Removal for Processing Form Documents

Source :

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri