OcrV1, Main, Merge, bibRecord, 002840

Automatic document processing: A survey

Identifieur interne : 002840 ( Main/Merge ); précédent : 002839; suivant : 002841

Automatic document processing: A survey

Auteurs : Yuan Y. Tang [Hong Kong] ; Seong-Whan Lee [Corée du Sud] ; Ching Y. Suen [Corée du Sud]

Source :

Pattern Recognition [ 0031-3203 ] ; 1996.

RBID : ISTEX:481F02E8D4008B6F673C509E6D43CC955535D415

Abstract

Surveys of the basic concepts and underlying techniques are presented in this paper. A basic model for document processing is described. In this model, document processing can be divided into two phases: document analysis and document understanding. A document has two structures: geometric (layout) structure and logical structure. Extraction of the geometric structure from a document refers to document analysis; mapping the geometric structure into logical structure deals with document understanding. Both types of document structures and the two areas of document processing are discussed. Two categories of methods have been used in document analysis, namely, (1) hierarchical methods including top-down and bottomdashup approaches, (2) no-hierarchical methods including modified fractal signature. Tree transform, formatting knowledge and description language approaches have been used in document understanding. A particular case of form document processing is discussed. Form description and form registration approaches are presented. A form processing system is also introduced. Finally, many techniques, such as skew detection, Hough transform, Gabor filters, projection, crossing counts, form definition language, etc. which have been used in these approaches are discussed.

Url:

https://api.istex.fr/document/481F02E8D4008B6F673C509E6D43CC955535D415/fulltext/pdf

DOI: 10.1016/S0031-3203(96)00044-1

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 000B82
to stream Istex, to step Curation: 000B67
to stream Istex, to step Checkpoint: 001B07

Links to Exploration step

ISTEX:481F02E8D4008B6F673C509E6D43CC955535D415

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title>Automatic document processing: A survey</title>
<author><name sortKey="Tang, Yuan Y" sort="Tang, Yuan Y" uniqKey="Tang Y" first="Yuan Y." last="Tang">Yuan Y. Tang</name>
</author>
<author><name sortKey="Lee, Seong Whan" sort="Lee, Seong Whan" uniqKey="Lee S" first="Seong-Whan" last="Lee">Seong-Whan Lee</name>
</author>
<author><name sortKey="Suen, Ching Y" sort="Suen, Ching Y" uniqKey="Suen C" first="Ching Y." last="Suen">Ching Y. Suen</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:481F02E8D4008B6F673C509E6D43CC955535D415</idno>
<date when="1996" year="1996">1996</date>
<idno type="doi">10.1016/S0031-3203(96)00044-1</idno>
<idno type="url">https://api.istex.fr/document/481F02E8D4008B6F673C509E6D43CC955535D415/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B82</idno>
<idno type="wicri:Area/Istex/Curation">000B67</idno>
<idno type="wicri:Area/Istex/Checkpoint">001B07</idno>
<idno type="wicri:doubleKey">0031-3203:1996:Tang Y:automatic:document:processing</idno>
<idno type="wicri:Area/Main/Merge">002840</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a">Automatic document processing: A survey</title>
<author><name sortKey="Tang, Yuan Y" sort="Tang, Yuan Y" uniqKey="Tang Y" first="Yuan Y." last="Tang">Yuan Y. Tang</name>
<affiliation wicri:level="1"><country wicri:rule="url">Hong Kong</country>
</affiliation>
<affiliation wicri:level="1"><country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computing Studies, Hong Kong Baptist University, Kowloon Tong, Kowloon</wicri:regionArea>
<wicri:noRegion>Kowloon</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Lee, Seong Whan" sort="Lee, Seong Whan" uniqKey="Lee S" first="Seong-Whan" last="Lee">Seong-Whan Lee</name>
<affiliation wicri:level="1"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science, Korea University, 1, 5-ka, Anamdashdong, Seongbuk-ku, Seoul 136–701</wicri:regionArea>
<placeName><settlement type="city">Séoul</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Suen, Ching Y" sort="Suen, Ching Y" uniqKey="Suen C" first="Ching Y." last="Suen">Ching Y. Suen</name>
<affiliation wicri:level="1"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science, Korea University, 1, 5-ka, Anamdashdong, Seongbuk-ku, Seoul 136–701</wicri:regionArea>
<placeName><settlement type="city">Séoul</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="ISSN">0031-3203</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="1996">1996</date>
<biblScope unit="volume">29</biblScope>
<biblScope unit="issue">12</biblScope>
<biblScope unit="page" from="1931">1931</biblScope>
<biblScope unit="page" to="1952">1952</biblScope>
</imprint>
<idno type="ISSN">0031-3203</idno>
</series>
<idno type="istex">481F02E8D4008B6F673C509E6D43CC955535D415</idno>
<idno type="DOI">10.1016/S0031-3203(96)00044-1</idno>
<idno type="PII">S0031-3203(96)00044-1</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Surveys of the basic concepts and underlying techniques are presented in this paper. A basic model for document processing is described. In this model, document processing can be divided into two phases: document analysis and document understanding. A document has two structures: geometric (layout) structure and logical structure. Extraction of the geometric structure from a document refers to document analysis; mapping the geometric structure into logical structure deals with document understanding. Both types of document structures and the two areas of document processing are discussed. Two categories of methods have been used in document analysis, namely, (1) hierarchical methods including top-down and bottomdashup approaches, (2) no-hierarchical methods including modified fractal signature. Tree transform, formatting knowledge and description language approaches have been used in document understanding. A particular case of form document processing is discussed. Form description and form registration approaches are presented. A form processing system is also introduced. Finally, many techniques, such as skew detection, Hough transform, Gabor filters, projection, crossing counts, form definition language, etc. which have been used in these approaches are discussed.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002840 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 002840 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     ISTEX:481F02E8D4008B6F673C509E6D43CC955535D415
   |texte=   Automatic document processing: A survey
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Automatic document processing: A survey

Automatic document processing: A survey

Source :

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri