A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon
Identifieur interne : 001A21 ( Main/Merge ); précédent : 001A20; suivant : 001A22A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon
Auteurs : Howard Wasserman [États-Unis] ; Keitaro Yukawa [États-Unis] ; Bon Sy [États-Unis] ; Kui-Lam Kwok [États-Unis] ; Tsaiyun Phillips [États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: The algorithm described in this paper is designed to detect potential table regions in the document, to decide whether a potential table region is, in fact, a table, and, when it is, to analyze the table structure. The decision and analysis phases of the algorithm and the resulting system are based primarily on a precise definition of table, and it is such a definition that is discussed in this paper. An adequate definition need not be complete in the sense of encompassing all possible structures that might be deemed to be tables, but it should encompass most such structures, it should include essential features of tables, and it should exclude features never or very rarely possessed by tables.
Url:
DOI: 10.1007/3-540-45869-7_34
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 003886
- to stream Istex, to step Curation: 003623
- to stream Istex, to step Checkpoint: 001055
Links to Exploration step
ISTEX:3D251A429AB161E9C3EB70E91BCB3071791FFDDALe document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon</title>
<author><name sortKey="Wasserman, Howard" sort="Wasserman, Howard" uniqKey="Wasserman H" first="Howard" last="Wasserman">Howard Wasserman</name>
</author>
<author><name sortKey="Yukawa, Keitaro" sort="Yukawa, Keitaro" uniqKey="Yukawa K" first="Keitaro" last="Yukawa">Keitaro Yukawa</name>
</author>
<author><name sortKey="Sy, Bon" sort="Sy, Bon" uniqKey="Sy B" first="Bon" last="Sy">Bon Sy</name>
</author>
<author><name sortKey="Kwok, Kui Lam" sort="Kwok, Kui Lam" uniqKey="Kwok K" first="Kui-Lam" last="Kwok">Kui-Lam Kwok</name>
</author>
<author><name sortKey="Phillips, Tsaiyun" sort="Phillips, Tsaiyun" uniqKey="Phillips T" first="Tsaiyun" last="Phillips">Tsaiyun Phillips</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:3D251A429AB161E9C3EB70E91BCB3071791FFDDA</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_34</idno>
<idno type="url">https://api.istex.fr/document/3D251A429AB161E9C3EB70E91BCB3071791FFDDA/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003886</idno>
<idno type="wicri:Area/Istex/Curation">003623</idno>
<idno type="wicri:Area/Istex/Checkpoint">001055</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Wasserman H:a:theoretical:foundation</idno>
<idno type="wicri:Area/Main/Merge">001A21</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon</title>
<author><name sortKey="Wasserman, Howard" sort="Wasserman, Howard" uniqKey="Wasserman H" first="Howard" last="Wasserman">Howard Wasserman</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Yukawa, Keitaro" sort="Yukawa, Keitaro" uniqKey="Yukawa K" first="Keitaro" last="Yukawa">Keitaro Yukawa</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Sy, Bon" sort="Sy, Bon" uniqKey="Sy B" first="Bon" last="Sy">Bon Sy</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Kwok, Kui Lam" sort="Kwok, Kui Lam" uniqKey="Kwok K" first="Kui-Lam" last="Kwok">Kui-Lam Kwok</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Phillips, Tsaiyun" sort="Phillips, Tsaiyun" uniqKey="Phillips T" first="Tsaiyun" last="Phillips">Tsaiyun Phillips</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">3D251A429AB161E9C3EB70E91BCB3071791FFDDA</idno>
<idno type="DOI">10.1007/3-540-45869-7_34</idno>
<idno type="ChapterID">34</idno>
<idno type="ChapterID">Chap34</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The algorithm described in this paper is designed to detect potential table regions in the document, to decide whether a potential table region is, in fact, a table, and, when it is, to analyze the table structure. The decision and analysis phases of the algorithm and the resulting system are based primarily on a precise definition of table, and it is such a definition that is discussed in this paper. An adequate definition need not be complete in the sense of encompassing all possible structures that might be deemed to be tables, but it should encompass most such structures, it should include essential features of tables, and it should exclude features never or very rarely possessed by tables.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A21 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001A21 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= ISTEX:3D251A429AB161E9C3EB70E91BCB3071791FFDDA |texte= A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon }}
This area was generated with Dilib version V0.6.32. |