Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon

Identifieur interne : 001A21 ( Main/Merge ); précédent : 001A20; suivant : 001A22

A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon

Auteurs : Howard Wasserman [États-Unis] ; Keitaro Yukawa [États-Unis] ; Bon Sy [États-Unis] ; Kui-Lam Kwok [États-Unis] ; Tsaiyun Phillips [États-Unis]

Source :

RBID : ISTEX:3D251A429AB161E9C3EB70E91BCB3071791FFDDA

Abstract

Abstract: The algorithm described in this paper is designed to detect potential table regions in the document, to decide whether a potential table region is, in fact, a table, and, when it is, to analyze the table structure. The decision and analysis phases of the algorithm and the resulting system are based primarily on a precise definition of table, and it is such a definition that is discussed in this paper. An adequate definition need not be complete in the sense of encompassing all possible structures that might be deemed to be tables, but it should encompass most such structures, it should include essential features of tables, and it should exclude features never or very rarely possessed by tables.

Url:
DOI: 10.1007/3-540-45869-7_34

Links toward previous steps (curation, corpus...)


Links to Exploration step

ISTEX:3D251A429AB161E9C3EB70E91BCB3071791FFDDA

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon</title>
<author>
<name sortKey="Wasserman, Howard" sort="Wasserman, Howard" uniqKey="Wasserman H" first="Howard" last="Wasserman">Howard Wasserman</name>
</author>
<author>
<name sortKey="Yukawa, Keitaro" sort="Yukawa, Keitaro" uniqKey="Yukawa K" first="Keitaro" last="Yukawa">Keitaro Yukawa</name>
</author>
<author>
<name sortKey="Sy, Bon" sort="Sy, Bon" uniqKey="Sy B" first="Bon" last="Sy">Bon Sy</name>
</author>
<author>
<name sortKey="Kwok, Kui Lam" sort="Kwok, Kui Lam" uniqKey="Kwok K" first="Kui-Lam" last="Kwok">Kui-Lam Kwok</name>
</author>
<author>
<name sortKey="Phillips, Tsaiyun" sort="Phillips, Tsaiyun" uniqKey="Phillips T" first="Tsaiyun" last="Phillips">Tsaiyun Phillips</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:3D251A429AB161E9C3EB70E91BCB3071791FFDDA</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_34</idno>
<idno type="url">https://api.istex.fr/document/3D251A429AB161E9C3EB70E91BCB3071791FFDDA/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003886</idno>
<idno type="wicri:Area/Istex/Curation">003623</idno>
<idno type="wicri:Area/Istex/Checkpoint">001055</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Wasserman H:a:theoretical:foundation</idno>
<idno type="wicri:Area/Main/Merge">001A21</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon</title>
<author>
<name sortKey="Wasserman, Howard" sort="Wasserman, Howard" uniqKey="Wasserman H" first="Howard" last="Wasserman">Howard Wasserman</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Yukawa, Keitaro" sort="Yukawa, Keitaro" uniqKey="Yukawa K" first="Keitaro" last="Yukawa">Keitaro Yukawa</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Sy, Bon" sort="Sy, Bon" uniqKey="Sy B" first="Bon" last="Sy">Bon Sy</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Kwok, Kui Lam" sort="Kwok, Kui Lam" uniqKey="Kwok K" first="Kui-Lam" last="Kwok">Kui-Lam Kwok</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Phillips, Tsaiyun" sort="Phillips, Tsaiyun" uniqKey="Phillips T" first="Tsaiyun" last="Phillips">Tsaiyun Phillips</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<placeName>
<region type="state">État de New York</region>
</placeName>
<wicri:cityArea>Department of Computer Science, Queens College, the City University of New York, 65-30 Kissena Boulevard, 11367-1597, Flushing</wicri:cityArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">3D251A429AB161E9C3EB70E91BCB3071791FFDDA</idno>
<idno type="DOI">10.1007/3-540-45869-7_34</idno>
<idno type="ChapterID">34</idno>
<idno type="ChapterID">Chap34</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: The algorithm described in this paper is designed to detect potential table regions in the document, to decide whether a potential table region is, in fact, a table, and, when it is, to analyze the table structure. The decision and analysis phases of the algorithm and the resulting system are based primarily on a precise definition of table, and it is such a definition that is discussed in this paper. An adequate definition need not be complete in the sense of encompassing all possible structures that might be deemed to be tables, but it should encompass most such structures, it should include essential features of tables, and it should exclude features never or very rarely possessed by tables.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A21 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001A21 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     ISTEX:3D251A429AB161E9C3EB70E91BCB3071791FFDDA
   |texte=   A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024