Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Learning to Detect Tables in Scanned Document Images using Line Information

Identifieur interne : 000077 ( Hal/Corpus ); précédent : 000076; suivant : 000078

Learning to Detect Tables in Scanned Document Images using Line Information

Auteurs : Thotreingam Kasar ; Philippine Barlas ; Adam Sébastien ; Clément Chatelain ; Thierry Paquet

Source :

RBID : Hal:hal-00934902

Descripteurs français

Abstract

This paper presents a method to detect table regions in document images by identifying the column and row line-separators and their properties. The method employs a run-length approach to identify the horizontal and vertical lines present in the input image. From each group of intersecting horizontal and vertical lines, a set of 26 low-level features are extracted and an SVM classifier is used to test if it belongs to a table or not. The performance of the method is evaluated on a heterogeneous corpus of French, English and Arabic documents that contain various types of table structures and compared with that of the Tesseract OCR system.

Url:
DOI: 10.1109/ICDAR.2013.240

Links to Exploration step

Hal:hal-00934902

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Learning to Detect Tables in Scanned Document Images using Line Information</title>
<author>
<name sortKey="Kasar, Thotreingam" sort="Kasar, Thotreingam" uniqKey="Kasar T" first="Thotreingam" last="Kasar">Thotreingam Kasar</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Barlas, Philippine" sort="Barlas, Philippine" uniqKey="Barlas P" first="Philippine" last="Barlas">Philippine Barlas</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Sebastien, Adam" sort="Sebastien, Adam" uniqKey="Sebastien A" first="Adam" last="Sébastien">Adam Sébastien</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Chatelain, Clement" sort="Chatelain, Clement" uniqKey="Chatelain C" first="Clément" last="Chatelain">Clément Chatelain</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00934902</idno>
<idno type="halId">hal-00934902</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-00934902</idno>
<idno type="url">https://hal.archives-ouvertes.fr/hal-00934902</idno>
<idno type="doi">10.1109/ICDAR.2013.240</idno>
<date when="2013-08-25">2013-08-25</date>
<idno type="wicri:Area/Hal/Corpus">000077</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Learning to Detect Tables in Scanned Document Images using Line Information</title>
<author>
<name sortKey="Kasar, Thotreingam" sort="Kasar, Thotreingam" uniqKey="Kasar T" first="Thotreingam" last="Kasar">Thotreingam Kasar</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Barlas, Philippine" sort="Barlas, Philippine" uniqKey="Barlas P" first="Philippine" last="Barlas">Philippine Barlas</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Sebastien, Adam" sort="Sebastien, Adam" uniqKey="Sebastien A" first="Adam" last="Sébastien">Adam Sébastien</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Chatelain, Clement" sort="Chatelain, Clement" uniqKey="Chatelain C" first="Clément" last="Chatelain">Clément Chatelain</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<affiliation>
<hal:affiliation type="researchteam" xml:id="struct-389520" status="INCOMING">
<orgName>DOCAPP</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-23832" type="direct"></relation>
<relation active="#struct-300317" type="indirect"></relation>
<relation name="EA4108" active="#struct-300318" type="indirect"></relation>
<relation active="#struct-301288" type="indirect"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-23832" type="direct">
<org type="laboratory" xml:id="struct-23832" status="VALID">
<orgName>Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</orgName>
<orgName type="acronym">LITIS</orgName>
<desc>
<address>
<addrLine>Avenue de l'Université UFR des Sciences et Techniques 76800 Saint-Etienne du Rouvray</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.litislab.eu</ref>
</desc>
<listRelation>
<relation active="#struct-300317" type="direct"></relation>
<relation name="EA4108" active="#struct-300318" type="direct"></relation>
<relation active="#struct-301288" type="direct"></relation>
<relation active="#struct-301232" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300317" type="indirect">
<org type="institution" xml:id="struct-300317" status="VALID">
<orgName>Université du Havre</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="EA4108" active="#struct-300318" type="indirect">
<org type="institution" xml:id="struct-300318" status="VALID">
<orgName>Université de Rouen</orgName>
<desc>
<address>
<addrLine> 1 rue Thomas Becket - 76821 Mont-Saint-Aignan</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rouen.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301288" type="indirect">
<org type="department" xml:id="struct-301288" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rouen</orgName>
<orgName type="acronym">INSA Rouen</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-301232" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
</affiliation>
</author>
</analytic>
<idno type="DOI">10.1109/ICDAR.2013.240</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="fr">
<term>Table detection</term>
<term>line detection</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper presents a method to detect table regions in document images by identifying the column and row line-separators and their properties. The method employs a run-length approach to identify the horizontal and vertical lines present in the input image. From each group of intersecting horizontal and vertical lines, a set of 26 low-level features are extracted and an SVM classifier is used to test if it belongs to a table or not. The performance of the method is evaluated on a heterogeneous corpus of French, English and Arabic documents that contain various types of table structures and compared with that of the Tesseract OCR system.</div>
</front>
</TEI>
<hal api="V3">
<titleStmt>
<title xml:lang="en">Learning to Detect Tables in Scanned Document Images using Line Information</title>
<author role="aut">
<persName>
<forename type="first">Thotreingam</forename>
<surname>Kasar</surname>
</persName>
<email></email>
<idno type="halauthor">944933</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Philippine</forename>
<surname>Barlas</surname>
</persName>
<email></email>
<idno type="halauthor">944934</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Adam</forename>
<surname>Sébastien</surname>
</persName>
<email></email>
<idno type="halauthor">974130</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Clément</forename>
<surname>Chatelain</surname>
</persName>
<email>clement.chatelain@univ-rouen.fr</email>
<idno type="idhal">clement-chatelain</idno>
<idno type="halauthor">134991</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Thierry</forename>
<surname>Paquet</surname>
</persName>
<email></email>
<idno type="halauthor">86898</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<editor role="depositor">
<persName>
<forename>Sébastien</forename>
<surname>Adam</surname>
</persName>
<email>Sebastien.Adam@univ-rouen.fr</email>
</editor>
</titleStmt>
<editionStmt>
<edition n="v1" type="current">
<date type="whenSubmitted">2014-01-22 17:16:18</date>
<date type="whenWritten">2013-02-04</date>
<date type="whenModified">2014-10-28 17:58:39</date>
<date type="whenReleased">2014-01-30 08:36:13</date>
<date type="whenProduced">2013-08-25</date>
<date type="whenEndEmbargoed">2014-01-22</date>
<ref type="file" target="https://hal.archives-ouvertes.fr/hal-00934902/document">
<date notBefore="2014-01-22"></date>
</ref>
<ref type="file" subtype="author" n="1" target="https://hal.archives-ouvertes.fr/hal-00934902/file/ICDAR_table_detection.pdf">
<date notBefore="2014-01-22"></date>
</ref>
</edition>
<respStmt>
<resp>contributor</resp>
<name key="131493">
<persName>
<forename>Sébastien</forename>
<surname>Adam</surname>
</persName>
<email>Sebastien.Adam@univ-rouen.fr</email>
</name>
</respStmt>
</editionStmt>
<publicationStmt>
<distributor>CCSD</distributor>
<idno type="halId">hal-00934902</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-00934902</idno>
<idno type="halBibtex">kasar:hal-00934902</idno>
<idno type="halRefHtml">Internation conférence on document analysis and recognition, Aug 2013, Washington, United States. pp.1185 - 1189, 2014, <10.1109/ICDAR.2013.240></idno>
<idno type="halRef">Internation conférence on document analysis and recognition, Aug 2013, Washington, United States. pp.1185 - 1189, 2014, <10.1109/ICDAR.2013.240></idno>
</publicationStmt>
<seriesStmt>
<idno type="stamp" n="UNIV-LEHAVRE">Université du Havre</idno>
<idno type="stamp" n="UNIV-ROUEN">Université de Rouen</idno>
<idno type="stamp" n="LITIS">Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes</idno>
</seriesStmt>
<notesStmt>
<note type="audience" n="2">International</note>
<note type="invited" n="0">No</note>
<note type="popular" n="0">No</note>
<note type="peer" n="1">Yes</note>
<note type="proceedings" n="1">Yes</note>
</notesStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Learning to Detect Tables in Scanned Document Images using Line Information</title>
<author role="aut">
<persName>
<forename type="first">Thotreingam</forename>
<surname>Kasar</surname>
</persName>
<idno type="halAuthorId">944933</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Philippine</forename>
<surname>Barlas</surname>
</persName>
<idno type="halAuthorId">944934</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Adam</forename>
<surname>Sébastien</surname>
</persName>
<idno type="halAuthorId">974130</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Clément</forename>
<surname>Chatelain</surname>
</persName>
<email>clement.chatelain@univ-rouen.fr</email>
<idno type="idHal">clement-chatelain</idno>
<idno type="halAuthorId">134991</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Thierry</forename>
<surname>Paquet</surname>
</persName>
<idno type="halAuthorId">86898</idno>
<affiliation ref="#struct-389520"></affiliation>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of ICDAR'13</title>
<meeting>
<title>Internation conférence on document analysis and recognition</title>
<date type="start">2013-08-25</date>
<date type="end">2013-08-28</date>
<settlement>Washington</settlement>
<country key="US">United States</country>
</meeting>
<imprint>
<biblScope unit="pp">1185 - 1189</biblScope>
<date type="datePub">2014-08-16</date>
</imprint>
</monogr>
<idno type="doi">10.1109/ICDAR.2013.240</idno>
</biblStruct>
</sourceDesc>
<profileDesc>
<langUsage>
<language ident="en">English</language>
</langUsage>
<textClass>
<keywords scheme="author">
<term xml:lang="fr">Table detection</term>
<term xml:lang="fr">line detection</term>
</keywords>
<classCode scheme="halDomain" n="info.info-tt">Computer Science [cs]/Document and Text Processing</classCode>
<classCode scheme="halTypology" n="COMM">Conference papers</classCode>
</textClass>
<abstract xml:lang="en">This paper presents a method to detect table regions in document images by identifying the column and row line-separators and their properties. The method employs a run-length approach to identify the horizontal and vertical lines present in the input image. From each group of intersecting horizontal and vertical lines, a set of 26 low-level features are extracted and an SVM classifier is used to test if it belongs to a table or not. The performance of the method is evaluated on a heterogeneous corpus of French, English and Arabic documents that contain various types of table structures and compared with that of the Tesseract OCR system.</abstract>
</profileDesc>
</hal>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Hal/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000077 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Hal/Corpus/biblio.hfd -nk 000077 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Hal
   |étape=   Corpus
   |type=    RBID
   |clé=     Hal:hal-00934902
   |texte=   Learning to Detect Tables in Scanned Document Images using Line Information
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024