Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Labelling logical structures of document images using a dynamic perceptive neural network

Identifieur interne : 000077 ( PascalFrancis/Corpus ); précédent : 000076; suivant : 000078

Labelling logical structures of document images using a dynamic perceptive neural network

Auteurs : Yves Rangoni ; Abdet Belaïd ; Szilárd Vajda

Source :

RBID : Pascal:12-0415345

Descripteurs français

English descriptors

Abstract

This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 1433-2833
A03   1    @0 Int. j. doc. anal. recognit. : (Print)
A05       @2 15
A06       @2 1
A08 01  1  ENG  @1 Labelling logical structures of document images using a dynamic perceptive neural network
A11 01  1    @1 RANGONI (Yves)
A11 02  1    @1 BELAÏD (Abdet)
A11 03  1    @1 VAJDA (Szilárd)
A14 01      @1 Nancy 2 University, LORIA @2 Vandœuvre-Lès-Nancy @3 FRA @Z 1 aut. @Z 2 aut.
A14 02      @1 Computer Science Department @2 TU Dortmund, Dortmund @3 DEU @Z 3 aut.
A20       @1 45-55
A21       @1 2012
A23 01      @0 ENG
A43 01      @1 INIST @2 26790 @5 354000508427810030
A44       @0 0000 @1 © 2012 INIST-CNRS. All rights reserved.
A45       @0 54 ref.
A47 01  1    @0 12-0415345
A60       @1 P
A61       @0 A
A64 01  1    @0 International journal on document analysis and recognition : (Print)
A66 01      @0 DEU
C01 01    ENG  @0 This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.
C02 01  X    @0 001D02C03
C02 02  X    @0 001D02C06
C02 03  X    @0 001D02B07B
C03 01  X  FRE  @0 Etiquetage @5 06
C03 01  X  ENG  @0 Labelling @5 06
C03 01  X  SPA  @0 Etiquetaje @5 06
C03 02  X  FRE  @0 Système dynamique @5 07
C03 02  X  ENG  @0 Dynamical system @5 07
C03 02  X  SPA  @0 Sistema dinámico @5 07
C03 03  X  FRE  @0 Reconnaissance caractère @5 08
C03 03  X  ENG  @0 Character recognition @5 08
C03 03  X  SPA  @0 Reconocimiento carácter @5 08
C03 04  X  FRE  @0 Reconnaissance optique caractère @5 09
C03 04  X  ENG  @0 Optical character recognition @5 09
C03 04  X  SPA  @0 Reconocimento óptico de caracteres @5 09
C03 05  X  FRE  @0 Texte @5 10
C03 05  X  ENG  @0 Text @5 10
C03 05  X  SPA  @0 Texto @5 10
C03 06  X  FRE  @0 Classification @5 11
C03 06  X  ENG  @0 Classification @5 11
C03 06  X  SPA  @0 Clasificación @5 11
C03 07  X  FRE  @0 Analyse documentaire @5 12
C03 07  X  ENG  @0 Document analysis @5 12
C03 07  X  SPA  @0 Análisis documental @5 12
C03 08  X  FRE  @0 Analyse image @5 13
C03 08  X  ENG  @0 Image analysis @5 13
C03 08  X  SPA  @0 Análisis imagen @5 13
C03 09  X  FRE  @0 Reconnaissance image @5 14
C03 09  X  ENG  @0 Image recognition @5 14
C03 09  X  SPA  @0 Reconocimiento imagen @5 14
C03 10  X  FRE  @0 Traitement image @5 15
C03 10  X  ENG  @0 Image processing @5 15
C03 10  X  SPA  @0 Procesamiento imagen @5 15
C03 11  X  FRE  @0 Structure document @5 18
C03 11  X  ENG  @0 Document structure @5 18
C03 11  X  SPA  @0 Estructura documental @5 18
C03 12  X  FRE  @0 Présentation document @5 19
C03 12  X  ENG  @0 Document layout @5 19
C03 12  X  SPA  @0 Presentación documento @5 19
C03 13  X  FRE  @0 Perception sensorielle @5 20
C03 13  X  ENG  @0 Sensorial perception @5 20
C03 13  X  SPA  @0 Percepción sensorial @5 20
C03 14  X  FRE  @0 Temps occupation @5 21
C03 14  X  ENG  @0 Occupation time @5 21
C03 14  X  SPA  @0 Tiempo ocupación @5 21
C03 15  X  FRE  @0 Taux erreur @5 22
C03 15  X  ENG  @0 Error rate @5 22
C03 15  X  SPA  @0 Indice error @5 22
C03 16  X  FRE  @0 Réseau neuronal @5 23
C03 16  X  ENG  @0 Neural network @5 23
C03 16  X  SPA  @0 Red neuronal @5 23
C03 17  X  FRE  @0 Modèle dynamique @5 24
C03 17  X  ENG  @0 Dynamic model @5 24
C03 17  X  SPA  @0 Modelo dinámico @5 24
C03 18  X  FRE  @0 Modélisation @5 25
C03 18  X  ENG  @0 Modeling @5 25
C03 18  X  SPA  @0 Modelización @5 25
C03 19  X  FRE  @0 Temps retard @5 26
C03 19  X  ENG  @0 Delay time @5 26
C03 19  X  SPA  @0 Tiempo retardo @5 26
C03 20  X  FRE  @0 Système à retard @5 27
C03 20  X  ENG  @0 Delay system @5 27
C03 20  X  SPA  @0 Sistema con retardo @5 27
C03 21  X  FRE  @0 Segmentation @5 28
C03 21  X  ENG  @0 Segmentation @5 28
C03 21  X  SPA  @0 Segmentación @5 28
C03 22  X  FRE  @0 . @4 INC @5 82
N21       @1 324
N44 01      @1 OTO
N82       @1 OTO

Format Inist (serveur)

NO : PASCAL 12-0415345 INIST
ET : Labelling logical structures of document images using a dynamic perceptive neural network
AU : RANGONI (Yves); BELAÏD (Abdet); VAJDA (Szilárd)
AF : Nancy 2 University, LORIA/Vandœuvre-Lès-Nancy/France (1 aut., 2 aut.); Computer Science Department/TU Dortmund, Dortmund/Allemagne (3 aut.)
DT : Publication en série; Niveau analytique
SO : International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2012; Vol. 15; No. 1; Pp. 45-55; Bibl. 54 ref.
LA : Anglais
EA : This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.
CC : 001D02C03; 001D02C06; 001D02B07B
FD : Etiquetage; Système dynamique; Reconnaissance caractère; Reconnaissance optique caractère; Texte; Classification; Analyse documentaire; Analyse image; Reconnaissance image; Traitement image; Structure document; Présentation document; Perception sensorielle; Temps occupation; Taux erreur; Réseau neuronal; Modèle dynamique; Modélisation; Temps retard; Système à retard; Segmentation; .
ED : Labelling; Dynamical system; Character recognition; Optical character recognition; Text; Classification; Document analysis; Image analysis; Image recognition; Image processing; Document structure; Document layout; Sensorial perception; Occupation time; Error rate; Neural network; Dynamic model; Modeling; Delay time; Delay system; Segmentation
SD : Etiquetaje; Sistema dinámico; Reconocimiento carácter; Reconocimento óptico de caracteres; Texto; Clasificación; Análisis documental; Análisis imagen; Reconocimiento imagen; Procesamiento imagen; Estructura documental; Presentación documento; Percepción sensorial; Tiempo ocupación; Indice error; Red neuronal; Modelo dinámico; Modelización; Tiempo retardo; Sistema con retardo; Segmentación
LO : INIST-26790.354000508427810030
ID : 12-0415345

Links to Exploration step

Pascal:12-0415345

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Labelling logical structures of document images using a dynamic perceptive neural network</title>
<author>
<name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Belaid, Abdet" sort="Belaid, Abdet" uniqKey="Belaid A" first="Abdet" last="Belaïd">Abdet Belaïd</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Computer Science Department</s1>
<s2>TU Dortmund, Dortmund</s2>
<s3>DEU</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">12-0415345</idno>
<date when="2012">2012</date>
<idno type="stanalyst">PASCAL 12-0415345 INIST</idno>
<idno type="RBID">Pascal:12-0415345</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000077</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Labelling logical structures of document images using a dynamic perceptive neural network</title>
<author>
<name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Belaid, Abdet" sort="Belaid, Abdet" uniqKey="Belaid A" first="Abdet" last="Belaïd">Abdet Belaïd</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Computer Science Department</s1>
<s2>TU Dortmund, Dortmund</s2>
<s3>DEU</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Classification</term>
<term>Delay system</term>
<term>Delay time</term>
<term>Document analysis</term>
<term>Document layout</term>
<term>Document structure</term>
<term>Dynamic model</term>
<term>Dynamical system</term>
<term>Error rate</term>
<term>Image analysis</term>
<term>Image processing</term>
<term>Image recognition</term>
<term>Labelling</term>
<term>Modeling</term>
<term>Neural network</term>
<term>Occupation time</term>
<term>Optical character recognition</term>
<term>Segmentation</term>
<term>Sensorial perception</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Etiquetage</term>
<term>Système dynamique</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Classification</term>
<term>Analyse documentaire</term>
<term>Analyse image</term>
<term>Reconnaissance image</term>
<term>Traitement image</term>
<term>Structure document</term>
<term>Présentation document</term>
<term>Perception sensorielle</term>
<term>Temps occupation</term>
<term>Taux erreur</term>
<term>Réseau neuronal</term>
<term>Modèle dynamique</term>
<term>Modélisation</term>
<term>Temps retard</term>
<term>Système à retard</term>
<term>Segmentation</term>
<term>.</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1433-2833</s0>
</fA01>
<fA03 i2="1">
<s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05>
<s2>15</s2>
</fA05>
<fA06>
<s2>1</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Labelling logical structures of document images using a dynamic perceptive neural network</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>RANGONI (Yves)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>BELAÏD (Abdet)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>VAJDA (Szilárd)</s1>
</fA11>
<fA14 i1="01">
<s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="02">
<s1>Computer Science Department</s1>
<s2>TU Dortmund, Dortmund</s2>
<s3>DEU</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA20>
<s1>45-55</s1>
</fA20>
<fA21>
<s1>2012</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>26790</s2>
<s5>354000508427810030</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2012 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>54 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>12-0415345</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01">
<s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D02C03</s0>
</fC02>
<fC02 i1="02" i2="X">
<s0>001D02C06</s0>
</fC02>
<fC02 i1="03" i2="X">
<s0>001D02B07B</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Etiquetage</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Labelling</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Etiquetaje</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Système dynamique</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Dynamical system</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Sistema dinámico</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Character recognition</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Reconocimiento carácter</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Texte</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Text</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Texto</s0>
<s5>10</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Classification</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Classification</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Clasificación</s0>
<s5>11</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Analyse documentaire</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Document analysis</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Análisis documental</s0>
<s5>12</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Analyse image</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Image analysis</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Análisis imagen</s0>
<s5>13</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Reconnaissance image</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Image recognition</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Reconocimiento imagen</s0>
<s5>14</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE">
<s0>Traitement image</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG">
<s0>Image processing</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA">
<s0>Procesamiento imagen</s0>
<s5>15</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE">
<s0>Structure document</s0>
<s5>18</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG">
<s0>Document structure</s0>
<s5>18</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA">
<s0>Estructura documental</s0>
<s5>18</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE">
<s0>Présentation document</s0>
<s5>19</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG">
<s0>Document layout</s0>
<s5>19</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA">
<s0>Presentación documento</s0>
<s5>19</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE">
<s0>Perception sensorielle</s0>
<s5>20</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG">
<s0>Sensorial perception</s0>
<s5>20</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA">
<s0>Percepción sensorial</s0>
<s5>20</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE">
<s0>Temps occupation</s0>
<s5>21</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG">
<s0>Occupation time</s0>
<s5>21</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA">
<s0>Tiempo ocupación</s0>
<s5>21</s5>
</fC03>
<fC03 i1="15" i2="X" l="FRE">
<s0>Taux erreur</s0>
<s5>22</s5>
</fC03>
<fC03 i1="15" i2="X" l="ENG">
<s0>Error rate</s0>
<s5>22</s5>
</fC03>
<fC03 i1="15" i2="X" l="SPA">
<s0>Indice error</s0>
<s5>22</s5>
</fC03>
<fC03 i1="16" i2="X" l="FRE">
<s0>Réseau neuronal</s0>
<s5>23</s5>
</fC03>
<fC03 i1="16" i2="X" l="ENG">
<s0>Neural network</s0>
<s5>23</s5>
</fC03>
<fC03 i1="16" i2="X" l="SPA">
<s0>Red neuronal</s0>
<s5>23</s5>
</fC03>
<fC03 i1="17" i2="X" l="FRE">
<s0>Modèle dynamique</s0>
<s5>24</s5>
</fC03>
<fC03 i1="17" i2="X" l="ENG">
<s0>Dynamic model</s0>
<s5>24</s5>
</fC03>
<fC03 i1="17" i2="X" l="SPA">
<s0>Modelo dinámico</s0>
<s5>24</s5>
</fC03>
<fC03 i1="18" i2="X" l="FRE">
<s0>Modélisation</s0>
<s5>25</s5>
</fC03>
<fC03 i1="18" i2="X" l="ENG">
<s0>Modeling</s0>
<s5>25</s5>
</fC03>
<fC03 i1="18" i2="X" l="SPA">
<s0>Modelización</s0>
<s5>25</s5>
</fC03>
<fC03 i1="19" i2="X" l="FRE">
<s0>Temps retard</s0>
<s5>26</s5>
</fC03>
<fC03 i1="19" i2="X" l="ENG">
<s0>Delay time</s0>
<s5>26</s5>
</fC03>
<fC03 i1="19" i2="X" l="SPA">
<s0>Tiempo retardo</s0>
<s5>26</s5>
</fC03>
<fC03 i1="20" i2="X" l="FRE">
<s0>Système à retard</s0>
<s5>27</s5>
</fC03>
<fC03 i1="20" i2="X" l="ENG">
<s0>Delay system</s0>
<s5>27</s5>
</fC03>
<fC03 i1="20" i2="X" l="SPA">
<s0>Sistema con retardo</s0>
<s5>27</s5>
</fC03>
<fC03 i1="21" i2="X" l="FRE">
<s0>Segmentation</s0>
<s5>28</s5>
</fC03>
<fC03 i1="21" i2="X" l="ENG">
<s0>Segmentation</s0>
<s5>28</s5>
</fC03>
<fC03 i1="21" i2="X" l="SPA">
<s0>Segmentación</s0>
<s5>28</s5>
</fC03>
<fC03 i1="22" i2="X" l="FRE">
<s0>.</s0>
<s4>INC</s4>
<s5>82</s5>
</fC03>
<fN21>
<s1>324</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
</standard>
<server>
<NO>PASCAL 12-0415345 INIST</NO>
<ET>Labelling logical structures of document images using a dynamic perceptive neural network</ET>
<AU>RANGONI (Yves); BELAÏD (Abdet); VAJDA (Szilárd)</AU>
<AF>Nancy 2 University, LORIA/Vandœuvre-Lès-Nancy/France (1 aut., 2 aut.); Computer Science Department/TU Dortmund, Dortmund/Allemagne (3 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2012; Vol. 15; No. 1; Pp. 45-55; Bibl. 54 ref.</SO>
<LA>Anglais</LA>
<EA>This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.</EA>
<CC>001D02C03; 001D02C06; 001D02B07B</CC>
<FD>Etiquetage; Système dynamique; Reconnaissance caractère; Reconnaissance optique caractère; Texte; Classification; Analyse documentaire; Analyse image; Reconnaissance image; Traitement image; Structure document; Présentation document; Perception sensorielle; Temps occupation; Taux erreur; Réseau neuronal; Modèle dynamique; Modélisation; Temps retard; Système à retard; Segmentation; .</FD>
<ED>Labelling; Dynamical system; Character recognition; Optical character recognition; Text; Classification; Document analysis; Image analysis; Image recognition; Image processing; Document structure; Document layout; Sensorial perception; Occupation time; Error rate; Neural network; Dynamic model; Modeling; Delay time; Delay system; Segmentation</ED>
<SD>Etiquetaje; Sistema dinámico; Reconocimiento carácter; Reconocimento óptico de caracteres; Texto; Clasificación; Análisis documental; Análisis imagen; Reconocimiento imagen; Procesamiento imagen; Estructura documental; Presentación documento; Percepción sensorial; Tiempo ocupación; Indice error; Red neuronal; Modelo dinámico; Modelización; Tiempo retardo; Sistema con retardo; Segmentación</SD>
<LO>INIST-26790.354000508427810030</LO>
<ID>12-0415345</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000077 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000077 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:12-0415345
   |texte=   Labelling logical structures of document images using a dynamic perceptive neural network
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024