OcrV1, PascalFrancis, Corpus, bibRecord, 000077

Labelling logical structures of document images using a dynamic perceptive neural network

Identifieur interne : 000077 ( PascalFrancis/Corpus ); précédent : 000076; suivant : 000078

Labelling logical structures of document images using a dynamic perceptive neural network

Auteurs : Yves Rangoni ; Abdet Belaïd ; Szilárd Vajda

Source :

International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2012.

RBID : Pascal:12-0415345

Descripteurs français

Pascal (Inist)
- Etiquetage, Système dynamique, Reconnaissance caractère, Reconnaissance optique caractère, Texte, Classification, Analyse documentaire, Analyse image, Reconnaissance image, Traitement image, Structure document, Présentation document, Perception sensorielle, Temps occupation, Taux erreur, Réseau neuronal, Modèle dynamique, Modélisation, Temps retard, Système à retard, Segmentation, ..

English descriptors

KwdEn :
- Character recognition, Classification, Delay system, Delay time, Document analysis, Document layout, Document structure, Dynamic model, Dynamical system, Error rate, Image analysis, Image processing, Image recognition, Labelling, Modeling, Neural network, Occupation time, Optical character recognition, Segmentation, Sensorial perception, Text.

Abstract

This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

A01	`01`	`1`		`@0 1433-2833`
A03		`1`		`@0 Int. j. doc. anal. recognit. : (Print)`
A05				`@2 15`
A06				`@2 1`
A08	`01`	`1`	`ENG`	`@1 Labelling logical structures of document images using a dynamic perceptive neural network`
A11	`01`	`1`		`@1 RANGONI (Yves)`
A11	`02`	`1`		`@1 BELAÏD (Abdet)`
A11	`03`	`1`		`@1 VAJDA (Szilárd)`
A14	`01`			`@1 Nancy 2 University, LORIA @2 Vandœuvre-Lès-Nancy @3 FRA @Z 1 aut. @Z 2 aut.`
A14	`02`			`@1 Computer Science Department @2 TU Dortmund, Dortmund @3 DEU @Z 3 aut.`
A20				`@1 45-55`
A21				`@1 2012`
A23	`01`			`@0 ENG`
A43	`01`			`@1 INIST @2 26790 @5 354000508427810030`
A44				`@0 0000 @1 © 2012 INIST-CNRS. All rights reserved.`
A45				`@0 54 ref.`
A47	`01`	`1`		`@0 12-0415345`
A60				`@1 P`
A61				`@0 A`
A64	`01`	`1`		`@0 International journal on document analysis and recognition : (Print)`
A66	`01`			`@0 DEU`
C01	`01`		`ENG`	@0 This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.
C02	`01`	`X`		`@0 001D02C03`
C02	`02`	`X`		`@0 001D02C06`
C02	`03`	`X`		`@0 001D02B07B`
C03	`01`	`X`	`FRE`	`@0 Etiquetage @5 06`
C03	`01`	`X`	`ENG`	`@0 Labelling @5 06`
C03	`01`	`X`	`SPA`	`@0 Etiquetaje @5 06`
C03	`02`	`X`	`FRE`	`@0 Système dynamique @5 07`
C03	`02`	`X`	`ENG`	`@0 Dynamical system @5 07`
C03	`02`	`X`	`SPA`	`@0 Sistema dinámico @5 07`
C03	`03`	`X`	`FRE`	`@0 Reconnaissance caractère @5 08`
C03	`03`	`X`	`ENG`	`@0 Character recognition @5 08`
C03	`03`	`X`	`SPA`	`@0 Reconocimiento carácter @5 08`
C03	`04`	`X`	`FRE`	`@0 Reconnaissance optique caractère @5 09`
C03	`04`	`X`	`ENG`	`@0 Optical character recognition @5 09`
C03	`04`	`X`	`SPA`	`@0 Reconocimento óptico de caracteres @5 09`
C03	`05`	`X`	`FRE`	`@0 Texte @5 10`
C03	`05`	`X`	`ENG`	`@0 Text @5 10`
C03	`05`	`X`	`SPA`	`@0 Texto @5 10`
C03	`06`	`X`	`FRE`	`@0 Classification @5 11`
C03	`06`	`X`	`ENG`	`@0 Classification @5 11`
C03	`06`	`X`	`SPA`	`@0 Clasificación @5 11`
C03	`07`	`X`	`FRE`	`@0 Analyse documentaire @5 12`
C03	`07`	`X`	`ENG`	`@0 Document analysis @5 12`
C03	`07`	`X`	`SPA`	`@0 Análisis documental @5 12`
C03	`08`	`X`	`FRE`	`@0 Analyse image @5 13`
C03	`08`	`X`	`ENG`	`@0 Image analysis @5 13`
C03	`08`	`X`	`SPA`	`@0 Análisis imagen @5 13`
C03	`09`	`X`	`FRE`	`@0 Reconnaissance image @5 14`
C03	`09`	`X`	`ENG`	`@0 Image recognition @5 14`
C03	`09`	`X`	`SPA`	`@0 Reconocimiento imagen @5 14`
C03	`10`	`X`	`FRE`	`@0 Traitement image @5 15`
C03	`10`	`X`	`ENG`	`@0 Image processing @5 15`
C03	`10`	`X`	`SPA`	`@0 Procesamiento imagen @5 15`
C03	`11`	`X`	`FRE`	`@0 Structure document @5 18`
C03	`11`	`X`	`ENG`	`@0 Document structure @5 18`
C03	`11`	`X`	`SPA`	`@0 Estructura documental @5 18`
C03	`12`	`X`	`FRE`	`@0 Présentation document @5 19`
C03	`12`	`X`	`ENG`	`@0 Document layout @5 19`
C03	`12`	`X`	`SPA`	`@0 Presentación documento @5 19`
C03	`13`	`X`	`FRE`	`@0 Perception sensorielle @5 20`
C03	`13`	`X`	`ENG`	`@0 Sensorial perception @5 20`
C03	`13`	`X`	`SPA`	`@0 Percepción sensorial @5 20`
C03	`14`	`X`	`FRE`	`@0 Temps occupation @5 21`
C03	`14`	`X`	`ENG`	`@0 Occupation time @5 21`
C03	`14`	`X`	`SPA`	`@0 Tiempo ocupación @5 21`
C03	`15`	`X`	`FRE`	`@0 Taux erreur @5 22`
C03	`15`	`X`	`ENG`	`@0 Error rate @5 22`
C03	`15`	`X`	`SPA`	`@0 Indice error @5 22`
C03	`16`	`X`	`FRE`	`@0 Réseau neuronal @5 23`
C03	`16`	`X`	`ENG`	`@0 Neural network @5 23`
C03	`16`	`X`	`SPA`	`@0 Red neuronal @5 23`
C03	`17`	`X`	`FRE`	`@0 Modèle dynamique @5 24`
C03	`17`	`X`	`ENG`	`@0 Dynamic model @5 24`
C03	`17`	`X`	`SPA`	`@0 Modelo dinámico @5 24`
C03	`18`	`X`	`FRE`	`@0 Modélisation @5 25`
C03	`18`	`X`	`ENG`	`@0 Modeling @5 25`
C03	`18`	`X`	`SPA`	`@0 Modelización @5 25`
C03	`19`	`X`	`FRE`	`@0 Temps retard @5 26`
C03	`19`	`X`	`ENG`	`@0 Delay time @5 26`
C03	`19`	`X`	`SPA`	`@0 Tiempo retardo @5 26`
C03	`20`	`X`	`FRE`	`@0 Système à retard @5 27`
C03	`20`	`X`	`ENG`	`@0 Delay system @5 27`
C03	`20`	`X`	`SPA`	`@0 Sistema con retardo @5 27`
C03	`21`	`X`	`FRE`	`@0 Segmentation @5 28`
C03	`21`	`X`	`ENG`	`@0 Segmentation @5 28`
C03	`21`	`X`	`SPA`	`@0 Segmentación @5 28`
C03	`22`	`X`	`FRE`	`@0 . @4 INC @5 82`
N21				`@1 324`
N44	`01`			`@1 OTO`
N82				`@1 OTO`

Format Inist (serveur)

NO :	PASCAL 12-0415345 INIST
ET :	Labelling logical structures of document images using a dynamic perceptive neural network
AU :	RANGONI (Yves); BELAÏD (Abdet); VAJDA (Szilárd)
AF :	Nancy 2 University, LORIA/Vandœuvre-Lès-Nancy/France (1 aut., 2 aut.); Computer Science Department/TU Dortmund, Dortmund/Allemagne (3 aut.)
DT :	Publication en série; Niveau analytique
SO :	International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2012; Vol. 15; No. 1; Pp. 45-55; Bibl. 54 ref.
LA :	Anglais
EA :	This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.
CC :	001D02C03; 001D02C06; 001D02B07B
FD :	Etiquetage; Système dynamique; Reconnaissance caractère; Reconnaissance optique caractère; Texte; Classification; Analyse documentaire; Analyse image; Reconnaissance image; Traitement image; Structure document; Présentation document; Perception sensorielle; Temps occupation; Taux erreur; Réseau neuronal; Modèle dynamique; Modélisation; Temps retard; Système à retard; Segmentation; .
ED :	Labelling; Dynamical system; Character recognition; Optical character recognition; Text; Classification; Document analysis; Image analysis; Image recognition; Image processing; Document structure; Document layout; Sensorial perception; Occupation time; Error rate; Neural network; Dynamic model; Modeling; Delay time; Delay system; Segmentation
SD :	Etiquetaje; Sistema dinámico; Reconocimiento carácter; Reconocimento óptico de caracteres; Texto; Clasificación; Análisis documental; Análisis imagen; Reconocimiento imagen; Procesamiento imagen; Estructura documental; Presentación documento; Percepción sensorial; Tiempo ocupación; Indice error; Red neuronal; Modelo dinámico; Modelización; Tiempo retardo; Sistema con retardo; Segmentación
LO :	INIST-26790.354000508427810030
ID :	12-0415345

Links to Exploration step

Pascal:12-0415345

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Labelling logical structures of document images using a dynamic perceptive neural network</title>
<author><name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
<affiliation><inist:fA14 i1="01"><s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Belaid, Abdet" sort="Belaid, Abdet" uniqKey="Belaid A" first="Abdet" last="Belaïd">Abdet Belaïd</name>
<affiliation><inist:fA14 i1="01"><s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
<affiliation><inist:fA14 i1="02"><s1>Computer Science Department</s1>
<s2>TU Dortmund, Dortmund</s2>
<s3>DEU</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">12-0415345</idno>
<date when="2012">2012</date>
<idno type="stanalyst">PASCAL 12-0415345 INIST</idno>
<idno type="RBID">Pascal:12-0415345</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000077</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Labelling logical structures of document images using a dynamic perceptive neural network</title>
<author><name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
<affiliation><inist:fA14 i1="01"><s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Belaid, Abdet" sort="Belaid, Abdet" uniqKey="Belaid A" first="Abdet" last="Belaïd">Abdet Belaïd</name>
<affiliation><inist:fA14 i1="01"><s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
<affiliation><inist:fA14 i1="02"><s1>Computer Science Department</s1>
<s2>TU Dortmund, Dortmund</s2>
<s3>DEU</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Classification</term>
<term>Delay system</term>
<term>Delay time</term>
<term>Document analysis</term>
<term>Document layout</term>
<term>Document structure</term>
<term>Dynamic model</term>
<term>Dynamical system</term>
<term>Error rate</term>
<term>Image analysis</term>
<term>Image processing</term>
<term>Image recognition</term>
<term>Labelling</term>
<term>Modeling</term>
<term>Neural network</term>
<term>Occupation time</term>
<term>Optical character recognition</term>
<term>Segmentation</term>
<term>Sensorial perception</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Etiquetage</term>
<term>Système dynamique</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Classification</term>
<term>Analyse documentaire</term>
<term>Analyse image</term>
<term>Reconnaissance image</term>
<term>Traitement image</term>
<term>Structure document</term>
<term>Présentation document</term>
<term>Perception sensorielle</term>
<term>Temps occupation</term>
<term>Taux erreur</term>
<term>Réseau neuronal</term>
<term>Modèle dynamique</term>
<term>Modélisation</term>
<term>Temps retard</term>
<term>Système à retard</term>
<term>Segmentation</term>
<term>.</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1433-2833</s0>
</fA01>
<fA03 i2="1"><s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05><s2>15</s2>
</fA05>
<fA06><s2>1</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG"><s1>Labelling logical structures of document images using a dynamic perceptive neural network</s1>
</fA08>
<fA11 i1="01" i2="1"><s1>RANGONI (Yves)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>BELAÏD (Abdet)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>VAJDA (Szilárd)</s1>
</fA11>
<fA14 i1="01"><s1>Nancy 2 University, LORIA</s1>
<s2>Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>Computer Science Department</s1>
<s2>TU Dortmund, Dortmund</s2>
<s3>DEU</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA20><s1>45-55</s1>
</fA20>
<fA21><s1>2012</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA43 i1="01"><s1>INIST</s1>
<s2>26790</s2>
<s5>354000508427810030</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2012 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>54 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>12-0415345</s0>
</fA47>
<fA60><s1>P</s1>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01"><s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02C03</s0>
</fC02>
<fC02 i1="02" i2="X"><s0>001D02C06</s0>
</fC02>
<fC02 i1="03" i2="X"><s0>001D02B07B</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Etiquetage</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Labelling</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Etiquetaje</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Système dynamique</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Dynamical system</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Sistema dinámico</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Reconnaissance caractère</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Character recognition</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Reconocimiento carácter</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Reconnaissance optique caractère</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Optical character recognition</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Reconocimento óptico de caracteres</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Texte</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Text</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Texto</s0>
<s5>10</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Classification</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Classification</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Clasificación</s0>
<s5>11</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Analyse documentaire</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG"><s0>Document analysis</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA"><s0>Análisis documental</s0>
<s5>12</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Analyse image</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Image analysis</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Análisis imagen</s0>
<s5>13</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE"><s0>Reconnaissance image</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG"><s0>Image recognition</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA"><s0>Reconocimiento imagen</s0>
<s5>14</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE"><s0>Traitement image</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG"><s0>Image processing</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA"><s0>Procesamiento imagen</s0>
<s5>15</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE"><s0>Structure document</s0>
<s5>18</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG"><s0>Document structure</s0>
<s5>18</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA"><s0>Estructura documental</s0>
<s5>18</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE"><s0>Présentation document</s0>
<s5>19</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG"><s0>Document layout</s0>
<s5>19</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA"><s0>Presentación documento</s0>
<s5>19</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE"><s0>Perception sensorielle</s0>
<s5>20</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG"><s0>Sensorial perception</s0>
<s5>20</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA"><s0>Percepción sensorial</s0>
<s5>20</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE"><s0>Temps occupation</s0>
<s5>21</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG"><s0>Occupation time</s0>
<s5>21</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA"><s0>Tiempo ocupación</s0>
<s5>21</s5>
</fC03>
<fC03 i1="15" i2="X" l="FRE"><s0>Taux erreur</s0>
<s5>22</s5>
</fC03>
<fC03 i1="15" i2="X" l="ENG"><s0>Error rate</s0>
<s5>22</s5>
</fC03>
<fC03 i1="15" i2="X" l="SPA"><s0>Indice error</s0>
<s5>22</s5>
</fC03>
<fC03 i1="16" i2="X" l="FRE"><s0>Réseau neuronal</s0>
<s5>23</s5>
</fC03>
<fC03 i1="16" i2="X" l="ENG"><s0>Neural network</s0>
<s5>23</s5>
</fC03>
<fC03 i1="16" i2="X" l="SPA"><s0>Red neuronal</s0>
<s5>23</s5>
</fC03>
<fC03 i1="17" i2="X" l="FRE"><s0>Modèle dynamique</s0>
<s5>24</s5>
</fC03>
<fC03 i1="17" i2="X" l="ENG"><s0>Dynamic model</s0>
<s5>24</s5>
</fC03>
<fC03 i1="17" i2="X" l="SPA"><s0>Modelo dinámico</s0>
<s5>24</s5>
</fC03>
<fC03 i1="18" i2="X" l="FRE"><s0>Modélisation</s0>
<s5>25</s5>
</fC03>
<fC03 i1="18" i2="X" l="ENG"><s0>Modeling</s0>
<s5>25</s5>
</fC03>
<fC03 i1="18" i2="X" l="SPA"><s0>Modelización</s0>
<s5>25</s5>
</fC03>
<fC03 i1="19" i2="X" l="FRE"><s0>Temps retard</s0>
<s5>26</s5>
</fC03>
<fC03 i1="19" i2="X" l="ENG"><s0>Delay time</s0>
<s5>26</s5>
</fC03>
<fC03 i1="19" i2="X" l="SPA"><s0>Tiempo retardo</s0>
<s5>26</s5>
</fC03>
<fC03 i1="20" i2="X" l="FRE"><s0>Système à retard</s0>
<s5>27</s5>
</fC03>
<fC03 i1="20" i2="X" l="ENG"><s0>Delay system</s0>
<s5>27</s5>
</fC03>
<fC03 i1="20" i2="X" l="SPA"><s0>Sistema con retardo</s0>
<s5>27</s5>
</fC03>
<fC03 i1="21" i2="X" l="FRE"><s0>Segmentation</s0>
<s5>28</s5>
</fC03>
<fC03 i1="21" i2="X" l="ENG"><s0>Segmentation</s0>
<s5>28</s5>
</fC03>
<fC03 i1="21" i2="X" l="SPA"><s0>Segmentación</s0>
<s5>28</s5>
</fC03>
<fC03 i1="22" i2="X" l="FRE"><s0>.</s0>
<s4>INC</s4>
<s5>82</s5>
</fC03>
<fN21><s1>324</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
</standard>
<server><NO>PASCAL 12-0415345 INIST</NO>
<ET>Labelling logical structures of document images using a dynamic perceptive neural network</ET>
<AU>RANGONI (Yves); BELAÏD (Abdet); VAJDA (Szilárd)</AU>
<AF>Nancy 2 University, LORIA/Vandœuvre-Lès-Nancy/France (1 aut., 2 aut.); Computer Science Department/TU Dortmund, Dortmund/Allemagne (3 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2012; Vol. 15; No. 1; Pp. 45-55; Bibl. 54 ref.</SO>
<LA>Anglais</LA>
<EA>This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR's outputs to find the meaning of each block of text (i.e. assigns labels like "Title", "Author", etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.</EA>
<CC>001D02C03; 001D02C06; 001D02B07B</CC>
<FD>Etiquetage; Système dynamique; Reconnaissance caractère; Reconnaissance optique caractère; Texte; Classification; Analyse documentaire; Analyse image; Reconnaissance image; Traitement image; Structure document; Présentation document; Perception sensorielle; Temps occupation; Taux erreur; Réseau neuronal; Modèle dynamique; Modélisation; Temps retard; Système à retard; Segmentation; .</FD>
<ED>Labelling; Dynamical system; Character recognition; Optical character recognition; Text; Classification; Document analysis; Image analysis; Image recognition; Image processing; Document structure; Document layout; Sensorial perception; Occupation time; Error rate; Neural network; Dynamic model; Modeling; Delay time; Delay system; Segmentation</ED>
<SD>Etiquetaje; Sistema dinámico; Reconocimiento carácter; Reconocimento óptico de caracteres; Texto; Clasificación; Análisis documental; Análisis imagen; Reconocimiento imagen; Procesamiento imagen; Estructura documental; Presentación documento; Percepción sensorial; Tiempo ocupación; Indice error; Red neuronal; Modelo dinámico; Modelización; Tiempo retardo; Sistema con retardo; Segmentación</SD>
<LO>INIST-26790.354000508427810030</LO>
<ID>12-0415345</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000077 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000077 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:12-0415345
   |texte=   Labelling logical structures of document images using a dynamic perceptive neural network
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Labelling logical structures of document images using a dynamic perceptive neural network

Labelling logical structures of document images using a dynamic perceptive neural network

Source :

Descripteurs français

English descriptors

Abstract

Notice en format standard (ISO 2709)

Format Inist (serveur)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri