Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation

Identifieur interne : 000009 ( PascalFrancis/Corpus ); précédent : 000008; suivant : 000010

Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation

Auteurs : Khaoula Elagouni ; Christophe Garcia ; Franck Mamalet ; Pascale Sebillot

Source :

RBID : Pascal:14-0199549

Descripteurs français

English descriptors

Abstract

Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 1433-2833
A03   1    @0 Int. j. doc. anal. recognit. : (Print)
A05       @2 17
A06       @2 1
A08 01  1  ENG  @1 Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation
A11 01  1    @1 ELAGOUNI (Khaoula)
A11 02  1    @1 GARCIA (Christophe)
A11 03  1    @1 MAMALET (Franck)
A11 04  1    @1 SEBILLOT (Pascale)
A14 01      @1 Orange Labs R&D @2 35512 Cesson-Sévigné @3 FRA @Z 1 aut. @Z 3 aut.
A14 02      @1 LIRIS/INSA de Lyon @2 69621 Villeurbanne @3 FRA @Z 2 aut.
A14 03      @1 IRISA/INSA de Rennes @2 35042 Rennes @3 FRA @Z 4 aut.
A20       @1 19-31
A21       @1 2014
A23 01      @0 ENG
A43 01      @1 INIST @2 26790 @5 354000501888130020
A44       @0 0000 @1 © 2014 INIST-CNRS. All rights reserved.
A45       @0 50 ref.
A47 01  1    @0 14-0199549
A60       @1 P
A61       @0 A
A64 01  1    @0 International journal on document analysis and recognition : (Print)
A66 01      @0 DEU
C01 01    ENG  @0 Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.
C02 01  X    @0 001D02C04
C02 02  X    @0 001D02C03
C02 03  X    @0 001D02C06
C03 01  X  FRE  @0 Reconnaissance caractère @5 06
C03 01  X  ENG  @0 Character recognition @5 06
C03 01  X  SPA  @0 Reconocimiento carácter @5 06
C03 02  X  FRE  @0 Texte @5 07
C03 02  X  ENG  @0 Text @5 07
C03 02  X  SPA  @0 Texto @5 07
C03 03  X  FRE  @0 Reconnaissance forme @5 08
C03 03  X  ENG  @0 Pattern recognition @5 08
C03 03  X  SPA  @0 Reconocimiento patrón @5 08
C03 04  X  FRE  @0 Analyse documentaire @5 09
C03 04  X  ENG  @0 Document analysis @5 09
C03 04  X  SPA  @0 Análisis documental @5 09
C03 05  X  FRE  @0 Multimédia @5 10
C03 05  X  ENG  @0 Multimedia @5 10
C03 05  X  SPA  @0 Multimedia @5 10
C03 06  X  FRE  @0 Reconnaissance optique caractère @5 11
C03 06  X  ENG  @0 Optical character recognition @5 11
C03 06  X  SPA  @0 Reconocimento óptico de caracteres @5 11
C03 07  X  FRE  @0 Accès contenu @5 12
C03 07  X  ENG  @0 Content access @5 12
C03 07  X  SPA  @0 Acceso contenido @5 12
C03 08  X  FRE  @0 Vision ordinateur @5 13
C03 08  X  ENG  @0 Computer vision @5 13
C03 08  X  SPA  @0 Visión ordenador @5 13
C03 09  X  FRE  @0 Linguistique mathématique @5 14
C03 09  X  ENG  @0 Computational linguistics @5 14
C03 09  X  SPA  @0 Linguística matemática @5 14
C03 10  X  FRE  @0 Signal vidéo @5 15
C03 10  X  ENG  @0 Video signal @5 15
C03 10  X  SPA  @0 Señal video @5 15
C03 11  X  FRE  @0 Analyse scène @5 16
C03 11  X  ENG  @0 Scene analysis @5 16
C03 11  X  SPA  @0 Análisis escena @5 16
C03 12  X  FRE  @0 Présentation document @5 18
C03 12  X  ENG  @0 Document layout @5 18
C03 12  X  SPA  @0 Presentación documento @5 18
C03 13  X  FRE  @0 Sémantique @5 19
C03 13  X  ENG  @0 Semantics @5 19
C03 13  X  SPA  @0 Semántica @5 19
C03 14  X  FRE  @0 Balayage @5 20
C03 14  X  ENG  @0 Scanning @5 20
C03 14  X  SPA  @0 Exploración @5 20
C03 15  X  FRE  @0 Sous titrage @5 21
C03 15  X  ENG  @0 Caption @5 21
C03 15  X  SPA  @0 Subtítulo @5 21
C03 16  X  FRE  @0 Convolution @5 23
C03 16  X  ENG  @0 Convolution @5 23
C03 16  X  SPA  @0 Convolución @5 23
C03 17  X  FRE  @0 Réseau neuronal @5 24
C03 17  X  ENG  @0 Neural network @5 24
C03 17  X  SPA  @0 Red neuronal @5 24
C03 18  X  FRE  @0 Segmentation image @4 CD @5 96
C03 18  X  ENG  @0 Image segmentation @4 CD @5 96
C03 18  X  SPA  @0 Segmentación de imágenes @4 CD @5 96
C03 19  X  FRE  @0 Scène naturelle @4 CD @5 97
C03 19  X  ENG  @0 Natural scenes @4 CD @5 97
C03 19  X  SPA  @0 Escena natural @4 CD @5 97
N21       @1 251
N44 01      @1 OTO
N82       @1 OTO

Format Inist (serveur)

NO : PASCAL 14-0199549 INIST
ET : Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation
AU : ELAGOUNI (Khaoula); GARCIA (Christophe); MAMALET (Franck); SEBILLOT (Pascale)
AF : Orange Labs R&D/35512 Cesson-Sévigné/France (1 aut., 3 aut.); LIRIS/INSA de Lyon/69621 Villeurbanne/France (2 aut.); IRISA/INSA de Rennes/35042 Rennes/France (4 aut.)
DT : Publication en série; Niveau analytique
SO : International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2014; Vol. 17; No. 1; Pp. 19-31; Bibl. 50 ref.
LA : Anglais
EA : Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.
CC : 001D02C04; 001D02C03; 001D02C06
FD : Reconnaissance caractère; Texte; Reconnaissance forme; Analyse documentaire; Multimédia; Reconnaissance optique caractère; Accès contenu; Vision ordinateur; Linguistique mathématique; Signal vidéo; Analyse scène; Présentation document; Sémantique; Balayage; Sous titrage; Convolution; Réseau neuronal; Segmentation image; Scène naturelle
ED : Character recognition; Text; Pattern recognition; Document analysis; Multimedia; Optical character recognition; Content access; Computer vision; Computational linguistics; Video signal; Scene analysis; Document layout; Semantics; Scanning; Caption; Convolution; Neural network; Image segmentation; Natural scenes
SD : Reconocimiento carácter; Texto; Reconocimiento patrón; Análisis documental; Multimedia; Reconocimento óptico de caracteres; Acceso contenido; Visión ordenador; Linguística matemática; Señal video; Análisis escena; Presentación documento; Semántica; Exploración; Subtítulo; Convolución; Red neuronal; Segmentación de imágenes; Escena natural
LO : INIST-26790.354000501888130020
ID : 14-0199549

Links to Exploration step

Pascal:14-0199549

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation</title>
<author>
<name sortKey="Elagouni, Khaoula" sort="Elagouni, Khaoula" uniqKey="Elagouni K" first="Khaoula" last="Elagouni">Khaoula Elagouni</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Orange Labs R&D</s1>
<s2>35512 Cesson-Sévigné</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Garcia, Christophe" sort="Garcia, Christophe" uniqKey="Garcia C" first="Christophe" last="Garcia">Christophe Garcia</name>
<affiliation>
<inist:fA14 i1="02">
<s1>LIRIS/INSA de Lyon</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Mamalet, Franck" sort="Mamalet, Franck" uniqKey="Mamalet F" first="Franck" last="Mamalet">Franck Mamalet</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Orange Labs R&D</s1>
<s2>35512 Cesson-Sévigné</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Sebillot, Pascale" sort="Sebillot, Pascale" uniqKey="Sebillot P" first="Pascale" last="Sebillot">Pascale Sebillot</name>
<affiliation>
<inist:fA14 i1="03">
<s1>IRISA/INSA de Rennes</s1>
<s2>35042 Rennes</s2>
<s3>FRA</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">14-0199549</idno>
<date when="2014">2014</date>
<idno type="stanalyst">PASCAL 14-0199549 INIST</idno>
<idno type="RBID">Pascal:14-0199549</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000009</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation</title>
<author>
<name sortKey="Elagouni, Khaoula" sort="Elagouni, Khaoula" uniqKey="Elagouni K" first="Khaoula" last="Elagouni">Khaoula Elagouni</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Orange Labs R&D</s1>
<s2>35512 Cesson-Sévigné</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Garcia, Christophe" sort="Garcia, Christophe" uniqKey="Garcia C" first="Christophe" last="Garcia">Christophe Garcia</name>
<affiliation>
<inist:fA14 i1="02">
<s1>LIRIS/INSA de Lyon</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Mamalet, Franck" sort="Mamalet, Franck" uniqKey="Mamalet F" first="Franck" last="Mamalet">Franck Mamalet</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Orange Labs R&D</s1>
<s2>35512 Cesson-Sévigné</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Sebillot, Pascale" sort="Sebillot, Pascale" uniqKey="Sebillot P" first="Pascale" last="Sebillot">Pascale Sebillot</name>
<affiliation>
<inist:fA14 i1="03">
<s1>IRISA/INSA de Rennes</s1>
<s2>35042 Rennes</s2>
<s3>FRA</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Caption</term>
<term>Character recognition</term>
<term>Computational linguistics</term>
<term>Computer vision</term>
<term>Content access</term>
<term>Convolution</term>
<term>Document analysis</term>
<term>Document layout</term>
<term>Image segmentation</term>
<term>Multimedia</term>
<term>Natural scenes</term>
<term>Neural network</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Scanning</term>
<term>Scene analysis</term>
<term>Semantics</term>
<term>Text</term>
<term>Video signal</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Texte</term>
<term>Reconnaissance forme</term>
<term>Analyse documentaire</term>
<term>Multimédia</term>
<term>Reconnaissance optique caractère</term>
<term>Accès contenu</term>
<term>Vision ordinateur</term>
<term>Linguistique mathématique</term>
<term>Signal vidéo</term>
<term>Analyse scène</term>
<term>Présentation document</term>
<term>Sémantique</term>
<term>Balayage</term>
<term>Sous titrage</term>
<term>Convolution</term>
<term>Réseau neuronal</term>
<term>Segmentation image</term>
<term>Scène naturelle</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1433-2833</s0>
</fA01>
<fA03 i2="1">
<s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05>
<s2>17</s2>
</fA05>
<fA06>
<s2>1</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>ELAGOUNI (Khaoula)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>GARCIA (Christophe)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>MAMALET (Franck)</s1>
</fA11>
<fA11 i1="04" i2="1">
<s1>SEBILLOT (Pascale)</s1>
</fA11>
<fA14 i1="01">
<s1>Orange Labs R&D</s1>
<s2>35512 Cesson-Sévigné</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</fA14>
<fA14 i1="02">
<s1>LIRIS/INSA de Lyon</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="03">
<s1>IRISA/INSA de Rennes</s1>
<s2>35042 Rennes</s2>
<s3>FRA</s3>
<sZ>4 aut.</sZ>
</fA14>
<fA20>
<s1>19-31</s1>
</fA20>
<fA21>
<s1>2014</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>26790</s2>
<s5>354000501888130020</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2014 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>50 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>14-0199549</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01">
<s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D02C04</s0>
</fC02>
<fC02 i1="02" i2="X">
<s0>001D02C03</s0>
</fC02>
<fC02 i1="03" i2="X">
<s0>001D02C06</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Character recognition</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Reconocimiento carácter</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Texte</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Text</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Texto</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Reconnaissance forme</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Pattern recognition</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Reconocimiento patrón</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Analyse documentaire</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Document analysis</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Análisis documental</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Multimédia</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Multimedia</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Multimedia</s0>
<s5>10</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>11</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Accès contenu</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Content access</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Acceso contenido</s0>
<s5>12</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Vision ordinateur</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Computer vision</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Visión ordenador</s0>
<s5>13</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Linguistique mathématique</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Computational linguistics</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Linguística matemática</s0>
<s5>14</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE">
<s0>Signal vidéo</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG">
<s0>Video signal</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA">
<s0>Señal video</s0>
<s5>15</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE">
<s0>Analyse scène</s0>
<s5>16</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG">
<s0>Scene analysis</s0>
<s5>16</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA">
<s0>Análisis escena</s0>
<s5>16</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE">
<s0>Présentation document</s0>
<s5>18</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG">
<s0>Document layout</s0>
<s5>18</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA">
<s0>Presentación documento</s0>
<s5>18</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE">
<s0>Sémantique</s0>
<s5>19</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG">
<s0>Semantics</s0>
<s5>19</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA">
<s0>Semántica</s0>
<s5>19</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE">
<s0>Balayage</s0>
<s5>20</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG">
<s0>Scanning</s0>
<s5>20</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA">
<s0>Exploración</s0>
<s5>20</s5>
</fC03>
<fC03 i1="15" i2="X" l="FRE">
<s0>Sous titrage</s0>
<s5>21</s5>
</fC03>
<fC03 i1="15" i2="X" l="ENG">
<s0>Caption</s0>
<s5>21</s5>
</fC03>
<fC03 i1="15" i2="X" l="SPA">
<s0>Subtítulo</s0>
<s5>21</s5>
</fC03>
<fC03 i1="16" i2="X" l="FRE">
<s0>Convolution</s0>
<s5>23</s5>
</fC03>
<fC03 i1="16" i2="X" l="ENG">
<s0>Convolution</s0>
<s5>23</s5>
</fC03>
<fC03 i1="16" i2="X" l="SPA">
<s0>Convolución</s0>
<s5>23</s5>
</fC03>
<fC03 i1="17" i2="X" l="FRE">
<s0>Réseau neuronal</s0>
<s5>24</s5>
</fC03>
<fC03 i1="17" i2="X" l="ENG">
<s0>Neural network</s0>
<s5>24</s5>
</fC03>
<fC03 i1="17" i2="X" l="SPA">
<s0>Red neuronal</s0>
<s5>24</s5>
</fC03>
<fC03 i1="18" i2="X" l="FRE">
<s0>Segmentation image</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="18" i2="X" l="ENG">
<s0>Image segmentation</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="18" i2="X" l="SPA">
<s0>Segmentación de imágenes</s0>
<s4>CD</s4>
<s5>96</s5>
</fC03>
<fC03 i1="19" i2="X" l="FRE">
<s0>Scène naturelle</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="19" i2="X" l="ENG">
<s0>Natural scenes</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fC03 i1="19" i2="X" l="SPA">
<s0>Escena natural</s0>
<s4>CD</s4>
<s5>97</s5>
</fC03>
<fN21>
<s1>251</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
</standard>
<server>
<NO>PASCAL 14-0199549 INIST</NO>
<ET>Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation</ET>
<AU>ELAGOUNI (Khaoula); GARCIA (Christophe); MAMALET (Franck); SEBILLOT (Pascale)</AU>
<AF>Orange Labs R&D/35512 Cesson-Sévigné/France (1 aut., 3 aut.); LIRIS/INSA de Lyon/69621 Villeurbanne/France (2 aut.); IRISA/INSA de Rennes/35042 Rennes/France (4 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2014; Vol. 17; No. 1; Pp. 19-31; Bibl. 50 ref.</SO>
<LA>Anglais</LA>
<EA>Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.</EA>
<CC>001D02C04; 001D02C03; 001D02C06</CC>
<FD>Reconnaissance caractère; Texte; Reconnaissance forme; Analyse documentaire; Multimédia; Reconnaissance optique caractère; Accès contenu; Vision ordinateur; Linguistique mathématique; Signal vidéo; Analyse scène; Présentation document; Sémantique; Balayage; Sous titrage; Convolution; Réseau neuronal; Segmentation image; Scène naturelle</FD>
<ED>Character recognition; Text; Pattern recognition; Document analysis; Multimedia; Optical character recognition; Content access; Computer vision; Computational linguistics; Video signal; Scene analysis; Document layout; Semantics; Scanning; Caption; Convolution; Neural network; Image segmentation; Natural scenes</ED>
<SD>Reconocimiento carácter; Texto; Reconocimiento patrón; Análisis documental; Multimedia; Reconocimento óptico de caracteres; Acceso contenido; Visión ordenador; Linguística matemática; Señal video; Análisis escena; Presentación documento; Semántica; Exploración; Subtítulo; Convolución; Red neuronal; Segmentación de imágenes; Escena natural</SD>
<LO>INIST-26790.354000501888130020</LO>
<ID>14-0199549</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000009 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000009 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:14-0199549
   |texte=   Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024