Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Extraction and recognition of artificial text in multimedia documents

Identifieur interne : 000552 ( PascalFrancis/Corpus ); précédent : 000551; suivant : 000553

Extraction and recognition of artificial text in multimedia documents

Auteurs : C. Wolf ; J.-M. Jolion

Source :

RBID : Pascal:04-0205881

Descripteurs français

English descriptors

Abstract

The systems currently available for content-based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localise artificial text in images and videos using a measure of accumulated gradients and morphological processing. The quality of the localised text is improved by robust multiple frame integration. A new technique for the binarisation of the text boxes based on a criterion maximizing local contrast is proposed. Finally, detection and OCR results for a commercial OCR are presented, justifying the choice of the binarisation technique.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  2    @0 1433-7541
A05       @2 6
A06       @2 4
A08 01  1  ENG  @1 Extraction and recognition of artificial text in multimedia documents
A11 01  1    @1 WOLF (C.)
A11 02  1    @1 JOLION (J.-M.)
A14 01      @1 Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein @2 69621 Villeurbanne @3 FRA @Z 1 aut. @Z 2 aut.
A20       @1 309-326
A21       @1 2004
A23 01      @0 ENG
A43 01      @1 INIST @2 26865 @5 354000115017570050
A44       @0 0000 @1 © 2004 INIST-CNRS. All rights reserved.
A45       @0 34 ref.
A47 01  1    @0 04-0205881
A60       @1 P
A61       @0 A
A64 01  2    @0 Pattern analysis and applications
A66 01      @0 GBR
C01 01    ENG  @0 The systems currently available for content-based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localise artificial text in images and videos using a measure of accumulated gradients and morphological processing. The quality of the localised text is improved by robust multiple frame integration. A new technique for the binarisation of the text boxes based on a criterion maximizing local contrast is proposed. Finally, detection and OCR results for a commercial OCR are presented, justifying the choice of the binarisation technique.
C02 01  X    @0 001D02C02
C03 01  X  FRE  @0 Reconnaissance caractère @5 09
C03 01  X  ENG  @0 Character recognition @5 09
C03 01  X  SPA  @0 Reconocimiento carácter @5 09
C03 02  X  FRE  @0 Multimédia @5 10
C03 02  X  ENG  @0 Multimedia @5 10
C03 02  X  SPA  @0 Multimedia @5 10
C03 03  X  FRE  @0 Signal vidéo @5 11
C03 03  X  ENG  @0 Video signal @5 11
C03 03  X  SPA  @0 Señal video @5 11
C03 04  X  FRE  @0 Traitement image @5 12
C03 04  X  ENG  @0 Image processing @5 12
C03 04  X  SPA  @0 Procesamiento imagen @5 12
C03 05  X  FRE  @0 Similitude @5 13
C03 05  X  ENG  @0 Similarity @5 13
C03 05  X  SPA  @0 Similitud @5 13
C03 06  X  FRE  @0 Indexation @5 14
C03 06  X  ENG  @0 Indexing @5 14
C03 06  X  SPA  @0 Indización @5 14
C03 07  X  FRE  @0 Texte @5 15
C03 07  X  ENG  @0 Text @5 15
C03 07  X  SPA  @0 Texto @5 15
C03 08  X  FRE  @0 Séquence image @5 16
C03 08  X  ENG  @0 Image sequence @5 16
C03 08  X  SPA  @0 Secuencia imagen @5 16
C03 09  X  FRE  @0 Utilisation information @5 17
C03 09  X  ENG  @0 Information use @5 17
C03 09  X  SPA  @0 Uso información @5 17
C03 10  3  FRE  @0 Recherche par contenu @5 18
C03 10  3  ENG  @0 Content-based retrieval @5 18
C03 11  X  FRE  @0 Sémantique @5 19
C03 11  X  ENG  @0 Semantics @5 19
C03 11  X  SPA  @0 Semántica @5 19
C03 12  X  FRE  @0 Homme @5 20
C03 12  X  ENG  @0 Human @5 20
C03 12  X  SPA  @0 Hombre @5 20
C03 13  X  FRE  @0 Choix @5 21
C03 13  X  ENG  @0 Choice @5 21
C03 13  X  SPA  @0 Elección @5 21
C03 14  X  FRE  @0 Extraction forme @5 23
C03 14  X  ENG  @0 Pattern extraction @5 23
C03 14  X  SPA  @0 Extracción forma @5 23
N21       @1 138
N82       @1 OTO

Format Inist (serveur)

NO : PASCAL 04-0205881 INIST
ET : Extraction and recognition of artificial text in multimedia documents
AU : WOLF (C.); JOLION (J.-M.)
AF : Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein/69621 Villeurbanne/France (1 aut., 2 aut.)
DT : Publication en série; Niveau analytique
SO : Pattern analysis and applications; ISSN 1433-7541; Royaume-Uni; Da. 2004; Vol. 6; No. 4; Pp. 309-326; Bibl. 34 ref.
LA : Anglais
EA : The systems currently available for content-based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localise artificial text in images and videos using a measure of accumulated gradients and morphological processing. The quality of the localised text is improved by robust multiple frame integration. A new technique for the binarisation of the text boxes based on a criterion maximizing local contrast is proposed. Finally, detection and OCR results for a commercial OCR are presented, justifying the choice of the binarisation technique.
CC : 001D02C02
FD : Reconnaissance caractère; Multimédia; Signal vidéo; Traitement image; Similitude; Indexation; Texte; Séquence image; Utilisation information; Recherche par contenu; Sémantique; Homme; Choix; Extraction forme
ED : Character recognition; Multimedia; Video signal; Image processing; Similarity; Indexing; Text; Image sequence; Information use; Content-based retrieval; Semantics; Human; Choice; Pattern extraction
SD : Reconocimiento carácter; Multimedia; Señal video; Procesamiento imagen; Similitud; Indización; Texto; Secuencia imagen; Uso información; Semántica; Hombre; Elección; Extracción forma
LO : INIST-26865.354000115017570050
ID : 04-0205881

Links to Exploration step

Pascal:04-0205881

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Extraction and recognition of artificial text in multimedia documents</title>
<author>
<name sortKey="Wolf, C" sort="Wolf, C" uniqKey="Wolf C" first="C." last="Wolf">C. Wolf</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Jolion, J M" sort="Jolion, J M" uniqKey="Jolion J" first="J.-M." last="Jolion">J.-M. Jolion</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">04-0205881</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 04-0205881 INIST</idno>
<idno type="RBID">Pascal:04-0205881</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000552</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Extraction and recognition of artificial text in multimedia documents</title>
<author>
<name sortKey="Wolf, C" sort="Wolf, C" uniqKey="Wolf C" first="C." last="Wolf">C. Wolf</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Jolion, J M" sort="Jolion, J M" uniqKey="Jolion J" first="J.-M." last="Jolion">J.-M. Jolion</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Pattern analysis and applications</title>
<idno type="ISSN">1433-7541</idno>
<imprint>
<date when="2004">2004</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Pattern analysis and applications</title>
<idno type="ISSN">1433-7541</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Choice</term>
<term>Content-based retrieval</term>
<term>Human</term>
<term>Image processing</term>
<term>Image sequence</term>
<term>Indexing</term>
<term>Information use</term>
<term>Multimedia</term>
<term>Pattern extraction</term>
<term>Semantics</term>
<term>Similarity</term>
<term>Text</term>
<term>Video signal</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Multimédia</term>
<term>Signal vidéo</term>
<term>Traitement image</term>
<term>Similitude</term>
<term>Indexation</term>
<term>Texte</term>
<term>Séquence image</term>
<term>Utilisation information</term>
<term>Recherche par contenu</term>
<term>Sémantique</term>
<term>Homme</term>
<term>Choix</term>
<term>Extraction forme</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The systems currently available for content-based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localise artificial text in images and videos using a measure of accumulated gradients and morphological processing. The quality of the localised text is improved by robust multiple frame integration. A new technique for the binarisation of the text boxes based on a criterion maximizing local contrast is proposed. Finally, detection and OCR results for a commercial OCR are presented, justifying the choice of the binarisation technique.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="2">
<s0>1433-7541</s0>
</fA01>
<fA05>
<s2>6</s2>
</fA05>
<fA06>
<s2>4</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Extraction and recognition of artificial text in multimedia documents</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>WOLF (C.)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>JOLION (J.-M.)</s1>
</fA11>
<fA14 i1="01">
<s1>Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein</s1>
<s2>69621 Villeurbanne</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA20>
<s1>309-326</s1>
</fA20>
<fA21>
<s1>2004</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>26865</s2>
<s5>354000115017570050</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2004 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>34 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>04-0205881</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="2">
<s0>Pattern analysis and applications</s0>
</fA64>
<fA66 i1="01">
<s0>GBR</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>The systems currently available for content-based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localise artificial text in images and videos using a measure of accumulated gradients and morphological processing. The quality of the localised text is improved by robust multiple frame integration. A new technique for the binarisation of the text boxes based on a criterion maximizing local contrast is proposed. Finally, detection and OCR results for a commercial OCR are presented, justifying the choice of the binarisation technique.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D02C02</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>09</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Character recognition</s0>
<s5>09</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Reconocimiento carácter</s0>
<s5>09</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Multimédia</s0>
<s5>10</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Multimedia</s0>
<s5>10</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Multimedia</s0>
<s5>10</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Signal vidéo</s0>
<s5>11</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Video signal</s0>
<s5>11</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Señal video</s0>
<s5>11</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Traitement image</s0>
<s5>12</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Image processing</s0>
<s5>12</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Procesamiento imagen</s0>
<s5>12</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Similitude</s0>
<s5>13</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Similarity</s0>
<s5>13</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Similitud</s0>
<s5>13</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Indexation</s0>
<s5>14</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Indexing</s0>
<s5>14</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Indización</s0>
<s5>14</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Texte</s0>
<s5>15</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Text</s0>
<s5>15</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Texto</s0>
<s5>15</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Séquence image</s0>
<s5>16</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Image sequence</s0>
<s5>16</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Secuencia imagen</s0>
<s5>16</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Utilisation information</s0>
<s5>17</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Information use</s0>
<s5>17</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Uso información</s0>
<s5>17</s5>
</fC03>
<fC03 i1="10" i2="3" l="FRE">
<s0>Recherche par contenu</s0>
<s5>18</s5>
</fC03>
<fC03 i1="10" i2="3" l="ENG">
<s0>Content-based retrieval</s0>
<s5>18</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE">
<s0>Sémantique</s0>
<s5>19</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG">
<s0>Semantics</s0>
<s5>19</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA">
<s0>Semántica</s0>
<s5>19</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE">
<s0>Homme</s0>
<s5>20</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG">
<s0>Human</s0>
<s5>20</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA">
<s0>Hombre</s0>
<s5>20</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE">
<s0>Choix</s0>
<s5>21</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG">
<s0>Choice</s0>
<s5>21</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA">
<s0>Elección</s0>
<s5>21</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE">
<s0>Extraction forme</s0>
<s5>23</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG">
<s0>Pattern extraction</s0>
<s5>23</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA">
<s0>Extracción forma</s0>
<s5>23</s5>
</fC03>
<fN21>
<s1>138</s1>
</fN21>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
</standard>
<server>
<NO>PASCAL 04-0205881 INIST</NO>
<ET>Extraction and recognition of artificial text in multimedia documents</ET>
<AU>WOLF (C.); JOLION (J.-M.)</AU>
<AF>Lyon Research Center for Images and Intelligent Information Systems, INSA de Lyon, Bat., Verne, 20, Av. Albert Einstein/69621 Villeurbanne/France (1 aut., 2 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>Pattern analysis and applications; ISSN 1433-7541; Royaume-Uni; Da. 2004; Vol. 6; No. 4; Pp. 309-326; Bibl. 34 ref.</SO>
<LA>Anglais</LA>
<EA>The systems currently available for content-based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localise artificial text in images and videos using a measure of accumulated gradients and morphological processing. The quality of the localised text is improved by robust multiple frame integration. A new technique for the binarisation of the text boxes based on a criterion maximizing local contrast is proposed. Finally, detection and OCR results for a commercial OCR are presented, justifying the choice of the binarisation technique.</EA>
<CC>001D02C02</CC>
<FD>Reconnaissance caractère; Multimédia; Signal vidéo; Traitement image; Similitude; Indexation; Texte; Séquence image; Utilisation information; Recherche par contenu; Sémantique; Homme; Choix; Extraction forme</FD>
<ED>Character recognition; Multimedia; Video signal; Image processing; Similarity; Indexing; Text; Image sequence; Information use; Content-based retrieval; Semantics; Human; Choice; Pattern extraction</ED>
<SD>Reconocimiento carácter; Multimedia; Señal video; Procesamiento imagen; Similitud; Indización; Texto; Secuencia imagen; Uso información; Semántica; Hombre; Elección; Extracción forma</SD>
<LO>INIST-26865.354000115017570050</LO>
<ID>04-0205881</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000552 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000552 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:04-0205881
   |texte=   Extraction and recognition of artificial text in multimedia documents
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024