OcrV1, PascalFrancis, Corpus, bibRecord, 000006

A framework for improved video text detection and recognition

Identifieur interne : 000006 ( PascalFrancis/Corpus ); précédent : 000005; suivant : 000007

A framework for improved video text detection and recognition

Auteurs : HAOJIN YANG ; Bernhard Quehl ; Harald Sack

Source :

Multimedia tools and applications [ 1380-7501 ] ; 2014.

RBID : Pascal:14-0217177

Descripteurs français

Pascal (Inist)
- Signal vidéo, Reconnaissance caractère, Texte, Reconnaissance forme, Recherche information, Traitement image, Indexation, Vision ordinateur, Bibliothèque électronique, Vidéothèque, Collecticiel, Workflow, Sémantique, Processus métier, Rappel, Taux fausse alarme, Classification à vaste marge, Localisation, ..

English descriptors

KwdEn :
- Business process, Character recognition, Computer vision, Electronic library, False alarm rate, Groupware, Image processing, Indexing, Information retrieval, Localization, Pattern recognition, Recall, Semantics, Text, Vector support machine, Video library, Video signal, Workflow.

Abstract

Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

A01	`01`	`1`		`@0 1380-7501`
A03		`1`		`@0 Multimed. tools appl.`
A05				`@2 69`
A06				`@2 1`
A08	`01`	`1`	`ENG`	`@1 A framework for improved video text detection and recognition`
A09	`01`	`1`	`ENG`	`@1 Computer Vision for Multimedia`
A11	`01`	`1`		`@1 HAOJIN YANG`
A11	`02`	`1`		`@1 QUEHL (Bernhard)`
A11	`03`	`1`		`@1 SACK (Harald)`
A12	`01`	`1`		`@1 TIAN (Jing) @9 ed.`
A12	`02`	`1`		`@1 CHEN (Li) @9 ed.`
A14	`01`			`@1 Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4 @2 14467 Potsdam @3 DEU @Z 1 aut. @Z 2 aut. @Z 3 aut.`
A15	`01`			`@1 Institute for Infocomm Research @2 Singapore 138632 @3 SGP @Z 1 aut.`
A15	`02`			`@1 Wuhan University of Science and Technology @2 Wuhan 430081 @3 CHN @Z 2 aut.`
A20				`@1 217-245`
A21				`@1 2014`
A23	`01`			`@0 ENG`
A43	`01`			`@1 INIST @2 28305 @5 354000501881100100`
A44				`@0 0000 @1 © 2014 INIST-CNRS. All rights reserved.`
A45				`@0 39 ref.`
A47	`01`	`1`		`@0 14-0217177`
A60				`@1 P`
A61				`@0 A`
A64	`01`	`1`		`@0 Multimedia tools and applications`
A66	`01`			`@0 DEU`
C01	`01`		`ENG`	@0 Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.
C02	`01`	`X`		`@0 001D02C03`
C02	`02`	`X`		`@0 001D02C04`
C02	`03`	`X`		`@0 001D02B07D`
C02	`04`	`X`		`@0 001D02B07B`
C03	`01`	`X`	`FRE`	`@0 Signal vidéo @5 06`
C03	`01`	`X`	`ENG`	`@0 Video signal @5 06`
C03	`01`	`X`	`SPA`	`@0 Señal video @5 06`
C03	`02`	`X`	`FRE`	`@0 Reconnaissance caractère @5 07`
C03	`02`	`X`	`ENG`	`@0 Character recognition @5 07`
C03	`02`	`X`	`SPA`	`@0 Reconocimiento carácter @5 07`
C03	`03`	`X`	`FRE`	`@0 Texte @5 08`
C03	`03`	`X`	`ENG`	`@0 Text @5 08`
C03	`03`	`X`	`SPA`	`@0 Texto @5 08`
C03	`04`	`X`	`FRE`	`@0 Reconnaissance forme @5 09`
C03	`04`	`X`	`ENG`	`@0 Pattern recognition @5 09`
C03	`04`	`X`	`SPA`	`@0 Reconocimiento patrón @5 09`
C03	`05`	`X`	`FRE`	`@0 Recherche information @5 10`
C03	`05`	`X`	`ENG`	`@0 Information retrieval @5 10`
C03	`05`	`X`	`SPA`	`@0 Búsqueda información @5 10`
C03	`06`	`X`	`FRE`	`@0 Traitement image @5 11`
C03	`06`	`X`	`ENG`	`@0 Image processing @5 11`
C03	`06`	`X`	`SPA`	`@0 Procesamiento imagen @5 11`
C03	`07`	`X`	`FRE`	`@0 Indexation @5 12`
C03	`07`	`X`	`ENG`	`@0 Indexing @5 12`
C03	`07`	`X`	`SPA`	`@0 Indización @5 12`
C03	`08`	`X`	`FRE`	`@0 Vision ordinateur @5 13`
C03	`08`	`X`	`ENG`	`@0 Computer vision @5 13`
C03	`08`	`X`	`SPA`	`@0 Visión ordenador @5 13`
C03	`09`	`X`	`FRE`	`@0 Bibliothèque électronique @5 14`
C03	`09`	`X`	`ENG`	`@0 Electronic library @5 14`
C03	`09`	`X`	`SPA`	`@0 Biblioteca electronica @5 14`
C03	`10`	`X`	`FRE`	`@0 Vidéothèque @5 15`
C03	`10`	`X`	`ENG`	`@0 Video library @5 15`
C03	`10`	`X`	`SPA`	`@0 Videoteca @5 15`
C03	`11`	`X`	`FRE`	`@0 Collecticiel @5 16`
C03	`11`	`X`	`ENG`	`@0 Groupware @5 16`
C03	`11`	`X`	`SPA`	`@0 Groupware @5 16`
C03	`12`	`X`	`FRE`	`@0 Workflow @5 17`
C03	`12`	`X`	`ENG`	`@0 Workflow @5 17`
C03	`12`	`X`	`SPA`	`@0 Workflow @5 17`
C03	`13`	`X`	`FRE`	`@0 Sémantique @5 18`
C03	`13`	`X`	`ENG`	`@0 Semantics @5 18`
C03	`13`	`X`	`SPA`	`@0 Semántica @5 18`
C03	`14`	`X`	`FRE`	`@0 Processus métier @5 19`
C03	`14`	`X`	`ENG`	`@0 Business process @5 19`
C03	`14`	`X`	`SPA`	`@0 Proceso oficio @5 19`
C03	`15`	`X`	`FRE`	`@0 Rappel @5 20`
C03	`15`	`X`	`ENG`	`@0 Recall @5 20`
C03	`15`	`X`	`SPA`	`@0 Llamada @5 20`
C03	`16`	`X`	`FRE`	`@0 Taux fausse alarme @5 21`
C03	`16`	`X`	`ENG`	`@0 False alarm rate @5 21`
C03	`16`	`X`	`SPA`	`@0 Porcentaje falsa alarma @5 21`
C03	`17`	`X`	`FRE`	`@0 Classification à vaste marge @5 23`
C03	`17`	`X`	`ENG`	`@0 Vector support machine @5 23`
C03	`17`	`X`	`SPA`	`@0 Máquina ejemplo soporte @5 23`
C03	`18`	`X`	`FRE`	`@0 Localisation @5 41`
C03	`18`	`X`	`ENG`	`@0 Localization @5 41`
C03	`18`	`X`	`SPA`	`@0 Localización @5 41`
C03	`19`	`X`	`FRE`	`@0 . @4 INC @5 82`
N21				`@1 265`
N44	`01`			`@1 OTO`
N82				`@1 OTO`

Format Inist (serveur)

NO :	PASCAL 14-0217177 INIST
ET :	A framework for improved video text detection and recognition
AU :	HAOJIN YANG; QUEHL (Bernhard); SACK (Harald); TIAN (Jing); CHEN (Li)
AF :	Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4/14467 Potsdam/Allemagne (1 aut., 2 aut., 3 aut.); Institute for Infocomm Research/Singapore 138632/Singapour (1 aut.); Wuhan University of Science and Technology/Wuhan 430081/Chine (2 aut.)
DT :	Publication en série; Niveau analytique
SO :	Multimedia tools and applications; ISSN 1380-7501; Allemagne; Da. 2014; Vol. 69; No. 1; Pp. 217-245; Bibl. 39 ref.
LA :	Anglais
EA :	Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.
CC :	001D02C03; 001D02C04; 001D02B07D; 001D02B07B
FD :	Signal vidéo; Reconnaissance caractère; Texte; Reconnaissance forme; Recherche information; Traitement image; Indexation; Vision ordinateur; Bibliothèque électronique; Vidéothèque; Collecticiel; Workflow; Sémantique; Processus métier; Rappel; Taux fausse alarme; Classification à vaste marge; Localisation; .
ED :	Video signal; Character recognition; Text; Pattern recognition; Information retrieval; Image processing; Indexing; Computer vision; Electronic library; Video library; Groupware; Workflow; Semantics; Business process; Recall; False alarm rate; Vector support machine; Localization
SD :	Señal video; Reconocimiento carácter; Texto; Reconocimiento patrón; Búsqueda información; Procesamiento imagen; Indización; Visión ordenador; Biblioteca electronica; Videoteca; Groupware; Workflow; Semántica; Proceso oficio; Llamada; Porcentaje falsa alarma; Máquina ejemplo soporte; Localización
LO :	INIST-28305.354000501881100100
ID :	14-0217177

Links to Exploration step

Pascal:14-0217177

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A framework for improved video text detection and recognition</title>
<author><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
<affiliation><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<affiliation><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
<affiliation><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">14-0217177</idno>
<date when="2014">2014</date>
<idno type="stanalyst">PASCAL 14-0217177 INIST</idno>
<idno type="RBID">Pascal:14-0217177</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000006</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A framework for improved video text detection and recognition</title>
<author><name sortKey="Haojin Yang" sort="Haojin Yang" uniqKey="Haojin Yang" last="Haojin Yang">HAOJIN YANG</name>
<affiliation><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Quehl, Bernhard" sort="Quehl, Bernhard" uniqKey="Quehl B" first="Bernhard" last="Quehl">Bernhard Quehl</name>
<affiliation><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Sack, Harald" sort="Sack, Harald" uniqKey="Sack H" first="Harald" last="Sack">Harald Sack</name>
<affiliation><inist:fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Multimedia tools and applications</title>
<title level="j" type="abbreviated">Multimed. tools appl.</title>
<idno type="ISSN">1380-7501</idno>
<imprint><date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Multimedia tools and applications</title>
<title level="j" type="abbreviated">Multimed. tools appl.</title>
<idno type="ISSN">1380-7501</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Business process</term>
<term>Character recognition</term>
<term>Computer vision</term>
<term>Electronic library</term>
<term>False alarm rate</term>
<term>Groupware</term>
<term>Image processing</term>
<term>Indexing</term>
<term>Information retrieval</term>
<term>Localization</term>
<term>Pattern recognition</term>
<term>Recall</term>
<term>Semantics</term>
<term>Text</term>
<term>Vector support machine</term>
<term>Video library</term>
<term>Video signal</term>
<term>Workflow</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Signal vidéo</term>
<term>Reconnaissance caractère</term>
<term>Texte</term>
<term>Reconnaissance forme</term>
<term>Recherche information</term>
<term>Traitement image</term>
<term>Indexation</term>
<term>Vision ordinateur</term>
<term>Bibliothèque électronique</term>
<term>Vidéothèque</term>
<term>Collecticiel</term>
<term>Workflow</term>
<term>Sémantique</term>
<term>Processus métier</term>
<term>Rappel</term>
<term>Taux fausse alarme</term>
<term>Classification à vaste marge</term>
<term>Localisation</term>
<term>.</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1380-7501</s0>
</fA01>
<fA03 i2="1"><s0>Multimed. tools appl.</s0>
</fA03>
<fA05><s2>69</s2>
</fA05>
<fA06><s2>1</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG"><s1>A framework for improved video text detection and recognition</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>Computer Vision for Multimedia</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>HAOJIN YANG</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>QUEHL (Bernhard)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>SACK (Harald)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>TIAN (Jing)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>CHEN (Li)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4</s1>
<s2>14467 Potsdam</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</fA14>
<fA15 i1="01"><s1>Institute for Infocomm Research</s1>
<s2>Singapore 138632</s2>
<s3>SGP</s3>
<sZ>1 aut.</sZ>
</fA15>
<fA15 i1="02"><s1>Wuhan University of Science and Technology</s1>
<s2>Wuhan 430081</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
</fA15>
<fA20><s1>217-245</s1>
</fA20>
<fA21><s1>2014</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA43 i1="01"><s1>INIST</s1>
<s2>28305</s2>
<s5>354000501881100100</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2014 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>39 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>14-0217177</s0>
</fA47>
<fA60><s1>P</s1>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>Multimedia tools and applications</s0>
</fA64>
<fA66 i1="01"><s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02C03</s0>
</fC02>
<fC02 i1="02" i2="X"><s0>001D02C04</s0>
</fC02>
<fC02 i1="03" i2="X"><s0>001D02B07D</s0>
</fC02>
<fC02 i1="04" i2="X"><s0>001D02B07B</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Signal vidéo</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Video signal</s0>
<s5>06</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Señal video</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Reconnaissance caractère</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Character recognition</s0>
<s5>07</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Reconocimiento carácter</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Texte</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Text</s0>
<s5>08</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Texto</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Reconnaissance forme</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Pattern recognition</s0>
<s5>09</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Reconocimiento patrón</s0>
<s5>09</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Recherche information</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Information retrieval</s0>
<s5>10</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Búsqueda información</s0>
<s5>10</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Traitement image</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Image processing</s0>
<s5>11</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Procesamiento imagen</s0>
<s5>11</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Indexation</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG"><s0>Indexing</s0>
<s5>12</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA"><s0>Indización</s0>
<s5>12</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Vision ordinateur</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Computer vision</s0>
<s5>13</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Visión ordenador</s0>
<s5>13</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE"><s0>Bibliothèque électronique</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG"><s0>Electronic library</s0>
<s5>14</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA"><s0>Biblioteca electronica</s0>
<s5>14</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE"><s0>Vidéothèque</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG"><s0>Video library</s0>
<s5>15</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA"><s0>Videoteca</s0>
<s5>15</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE"><s0>Collecticiel</s0>
<s5>16</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG"><s0>Groupware</s0>
<s5>16</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA"><s0>Groupware</s0>
<s5>16</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE"><s0>Workflow</s0>
<s5>17</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG"><s0>Workflow</s0>
<s5>17</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA"><s0>Workflow</s0>
<s5>17</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE"><s0>Sémantique</s0>
<s5>18</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG"><s0>Semantics</s0>
<s5>18</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA"><s0>Semántica</s0>
<s5>18</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE"><s0>Processus métier</s0>
<s5>19</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG"><s0>Business process</s0>
<s5>19</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA"><s0>Proceso oficio</s0>
<s5>19</s5>
</fC03>
<fC03 i1="15" i2="X" l="FRE"><s0>Rappel</s0>
<s5>20</s5>
</fC03>
<fC03 i1="15" i2="X" l="ENG"><s0>Recall</s0>
<s5>20</s5>
</fC03>
<fC03 i1="15" i2="X" l="SPA"><s0>Llamada</s0>
<s5>20</s5>
</fC03>
<fC03 i1="16" i2="X" l="FRE"><s0>Taux fausse alarme</s0>
<s5>21</s5>
</fC03>
<fC03 i1="16" i2="X" l="ENG"><s0>False alarm rate</s0>
<s5>21</s5>
</fC03>
<fC03 i1="16" i2="X" l="SPA"><s0>Porcentaje falsa alarma</s0>
<s5>21</s5>
</fC03>
<fC03 i1="17" i2="X" l="FRE"><s0>Classification à vaste marge</s0>
<s5>23</s5>
</fC03>
<fC03 i1="17" i2="X" l="ENG"><s0>Vector support machine</s0>
<s5>23</s5>
</fC03>
<fC03 i1="17" i2="X" l="SPA"><s0>Máquina ejemplo soporte</s0>
<s5>23</s5>
</fC03>
<fC03 i1="18" i2="X" l="FRE"><s0>Localisation</s0>
<s5>41</s5>
</fC03>
<fC03 i1="18" i2="X" l="ENG"><s0>Localization</s0>
<s5>41</s5>
</fC03>
<fC03 i1="18" i2="X" l="SPA"><s0>Localización</s0>
<s5>41</s5>
</fC03>
<fC03 i1="19" i2="X" l="FRE"><s0>.</s0>
<s4>INC</s4>
<s5>82</s5>
</fC03>
<fN21><s1>265</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
</standard>
<server><NO>PASCAL 14-0217177 INIST</NO>
<ET>A framework for improved video text detection and recognition</ET>
<AU>HAOJIN YANG; QUEHL (Bernhard); SACK (Harald); TIAN (Jing); CHEN (Li)</AU>
<AF>Hasso-Plattner-Institute for IT-Systems Engineering, University of Potsdam, Prof.-Dr.-Helmert Str. 2-4/14467 Potsdam/Allemagne (1 aut., 2 aut., 3 aut.); Institute for Infocomm Research/Singapore 138632/Singapour (1 aut.); Wuhan University of Science and Technology/Wuhan 430081/Chine (2 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>Multimedia tools and applications; ISSN 1380-7501; Allemagne; Da. 2014; Vol. 69; No. 1; Pp. 217-245; Bibl. 39 ref.</SO>
<LA>Anglais</LA>
<EA>Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT) - and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.</EA>
<CC>001D02C03; 001D02C04; 001D02B07D; 001D02B07B</CC>
<FD>Signal vidéo; Reconnaissance caractère; Texte; Reconnaissance forme; Recherche information; Traitement image; Indexation; Vision ordinateur; Bibliothèque électronique; Vidéothèque; Collecticiel; Workflow; Sémantique; Processus métier; Rappel; Taux fausse alarme; Classification à vaste marge; Localisation; .</FD>
<ED>Video signal; Character recognition; Text; Pattern recognition; Information retrieval; Image processing; Indexing; Computer vision; Electronic library; Video library; Groupware; Workflow; Semantics; Business process; Recall; False alarm rate; Vector support machine; Localization</ED>
<SD>Señal video; Reconocimiento carácter; Texto; Reconocimiento patrón; Búsqueda información; Procesamiento imagen; Indización; Visión ordenador; Biblioteca electronica; Videoteca; Groupware; Workflow; Semántica; Proceso oficio; Llamada; Porcentaje falsa alarma; Máquina ejemplo soporte; Localización</SD>
<LO>INIST-28305.354000501881100100</LO>
<ID>14-0217177</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000006 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000006 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:14-0217177
   |texte=   A framework for improved video text detection and recognition
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

A framework for improved video text detection and recognition

A framework for improved video text detection and recognition

Source :

Descripteurs français

English descriptors

Abstract

Notice en format standard (ISO 2709)

Format Inist (serveur)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri