A Heuristic Approach to Caption Enhancement for Effective Video OCR
Identifieur interne : 000C91 ( Main/Exploration ); précédent : 000C90; suivant : 000C92A Heuristic Approach to Caption Enhancement for Effective Video OCR
Auteurs : Lei Xie [République populaire de Chine] ; Xi Tan [République populaire de Chine]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2008.
Abstract
Abstract: We present a heuristic approach to enhancing speech synchronized captions for video OCR, as a pre-process for subsequent tasks of multimedia indexing, segmentation and retrieval. We use a bi-search based caption transition detection method to improve efficiency, which adopts a simple heuristics that the same caption content usually lasts for a period for stable viewing. We propose a combination of color mask, changing mask and region mask to perform caption enhancement based on the discriminative characteristics of captions and backgrounds. Elaborate enhancement on individual characters is further used to remove small background residues. OCR experiments show that our caption enhancement approach brings a high character accuracy of 89.24%.
Url:
DOI: 10.1007/978-3-540-87442-3_44
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000336
- to stream Istex, to step Curation: 000331
- to stream Istex, to step Checkpoint: 000758
- to stream Main, to step Merge: 000D03
- to stream Main, to step Curation: 000C91
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">A Heuristic Approach to Caption Enhancement for Effective Video OCR</title>
<author><name sortKey="Xie, Lei" sort="Xie, Lei" uniqKey="Xie L" first="Lei" last="Xie">Lei Xie</name>
</author>
<author><name sortKey="Tan, Xi" sort="Tan, Xi" uniqKey="Tan X" first="Xi" last="Tan">Xi Tan</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:EC58ACAF01816737F364C0B94FA269259D681E66</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1007/978-3-540-87442-3_44</idno>
<idno type="url">https://api.istex.fr/document/EC58ACAF01816737F364C0B94FA269259D681E66/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000336</idno>
<idno type="wicri:Area/Istex/Curation">000331</idno>
<idno type="wicri:Area/Istex/Checkpoint">000758</idno>
<idno type="wicri:doubleKey">0302-9743:2008:Xie L:a:heuristic:approach</idno>
<idno type="wicri:Area/Main/Merge">000D03</idno>
<idno type="wicri:Area/Main/Curation">000C91</idno>
<idno type="wicri:Area/Main/Exploration">000C91</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">A Heuristic Approach to Caption Enhancement for Effective Video OCR</title>
<author><name sortKey="Xie, Lei" sort="Xie, Lei" uniqKey="Xie L" first="Lei" last="Xie">Lei Xie</name>
<affiliation wicri:level="1"><country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>School of Computer Science, Northwestern Polytechnical University, Xi’an</wicri:regionArea>
<wicri:noRegion>Xi’an</wicri:noRegion>
</affiliation>
<affiliation><wicri:noCountry code="subField">SAR</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">République populaire de Chine</country>
</affiliation>
</author>
<author><name sortKey="Tan, Xi" sort="Tan, Xi" uniqKey="Tan X" first="Xi" last="Tan">Xi Tan</name>
<affiliation wicri:level="1"><country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>School of Computer Science, Northwestern Polytechnical University, Xi’an</wicri:regionArea>
<wicri:noRegion>Xi’an</wicri:noRegion>
</affiliation>
<affiliation wicri:level="3"><country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems, , Shenzhen</wicri:regionArea>
<placeName><settlement type="city">Shenzhen</settlement>
<region type="province">Guangdong</region>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2008</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">EC58ACAF01816737F364C0B94FA269259D681E66</idno>
<idno type="DOI">10.1007/978-3-540-87442-3_44</idno>
<idno type="ChapterID">44</idno>
<idno type="ChapterID">Chap44</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We present a heuristic approach to enhancing speech synchronized captions for video OCR, as a pre-process for subsequent tasks of multimedia indexing, segmentation and retrieval. We use a bi-search based caption transition detection method to improve efficiency, which adopts a simple heuristics that the same caption content usually lasts for a period for stable viewing. We propose a combination of color mask, changing mask and region mask to perform caption enhancement based on the discriminative characteristics of captions and backgrounds. Elaborate enhancement on individual characters is further used to remove small background residues. OCR experiments show that our caption enhancement approach brings a high character accuracy of 89.24%.</div>
</front>
</TEI>
<affiliations><list><country><li>République populaire de Chine</li>
</country>
<region><li>Guangdong</li>
</region>
<settlement><li>Shenzhen</li>
</settlement>
</list>
<tree><country name="République populaire de Chine"><noRegion><name sortKey="Xie, Lei" sort="Xie, Lei" uniqKey="Xie L" first="Lei" last="Xie">Lei Xie</name>
</noRegion>
<name sortKey="Tan, Xi" sort="Tan, Xi" uniqKey="Tan X" first="Xi" last="Tan">Xi Tan</name>
<name sortKey="Tan, Xi" sort="Tan, Xi" uniqKey="Tan X" first="Xi" last="Tan">Xi Tan</name>
<name sortKey="Xie, Lei" sort="Xie, Lei" uniqKey="Xie L" first="Lei" last="Xie">Lei Xie</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C91 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000C91 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:EC58ACAF01816737F364C0B94FA269259D681E66 |texte= A Heuristic Approach to Caption Enhancement for Effective Video OCR }}
![]() | This area was generated with Dilib version V0.6.32. | ![]() |