Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Heuristic Approach to Caption Enhancement for Effective Video OCR

Identifieur interne : 000331 ( Istex/Curation ); précédent : 000330; suivant : 000332

A Heuristic Approach to Caption Enhancement for Effective Video OCR

Auteurs : Lei Xie [République populaire de Chine] ; Xi Tan [République populaire de Chine]

Source :

RBID : ISTEX:EC58ACAF01816737F364C0B94FA269259D681E66

Abstract

Abstract: We present a heuristic approach to enhancing speech synchronized captions for video OCR, as a pre-process for subsequent tasks of multimedia indexing, segmentation and retrieval. We use a bi-search based caption transition detection method to improve efficiency, which adopts a simple heuristics that the same caption content usually lasts for a period for stable viewing. We propose a combination of color mask, changing mask and region mask to perform caption enhancement based on the discriminative characteristics of captions and backgrounds. Elaborate enhancement on individual characters is further used to remove small background residues. OCR experiments show that our caption enhancement approach brings a high character accuracy of 89.24%.

Url:
DOI: 10.1007/978-3-540-87442-3_44

Links toward previous steps (curation, corpus...)


Links to Exploration step

ISTEX:EC58ACAF01816737F364C0B94FA269259D681E66

Curation

No country items

Lei Xie
<affiliation>
<mods:affiliation>Human-Computer Communications Laboratory, The Chinese University of Hong Kong, Hong Kong SAR</mods:affiliation>
<wicri:noCountry code="subField">SAR</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1">
<mods:affiliation>E-mail: lxie@nwpu.edu.cn</mods:affiliation>
<country wicri:rule="url">République populaire de Chine</country>
</affiliation>

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A Heuristic Approach to Caption Enhancement for Effective Video OCR</title>
<author>
<name sortKey="Xie, Lei" sort="Xie, Lei" uniqKey="Xie L" first="Lei" last="Xie">Lei Xie</name>
<affiliation wicri:level="1">
<mods:affiliation>School of Computer Science, Northwestern Polytechnical University, Xi’an, China</mods:affiliation>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>School of Computer Science, Northwestern Polytechnical University, Xi’an</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>Human-Computer Communications Laboratory, The Chinese University of Hong Kong, Hong Kong SAR</mods:affiliation>
<wicri:noCountry code="subField">SAR</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1">
<mods:affiliation>E-mail: lxie@nwpu.edu.cn</mods:affiliation>
<country wicri:rule="url">République populaire de Chine</country>
</affiliation>
</author>
<author>
<name sortKey="Tan, Xi" sort="Tan, Xi" uniqKey="Tan X" first="Xi" last="Tan">Xi Tan</name>
<affiliation wicri:level="1">
<mods:affiliation>School of Computer Science, Northwestern Polytechnical University, Xi’an, China</mods:affiliation>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>School of Computer Science, Northwestern Polytechnical University, Xi’an</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<mods:affiliation>Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems,  , Shenzhen, China</mods:affiliation>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems,  , Shenzhen</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:EC58ACAF01816737F364C0B94FA269259D681E66</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1007/978-3-540-87442-3_44</idno>
<idno type="url">https://api.istex.fr/document/EC58ACAF01816737F364C0B94FA269259D681E66/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000336</idno>
<idno type="wicri:Area/Istex/Curation">000331</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">A Heuristic Approach to Caption Enhancement for Effective Video OCR</title>
<author>
<name sortKey="Xie, Lei" sort="Xie, Lei" uniqKey="Xie L" first="Lei" last="Xie">Lei Xie</name>
<affiliation wicri:level="1">
<mods:affiliation>School of Computer Science, Northwestern Polytechnical University, Xi’an, China</mods:affiliation>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>School of Computer Science, Northwestern Polytechnical University, Xi’an</wicri:regionArea>
</affiliation>
<affiliation>
<mods:affiliation>Human-Computer Communications Laboratory, The Chinese University of Hong Kong, Hong Kong SAR</mods:affiliation>
<wicri:noCountry code="subField">SAR</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1">
<mods:affiliation>E-mail: lxie@nwpu.edu.cn</mods:affiliation>
<country wicri:rule="url">République populaire de Chine</country>
</affiliation>
</author>
<author>
<name sortKey="Tan, Xi" sort="Tan, Xi" uniqKey="Tan X" first="Xi" last="Tan">Xi Tan</name>
<affiliation wicri:level="1">
<mods:affiliation>School of Computer Science, Northwestern Polytechnical University, Xi’an, China</mods:affiliation>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>School of Computer Science, Northwestern Polytechnical University, Xi’an</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<mods:affiliation>Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems,  , Shenzhen, China</mods:affiliation>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems,  , Shenzhen</wicri:regionArea>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2008</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">EC58ACAF01816737F364C0B94FA269259D681E66</idno>
<idno type="DOI">10.1007/978-3-540-87442-3_44</idno>
<idno type="ChapterID">44</idno>
<idno type="ChapterID">Chap44</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: We present a heuristic approach to enhancing speech synchronized captions for video OCR, as a pre-process for subsequent tasks of multimedia indexing, segmentation and retrieval. We use a bi-search based caption transition detection method to improve efficiency, which adopts a simple heuristics that the same caption content usually lasts for a period for stable viewing. We propose a combination of color mask, changing mask and region mask to perform caption enhancement based on the discriminative characteristics of captions and backgrounds. Elaborate enhancement on individual characters is further used to remove small background residues. OCR experiments show that our caption enhancement approach brings a high character accuracy of 89.24%.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000331 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Curation/biblio.hfd -nk 000331 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Curation
   |type=    RBID
   |clé=     ISTEX:EC58ACAF01816737F364C0B94FA269259D681E66
   |texte=   A Heuristic Approach to Caption Enhancement for Effective Video OCR
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024