Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Information filtering in Chinese document images based on templates matching and confidence measure

Identifieur interne : 001687 ( Main/Merge ); précédent : 001686; suivant : 001688

Information filtering in Chinese document images based on templates matching and confidence measure

Auteurs : Jiewei Chen [République populaire de Chine] ; Weiran Xu [République populaire de Chine] ; Jun Guo [République populaire de Chine]

Source :

RBID : Pascal:06-0453060

Descripteurs français

English descriptors

Abstract

A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach.Experimental results confirmed the robust of the proposed approach over a wide range of degradations.

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:06-0453060

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Information filtering in Chinese document images based on templates matching and confidence measure</title>
<author>
<name sortKey="Chen, Jiewei" sort="Chen, Jiewei" uniqKey="Chen J" first="Jiewei" last="Chen">Jiewei Chen</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Xu, Weiran" sort="Xu, Weiran" uniqKey="Xu W" first="Weiran" last="Xu">Weiran Xu</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Guo, Jun" sort="Guo, Jun" uniqKey="Guo J" first="Jun" last="Guo">Jun Guo</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">06-0453060</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 06-0453060 INIST</idno>
<idno type="RBID">Pascal:06-0453060</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000369</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000417</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000476</idno>
<idno type="wicri:Area/Main/Merge">001687</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Information filtering in Chinese document images based on templates matching and confidence measure</title>
<author>
<name sortKey="Chen, Jiewei" sort="Chen, Jiewei" uniqKey="Chen J" first="Jiewei" last="Chen">Jiewei Chen</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Xu, Weiran" sort="Xu, Weiran" uniqKey="Xu W" first="Weiran" last="Xu">Weiran Xu</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Guo, Jun" sort="Guo, Jun" uniqKey="Guo J" first="Jun" last="Guo">Jun Guo</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithm</term>
<term>Chinese</term>
<term>Degradation</term>
<term>Distance measurement</term>
<term>Document image processing</term>
<term>Feature extraction</term>
<term>Filtering</term>
<term>Image processing</term>
<term>Keyword</term>
<term>Lexicon</term>
<term>Multiple image</term>
<term>Multistage method</term>
<term>Optical character recognition</term>
<term>Pattern matching</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
<term>Signal processing</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Filtrage</term>
<term>Chinois</term>
<term>Traitement image document</term>
<term>Concordance forme</term>
<term>Mot clé</term>
<term>Image multiple</term>
<term>Lexique</term>
<term>Méthode section divisée</term>
<term>Algorithme</term>
<term>Mesure de distance</term>
<term>Evaluation performance</term>
<term>Reconnaissance optique caractère</term>
<term>Dégradation</term>
<term>Traitement image</term>
<term>Reconnaissance forme</term>
<term>Extraction caractéristique</term>
<term>Traitement signal</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach.Experimental results confirmed the robust of the proposed approach over a wide range of degradations.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République populaire de Chine</li>
</country>
<settlement>
<li>Pékin</li>
</settlement>
</list>
<tree>
<country name="République populaire de Chine">
<noRegion>
<name sortKey="Chen, Jiewei" sort="Chen, Jiewei" uniqKey="Chen J" first="Jiewei" last="Chen">Jiewei Chen</name>
</noRegion>
<name sortKey="Guo, Jun" sort="Guo, Jun" uniqKey="Guo J" first="Jun" last="Guo">Jun Guo</name>
<name sortKey="Xu, Weiran" sort="Xu, Weiran" uniqKey="Xu W" first="Weiran" last="Xu">Weiran Xu</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001687 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001687 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     Pascal:06-0453060
   |texte=   Information filtering in Chinese document images based on templates matching and confidence measure
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024