Information filtering in Chinese document images based on templates matching and confidence measure
Identifieur interne : 001687 ( Main/Merge ); précédent : 001686; suivant : 001688Information filtering in Chinese document images based on templates matching and confidence measure
Auteurs : Jiewei Chen [République populaire de Chine] ; Weiran Xu [République populaire de Chine] ; Jun Guo [République populaire de Chine]Source :
Descripteurs français
- Pascal (Inist)
- Filtrage, Chinois, Traitement image document, Concordance forme, Mot clé, Image multiple, Lexique, Méthode section divisée, Algorithme, Mesure de distance, Evaluation performance, Reconnaissance optique caractère, Dégradation, Traitement image, Reconnaissance forme, Extraction caractéristique, Traitement signal.
English descriptors
- KwdEn :
Abstract
A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach.Experimental results confirmed the robust of the proposed approach over a wide range of degradations.
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000369
- to stream PascalFrancis, to step Curation: 000417
- to stream PascalFrancis, to step Checkpoint: 000476
Links to Exploration step
Pascal:06-0453060Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Information filtering in Chinese document images based on templates matching and confidence measure</title>
<author><name sortKey="Chen, Jiewei" sort="Chen, Jiewei" uniqKey="Chen J" first="Jiewei" last="Chen">Jiewei Chen</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Xu, Weiran" sort="Xu, Weiran" uniqKey="Xu W" first="Weiran" last="Xu">Weiran Xu</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Guo, Jun" sort="Guo, Jun" uniqKey="Guo J" first="Jun" last="Guo">Jun Guo</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">06-0453060</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 06-0453060 INIST</idno>
<idno type="RBID">Pascal:06-0453060</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000369</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000417</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000476</idno>
<idno type="wicri:Area/Main/Merge">001687</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Information filtering in Chinese document images based on templates matching and confidence measure</title>
<author><name sortKey="Chen, Jiewei" sort="Chen, Jiewei" uniqKey="Chen J" first="Jiewei" last="Chen">Jiewei Chen</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Xu, Weiran" sort="Xu, Weiran" uniqKey="Xu W" first="Weiran" last="Xu">Weiran Xu</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Guo, Jun" sort="Guo, Jun" uniqKey="Guo J" first="Jun" last="Guo">Jun Guo</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>School of Information Engineering, Beijing University of Posts and Telecommunications</s1>
<s2>Beijing 100876</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName><settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithm</term>
<term>Chinese</term>
<term>Degradation</term>
<term>Distance measurement</term>
<term>Document image processing</term>
<term>Feature extraction</term>
<term>Filtering</term>
<term>Image processing</term>
<term>Keyword</term>
<term>Lexicon</term>
<term>Multiple image</term>
<term>Multistage method</term>
<term>Optical character recognition</term>
<term>Pattern matching</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
<term>Signal processing</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Filtrage</term>
<term>Chinois</term>
<term>Traitement image document</term>
<term>Concordance forme</term>
<term>Mot clé</term>
<term>Image multiple</term>
<term>Lexique</term>
<term>Méthode section divisée</term>
<term>Algorithme</term>
<term>Mesure de distance</term>
<term>Evaluation performance</term>
<term>Reconnaissance optique caractère</term>
<term>Dégradation</term>
<term>Traitement image</term>
<term>Reconnaissance forme</term>
<term>Extraction caractéristique</term>
<term>Traitement signal</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">A fast approach to keyword spotting in Chinese document images based on multiple templates matching and confidence measure is presented. The system generates keyword lexicon of diverse fonts and two-stage feature vectors prior to the procedure of keyword searching. A two-stage retrieval scheme and Boyer-Moore Algorithm is proposed aiming at accelerating the retrieval process. A distance measure between the candidate character and the templates is used to identify and rank similar templates. The performance of new system has been significantly improved when compared to traditional OCR and image-based approach.Experimental results confirmed the robust of the proposed approach over a wide range of degradations.</div>
</front>
</TEI>
<affiliations><list><country><li>République populaire de Chine</li>
</country>
<settlement><li>Pékin</li>
</settlement>
</list>
<tree><country name="République populaire de Chine"><noRegion><name sortKey="Chen, Jiewei" sort="Chen, Jiewei" uniqKey="Chen J" first="Jiewei" last="Chen">Jiewei Chen</name>
</noRegion>
<name sortKey="Guo, Jun" sort="Guo, Jun" uniqKey="Guo J" first="Jun" last="Guo">Jun Guo</name>
<name sortKey="Xu, Weiran" sort="Xu, Weiran" uniqKey="Xu W" first="Weiran" last="Xu">Weiran Xu</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001687 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001687 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= Pascal:06-0453060 |texte= Information filtering in Chinese document images based on templates matching and confidence measure }}
This area was generated with Dilib version V0.6.32. |