Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Automatic performance evaluation of printed Chinese character recognition systems

Identifieur interne : 001A82 ( Main/Merge ); précédent : 001A81; suivant : 001A83

Automatic performance evaluation of printed Chinese character recognition systems

Auteurs : CHI FANG [République populaire de Chine] ; CHANGSONG LIU [République populaire de Chine] ; LIANGRUI PENG [République populaire de Chine] ; XIAOQING DING [République populaire de Chine]

Source :

RBID : Pascal:02-0230355

Descripteurs français

English descriptors

Abstract

Performance evaluation is crucial for improving the performance of OCR systems. However, this is trivial and sophisticated work to do by hand. Therefore, we have developed an automatic performance evaluation system for a printed Chinese character recognition (PCCR) system. Our system is characterized by using real-world data as test data and automatically obtaining the performance of the PCCR system by comparing the correct text and the recognition result of the document image. In addition, our performance evaluation system also provides some evaluation of performance for the segmentation module, the classification module, and the post-processing module of the PCCR system. For this purpose, a segmentation error-tolerant character-string matching algorithm is proposed to obtain the correspondence between the correct text and the recognition result. The experiments show that our performance evaluation system is an accurate and powerful tool for studying deficiencies in the PCCR system. Although our approach is aimed at the PCCR system, the idea also can be applied to other OCR systems.

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:02-0230355

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Automatic performance evaluation of printed Chinese character recognition systems</title>
<author>
<name sortKey="Chi Fang" sort="Chi Fang" uniqKey="Chi Fang" last="Chi Fang">CHI FANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Changsong Liu" sort="Changsong Liu" uniqKey="Changsong Liu" last="Changsong Liu">CHANGSONG LIU</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Liangrui Peng" sort="Liangrui Peng" uniqKey="Liangrui Peng" last="Liangrui Peng">LIANGRUI PENG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">02-0230355</idno>
<date when="2002">2002</date>
<idno type="stanalyst">PASCAL 02-0230355 INIST</idno>
<idno type="RBID">Pascal:02-0230355</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000672</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000120</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000629</idno>
<idno type="wicri:doubleKey">1433-2833:2002:Chi Fang:automatic:performance:evaluation</idno>
<idno type="wicri:Area/Main/Merge">001A82</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Automatic performance evaluation of printed Chinese character recognition systems</title>
<author>
<name sortKey="Chi Fang" sort="Chi Fang" uniqKey="Chi Fang" last="Chi Fang">CHI FANG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Changsong Liu" sort="Changsong Liu" uniqKey="Changsong Liu" last="Changsong Liu">CHANGSONG LIU</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Liangrui Peng" sort="Liangrui Peng" uniqKey="Liangrui Peng" last="Liangrui Peng">LIANGRUI PENG</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>State Key Laboratory of Intelligent Technology and Systems, Department of Electronic Engineering, Tsinghua University</s1>
<s2>Beijing 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>République populaire de Chine</country>
<placeName>
<settlement type="city">Pékin</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2002">2002</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Automatic system</term>
<term>Automatic test</term>
<term>Character recognition</term>
<term>Character string</term>
<term>Chinese</term>
<term>Image recognition</term>
<term>Image segmentation</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
<term>Printed character</term>
<term>System evaluation</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Chaîne caractère</term>
<term>Segmentation image</term>
<term>Essai automatique</term>
<term>Evaluation performance</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance image</term>
<term>Reconnaissance optique caractère</term>
<term>Evaluation système</term>
<term>Système automatique</term>
<term>Chinois</term>
<term>Caractère imprimé</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Performance evaluation is crucial for improving the performance of OCR systems. However, this is trivial and sophisticated work to do by hand. Therefore, we have developed an automatic performance evaluation system for a printed Chinese character recognition (PCCR) system. Our system is characterized by using real-world data as test data and automatically obtaining the performance of the PCCR system by comparing the correct text and the recognition result of the document image. In addition, our performance evaluation system also provides some evaluation of performance for the segmentation module, the classification module, and the post-processing module of the PCCR system. For this purpose, a segmentation error-tolerant character-string matching algorithm is proposed to obtain the correspondence between the correct text and the recognition result. The experiments show that our performance evaluation system is an accurate and powerful tool for studying deficiencies in the PCCR system. Although our approach is aimed at the PCCR system, the idea also can be applied to other OCR systems.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République populaire de Chine</li>
</country>
<settlement>
<li>Pékin</li>
</settlement>
</list>
<tree>
<country name="République populaire de Chine">
<noRegion>
<name sortKey="Chi Fang" sort="Chi Fang" uniqKey="Chi Fang" last="Chi Fang">CHI FANG</name>
</noRegion>
<name sortKey="Changsong Liu" sort="Changsong Liu" uniqKey="Changsong Liu" last="Changsong Liu">CHANGSONG LIU</name>
<name sortKey="Liangrui Peng" sort="Liangrui Peng" uniqKey="Liangrui Peng" last="Liangrui Peng">LIANGRUI PENG</name>
<name sortKey="Xiaoqing Ding" sort="Xiaoqing Ding" uniqKey="Xiaoqing Ding" last="Xiaoqing Ding">XIAOQING DING</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A82 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001A82 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     Pascal:02-0230355
   |texte=   Automatic performance evaluation of printed Chinese character recognition systems
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024