OcrV1, PascalFrancis, Corpus, bibRecord, 000338

Robust feature extraction for character recognition based on binary images

Identifieur interne : 000338 ( PascalFrancis/Corpus ); précédent : 000337; suivant : 000339

Robust feature extraction for character recognition based on binary images

Auteurs : LIJUN WANG ; LI ZHANG ; YUXIANG XING ; ZHIMING WANG ; HEWEI GAO

Source :

Proceedings of SPIE, the International Society for Optical Engineering [ 0277-786X ] ; 2006.

RBID : Pascal:07-0376468

Descripteurs français

Pascal (Inist)
- Reconnaissance forme, Algorithme, Etude expérimentale, Image binaire, Extraction caractéristique, Reconnaissance caractère, Reconnaissance optique caractère, Transformation distance, Robustesse, Système hiérarchisé, Evaluation performance, Etats Unis, 4230S.

English descriptors

KwdEn :
- Algorithms, Binary image, Character recognition, Distance transformation, Experimental study, Feature extraction, Hierarchical systems, Optical character recognition, Pattern recognition, Performance evaluation, Robustness, USA.

Abstract

Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

A01	`01`	`1`		`@0 0277-786X`
A05				`@2 6067`
A08	`01`	`1`	`ENG`	`@1 Robust feature extraction for character recognition based on binary images`
A09	`01`	`1`	`ENG`	`@1 Document recognition and retrieval XIII : 18-19 January 2006, San Jose, California, USA`
A11	`01`	`1`		`@1 LIJUN WANG`
A11	`02`	`1`		`@1 LI ZHANG`
A11	`03`	`1`		`@1 YUXIANG XING`
A11	`04`	`1`		`@1 ZHIMING WANG`
A11	`05`	`1`		`@1 HEWEI GAO`
A12	`01`	`1`		`@1 TAGHVA (Kazem) @9 ed.`
A12	`02`	`1`		`@1 LIN (Xiaofan) @9 ed.`
A14	`01`			`@1 Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu @2 Haidian District, Beijing, 100084 @3 CHN @Z 1 aut.`
A14	`02`			`@1 Department of Engineering Physics, Tsinghua University @2 100084 @3 CHN @Z 2 aut. @Z 3 aut. @Z 5 aut.`
A14	`03`			`@1 Department of Computer Science and Technology, University of Science and Technology @2 Beijing, 100084 @3 CHN @Z 4 aut.`
A18	`01`	`1`		`@1 IS&T--The Society for Imaging Science and Technology @3 USA @9 org-cong.`
A18	`02`	`1`		`@1 Society of photo-optical instrumentation engineers @3 USA @9 org-cong.`
A20				`@2 606708.1-606708.8`
A21				`@1 2006`
A23	`01`			`@0 ENG`
A26	`01`			`@0 0-8194-6107-5`
A43	`01`			`@1 INIST @2 21760 @5 354000153562240070`
A44				`@0 0000 @1 © 2007 INIST-CNRS. All rights reserved.`
A45				`@0 18 ref.`
A47	`01`	`1`		`@0 07-0376468`
A60				`@1 P @2 C`
A61				`@0 A`
A64	`01`	`1`		`@0 Proceedings of SPIE, the International Society for Optical Engineering`
A66	`01`			`@0 USA`
C01	`01`		`ENG`	@0 Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.
C02	`01`	`3`		`@0 001B40B30S`
C02	`02`	`X`		`@0 001D04A05A`
C02	`03`	`X`		`@0 001D04A05D`
C03	`01`	`3`	`FRE`	`@0 Reconnaissance forme @5 03`
C03	`01`	`3`	`ENG`	`@0 Pattern recognition @5 03`
C03	`02`	`3`	`FRE`	`@0 Algorithme @5 23`
C03	`02`	`3`	`ENG`	`@0 Algorithms @5 23`
C03	`03`	`3`	`FRE`	`@0 Etude expérimentale @5 30`
C03	`03`	`3`	`ENG`	`@0 Experimental study @5 30`
C03	`04`	`X`	`FRE`	`@0 Image binaire @5 61`
C03	`04`	`X`	`ENG`	`@0 Binary image @5 61`
C03	`04`	`X`	`SPA`	`@0 Imagen binaria @5 61`
C03	`05`	`3`	`FRE`	`@0 Extraction caractéristique @5 62`
C03	`05`	`3`	`ENG`	`@0 Feature extraction @5 62`
C03	`06`	`3`	`FRE`	`@0 Reconnaissance caractère @5 63`
C03	`06`	`3`	`ENG`	`@0 Character recognition @5 63`
C03	`07`	`3`	`FRE`	`@0 Reconnaissance optique caractère @5 64`
C03	`07`	`3`	`ENG`	`@0 Optical character recognition @5 64`
C03	`08`	`X`	`FRE`	`@0 Transformation distance @5 65`
C03	`08`	`X`	`ENG`	`@0 Distance transformation @5 65`
C03	`08`	`X`	`SPA`	`@0 Transformación distancia @5 65`
C03	`09`	`X`	`FRE`	`@0 Robustesse @5 66`
C03	`09`	`X`	`ENG`	`@0 Robustness @5 66`
C03	`09`	`X`	`SPA`	`@0 Robustez @5 66`
C03	`10`	`3`	`FRE`	`@0 Système hiérarchisé @5 67`
C03	`10`	`3`	`ENG`	`@0 Hierarchical systems @5 67`
C03	`11`	`3`	`FRE`	`@0 Evaluation performance @5 68`
C03	`11`	`3`	`ENG`	`@0 Performance evaluation @5 68`
C03	`12`	`3`	`FRE`	`@0 Etats Unis @2 NG @5 69`
C03	`12`	`3`	`ENG`	`@0 USA @2 NG @5 69`
C03	`13`	`3`	`FRE`	`@0 4230S @4 INC @5 91`
N21				`@1 239`
N44	`01`			`@1 OTO`
N82				`@1 OTO`

A30	`01`	`1`	`ENG`	`@1 Document recognition and retrieval @2 13 @3 USA @4 2006`

Format Inist (serveur)

NO :	PASCAL 07-0376468 INIST
ET :	Robust feature extraction for character recognition based on binary images
AU :	LIJUN WANG; LI ZHANG; YUXIANG XING; ZHIMING WANG; HEWEI GAO; TAGHVA (Kazem); LIN (Xiaofan)
AF :	Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu/Haidian District, Beijing, 100084/Chine (1 aut.); Department of Engineering Physics, Tsinghua University/100084/Chine (2 aut., 3 aut., 5 aut.); Department of Computer Science and Technology, University of Science and Technology/Beijing, 100084/Chine (4 aut.)
DT :	Publication en série; Congrès; Niveau analytique
SO :	Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Etats-Unis; Da. 2006; Vol. 6067; 606708.1-606708.8; Bibl. 18 ref.
LA :	Anglais
EA :	Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.
CC :	001B40B30S; 001D04A05A; 001D04A05D
FD :	Reconnaissance forme; Algorithme; Etude expérimentale; Image binaire; Extraction caractéristique; Reconnaissance caractère; Reconnaissance optique caractère; Transformation distance; Robustesse; Système hiérarchisé; Evaluation performance; Etats Unis; 4230S
ED :	Pattern recognition; Algorithms; Experimental study; Binary image; Feature extraction; Character recognition; Optical character recognition; Distance transformation; Robustness; Hierarchical systems; Performance evaluation; USA
SD :	Imagen binaria; Transformación distancia; Robustez
LO :	INIST-21760.354000153562240070
ID :	07-0376468

Links to Exploration step

Pascal:07-0376468

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Robust feature extraction for character recognition based on binary images</title>
<author><name sortKey="Lijun Wang" sort="Lijun Wang" uniqKey="Lijun Wang" last="Lijun Wang">LIJUN WANG</name>
<affiliation><inist:fA14 i1="01"><s1>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu</s1>
<s2>Haidian District, Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Li Zhang" sort="Li Zhang" uniqKey="Li Zhang" last="Li Zhang">LI ZHANG</name>
<affiliation><inist:fA14 i1="02"><s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Yuxiang Xing" sort="Yuxiang Xing" uniqKey="Yuxiang Xing" last="Yuxiang Xing">YUXIANG XING</name>
<affiliation><inist:fA14 i1="02"><s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Zhiming Wang" sort="Zhiming Wang" uniqKey="Zhiming Wang" last="Zhiming Wang">ZHIMING WANG</name>
<affiliation><inist:fA14 i1="03"><s1>Department of Computer Science and Technology, University of Science and Technology</s1>
<s2>Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Hewei Gao" sort="Hewei Gao" uniqKey="Hewei Gao" last="Hewei Gao">HEWEI GAO</name>
<affiliation><inist:fA14 i1="02"><s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">07-0376468</idno>
<date when="2006">2006</date>
<idno type="stanalyst">PASCAL 07-0376468 INIST</idno>
<idno type="RBID">Pascal:07-0376468</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000338</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Robust feature extraction for character recognition based on binary images</title>
<author><name sortKey="Lijun Wang" sort="Lijun Wang" uniqKey="Lijun Wang" last="Lijun Wang">LIJUN WANG</name>
<affiliation><inist:fA14 i1="01"><s1>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu</s1>
<s2>Haidian District, Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Li Zhang" sort="Li Zhang" uniqKey="Li Zhang" last="Li Zhang">LI ZHANG</name>
<affiliation><inist:fA14 i1="02"><s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Yuxiang Xing" sort="Yuxiang Xing" uniqKey="Yuxiang Xing" last="Yuxiang Xing">YUXIANG XING</name>
<affiliation><inist:fA14 i1="02"><s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Zhiming Wang" sort="Zhiming Wang" uniqKey="Zhiming Wang" last="Zhiming Wang">ZHIMING WANG</name>
<affiliation><inist:fA14 i1="03"><s1>Department of Computer Science and Technology, University of Science and Technology</s1>
<s2>Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Hewei Gao" sort="Hewei Gao" uniqKey="Hewei Gao" last="Hewei Gao">HEWEI GAO</name>
<affiliation><inist:fA14 i1="02"><s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
<imprint><date when="2006">2006</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Binary image</term>
<term>Character recognition</term>
<term>Distance transformation</term>
<term>Experimental study</term>
<term>Feature extraction</term>
<term>Hierarchical systems</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
<term>Robustness</term>
<term>USA</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance forme</term>
<term>Algorithme</term>
<term>Etude expérimentale</term>
<term>Image binaire</term>
<term>Extraction caractéristique</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Transformation distance</term>
<term>Robustesse</term>
<term>Système hiérarchisé</term>
<term>Evaluation performance</term>
<term>Etats Unis</term>
<term>4230S</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>0277-786X</s0>
</fA01>
<fA05><s2>6067</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG"><s1>Robust feature extraction for character recognition based on binary images</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval XIII : 18-19 January 2006, San Jose, California, USA</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>LIJUN WANG</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>LI ZHANG</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>YUXIANG XING</s1>
</fA11>
<fA11 i1="04" i2="1"><s1>ZHIMING WANG</s1>
</fA11>
<fA11 i1="05" i2="1"><s1>HEWEI GAO</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>TAGHVA (Kazem)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>LIN (Xiaofan)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu</s1>
<s2>Haidian District, Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</fA14>
<fA14 i1="03"><s1>Department of Computer Science and Technology, University of Science and Technology</s1>
<s2>Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1"><s1>IS&T--The Society for Imaging Science and Technology</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="02" i2="1"><s1>Society of photo-optical instrumentation engineers</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20><s2>606708.1-606708.8</s2>
</fA20>
<fA21><s1>2006</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA26 i1="01"><s0>0-8194-6107-5</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>21760</s2>
<s5>354000153562240070</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2007 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>18 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>07-0376468</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>Proceedings of SPIE, the International Society for Optical Engineering</s0>
</fA64>
<fA66 i1="01"><s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.</s0>
</fC01>
<fC02 i1="01" i2="3"><s0>001B40B30S</s0>
</fC02>
<fC02 i1="02" i2="X"><s0>001D04A05A</s0>
</fC02>
<fC02 i1="03" i2="X"><s0>001D04A05D</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE"><s0>Reconnaissance forme</s0>
<s5>03</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG"><s0>Pattern recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="02" i2="3" l="FRE"><s0>Algorithme</s0>
<s5>23</s5>
</fC03>
<fC03 i1="02" i2="3" l="ENG"><s0>Algorithms</s0>
<s5>23</s5>
</fC03>
<fC03 i1="03" i2="3" l="FRE"><s0>Etude expérimentale</s0>
<s5>30</s5>
</fC03>
<fC03 i1="03" i2="3" l="ENG"><s0>Experimental study</s0>
<s5>30</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Image binaire</s0>
<s5>61</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Binary image</s0>
<s5>61</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Imagen binaria</s0>
<s5>61</s5>
</fC03>
<fC03 i1="05" i2="3" l="FRE"><s0>Extraction caractéristique</s0>
<s5>62</s5>
</fC03>
<fC03 i1="05" i2="3" l="ENG"><s0>Feature extraction</s0>
<s5>62</s5>
</fC03>
<fC03 i1="06" i2="3" l="FRE"><s0>Reconnaissance caractère</s0>
<s5>63</s5>
</fC03>
<fC03 i1="06" i2="3" l="ENG"><s0>Character recognition</s0>
<s5>63</s5>
</fC03>
<fC03 i1="07" i2="3" l="FRE"><s0>Reconnaissance optique caractère</s0>
<s5>64</s5>
</fC03>
<fC03 i1="07" i2="3" l="ENG"><s0>Optical character recognition</s0>
<s5>64</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Transformation distance</s0>
<s5>65</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Distance transformation</s0>
<s5>65</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Transformación distancia</s0>
<s5>65</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE"><s0>Robustesse</s0>
<s5>66</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG"><s0>Robustness</s0>
<s5>66</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA"><s0>Robustez</s0>
<s5>66</s5>
</fC03>
<fC03 i1="10" i2="3" l="FRE"><s0>Système hiérarchisé</s0>
<s5>67</s5>
</fC03>
<fC03 i1="10" i2="3" l="ENG"><s0>Hierarchical systems</s0>
<s5>67</s5>
</fC03>
<fC03 i1="11" i2="3" l="FRE"><s0>Evaluation performance</s0>
<s5>68</s5>
</fC03>
<fC03 i1="11" i2="3" l="ENG"><s0>Performance evaluation</s0>
<s5>68</s5>
</fC03>
<fC03 i1="12" i2="3" l="FRE"><s0>Etats Unis</s0>
<s2>NG</s2>
<s5>69</s5>
</fC03>
<fC03 i1="12" i2="3" l="ENG"><s0>USA</s0>
<s2>NG</s2>
<s5>69</s5>
</fC03>
<fC03 i1="13" i2="3" l="FRE"><s0>4230S</s0>
<s4>INC</s4>
<s5>91</s5>
</fC03>
<fN21><s1>239</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval</s1>
<s2>13</s2>
<s3>USA</s3>
<s4>2006</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 07-0376468 INIST</NO>
<ET>Robust feature extraction for character recognition based on binary images</ET>
<AU>LIJUN WANG; LI ZHANG; YUXIANG XING; ZHIMING WANG; HEWEI GAO; TAGHVA (Kazem); LIN (Xiaofan)</AU>
<AF>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu/Haidian District, Beijing, 100084/Chine (1 aut.); Department of Engineering Physics, Tsinghua University/100084/Chine (2 aut., 3 aut., 5 aut.); Department of Computer Science and Technology, University of Science and Technology/Beijing, 100084/Chine (4 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Etats-Unis; Da. 2006; Vol. 6067; 606708.1-606708.8; Bibl. 18 ref.</SO>
<LA>Anglais</LA>
<EA>Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.</EA>
<CC>001B40B30S; 001D04A05A; 001D04A05D</CC>
<FD>Reconnaissance forme; Algorithme; Etude expérimentale; Image binaire; Extraction caractéristique; Reconnaissance caractère; Reconnaissance optique caractère; Transformation distance; Robustesse; Système hiérarchisé; Evaluation performance; Etats Unis; 4230S</FD>
<ED>Pattern recognition; Algorithms; Experimental study; Binary image; Feature extraction; Character recognition; Optical character recognition; Distance transformation; Robustness; Hierarchical systems; Performance evaluation; USA</ED>
<SD>Imagen binaria; Transformación distancia; Robustez</SD>
<LO>INIST-21760.354000153562240070</LO>
<ID>07-0376468</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000338 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000338 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:07-0376468
   |texte=   Robust feature extraction for character recognition based on binary images
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Robust feature extraction for character recognition based on binary images

Robust feature extraction for character recognition based on binary images

Source :

Descripteurs français

English descriptors

Abstract

Notice en format standard (ISO 2709)

Format Inist (serveur)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri