Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Robust feature extraction for character recognition based on binary images

Identifieur interne : 000338 ( PascalFrancis/Corpus ); précédent : 000337; suivant : 000339

Robust feature extraction for character recognition based on binary images

Auteurs : LIJUN WANG ; LI ZHANG ; YUXIANG XING ; ZHIMING WANG ; HEWEI GAO

Source :

RBID : Pascal:07-0376468

Descripteurs français

English descriptors

Abstract

Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 0277-786X
A05       @2 6067
A08 01  1  ENG  @1 Robust feature extraction for character recognition based on binary images
A09 01  1  ENG  @1 Document recognition and retrieval XIII : 18-19 January 2006, San Jose, California, USA
A11 01  1    @1 LIJUN WANG
A11 02  1    @1 LI ZHANG
A11 03  1    @1 YUXIANG XING
A11 04  1    @1 ZHIMING WANG
A11 05  1    @1 HEWEI GAO
A12 01  1    @1 TAGHVA (Kazem) @9 ed.
A12 02  1    @1 LIN (Xiaofan) @9 ed.
A14 01      @1 Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu @2 Haidian District, Beijing, 100084 @3 CHN @Z 1 aut.
A14 02      @1 Department of Engineering Physics, Tsinghua University @2 100084 @3 CHN @Z 2 aut. @Z 3 aut. @Z 5 aut.
A14 03      @1 Department of Computer Science and Technology, University of Science and Technology @2 Beijing, 100084 @3 CHN @Z 4 aut.
A18 01  1    @1 IS&T--The Society for Imaging Science and Technology @3 USA @9 org-cong.
A18 02  1    @1 Society of photo-optical instrumentation engineers @3 USA @9 org-cong.
A20       @2 606708.1-606708.8
A21       @1 2006
A23 01      @0 ENG
A26 01      @0 0-8194-6107-5
A43 01      @1 INIST @2 21760 @5 354000153562240070
A44       @0 0000 @1 © 2007 INIST-CNRS. All rights reserved.
A45       @0 18 ref.
A47 01  1    @0 07-0376468
A60       @1 P @2 C
A61       @0 A
A64 01  1    @0 Proceedings of SPIE, the International Society for Optical Engineering
A66 01      @0 USA
C01 01    ENG  @0 Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.
C02 01  3    @0 001B40B30S
C02 02  X    @0 001D04A05A
C02 03  X    @0 001D04A05D
C03 01  3  FRE  @0 Reconnaissance forme @5 03
C03 01  3  ENG  @0 Pattern recognition @5 03
C03 02  3  FRE  @0 Algorithme @5 23
C03 02  3  ENG  @0 Algorithms @5 23
C03 03  3  FRE  @0 Etude expérimentale @5 30
C03 03  3  ENG  @0 Experimental study @5 30
C03 04  X  FRE  @0 Image binaire @5 61
C03 04  X  ENG  @0 Binary image @5 61
C03 04  X  SPA  @0 Imagen binaria @5 61
C03 05  3  FRE  @0 Extraction caractéristique @5 62
C03 05  3  ENG  @0 Feature extraction @5 62
C03 06  3  FRE  @0 Reconnaissance caractère @5 63
C03 06  3  ENG  @0 Character recognition @5 63
C03 07  3  FRE  @0 Reconnaissance optique caractère @5 64
C03 07  3  ENG  @0 Optical character recognition @5 64
C03 08  X  FRE  @0 Transformation distance @5 65
C03 08  X  ENG  @0 Distance transformation @5 65
C03 08  X  SPA  @0 Transformación distancia @5 65
C03 09  X  FRE  @0 Robustesse @5 66
C03 09  X  ENG  @0 Robustness @5 66
C03 09  X  SPA  @0 Robustez @5 66
C03 10  3  FRE  @0 Système hiérarchisé @5 67
C03 10  3  ENG  @0 Hierarchical systems @5 67
C03 11  3  FRE  @0 Evaluation performance @5 68
C03 11  3  ENG  @0 Performance evaluation @5 68
C03 12  3  FRE  @0 Etats Unis @2 NG @5 69
C03 12  3  ENG  @0 USA @2 NG @5 69
C03 13  3  FRE  @0 4230S @4 INC @5 91
N21       @1 239
N44 01      @1 OTO
N82       @1 OTO
pR  
A30 01  1  ENG  @1 Document recognition and retrieval @2 13 @3 USA @4 2006

Format Inist (serveur)

NO : PASCAL 07-0376468 INIST
ET : Robust feature extraction for character recognition based on binary images
AU : LIJUN WANG; LI ZHANG; YUXIANG XING; ZHIMING WANG; HEWEI GAO; TAGHVA (Kazem); LIN (Xiaofan)
AF : Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu/Haidian District, Beijing, 100084/Chine (1 aut.); Department of Engineering Physics, Tsinghua University/100084/Chine (2 aut., 3 aut., 5 aut.); Department of Computer Science and Technology, University of Science and Technology/Beijing, 100084/Chine (4 aut.)
DT : Publication en série; Congrès; Niveau analytique
SO : Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Etats-Unis; Da. 2006; Vol. 6067; 606708.1-606708.8; Bibl. 18 ref.
LA : Anglais
EA : Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.
CC : 001B40B30S; 001D04A05A; 001D04A05D
FD : Reconnaissance forme; Algorithme; Etude expérimentale; Image binaire; Extraction caractéristique; Reconnaissance caractère; Reconnaissance optique caractère; Transformation distance; Robustesse; Système hiérarchisé; Evaluation performance; Etats Unis; 4230S
ED : Pattern recognition; Algorithms; Experimental study; Binary image; Feature extraction; Character recognition; Optical character recognition; Distance transformation; Robustness; Hierarchical systems; Performance evaluation; USA
SD : Imagen binaria; Transformación distancia; Robustez
LO : INIST-21760.354000153562240070
ID : 07-0376468

Links to Exploration step

Pascal:07-0376468

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Robust feature extraction for character recognition based on binary images</title>
<author>
<name sortKey="Lijun Wang" sort="Lijun Wang" uniqKey="Lijun Wang" last="Lijun Wang">LIJUN WANG</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu</s1>
<s2>Haidian District, Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Li Zhang" sort="Li Zhang" uniqKey="Li Zhang" last="Li Zhang">LI ZHANG</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Yuxiang Xing" sort="Yuxiang Xing" uniqKey="Yuxiang Xing" last="Yuxiang Xing">YUXIANG XING</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Zhiming Wang" sort="Zhiming Wang" uniqKey="Zhiming Wang" last="Zhiming Wang">ZHIMING WANG</name>
<affiliation>
<inist:fA14 i1="03">
<s1>Department of Computer Science and Technology, University of Science and Technology</s1>
<s2>Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Hewei Gao" sort="Hewei Gao" uniqKey="Hewei Gao" last="Hewei Gao">HEWEI GAO</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">07-0376468</idno>
<date when="2006">2006</date>
<idno type="stanalyst">PASCAL 07-0376468 INIST</idno>
<idno type="RBID">Pascal:07-0376468</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000338</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Robust feature extraction for character recognition based on binary images</title>
<author>
<name sortKey="Lijun Wang" sort="Lijun Wang" uniqKey="Lijun Wang" last="Lijun Wang">LIJUN WANG</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu</s1>
<s2>Haidian District, Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Li Zhang" sort="Li Zhang" uniqKey="Li Zhang" last="Li Zhang">LI ZHANG</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Yuxiang Xing" sort="Yuxiang Xing" uniqKey="Yuxiang Xing" last="Yuxiang Xing">YUXIANG XING</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Zhiming Wang" sort="Zhiming Wang" uniqKey="Zhiming Wang" last="Zhiming Wang">ZHIMING WANG</name>
<affiliation>
<inist:fA14 i1="03">
<s1>Department of Computer Science and Technology, University of Science and Technology</s1>
<s2>Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Hewei Gao" sort="Hewei Gao" uniqKey="Hewei Gao" last="Hewei Gao">HEWEI GAO</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
<imprint>
<date when="2006">2006</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Binary image</term>
<term>Character recognition</term>
<term>Distance transformation</term>
<term>Experimental study</term>
<term>Feature extraction</term>
<term>Hierarchical systems</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
<term>Robustness</term>
<term>USA</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance forme</term>
<term>Algorithme</term>
<term>Etude expérimentale</term>
<term>Image binaire</term>
<term>Extraction caractéristique</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Transformation distance</term>
<term>Robustesse</term>
<term>Système hiérarchisé</term>
<term>Evaluation performance</term>
<term>Etats Unis</term>
<term>4230S</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0277-786X</s0>
</fA01>
<fA05>
<s2>6067</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG">
<s1>Robust feature extraction for character recognition based on binary images</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval XIII : 18-19 January 2006, San Jose, California, USA</s1>
</fA09>
<fA11 i1="01" i2="1">
<s1>LIJUN WANG</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>LI ZHANG</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>YUXIANG XING</s1>
</fA11>
<fA11 i1="04" i2="1">
<s1>ZHIMING WANG</s1>
</fA11>
<fA11 i1="05" i2="1">
<s1>HEWEI GAO</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>TAGHVA (Kazem)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>LIN (Xiaofan)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01">
<s1>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu</s1>
<s2>Haidian District, Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA14 i1="02">
<s1>Department of Engineering Physics, Tsinghua University</s1>
<s2>100084</s2>
<s3>CHN</s3>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>5 aut.</sZ>
</fA14>
<fA14 i1="03">
<s1>Department of Computer Science and Technology, University of Science and Technology</s1>
<s2>Beijing, 100084</s2>
<s3>CHN</s3>
<sZ>4 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1">
<s1>IS&T--The Society for Imaging Science and Technology</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="02" i2="1">
<s1>Society of photo-optical instrumentation engineers</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20>
<s2>606708.1-606708.8</s2>
</fA20>
<fA21>
<s1>2006</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA26 i1="01">
<s0>0-8194-6107-5</s0>
</fA26>
<fA43 i1="01">
<s1>INIST</s1>
<s2>21760</s2>
<s5>354000153562240070</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2007 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>18 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>07-0376468</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>Proceedings of SPIE, the International Society for Optical Engineering</s0>
</fA64>
<fA66 i1="01">
<s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.</s0>
</fC01>
<fC02 i1="01" i2="3">
<s0>001B40B30S</s0>
</fC02>
<fC02 i1="02" i2="X">
<s0>001D04A05A</s0>
</fC02>
<fC02 i1="03" i2="X">
<s0>001D04A05D</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE">
<s0>Reconnaissance forme</s0>
<s5>03</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG">
<s0>Pattern recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="02" i2="3" l="FRE">
<s0>Algorithme</s0>
<s5>23</s5>
</fC03>
<fC03 i1="02" i2="3" l="ENG">
<s0>Algorithms</s0>
<s5>23</s5>
</fC03>
<fC03 i1="03" i2="3" l="FRE">
<s0>Etude expérimentale</s0>
<s5>30</s5>
</fC03>
<fC03 i1="03" i2="3" l="ENG">
<s0>Experimental study</s0>
<s5>30</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Image binaire</s0>
<s5>61</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Binary image</s0>
<s5>61</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Imagen binaria</s0>
<s5>61</s5>
</fC03>
<fC03 i1="05" i2="3" l="FRE">
<s0>Extraction caractéristique</s0>
<s5>62</s5>
</fC03>
<fC03 i1="05" i2="3" l="ENG">
<s0>Feature extraction</s0>
<s5>62</s5>
</fC03>
<fC03 i1="06" i2="3" l="FRE">
<s0>Reconnaissance caractère</s0>
<s5>63</s5>
</fC03>
<fC03 i1="06" i2="3" l="ENG">
<s0>Character recognition</s0>
<s5>63</s5>
</fC03>
<fC03 i1="07" i2="3" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>64</s5>
</fC03>
<fC03 i1="07" i2="3" l="ENG">
<s0>Optical character recognition</s0>
<s5>64</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Transformation distance</s0>
<s5>65</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Distance transformation</s0>
<s5>65</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Transformación distancia</s0>
<s5>65</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Robustesse</s0>
<s5>66</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Robustness</s0>
<s5>66</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Robustez</s0>
<s5>66</s5>
</fC03>
<fC03 i1="10" i2="3" l="FRE">
<s0>Système hiérarchisé</s0>
<s5>67</s5>
</fC03>
<fC03 i1="10" i2="3" l="ENG">
<s0>Hierarchical systems</s0>
<s5>67</s5>
</fC03>
<fC03 i1="11" i2="3" l="FRE">
<s0>Evaluation performance</s0>
<s5>68</s5>
</fC03>
<fC03 i1="11" i2="3" l="ENG">
<s0>Performance evaluation</s0>
<s5>68</s5>
</fC03>
<fC03 i1="12" i2="3" l="FRE">
<s0>Etats Unis</s0>
<s2>NG</s2>
<s5>69</s5>
</fC03>
<fC03 i1="12" i2="3" l="ENG">
<s0>USA</s0>
<s2>NG</s2>
<s5>69</s5>
</fC03>
<fC03 i1="13" i2="3" l="FRE">
<s0>4230S</s0>
<s4>INC</s4>
<s5>91</s5>
</fC03>
<fN21>
<s1>239</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval</s1>
<s2>13</s2>
<s3>USA</s3>
<s4>2006</s4>
</fA30>
</pR>
</standard>
<server>
<NO>PASCAL 07-0376468 INIST</NO>
<ET>Robust feature extraction for character recognition based on binary images</ET>
<AU>LIJUN WANG; LI ZHANG; YUXIANG XING; ZHIMING WANG; HEWEI GAO; TAGHVA (Kazem); LIN (Xiaofan)</AU>
<AF>Department of Application Science, Research Institute, Nuctech 2/F Block A, Tongfang Building, Shuangqinglu/Haidian District, Beijing, 100084/Chine (1 aut.); Department of Engineering Physics, Tsinghua University/100084/Chine (2 aut., 3 aut., 5 aut.); Department of Computer Science and Technology, University of Science and Technology/Beijing, 100084/Chine (4 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Etats-Unis; Da. 2006; Vol. 6067; 606708.1-606708.8; Bibl. 18 ref.</SO>
<LA>Anglais</LA>
<EA>Optical Character Recognition (OCR) is a classical research field and has become one of most successful applications in the area of pattern recognition. Feature extraction is a key step in the process of OCR. This paper presents three algorithms for feature extraction based on binary images: the Lattice with Distance Transform (DTL), Stroke Density (SD) and Co-occurrence Matrix (CM). DTL algorithm improves the robustness of the lattice feature by using distance transform to increase the distance of the foreground and background and thus reduce the influence from the boundary of strokes. SD and CM algorithms extract robust stroke features base on the fact that human recognize characters according to strokes, including length and orientation. SD reflects the quantized stroke information including the length and the orientation. CM reflects the length and orientation of a contour. SD and CM together sufficiently describe strokes. Since these three groups of feature vectors complement each other in expressing characters, we integrate them and adopt a hierarchical algorithm to achieve optimal performance. Our methods are tested on the USPS (United States Postal Service) database and the Vehicle License Plate Number Pictures Database (VLNPD). Experimental results shows that the methods gain high recognition rate and cost reasonable average running time. Also, based on similar condition, we compared our results to the box method proposed by Hannmandlu [18]. Our methods demonstrated better performance in efficiency.</EA>
<CC>001B40B30S; 001D04A05A; 001D04A05D</CC>
<FD>Reconnaissance forme; Algorithme; Etude expérimentale; Image binaire; Extraction caractéristique; Reconnaissance caractère; Reconnaissance optique caractère; Transformation distance; Robustesse; Système hiérarchisé; Evaluation performance; Etats Unis; 4230S</FD>
<ED>Pattern recognition; Algorithms; Experimental study; Binary image; Feature extraction; Character recognition; Optical character recognition; Distance transformation; Robustness; Hierarchical systems; Performance evaluation; USA</ED>
<SD>Imagen binaria; Transformación distancia; Robustez</SD>
<LO>INIST-21760.354000153562240070</LO>
<ID>07-0376468</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000338 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000338 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:07-0376468
   |texte=   Robust feature extraction for character recognition based on binary images
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024