Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Improvement of numeral recognition using personal handwriting characteristics based on clustering

Identifieur interne : 000610 ( PascalFrancis/Checkpoint ); précédent : 000609; suivant : 000611

Improvement of numeral recognition using personal handwriting characteristics based on clustering

Auteurs : Y. Hotta [Japon] ; S. Naoi ; M. Suwa

Source :

RBID : Pascal:02-0303397

Descripteurs français

English descriptors

Abstract

Correctly recognizing characters with peculiarities for each writer is a difficult problem. The process of absorbing variations in individual writing by creating an individual dictionary is also difficult when a writer is not specified and the total number of writers is large. In this paper the authors propose a method to improve the results of isolated character recognition in forms in which the same writer writes many characters by taking the characteristics of the writer's writing on a form as a character distribution in a character feature space. In concrete terms, the authors first perform isolated character recognition on all characters on the same form. Then, based on the results of isolated character recognition, clustering of input character groups is performed for each character category. Clusters which are very likely to include misrecognized characters from isolated character recognition are extracted based on the results of clustering. Then character categories in the extracted cluster are automatically amended based on the distance from all clusters in other categories. In the same fashion, automatic amending is performed for rejected characters. Based on experiments to evaluate handwritten numerals on OCR forms, the authors show that the precision of numeral recognition is improved by using this approach as a form of postprocessing for isolated character recognition. © 2002 Wiley Periodicals, Inc. Syst. Comp. Jpn.


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:02-0303397

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Improvement of numeral recognition using personal handwriting characteristics based on clustering</title>
<author>
<name sortKey="Hotta, Y" sort="Hotta, Y" uniqKey="Hotta Y" first="Y." last="Hotta">Y. Hotta</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Fujitsu Laboratories Ltd.</s1>
<s2>Kawasaki 211-8588</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>Fujitsu Laboratories Ltd.</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Naoi, S" sort="Naoi, S" uniqKey="Naoi S" first="S." last="Naoi">S. Naoi</name>
</author>
<author>
<name sortKey="Suwa, M" sort="Suwa, M" uniqKey="Suwa M" first="M." last="Suwa">M. Suwa</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">02-0303397</idno>
<date when="2002">2002</date>
<idno type="stanalyst">PASCAL 02-0303397 EI</idno>
<idno type="RBID">Pascal:02-0303397</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000664</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000128</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000610</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Improvement of numeral recognition using personal handwriting characteristics based on clustering</title>
<author>
<name sortKey="Hotta, Y" sort="Hotta, Y" uniqKey="Hotta Y" first="Y." last="Hotta">Y. Hotta</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Fujitsu Laboratories Ltd.</s1>
<s2>Kawasaki 211-8588</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>Fujitsu Laboratories Ltd.</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Naoi, S" sort="Naoi, S" uniqKey="Naoi S" first="S." last="Naoi">S. Naoi</name>
</author>
<author>
<name sortKey="Suwa, M" sort="Suwa, M" uniqKey="Suwa M" first="M." last="Suwa">M. Suwa</name>
</author>
</analytic>
<series>
<title level="j" type="main">Systems and Computers in Japan</title>
<title level="j" type="abbreviated">Syst Comput Jpn</title>
<idno type="ISSN">0882-1666</idno>
<imprint>
<date when="2002">2002</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Systems and Computers in Japan</title>
<title level="j" type="abbreviated">Syst Comput Jpn</title>
<idno type="ISSN">0882-1666</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Correlation methods</term>
<term>Experiments</term>
<term>Glossaries</term>
<term>Number theory</term>
<term>Numeral recognition</term>
<term>Optical character recognition</term>
<term>Theory</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Théorie</term>
<term>Glossaire</term>
<term>Théorie nombre</term>
<term>Méthode corrélation</term>
<term>Reconnaissance optique caractère</term>
<term>Expérience</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Correctly recognizing characters with peculiarities for each writer is a difficult problem. The process of absorbing variations in individual writing by creating an individual dictionary is also difficult when a writer is not specified and the total number of writers is large. In this paper the authors propose a method to improve the results of isolated character recognition in forms in which the same writer writes many characters by taking the characteristics of the writer's writing on a form as a character distribution in a character feature space. In concrete terms, the authors first perform isolated character recognition on all characters on the same form. Then, based on the results of isolated character recognition, clustering of input character groups is performed for each character category. Clusters which are very likely to include misrecognized characters from isolated character recognition are extracted based on the results of clustering. Then character categories in the extracted cluster are automatically amended based on the distance from all clusters in other categories. In the same fashion, automatic amending is performed for rejected characters. Based on experiments to evaluate handwritten numerals on OCR forms, the authors show that the precision of numeral recognition is improved by using this approach as a form of postprocessing for isolated character recognition. © 2002 Wiley Periodicals, Inc. Syst. Comp. Jpn.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0882-1666</s0>
</fA01>
<fA02 i1="01">
<s0>SCJAEP</s0>
</fA02>
<fA03 i2="1">
<s0>Syst Comput Jpn</s0>
</fA03>
<fA05>
<s2>33</s2>
</fA05>
<fA06>
<s2>7</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Improvement of numeral recognition using personal handwriting characteristics based on clustering</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>HOTTA (Y.)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>NAOI (S.)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>SUWA (M.)</s1>
</fA11>
<fA14 i1="01">
<s1>Fujitsu Laboratories Ltd.</s1>
<s2>Kawasaki 211-8588</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA20>
<s1>104-113</s1>
</fA20>
<fA21>
<s1>2002</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>15508</s2>
</fA43>
<fA44>
<s0>A100</s0>
</fA44>
<fA45>
<s0>7 Refs.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>02-0303397</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>Systems and Computers in Japan</s0>
</fA64>
<fA66 i1="01">
<s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>Correctly recognizing characters with peculiarities for each writer is a difficult problem. The process of absorbing variations in individual writing by creating an individual dictionary is also difficult when a writer is not specified and the total number of writers is large. In this paper the authors propose a method to improve the results of isolated character recognition in forms in which the same writer writes many characters by taking the characteristics of the writer's writing on a form as a character distribution in a character feature space. In concrete terms, the authors first perform isolated character recognition on all characters on the same form. Then, based on the results of isolated character recognition, clustering of input character groups is performed for each character category. Clusters which are very likely to include misrecognized characters from isolated character recognition are extracted based on the results of clustering. Then character categories in the extracted cluster are automatically amended based on the distance from all clusters in other categories. In the same fashion, automatic amending is performed for rejected characters. Based on experiments to evaluate handwritten numerals on OCR forms, the authors show that the precision of numeral recognition is improved by using this approach as a form of postprocessing for isolated character recognition. © 2002 Wiley Periodicals, Inc. Syst. Comp. Jpn.</s0>
</fC01>
<fC02 i1="01" i2="3">
<s0>001B40B</s0>
</fC02>
<fC02 i1="02" i2="X">
<s0>001D00C</s0>
</fC02>
<fC02 i1="03" i2="X">
<s0>001A02</s0>
</fC02>
<fC02 i1="04" i2="X">
<s0>001A02H02</s0>
</fC02>
<fC03 i1="01" i2="1" l="ENG">
<s0>Numeral recognition</s0>
<s4>INC</s4>
</fC03>
<fC03 i1="02" i2="1" l="FRE">
<s0>Théorie</s0>
</fC03>
<fC03 i1="02" i2="1" l="ENG">
<s0>Theory</s0>
</fC03>
<fC03 i1="03" i2="1" l="FRE">
<s0>Glossaire</s0>
</fC03>
<fC03 i1="03" i2="1" l="ENG">
<s0>Glossaries</s0>
</fC03>
<fC03 i1="04" i2="1" l="FRE">
<s0>Théorie nombre</s0>
</fC03>
<fC03 i1="04" i2="1" l="ENG">
<s0>Number theory</s0>
</fC03>
<fC03 i1="05" i2="1" l="FRE">
<s0>Méthode corrélation</s0>
</fC03>
<fC03 i1="05" i2="1" l="ENG">
<s0>Correlation methods</s0>
</fC03>
<fC03 i1="06" i2="1" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s3>P</s3>
</fC03>
<fC03 i1="06" i2="1" l="ENG">
<s0>Optical character recognition</s0>
<s3>P</s3>
</fC03>
<fC03 i1="07" i2="1" l="FRE">
<s0>Expérience</s0>
</fC03>
<fC03 i1="07" i2="1" l="ENG">
<s0>Experiments</s0>
</fC03>
<fN21>
<s1>175</s1>
</fN21>
</pA>
</standard>
</inist>
<affiliations>
<list>
<country>
<li>Japon</li>
</country>
</list>
<tree>
<noCountry>
<name sortKey="Naoi, S" sort="Naoi, S" uniqKey="Naoi S" first="S." last="Naoi">S. Naoi</name>
<name sortKey="Suwa, M" sort="Suwa, M" uniqKey="Suwa M" first="M." last="Suwa">M. Suwa</name>
</noCountry>
<country name="Japon">
<noRegion>
<name sortKey="Hotta, Y" sort="Hotta, Y" uniqKey="Hotta Y" first="Y." last="Hotta">Y. Hotta</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000610 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Checkpoint/biblio.hfd -nk 000610 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Checkpoint
   |type=    RBID
   |clé=     Pascal:02-0303397
   |texte=   Improvement of numeral recognition using personal handwriting  characteristics based on clustering
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024