OcrV1, Main, Exploration, bibRecord, 002B77

Visual similarity analysis of Chinese characters and its uses in Japanese OCR

Identifieur interne : 002B77 ( Main/Exploration ); précédent : 002B76; suivant : 002B78

Visual similarity analysis of Chinese characters and its uses in Japanese OCR

Auteurs : TAO HONG [États-Unis] ; S. W. Lam [États-Unis] ; J. J. Hull ; S. N. Srihari [États-Unis]

Source :

SPIE proceedings series [ 1017-2653 ] ; 1995.

RBID : Pascal:97-0121468

Descripteurs français

Pascal (Inist)
- Reconnaissance optique caractère, Chinois, Japonais, Document, Reconnaissance caractère, Reconnaissance forme, Caractère imprimé, Similarité.
Wicri :
- topic : Document.

English descriptors

KwdEn :
- Character recognition, Chinese, Document, Japanese, Optical character recognition, Pattern recognition, Printed character, Similarity.

Abstract

Traditionally, a Chinese or Japanese Optical Character Reader (OCR) has to represent each character category individually as one or more feature prototypes, or a structural description which is a composition of manually derived components such as radicals. Here we propose a new approach in which various kinds of visual similarities between different Chinese characters are analyzed automatically at the feature level. Using this method, character categories will be related to each other by training on fonts; and character images from a text page can be related to each other based on visual similarities they share. This method provides a way to interpret character images from a text page systematically, instead of a sequence of isolated character recognitions. The use of the method for postprocessing in Japanese text recognition will also be discussed.

Affiliations:

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000960
to stream PascalFrancis, to step Curation: 000A39
to stream PascalFrancis, to step Checkpoint: 000981
to stream Main, to step Merge: 002D34
to stream Main, to step Curation: 002B77

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Visual similarity analysis of Chinese characters and its uses in Japanese OCR</title>
<author><name sortKey="Tao Hong" sort="Tao Hong" uniqKey="Tao Hong" last="Tao Hong">TAO HONG</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, The UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228-2567</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Lam, S W" sort="Lam, S W" uniqKey="Lam S" first="S. W." last="Lam">S. W. Lam</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, The UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228-2567</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Hull, J J" sort="Hull, J J" uniqKey="Hull J" first="J. J." last="Hull">J. J. Hull</name>
</author>
<author><name sortKey="Srihari, S N" sort="Srihari, S N" uniqKey="Srihari S" first="S. N." last="Srihari">S. N. Srihari</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, The UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228-2567</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">97-0121468</idno>
<date when="1995">1995</date>
<idno type="stanalyst">PASCAL 97-0121468 INIST</idno>
<idno type="RBID">Pascal:97-0121468</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000960</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000A39</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000981</idno>
<idno type="wicri:doubleKey">1017-2653:1995:Tao Hong:visual:similarity:analysis</idno>
<idno type="wicri:Area/Main/Merge">002D34</idno>
<idno type="wicri:Area/Main/Curation">002B77</idno>
<idno type="wicri:Area/Main/Exploration">002B77</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Visual similarity analysis of Chinese characters and its uses in Japanese OCR</title>
<author><name sortKey="Tao Hong" sort="Tao Hong" uniqKey="Tao Hong" last="Tao Hong">TAO HONG</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, The UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228-2567</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Lam, S W" sort="Lam, S W" uniqKey="Lam S" first="S. W." last="Lam">S. W. Lam</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, The UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228-2567</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Hull, J J" sort="Hull, J J" uniqKey="Hull J" first="J. J." last="Hull">J. J. Hull</name>
</author>
<author><name sortKey="Srihari, S N" sort="Srihari, S N" uniqKey="Srihari S" first="S. N." last="Srihari">S. N. Srihari</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, The UB Commons, 520 Lee Entrance, Suite 202</s1>
<s2>Amherst, NY 14228-2567</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint><date when="1995">1995</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Chinese</term>
<term>Document</term>
<term>Japanese</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Printed character</term>
<term>Similarity</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance optique caractère</term>
<term>Chinois</term>
<term>Japonais</term>
<term>Document</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Caractère imprimé</term>
<term>Similarité</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Document</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Traditionally, a Chinese or Japanese Optical Character Reader (OCR) has to represent each character category individually as one or more feature prototypes, or a structural description which is a composition of manually derived components such as radicals. Here we propose a new approach in which various kinds of visual similarities between different Chinese characters are analyzed automatically at the feature level. Using this method, character categories will be related to each other by training on fonts; and character images from a text page can be related to each other based on visual similarities they share. This method provides a way to interpret character images from a text page systematically, instead of a sequence of isolated character recognitions. The use of the method for postprocessing in Japanese text recognition will also be discussed.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>État de New York</li>
</region>
</list>
<tree><noCountry><name sortKey="Hull, J J" sort="Hull, J J" uniqKey="Hull J" first="J. J." last="Hull">J. J. Hull</name>
</noCountry>
<country name="États-Unis"><region name="État de New York"><name sortKey="Tao Hong" sort="Tao Hong" uniqKey="Tao Hong" last="Tao Hong">TAO HONG</name>
</region>
<name sortKey="Lam, S W" sort="Lam, S W" uniqKey="Lam S" first="S. W." last="Lam">S. W. Lam</name>
<name sortKey="Srihari, S N" sort="Srihari, S N" uniqKey="Srihari S" first="S. N." last="Srihari">S. N. Srihari</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002B77 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002B77 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:97-0121468
   |texte=   Visual similarity analysis of Chinese characters and its uses in Japanese OCR
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Visual similarity analysis of Chinese characters and its uses in Japanese OCR

Visual similarity analysis of Chinese characters and its uses in Japanese OCR

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri