OcrV1, Main, Merge, bibRecord, 002521

Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation

Identifieur interne : 002521 ( Main/Merge ); précédent : 002520; suivant : 002522

Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation

Auteurs : Geunbae Lee [Corée du Sud] ; Jong-Hyeok Lee [Corée du Sud] ; Jinhee Yoo [Corée du Sud]

Source :

Pattern Recognition [ 0031-3203 ] ; 1996.

RBID : ISTEX:5BF85BB14132147049C6E647BFE729A97CE7CF10

Abstract

Most of the post-processing methods for character recognition rely on contextual information of character and word-fragment levels. However, due to linguistic characteristics of Korean, such low-level information alone is not sufficient for high-quality character-recognition applications, and we need much higher-level contextual information to improve the recognition results. This paper presents a domain independent post-processing technique that utilizes multi-level morphological, syntactic, and semantic information as well as character-level information. The proposed post-processing system performs three-level processing: candidate character-set selection, candidate eojeol (Korean word) generation through morphological analysis, and final single eojeol-sequence selection by linguistic evaluation. All the required linguistic information and probabilities are automatically acquired from a statistical corpus analysis. Experimental results demonstrate the effectiveness of our method, yielding an error correction rate of 80.46%, and improved recognition rate of 95.53% from the before-post-processing rate of 71.2% for single best-solution selection.

Url:

https://api.istex.fr/document/5BF85BB14132147049C6E647BFE729A97CE7CF10/fulltext/pdf

DOI: 10.1016/S0031-3203(96)00156-2

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 001B83
to stream Istex, to step Curation: 001A71
to stream Istex, to step Checkpoint: 001841

Links to Exploration step

ISTEX:5BF85BB14132147049C6E647BFE729A97CE7CF10

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title>Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation</title>
<author><name sortKey="Lee, Geunbae" sort="Lee, Geunbae" uniqKey="Lee G" first="Geunbae" last="Lee">Geunbae Lee</name>
</author>
<author><name sortKey="Lee, Jong Hyeok" sort="Lee, Jong Hyeok" uniqKey="Lee J" first="Jong-Hyeok" last="Lee">Jong-Hyeok Lee</name>
</author>
<author><name sortKey="Yoo, Jinhee" sort="Yoo, Jinhee" uniqKey="Yoo J" first="Jinhee" last="Yoo">Jinhee Yoo</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:5BF85BB14132147049C6E647BFE729A97CE7CF10</idno>
<date when="1997" year="1997">1997</date>
<idno type="doi">10.1016/S0031-3203(96)00156-2</idno>
<idno type="url">https://api.istex.fr/document/5BF85BB14132147049C6E647BFE729A97CE7CF10/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001B83</idno>
<idno type="wicri:Area/Istex/Curation">001A71</idno>
<idno type="wicri:Area/Istex/Checkpoint">001841</idno>
<idno type="wicri:doubleKey">0031-3203:1997:Lee G:multi:level:post</idno>
<idno type="wicri:Area/Main/Merge">002521</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a">Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation</title>
<author><name sortKey="Lee, Geunbae" sort="Lee, Geunbae" uniqKey="Lee G" first="Geunbae" last="Lee">Geunbae Lee</name>
<affiliation wicri:level="1"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hoja-Dong, Pohang 790-784</wicri:regionArea>
<wicri:noRegion>Pohang 790-784</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
<author><name sortKey="Lee, Jong Hyeok" sort="Lee, Jong Hyeok" uniqKey="Lee J" first="Jong-Hyeok" last="Lee">Jong-Hyeok Lee</name>
<affiliation wicri:level="1"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hoja-Dong, Pohang 790-784</wicri:regionArea>
<wicri:noRegion>Pohang 790-784</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Yoo, Jinhee" sort="Yoo, Jinhee" uniqKey="Yoo J" first="Jinhee" last="Yoo">Jinhee Yoo</name>
<affiliation wicri:level="1"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hoja-Dong, Pohang 790-784</wicri:regionArea>
<wicri:noRegion>Pohang 790-784</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Pattern Recognition</title>
<title level="j" type="abbrev">PR</title>
<idno type="ISSN">0031-3203</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="1996">1996</date>
<biblScope unit="volume">30</biblScope>
<biblScope unit="issue">8</biblScope>
<biblScope unit="page" from="1347">1347</biblScope>
<biblScope unit="page" to="1360">1360</biblScope>
</imprint>
<idno type="ISSN">0031-3203</idno>
</series>
<idno type="istex">5BF85BB14132147049C6E647BFE729A97CE7CF10</idno>
<idno type="DOI">10.1016/S0031-3203(96)00156-2</idno>
<idno type="PII">S0031-3203(96)00156-2</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0031-3203</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Most of the post-processing methods for character recognition rely on contextual information of character and word-fragment levels. However, due to linguistic characteristics of Korean, such low-level information alone is not sufficient for high-quality character-recognition applications, and we need much higher-level contextual information to improve the recognition results. This paper presents a domain independent post-processing technique that utilizes multi-level morphological, syntactic, and semantic information as well as character-level information. The proposed post-processing system performs three-level processing: candidate character-set selection, candidate eojeol (Korean word) generation through morphological analysis, and final single eojeol-sequence selection by linguistic evaluation. All the required linguistic information and probabilities are automatically acquired from a statistical corpus analysis. Experimental results demonstrate the effectiveness of our method, yielding an error correction rate of 80.46%, and improved recognition rate of 95.53% from the before-post-processing rate of 71.2% for single best-solution selection.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002521 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 002521 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     ISTEX:5BF85BB14132147049C6E647BFE729A97CE7CF10
   |texte=   Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation

Multi-level post-processing for Korean character recognition using morphological analysis and linguistic evaluation

Source :

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri