OcrV1, Istex, Curation, bibRecord, 000354

NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding

Identifieur interne : 000354 ( Istex/Curation ); précédent : 000353; suivant : 000355

NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding

Auteurs : Woo Lee [Corée du Sud] ; Keechul Jung [Corée du Sud]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2005.

RBID : ISTEX:8D70EE7AE36F82535B88D00E2BD7ADCD1AED3109

Abstract

Abstract: This paper proposes an approach to font classification for document image understanding using non-negative matrix factorization (NMF). The basic idea of the proposed method is based on that the characteristics of each font are derived from parts of the individual characters in each font rather than holistic textures. Spatial localities, parts composing of font images, are automatically extracted using NMF. These parts are used as features representing each font. In the experimental results, the distribution of features and the appropriateness of use of the characteristics specifying each font are investigated. Add to that, the proposed method is compared with the method based on principal component analysis (PCA), in which various distance metrics are tested in the feature space. It expects that the proposed method will increase the performance of optical character recognition (OCR) systems or document indexing and retrieval systems if such systems adopt the proposed font classifier as a preprocessor.

Url:

https://api.istex.fr/document/8D70EE7AE36F82535B88D00E2BD7ADCD1AED3109/fulltext/pdf

DOI: 10.1007/11526018_35

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000359

Links to Exploration step

ISTEX:8D70EE7AE36F82535B88D00E2BD7ADCD1AED3109

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding</title>
<author><name sortKey="Lee, Woo" sort="Lee, Woo" uniqKey="Lee W" first="Woo" last="Lee">Woo Lee</name>
<affiliation wicri:level="1"><mods:affiliation>Dept. of Computer Information Science, Kunsan National University, 573-701, Kunsan, Jeollabuk-do, S. Korea</mods:affiliation>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Dept. of Computer Information Science, Kunsan National University, 573-701, Kunsan, Jeollabuk-do</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: leecw@kunsan.ac.kr</mods:affiliation>
<country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
<author><name sortKey="Jung, Keechul" sort="Jung, Keechul" uniqKey="Jung K" first="Keechul" last="Jung">Keechul Jung</name>
<affiliation wicri:level="1"><mods:affiliation>School of Media, College of Information Science, Soongsil University, 156-743, Seoul, S. Korea</mods:affiliation>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>School of Media, College of Information Science, Soongsil University, 156-743, Seoul</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: kcjung@ssu.ac.kr</mods:affiliation>
<country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:8D70EE7AE36F82535B88D00E2BD7ADCD1AED3109</idno>
<date when="2005" year="2005">2005</date>
<idno type="doi">10.1007/11526018_35</idno>
<idno type="url">https://api.istex.fr/document/8D70EE7AE36F82535B88D00E2BD7ADCD1AED3109/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000359</idno>
<idno type="wicri:Area/Istex/Curation">000354</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding</title>
<author><name sortKey="Lee, Woo" sort="Lee, Woo" uniqKey="Lee W" first="Woo" last="Lee">Woo Lee</name>
<affiliation wicri:level="1"><mods:affiliation>Dept. of Computer Information Science, Kunsan National University, 573-701, Kunsan, Jeollabuk-do, S. Korea</mods:affiliation>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Dept. of Computer Information Science, Kunsan National University, 573-701, Kunsan, Jeollabuk-do</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: leecw@kunsan.ac.kr</mods:affiliation>
<country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
<author><name sortKey="Jung, Keechul" sort="Jung, Keechul" uniqKey="Jung K" first="Keechul" last="Jung">Keechul Jung</name>
<affiliation wicri:level="1"><mods:affiliation>School of Media, College of Information Science, Soongsil University, 156-743, Seoul, S. Korea</mods:affiliation>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>School of Media, College of Information Science, Soongsil University, 156-743, Seoul</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: kcjung@ssu.ac.kr</mods:affiliation>
<country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2005</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">8D70EE7AE36F82535B88D00E2BD7ADCD1AED3109</idno>
<idno type="DOI">10.1007/11526018_35</idno>
<idno type="ChapterID">35</idno>
<idno type="ChapterID">Chap35</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: This paper proposes an approach to font classification for document image understanding using non-negative matrix factorization (NMF). The basic idea of the proposed method is based on that the characteristics of each font are derived from parts of the individual characters in each font rather than holistic textures. Spatial localities, parts composing of font images, are automatically extracted using NMF. These parts are used as features representing each font. In the experimental results, the distribution of features and the appropriateness of use of the characteristics specifying each font are investigated. Add to that, the proposed method is compared with the method based on principal component analysis (PCA), in which various distance metrics are tested in the feature space. It expects that the proposed method will increase the performance of optical character recognition (OCR) systems or document indexing and retrieval systems if such systems adopt the proposed font classifier as a preprocessor.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Curation

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000354 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Istex/Curation/biblio.hfd -nk 000354 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Curation
   |type=    RBID
   |clé=     ISTEX:8D70EE7AE36F82535B88D00E2BD7ADCD1AED3109
   |texte=   NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding

NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding

Source :

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri