Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

An Adaptive Binarization Technique for Low Quality Historical Documents

Identifieur interne : 001630 ( Main/Merge ); précédent : 001629; suivant : 001631

An Adaptive Binarization Technique for Low Quality Historical Documents

Auteurs : Basilios Gatos [Grèce] ; Ioannis Pratikakis [Grèce] ; J. Perantonis [Grèce]

Source :

RBID : ISTEX:35F40709B6AD5153018FB3B4194A8283BA78EA78

Abstract

Abstract: Historical document collections are a valuable resource for human history. This paper proposes a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. The proposed scheme consists of five distinct steps: a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions using Niblack’s approach, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image and finally a post-processing step in order to improve the quality of text regions and preserve stroke connectivity. The proposed methodology works with great success even in cases of historical manuscripts with poor quality, shadows, nonuniform illumination, low contrast, large signal- dependent noise, smear and strain. After testing the proposed method on numerous low quality historical manuscripts, it has turned out that our methodology performs better compared to current state-of-the-art adaptive thresholding techniques.

Url:
DOI: 10.1007/978-3-540-28640-0_10

Links toward previous steps (curation, corpus...)


Links to Exploration step

ISTEX:35F40709B6AD5153018FB3B4194A8283BA78EA78

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">An Adaptive Binarization Technique for Low Quality Historical Documents</title>
<author>
<name sortKey="Gatos, Basilios" sort="Gatos, Basilios" uniqKey="Gatos B" first="Basilios" last="Gatos">Basilios Gatos</name>
</author>
<author>
<name sortKey="Pratikakis, Ioannis" sort="Pratikakis, Ioannis" uniqKey="Pratikakis I" first="Ioannis" last="Pratikakis">Ioannis Pratikakis</name>
</author>
<author>
<name sortKey="Perantonis, J" sort="Perantonis, J" uniqKey="Perantonis J" first="J." last="Perantonis">J. Perantonis</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:35F40709B6AD5153018FB3B4194A8283BA78EA78</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-28640-0_10</idno>
<idno type="url">https://api.istex.fr/document/35F40709B6AD5153018FB3B4194A8283BA78EA78/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000D04</idno>
<idno type="wicri:Area/Istex/Curation">000C81</idno>
<idno type="wicri:Area/Istex/Checkpoint">000E14</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Gatos B:an:adaptive:binarization</idno>
<idno type="wicri:Area/Main/Merge">001630</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">An Adaptive Binarization Technique for Low Quality Historical Documents</title>
<author>
<name sortKey="Gatos, Basilios" sort="Gatos, Basilios" uniqKey="Gatos B" first="Basilios" last="Gatos">Basilios Gatos</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Grèce</country>
<wicri:regionArea>Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center “Demokritos”, 153 10, Athens</wicri:regionArea>
<wicri:noRegion>Athens</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Grèce</country>
</affiliation>
</author>
<author>
<name sortKey="Pratikakis, Ioannis" sort="Pratikakis, Ioannis" uniqKey="Pratikakis I" first="Ioannis" last="Pratikakis">Ioannis Pratikakis</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Grèce</country>
<wicri:regionArea>Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center “Demokritos”, 153 10, Athens</wicri:regionArea>
<wicri:noRegion>Athens</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Grèce</country>
</affiliation>
</author>
<author>
<name sortKey="Perantonis, J" sort="Perantonis, J" uniqKey="Perantonis J" first="J." last="Perantonis">J. Perantonis</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Grèce</country>
<wicri:regionArea>Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center “Demokritos”, 153 10, Athens</wicri:regionArea>
<wicri:noRegion>Athens</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Grèce</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">35F40709B6AD5153018FB3B4194A8283BA78EA78</idno>
<idno type="DOI">10.1007/978-3-540-28640-0_10</idno>
<idno type="ChapterID">10</idno>
<idno type="ChapterID">Chap10</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Historical document collections are a valuable resource for human history. This paper proposes a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. The proposed scheme consists of five distinct steps: a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions using Niblack’s approach, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image and finally a post-processing step in order to improve the quality of text regions and preserve stroke connectivity. The proposed methodology works with great success even in cases of historical manuscripts with poor quality, shadows, nonuniform illumination, low contrast, large signal- dependent noise, smear and strain. After testing the proposed method on numerous low quality historical manuscripts, it has turned out that our methodology performs better compared to current state-of-the-art adaptive thresholding techniques.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001630 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001630 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     ISTEX:35F40709B6AD5153018FB3B4194A8283BA78EA78
   |texte=   An Adaptive Binarization Technique for Low Quality Historical Documents
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024