An Adaptive Binarization Technique for Low Quality Historical Documents
Identifieur interne : 001630 ( Main/Merge ); précédent : 001629; suivant : 001631An Adaptive Binarization Technique for Low Quality Historical Documents
Auteurs : Basilios Gatos [Grèce] ; Ioannis Pratikakis [Grèce] ; J. Perantonis [Grèce]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2004.
Abstract
Abstract: Historical document collections are a valuable resource for human history. This paper proposes a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. The proposed scheme consists of five distinct steps: a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions using Niblack’s approach, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image and finally a post-processing step in order to improve the quality of text regions and preserve stroke connectivity. The proposed methodology works with great success even in cases of historical manuscripts with poor quality, shadows, nonuniform illumination, low contrast, large signal- dependent noise, smear and strain. After testing the proposed method on numerous low quality historical manuscripts, it has turned out that our methodology performs better compared to current state-of-the-art adaptive thresholding techniques.
Url:
DOI: 10.1007/978-3-540-28640-0_10
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000D04
- to stream Istex, to step Curation: 000C81
- to stream Istex, to step Checkpoint: 000E14
Links to Exploration step
ISTEX:35F40709B6AD5153018FB3B4194A8283BA78EA78Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">An Adaptive Binarization Technique for Low Quality Historical Documents</title>
<author><name sortKey="Gatos, Basilios" sort="Gatos, Basilios" uniqKey="Gatos B" first="Basilios" last="Gatos">Basilios Gatos</name>
</author>
<author><name sortKey="Pratikakis, Ioannis" sort="Pratikakis, Ioannis" uniqKey="Pratikakis I" first="Ioannis" last="Pratikakis">Ioannis Pratikakis</name>
</author>
<author><name sortKey="Perantonis, J" sort="Perantonis, J" uniqKey="Perantonis J" first="J." last="Perantonis">J. Perantonis</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:35F40709B6AD5153018FB3B4194A8283BA78EA78</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-28640-0_10</idno>
<idno type="url">https://api.istex.fr/document/35F40709B6AD5153018FB3B4194A8283BA78EA78/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000D04</idno>
<idno type="wicri:Area/Istex/Curation">000C81</idno>
<idno type="wicri:Area/Istex/Checkpoint">000E14</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Gatos B:an:adaptive:binarization</idno>
<idno type="wicri:Area/Main/Merge">001630</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">An Adaptive Binarization Technique for Low Quality Historical Documents</title>
<author><name sortKey="Gatos, Basilios" sort="Gatos, Basilios" uniqKey="Gatos B" first="Basilios" last="Gatos">Basilios Gatos</name>
<affiliation wicri:level="1"><country xml:lang="fr">Grèce</country>
<wicri:regionArea>Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center “Demokritos”, 153 10, Athens</wicri:regionArea>
<wicri:noRegion>Athens</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Grèce</country>
</affiliation>
</author>
<author><name sortKey="Pratikakis, Ioannis" sort="Pratikakis, Ioannis" uniqKey="Pratikakis I" first="Ioannis" last="Pratikakis">Ioannis Pratikakis</name>
<affiliation wicri:level="1"><country xml:lang="fr">Grèce</country>
<wicri:regionArea>Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center “Demokritos”, 153 10, Athens</wicri:regionArea>
<wicri:noRegion>Athens</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Grèce</country>
</affiliation>
</author>
<author><name sortKey="Perantonis, J" sort="Perantonis, J" uniqKey="Perantonis J" first="J." last="Perantonis">J. Perantonis</name>
<affiliation wicri:level="1"><country xml:lang="fr">Grèce</country>
<wicri:regionArea>Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Research Center “Demokritos”, 153 10, Athens</wicri:regionArea>
<wicri:noRegion>Athens</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Grèce</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">35F40709B6AD5153018FB3B4194A8283BA78EA78</idno>
<idno type="DOI">10.1007/978-3-540-28640-0_10</idno>
<idno type="ChapterID">10</idno>
<idno type="ChapterID">Chap10</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Historical document collections are a valuable resource for human history. This paper proposes a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. The proposed scheme consists of five distinct steps: a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions using Niblack’s approach, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image and finally a post-processing step in order to improve the quality of text regions and preserve stroke connectivity. The proposed methodology works with great success even in cases of historical manuscripts with poor quality, shadows, nonuniform illumination, low contrast, large signal- dependent noise, smear and strain. After testing the proposed method on numerous low quality historical manuscripts, it has turned out that our methodology performs better compared to current state-of-the-art adaptive thresholding techniques.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001630 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001630 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= ISTEX:35F40709B6AD5153018FB3B4194A8283BA78EA78 |texte= An Adaptive Binarization Technique for Low Quality Historical Documents }}
This area was generated with Dilib version V0.6.32. |