Document Image De-warping Based on Detection of Distorted Text Lines
Identifieur interne : 001312 ( Main/Curation ); précédent : 001311; suivant : 001313Document Image De-warping Based on Detection of Distorted Text Lines
Auteurs : Lothar Mischke [Allemagne] ; Wolfram Luther [Allemagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2005.
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Numérisation.
English descriptors
- KwdEn :
Abstract
Abstract: Image warping caused by scanning, photocopying or photographing a document is a common problem in the .eld of document processing and understanding. Distortion within the text documents impairs OCRability and thus strongly decreases the usability of the results. This is one of the major obstacles for automating the process of digitizing printed documents. In this paper we present a novel algorithm which is able to correct document image warping based on the detection of distorted text lines. The proposed solution is used in a recent project of digitizing old, poor quality manuscripts. The algorithm is compared to other published approaches. Experiments with various document samples and the resulting improvements of the text recognition rate achieved by a commercial OCR engine are also presented.
Url:
DOI: 10.1007/11553595_131
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000533
- to stream Istex, to step Curation: Pour aller vers cette notice dans l'étape Curation :000526
- to stream Istex, to step Checkpoint: Pour aller vers cette notice dans l'étape Curation :000C21
- to stream Main, to step Merge: Pour aller vers cette notice dans l'étape Curation :001348
- to stream PascalFrancis, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000442
- to stream PascalFrancis, to step Curation: Pour aller vers cette notice dans l'étape Curation :000345
- to stream PascalFrancis, to step Checkpoint: Pour aller vers cette notice dans l'étape Curation :000414
- to stream Main, to step Merge: Pour aller vers cette notice dans l'étape Curation :001444
Links to Exploration step
ISTEX:093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Document Image De-warping Based on Detection of Distorted Text Lines</title>
<author><name sortKey="Mischke, Lothar" sort="Mischke, Lothar" uniqKey="Mischke L" first="Lothar" last="Mischke">Lothar Mischke</name>
</author>
<author><name sortKey="Luther, Wolfram" sort="Luther, Wolfram" uniqKey="Luther W" first="Wolfram" last="Luther">Wolfram Luther</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1</idno>
<date when="2005" year="2005">2005</date>
<idno type="doi">10.1007/11553595_131</idno>
<idno type="url">https://api.istex.fr/document/093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000533</idno>
<idno type="wicri:Area/Istex/Curation">000526</idno>
<idno type="wicri:Area/Istex/Checkpoint">000C21</idno>
<idno type="wicri:doubleKey">0302-9743:2005:Mischke L:document:image:de</idno>
<idno type="wicri:Area/Main/Merge">001348</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Pascal:05-0420709</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000442</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000345</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000414</idno>
<idno type="wicri:doubleKey">0302-9743:2005:Mischke L:document:image:de</idno>
<idno type="wicri:Area/Main/Merge">001444</idno>
<idno type="wicri:Area/Main/Curation">001312</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Document Image De-warping Based on Detection of Distorted Text Lines</title>
<author><name sortKey="Mischke, Lothar" sort="Mischke, Lothar" uniqKey="Mischke L" first="Lothar" last="Mischke">Lothar Mischke</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Eduard Spranger Vocational School, Vorheider Weg 8, D-59067, Hamm</wicri:regionArea>
<wicri:noRegion>59067, Hamm</wicri:noRegion>
<wicri:noRegion>Hamm</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author><name sortKey="Luther, Wolfram" sort="Luther, Wolfram" uniqKey="Luther W" first="Wolfram" last="Luther">Wolfram Luther</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Institute of Computer Science and Interactive Systems, University of Duisburg–Essen, Lotharstr. 65, D-47048, Duisburg</wicri:regionArea>
<wicri:noRegion>47048, Duisburg</wicri:noRegion>
<wicri:noRegion>Duisburg</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2005</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1</idno>
<idno type="DOI">10.1007/11553595_131</idno>
<idno type="ChapterID">131</idno>
<idno type="ChapterID">Chap131</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Digitizing</term>
<term>Document processing</term>
<term>Image interpretation</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Printed character</term>
<term>Printed document</term>
<term>Text</term>
<term>Usability</term>
<term>Warping</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Caractère imprimé</term>
<term>Document imprimé</term>
<term>Gauchissement</term>
<term>Interprétation image</term>
<term>Numérisation</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Traitement document</term>
<term>Utilisabilité</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Numérisation</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Image warping caused by scanning, photocopying or photographing a document is a common problem in the .eld of document processing and understanding. Distortion within the text documents impairs OCRability and thus strongly decreases the usability of the results. This is one of the major obstacles for automating the process of digitizing printed documents. In this paper we present a novel algorithm which is able to correct document image warping based on the detection of distorted text lines. The proposed solution is used in a recent project of digitizing old, poor quality manuscripts. The algorithm is compared to other published approaches. Experiments with various document samples and the resulting improvements of the text recognition rate achieved by a commercial OCR engine are also presented.</div>
</front>
</TEI>
<double idat="0302-9743:2005:Mischke L:document:image:de"><INIST><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Document image de-warping based on detection of distorted text lines</title>
<author><name sortKey="Mischke, Lothar" sort="Mischke, Lothar" uniqKey="Mischke L" first="Lothar" last="Mischke">Lothar Mischke</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Eduard Spranger Vocational School, Vorheider Weg 8</s1>
<s2>59067 Hamm</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>59067 Hamm</wicri:noRegion>
<wicri:noRegion>Vorheider Weg 8</wicri:noRegion>
<wicri:noRegion>59067 Hamm</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Luther, Wolfram" sort="Luther, Wolfram" uniqKey="Luther W" first="Wolfram" last="Luther">Wolfram Luther</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>Institute of Computer Science and Interactive Systems, University of Duisburg-Essen, Lotharstr. 65</s1>
<s2>47048 Duisburg</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>47048 Duisburg</wicri:noRegion>
<wicri:noRegion>Lotharstr. 65</wicri:noRegion>
<wicri:noRegion>47048 Duisburg</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">05-0420709</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0420709 INIST</idno>
<idno type="RBID">Pascal:05-0420709</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000442</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000345</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000414</idno>
<idno type="wicri:doubleKey">0302-9743:2005:Mischke L:document:image:de</idno>
<idno type="wicri:Area/Main/Merge">001444</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Document image de-warping based on detection of distorted text lines</title>
<author><name sortKey="Mischke, Lothar" sort="Mischke, Lothar" uniqKey="Mischke L" first="Lothar" last="Mischke">Lothar Mischke</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Eduard Spranger Vocational School, Vorheider Weg 8</s1>
<s2>59067 Hamm</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>59067 Hamm</wicri:noRegion>
<wicri:noRegion>Vorheider Weg 8</wicri:noRegion>
<wicri:noRegion>59067 Hamm</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Luther, Wolfram" sort="Luther, Wolfram" uniqKey="Luther W" first="Wolfram" last="Luther">Wolfram Luther</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>Institute of Computer Science and Interactive Systems, University of Duisburg-Essen, Lotharstr. 65</s1>
<s2>47048 Duisburg</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>47048 Duisburg</wicri:noRegion>
<wicri:noRegion>Lotharstr. 65</wicri:noRegion>
<wicri:noRegion>47048 Duisburg</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
<imprint><date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Digitizing</term>
<term>Document processing</term>
<term>Image interpretation</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Printed character</term>
<term>Printed document</term>
<term>Text</term>
<term>Usability</term>
<term>Warping</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Texte</term>
<term>Interprétation image</term>
<term>Traitement document</term>
<term>Numérisation</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance optique caractère</term>
<term>Gauchissement</term>
<term>Utilisabilité</term>
<term>Caractère imprimé</term>
<term>Document imprimé</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Numérisation</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Image warping caused by scanning, photocopying or photographing a document is a common problem in the field of document processing and understanding. Distortion within the text documents impairs OCRability and thus strongly decreases the usability of the results. This is one of the major obstacles for automating the process of digitizing printed documents. In this paper we present a novel algorithm which is able to correct document image warping based on the detection of distorted text lines. The proposed solution is used in a recent project of digitizing old, poor quality manuscripts. The algorithm is compared to other published approaches. Experiments with various document samples and the resulting improvements of the text recognition rate achieved by a commercial OCR engine are also presented.</div>
</front>
</TEI>
</INIST>
<ISTEX><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Document Image De-warping Based on Detection of Distorted Text Lines</title>
<author><name sortKey="Mischke, Lothar" sort="Mischke, Lothar" uniqKey="Mischke L" first="Lothar" last="Mischke">Lothar Mischke</name>
</author>
<author><name sortKey="Luther, Wolfram" sort="Luther, Wolfram" uniqKey="Luther W" first="Wolfram" last="Luther">Wolfram Luther</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1</idno>
<date when="2005" year="2005">2005</date>
<idno type="doi">10.1007/11553595_131</idno>
<idno type="url">https://api.istex.fr/document/093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000533</idno>
<idno type="wicri:Area/Istex/Curation">000526</idno>
<idno type="wicri:Area/Istex/Checkpoint">000C21</idno>
<idno type="wicri:doubleKey">0302-9743:2005:Mischke L:document:image:de</idno>
<idno type="wicri:Area/Main/Merge">001348</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Document Image De-warping Based on Detection of Distorted Text Lines</title>
<author><name sortKey="Mischke, Lothar" sort="Mischke, Lothar" uniqKey="Mischke L" first="Lothar" last="Mischke">Lothar Mischke</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Eduard Spranger Vocational School, Vorheider Weg 8, D-59067, Hamm</wicri:regionArea>
<wicri:noRegion>59067, Hamm</wicri:noRegion>
<wicri:noRegion>Hamm</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author><name sortKey="Luther, Wolfram" sort="Luther, Wolfram" uniqKey="Luther W" first="Wolfram" last="Luther">Wolfram Luther</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Institute of Computer Science and Interactive Systems, University of Duisburg–Essen, Lotharstr. 65, D-47048, Duisburg</wicri:regionArea>
<wicri:noRegion>47048, Duisburg</wicri:noRegion>
<wicri:noRegion>Duisburg</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2005</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1</idno>
<idno type="DOI">10.1007/11553595_131</idno>
<idno type="ChapterID">131</idno>
<idno type="ChapterID">Chap131</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Image warping caused by scanning, photocopying or photographing a document is a common problem in the .eld of document processing and understanding. Distortion within the text documents impairs OCRability and thus strongly decreases the usability of the results. This is one of the major obstacles for automating the process of digitizing printed documents. In this paper we present a novel algorithm which is able to correct document image warping based on the detection of distorted text lines. The proposed solution is used in a recent project of digitizing old, poor quality manuscripts. The algorithm is compared to other published approaches. Experiments with various document samples and the resulting improvements of the text recognition rate achieved by a commercial OCR engine are also presented.</div>
</front>
</TEI>
</ISTEX>
</double>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001312 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 001312 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Curation |type= RBID |clé= ISTEX:093C2B6E6B98D0B3A0122B40D31AFB8A26AD5ED1 |texte= Document Image De-warping Based on Detection of Distorted Text Lines }}
This area was generated with Dilib version V0.6.32. |