Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Composition of a Dewarped and Enhanced Document Image From Two View Images

Identifieur interne : 000558 ( PascalFrancis/Curation ); précédent : 000557; suivant : 000559

Composition of a Dewarped and Enhanced Document Image From Two View Images

Auteurs : HYUNG IL KOO [Corée du Sud] ; Jinho Kim [Corée du Sud] ; NAM IK CHO [Corée du Sud]

Source :

RBID : Pascal:09-0350330

Descripteurs français

English descriptors

Abstract

In this paper, we propose an algorithm to compose a geometrically dewarped and visually enhanced image from two document images taken by a digital camera at different angles. Unlike the conventional works that require special equipments or assumptions on the contents of books or complicated image acquisition steps, we estimate the unfolded book or document surface from the corresponding points between two images. For this purpose, the surface and camera matrices are estimated using structure reconstruction, 3-D projection analysis, and random sample consensus-based curve fitting with the cylindrical surface model. Because we do not need any assumption on the contents of books, the proposed method can be applied not only to optical character recognition (OCR), but also to the high-quality digitization of pictures in documents. In addition to the dewarping for a structurally better image, image mosaic is also performed for further improving the visual quality. By finding better parts of images (with less out of focus blur and/or without specular reflections) from either of views, we compose a better image by stitching and blending them. These processes are formulated as energy minimization problems that can be solved using a graph cut method. Experiments on many kinds of book or document images show that the proposed algorithm robustly works and yields visually pleasing results. Also, the OCR rate of the resulting image is comparable to that of document images from a flatbed scanner.
pA  
A01 01  1    @0 1057-7149
A03   1    @0 IEEE trans. image process.
A05       @2 18
A06       @2 7
A08 01  1  ENG  @1 Composition of a Dewarped and Enhanced Document Image From Two View Images
A11 01  1    @1 HYUNG IL KOO
A11 02  1    @1 KIM (Jinho)
A11 03  1    @1 NAM IK CHO
A14 01      @1 Department of Electrical Engineering and Computer Science and the INMC, Seoul National University @2 Gwanak-gu, Seoul, 151-744 @3 KOR @Z 1 aut. @Z 3 aut.
A14 02      @1 Multimedia Laboratory, Telecommunication R&D Center, Samsung Electronics Company, Ltd @2 Suwon, Gyeonggi-do, 443-742 @3 KOR @Z 2 aut.
A20       @1 1551-1562
A21       @1 2009
A23 01      @0 ENG
A43 01      @1 INIST @2 22520 @5 354000170936010140
A44       @0 0000 @1 © 2009 INIST-CNRS. All rights reserved.
A45       @0 34 ref.
A47 01  1    @0 09-0350330
A60       @1 P
A61       @0 A
A64 01  1    @0 IEEE transactions on image processing
A66 01      @0 USA
C01 01    ENG  @0 In this paper, we propose an algorithm to compose a geometrically dewarped and visually enhanced image from two document images taken by a digital camera at different angles. Unlike the conventional works that require special equipments or assumptions on the contents of books or complicated image acquisition steps, we estimate the unfolded book or document surface from the corresponding points between two images. For this purpose, the surface and camera matrices are estimated using structure reconstruction, 3-D projection analysis, and random sample consensus-based curve fitting with the cylindrical surface model. Because we do not need any assumption on the contents of books, the proposed method can be applied not only to optical character recognition (OCR), but also to the high-quality digitization of pictures in documents. In addition to the dewarping for a structurally better image, image mosaic is also performed for further improving the visual quality. By finding better parts of images (with less out of focus blur and/or without specular reflections) from either of views, we compose a better image by stitching and blending them. These processes are formulated as energy minimization problems that can be solved using a graph cut method. Experiments on many kinds of book or document images show that the proposed algorithm robustly works and yields visually pleasing results. Also, the OCR rate of the resulting image is comparable to that of document images from a flatbed scanner.
C02 01  X    @0 001D04A05C
C02 02  X    @0 001D04A05A
C02 03  X    @0 001D04A04A2
C03 01  3  FRE  @0 Traitement image document @5 01
C03 01  3  ENG  @0 Document image processing @5 01
C03 02  X  FRE  @0 Algorithme @5 02
C03 02  X  ENG  @0 Algorithm @5 02
C03 02  X  SPA  @0 Algoritmo @5 02
C03 03  X  FRE  @0 Ajustement courbe @5 03
C03 03  X  ENG  @0 Curve fitting @5 03
C03 03  X  SPA  @0 Ajustamiento curva @5 03
C03 04  X  FRE  @0 Forme cylindrique @5 04
C03 04  X  ENG  @0 Cylindrical shape @5 04
C03 04  X  SPA  @0 Forma cilíndrica @5 04
C03 05  X  FRE  @0 Reconnaissance optique caractère @5 05
C03 05  X  ENG  @0 Optical character recognition @5 05
C03 05  X  SPA  @0 Reconocimento óptico de caracteres @5 05
C03 06  X  FRE  @0 Numérisation @5 06
C03 06  X  ENG  @0 Digitizing @5 06
C03 06  X  SPA  @0 Numerización @5 06
C03 07  X  FRE  @0 Qualité image @5 07
C03 07  X  ENG  @0 Image quality @5 07
C03 07  X  SPA  @0 Calidad imagen @5 07
C03 08  X  FRE  @0 Image floue @5 08
C03 08  X  ENG  @0 Blurred image @5 08
C03 08  X  SPA  @0 Imagen borrosa @5 08
C03 09  X  FRE  @0 Réflexion spéculaire @5 09
C03 09  X  ENG  @0 Specular reflection @5 09
C03 09  X  SPA  @0 Reflexión especular @5 09
C03 10  X  FRE  @0 Méthode graphe @5 10
C03 10  X  ENG  @0 Graph method @5 10
C03 10  X  SPA  @0 Método grafo @5 10
C03 11  X  FRE  @0 Coupe graphe @5 11
C03 11  X  ENG  @0 Graph cut @5 11
C03 11  X  SPA  @0 Corte grafo @5 11
C03 12  X  FRE  @0 Estimation robuste @5 12
C03 12  X  ENG  @0 Robust estimation @5 12
C03 12  X  SPA  @0 Estimación robusta @5 12
C03 13  X  FRE  @0 Traitement image @5 31
C03 13  X  ENG  @0 Image processing @5 31
C03 13  X  SPA  @0 Procesamiento imagen @5 31
C03 14  X  FRE  @0 Reconnaissance forme @5 32
C03 14  X  ENG  @0 Pattern recognition @5 32
C03 14  X  SPA  @0 Reconocimiento patrón @5 32
N21       @1 257
N44 01      @1 OTO
N82       @1 OTO

Links toward previous steps (curation, corpus...)


Links to Exploration step

Pascal:09-0350330

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Composition of a Dewarped and Enhanced Document Image From Two View Images</title>
<author>
<name sortKey="Hyung Il Koo" sort="Hyung Il Koo" uniqKey="Hyung Il Koo" last="Hyung Il Koo">HYUNG IL KOO</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering and Computer Science and the INMC, Seoul National University</s1>
<s2>Gwanak-gu, Seoul, 151-744</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
</affiliation>
</author>
<author>
<name sortKey="Kim, Jinho" sort="Kim, Jinho" uniqKey="Kim J" first="Jinho" last="Kim">Jinho Kim</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>Multimedia Laboratory, Telecommunication R&D Center, Samsung Electronics Company, Ltd</s1>
<s2>Suwon, Gyeonggi-do, 443-742</s2>
<s3>KOR</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
</affiliation>
</author>
<author>
<name sortKey="Nam Ik Cho" sort="Nam Ik Cho" uniqKey="Nam Ik Cho" last="Nam Ik Cho">NAM IK CHO</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering and Computer Science and the INMC, Seoul National University</s1>
<s2>Gwanak-gu, Seoul, 151-744</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">09-0350330</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 09-0350330 INIST</idno>
<idno type="RBID">Pascal:09-0350330</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000221</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000558</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Composition of a Dewarped and Enhanced Document Image From Two View Images</title>
<author>
<name sortKey="Hyung Il Koo" sort="Hyung Il Koo" uniqKey="Hyung Il Koo" last="Hyung Il Koo">HYUNG IL KOO</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering and Computer Science and the INMC, Seoul National University</s1>
<s2>Gwanak-gu, Seoul, 151-744</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
</affiliation>
</author>
<author>
<name sortKey="Kim, Jinho" sort="Kim, Jinho" uniqKey="Kim J" first="Jinho" last="Kim">Jinho Kim</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>Multimedia Laboratory, Telecommunication R&D Center, Samsung Electronics Company, Ltd</s1>
<s2>Suwon, Gyeonggi-do, 443-742</s2>
<s3>KOR</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
</affiliation>
</author>
<author>
<name sortKey="Nam Ik Cho" sort="Nam Ik Cho" uniqKey="Nam Ik Cho" last="Nam Ik Cho">NAM IK CHO</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Electrical Engineering and Computer Science and the INMC, Seoul National University</s1>
<s2>Gwanak-gu, Seoul, 151-744</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Corée du Sud</country>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">IEEE transactions on image processing</title>
<title level="j" type="abbreviated">IEEE trans. image process.</title>
<idno type="ISSN">1057-7149</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">IEEE transactions on image processing</title>
<title level="j" type="abbreviated">IEEE trans. image process.</title>
<idno type="ISSN">1057-7149</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithm</term>
<term>Blurred image</term>
<term>Curve fitting</term>
<term>Cylindrical shape</term>
<term>Digitizing</term>
<term>Document image processing</term>
<term>Graph cut</term>
<term>Graph method</term>
<term>Image processing</term>
<term>Image quality</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Robust estimation</term>
<term>Specular reflection</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Traitement image document</term>
<term>Algorithme</term>
<term>Ajustement courbe</term>
<term>Forme cylindrique</term>
<term>Reconnaissance optique caractère</term>
<term>Numérisation</term>
<term>Qualité image</term>
<term>Image floue</term>
<term>Réflexion spéculaire</term>
<term>Méthode graphe</term>
<term>Coupe graphe</term>
<term>Estimation robuste</term>
<term>Traitement image</term>
<term>Reconnaissance forme</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Numérisation</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this paper, we propose an algorithm to compose a geometrically dewarped and visually enhanced image from two document images taken by a digital camera at different angles. Unlike the conventional works that require special equipments or assumptions on the contents of books or complicated image acquisition steps, we estimate the unfolded book or document surface from the corresponding points between two images. For this purpose, the surface and camera matrices are estimated using structure reconstruction, 3-D projection analysis, and random sample consensus-based curve fitting with the cylindrical surface model. Because we do not need any assumption on the contents of books, the proposed method can be applied not only to optical character recognition (OCR), but also to the high-quality digitization of pictures in documents. In addition to the dewarping for a structurally better image, image mosaic is also performed for further improving the visual quality. By finding better parts of images (with less out of focus blur and/or without specular reflections) from either of views, we compose a better image by stitching and blending them. These processes are formulated as energy minimization problems that can be solved using a graph cut method. Experiments on many kinds of book or document images show that the proposed algorithm robustly works and yields visually pleasing results. Also, the OCR rate of the resulting image is comparable to that of document images from a flatbed scanner.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1057-7149</s0>
</fA01>
<fA03 i2="1">
<s0>IEEE trans. image process.</s0>
</fA03>
<fA05>
<s2>18</s2>
</fA05>
<fA06>
<s2>7</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG">
<s1>Composition of a Dewarped and Enhanced Document Image From Two View Images</s1>
</fA08>
<fA11 i1="01" i2="1">
<s1>HYUNG IL KOO</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>KIM (Jinho)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>NAM IK CHO</s1>
</fA11>
<fA14 i1="01">
<s1>Department of Electrical Engineering and Computer Science and the INMC, Seoul National University</s1>
<s2>Gwanak-gu, Seoul, 151-744</s2>
<s3>KOR</s3>
<sZ>1 aut.</sZ>
<sZ>3 aut.</sZ>
</fA14>
<fA14 i1="02">
<s1>Multimedia Laboratory, Telecommunication R&D Center, Samsung Electronics Company, Ltd</s1>
<s2>Suwon, Gyeonggi-do, 443-742</s2>
<s3>KOR</s3>
<sZ>2 aut.</sZ>
</fA14>
<fA20>
<s1>1551-1562</s1>
</fA20>
<fA21>
<s1>2009</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA43 i1="01">
<s1>INIST</s1>
<s2>22520</s2>
<s5>354000170936010140</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2009 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>34 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>09-0350330</s0>
</fA47>
<fA60>
<s1>P</s1>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>IEEE transactions on image processing</s0>
</fA64>
<fA66 i1="01">
<s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>In this paper, we propose an algorithm to compose a geometrically dewarped and visually enhanced image from two document images taken by a digital camera at different angles. Unlike the conventional works that require special equipments or assumptions on the contents of books or complicated image acquisition steps, we estimate the unfolded book or document surface from the corresponding points between two images. For this purpose, the surface and camera matrices are estimated using structure reconstruction, 3-D projection analysis, and random sample consensus-based curve fitting with the cylindrical surface model. Because we do not need any assumption on the contents of books, the proposed method can be applied not only to optical character recognition (OCR), but also to the high-quality digitization of pictures in documents. In addition to the dewarping for a structurally better image, image mosaic is also performed for further improving the visual quality. By finding better parts of images (with less out of focus blur and/or without specular reflections) from either of views, we compose a better image by stitching and blending them. These processes are formulated as energy minimization problems that can be solved using a graph cut method. Experiments on many kinds of book or document images show that the proposed algorithm robustly works and yields visually pleasing results. Also, the OCR rate of the resulting image is comparable to that of document images from a flatbed scanner.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D04A05C</s0>
</fC02>
<fC02 i1="02" i2="X">
<s0>001D04A05A</s0>
</fC02>
<fC02 i1="03" i2="X">
<s0>001D04A04A2</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE">
<s0>Traitement image document</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG">
<s0>Document image processing</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Algorithme</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Algorithm</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Algoritmo</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Ajustement courbe</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Curve fitting</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Ajustamiento curva</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Forme cylindrique</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Cylindrical shape</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Forma cilíndrica</s0>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Numérisation</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Digitizing</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Numerización</s0>
<s5>06</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE">
<s0>Qualité image</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG">
<s0>Image quality</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA">
<s0>Calidad imagen</s0>
<s5>07</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Image floue</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Blurred image</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Imagen borrosa</s0>
<s5>08</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Réflexion spéculaire</s0>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Specular reflection</s0>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Reflexión especular</s0>
<s5>09</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE">
<s0>Méthode graphe</s0>
<s5>10</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG">
<s0>Graph method</s0>
<s5>10</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA">
<s0>Método grafo</s0>
<s5>10</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE">
<s0>Coupe graphe</s0>
<s5>11</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG">
<s0>Graph cut</s0>
<s5>11</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA">
<s0>Corte grafo</s0>
<s5>11</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE">
<s0>Estimation robuste</s0>
<s5>12</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG">
<s0>Robust estimation</s0>
<s5>12</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA">
<s0>Estimación robusta</s0>
<s5>12</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE">
<s0>Traitement image</s0>
<s5>31</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG">
<s0>Image processing</s0>
<s5>31</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA">
<s0>Procesamiento imagen</s0>
<s5>31</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE">
<s0>Reconnaissance forme</s0>
<s5>32</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG">
<s0>Pattern recognition</s0>
<s5>32</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA">
<s0>Reconocimiento patrón</s0>
<s5>32</s5>
</fC03>
<fN21>
<s1>257</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
</standard>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000558 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Curation/biblio.hfd -nk 000558 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Curation
   |type=    RBID
   |clé=     Pascal:09-0350330
   |texte=   Composition of a Dewarped and Enhanced Document Image From Two View Images
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024