Integrated text and line-art extraction from a topographic map
Identifieur interne : 000770 ( PascalFrancis/Corpus ); précédent : 000769; suivant : 000771Integrated text and line-art extraction from a topographic map
Auteurs : LUYANG LI ; G. Nagy ; A. Samal ; S. Seth ; YIHONG XUSource :
- International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2000.
Descripteurs français
- Pascal (Inist)
- Localisation, Interface utilisateur, Reconnaissance caractère, Représentation graphique, Mode conversationnel, Adaptation, Système information géographique, Reconnaissance graphique, Segmentation, Lecteur optique, Texte, Interface graphique, Etude expérimentale, Rue, Taux erreur, Traitement coopératif, Traitement forme.
English descriptors
- KwdEn :
Abstract
Our proposed approach to text and line-art extraction requires accurately locating a text-string box and identifying external line vectors incident on the box. The results of extrapolating these vectors inside the box are passed to an experimental single-font optical character reader (OCR) program, specifically trained for the font used for street labels. In the first evaluation experiment, automated techniques are used to identify the boxes and the line vectors. In the second, more comprehensive, experiment an operator marks these using a graphical user interface. OCR results on 544 instances of overlapped street-name boxes show the following improvements due to the integrated processing: the error rate is reduced from 4.1% to 2.0% for characters and from 11.8% to 6.4% for words.
Notice en format standard (ISO 2709)
Pour connaître la documentation sur le format Inist Standard.
pA |
|
---|
Format Inist (serveur)
NO : | PASCAL 00-0341139 INIST |
---|---|
ET : | Integrated text and line-art extraction from a topographic map |
AU : | LUYANG LI; NAGY (G.); SAMAL (A.); SETH (S.); YIHONG XU |
AF : | Panasonic Information and Networking Laboratories/Princeton, NY/Etats-Unis (1 aut.); Rensselaer Polytechnic Institute/Troy, NY/Etats-Unis (2 aut.); University of Nebraska - Lincoln, Department of Computer Science/Lincoln, NE 68588/Etats-Unis (3 aut., 4 aut.); Hewlett-Packard Laboratories/Palo Alto, CA/Etats-Unis (5 aut.) |
DT : | Publication en série; Niveau analytique |
SO : | International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2000; Vol. 2; No. 4; Pp. 177-185; Bibl. 26 ref. |
LA : | Anglais |
EA : | Our proposed approach to text and line-art extraction requires accurately locating a text-string box and identifying external line vectors incident on the box. The results of extrapolating these vectors inside the box are passed to an experimental single-font optical character reader (OCR) program, specifically trained for the font used for street labels. In the first evaluation experiment, automated techniques are used to identify the boxes and the line vectors. In the second, more comprehensive, experiment an operator marks these using a graphical user interface. OCR results on 544 instances of overlapped street-name boxes show the following improvements due to the integrated processing: the error rate is reduced from 4.1% to 2.0% for characters and from 11.8% to 6.4% for words. |
CC : | 001D02C03; 001E01J02; 224B02 |
FD : | Localisation; Interface utilisateur; Reconnaissance caractère; Représentation graphique; Mode conversationnel; Adaptation; Système information géographique; Reconnaissance graphique; Segmentation; Lecteur optique; Texte; Interface graphique; Etude expérimentale; Rue; Taux erreur; Traitement coopératif; Traitement forme |
ED : | Localization; User interface; Character recognition; Graphics; Interactive mode; Adaptation; Geographic information system; Graphical recognition; Segmentation; Optical reader; Text; Graphical interface; Experimental study; Street; Error rate |
SD : | Localización; Interfase usuario; Reconocimiento carácter; Representación gráfica; Modo conversacional; Adaptación; Sistema información geográfica; Reconocimiento gráfico; Segmentación; Lector óptico; Texto; Interfaz grafica; Estudio experimental; Calle; Indice error |
LO : | INIST-26790.354000082591380030 |
ID : | 00-0341139 |
Links to Exploration step
Pascal:00-0341139Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Integrated text and line-art extraction from a topographic map</title>
<author><name sortKey="Luyang Li" sort="Luyang Li" uniqKey="Luyang Li" last="Luyang Li">LUYANG LI</name>
<affiliation><inist:fA14 i1="01"><s1>Panasonic Information and Networking Laboratories</s1>
<s2>Princeton, NY</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Nagy, G" sort="Nagy, G" uniqKey="Nagy G" first="G." last="Nagy">G. Nagy</name>
<affiliation><inist:fA14 i1="02"><s1>Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Samal, A" sort="Samal, A" uniqKey="Samal A" first="A." last="Samal">A. Samal</name>
<affiliation><inist:fA14 i1="03"><s1>University of Nebraska - Lincoln, Department of Computer Science</s1>
<s2>Lincoln, NE 68588</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Seth, S" sort="Seth, S" uniqKey="Seth S" first="S." last="Seth">S. Seth</name>
<affiliation><inist:fA14 i1="03"><s1>University of Nebraska - Lincoln, Department of Computer Science</s1>
<s2>Lincoln, NE 68588</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Yihong Xu" sort="Yihong Xu" uniqKey="Yihong Xu" last="Yihong Xu">YIHONG XU</name>
<affiliation><inist:fA14 i1="04"><s1>Hewlett-Packard Laboratories</s1>
<s2>Palo Alto, CA</s2>
<s3>USA</s3>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">00-0341139</idno>
<date when="2000">2000</date>
<idno type="stanalyst">PASCAL 00-0341139 INIST</idno>
<idno type="RBID">Pascal:00-0341139</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000770</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Integrated text and line-art extraction from a topographic map</title>
<author><name sortKey="Luyang Li" sort="Luyang Li" uniqKey="Luyang Li" last="Luyang Li">LUYANG LI</name>
<affiliation><inist:fA14 i1="01"><s1>Panasonic Information and Networking Laboratories</s1>
<s2>Princeton, NY</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Nagy, G" sort="Nagy, G" uniqKey="Nagy G" first="G." last="Nagy">G. Nagy</name>
<affiliation><inist:fA14 i1="02"><s1>Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Samal, A" sort="Samal, A" uniqKey="Samal A" first="A." last="Samal">A. Samal</name>
<affiliation><inist:fA14 i1="03"><s1>University of Nebraska - Lincoln, Department of Computer Science</s1>
<s2>Lincoln, NE 68588</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Seth, S" sort="Seth, S" uniqKey="Seth S" first="S." last="Seth">S. Seth</name>
<affiliation><inist:fA14 i1="03"><s1>University of Nebraska - Lincoln, Department of Computer Science</s1>
<s2>Lincoln, NE 68588</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Yihong Xu" sort="Yihong Xu" uniqKey="Yihong Xu" last="Yihong Xu">YIHONG XU</name>
<affiliation><inist:fA14 i1="04"><s1>Hewlett-Packard Laboratories</s1>
<s2>Palo Alto, CA</s2>
<s3>USA</s3>
<sZ>5 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2000">2000</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Adaptation</term>
<term>Character recognition</term>
<term>Error rate</term>
<term>Experimental study</term>
<term>Geographic information system</term>
<term>Graphical interface</term>
<term>Graphical recognition</term>
<term>Graphics</term>
<term>Interactive mode</term>
<term>Localization</term>
<term>Optical reader</term>
<term>Segmentation</term>
<term>Street</term>
<term>Text</term>
<term>User interface</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Localisation</term>
<term>Interface utilisateur</term>
<term>Reconnaissance caractère</term>
<term>Représentation graphique</term>
<term>Mode conversationnel</term>
<term>Adaptation</term>
<term>Système information géographique</term>
<term>Reconnaissance graphique</term>
<term>Segmentation</term>
<term>Lecteur optique</term>
<term>Texte</term>
<term>Interface graphique</term>
<term>Etude expérimentale</term>
<term>Rue</term>
<term>Taux erreur</term>
<term>Traitement coopératif</term>
<term>Traitement forme</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Our proposed approach to text and line-art extraction requires accurately locating a text-string box and identifying external line vectors incident on the box. The results of extrapolating these vectors inside the box are passed to an experimental single-font optical character reader (OCR) program, specifically trained for the font used for street labels. In the first evaluation experiment, automated techniques are used to identify the boxes and the line vectors. In the second, more comprehensive, experiment an operator marks these using a graphical user interface. OCR results on 544 instances of overlapped street-name boxes show the following improvements due to the integrated processing: the error rate is reduced from 4.1% to 2.0% for characters and from 11.8% to 6.4% for words.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1433-2833</s0>
</fA01>
<fA03 i2="1"><s0>Int. j. doc. anal. recognit. : (Print)</s0>
</fA03>
<fA05><s2>2</s2>
</fA05>
<fA06><s2>4</s2>
</fA06>
<fA08 i1="01" i2="1" l="ENG"><s1>Integrated text and line-art extraction from a topographic map</s1>
</fA08>
<fA11 i1="01" i2="1"><s1>LUYANG LI</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>NAGY (G.)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>SAMAL (A.)</s1>
</fA11>
<fA11 i1="04" i2="1"><s1>SETH (S.)</s1>
</fA11>
<fA11 i1="05" i2="1"><s1>YIHONG XU</s1>
</fA11>
<fA14 i1="01"><s1>Panasonic Information and Networking Laboratories</s1>
<s2>Princeton, NY</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>Rensselaer Polytechnic Institute</s1>
<s2>Troy, NY</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="03"><s1>University of Nebraska - Lincoln, Department of Computer Science</s1>
<s2>Lincoln, NE 68588</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</fA14>
<fA14 i1="04"><s1>Hewlett-Packard Laboratories</s1>
<s2>Palo Alto, CA</s2>
<s3>USA</s3>
<sZ>5 aut.</sZ>
</fA14>
<fA20><s1>177-185</s1>
</fA20>
<fA21><s1>2000</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA43 i1="01"><s1>INIST</s1>
<s2>26790</s2>
<s5>354000082591380030</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2000 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>26 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>00-0341139</s0>
</fA47>
<fA60><s1>P</s1>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>International journal on document analysis and recognition : (Print)</s0>
</fA64>
<fA66 i1="01"><s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>Our proposed approach to text and line-art extraction requires accurately locating a text-string box and identifying external line vectors incident on the box. The results of extrapolating these vectors inside the box are passed to an experimental single-font optical character reader (OCR) program, specifically trained for the font used for street labels. In the first evaluation experiment, automated techniques are used to identify the boxes and the line vectors. In the second, more comprehensive, experiment an operator marks these using a graphical user interface. OCR results on 544 instances of overlapped street-name boxes show the following improvements due to the integrated processing: the error rate is reduced from 4.1% to 2.0% for characters and from 11.8% to 6.4% for words.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02C03</s0>
</fC02>
<fC02 i1="02" i2="X"><s0>001E01J02</s0>
</fC02>
<fC02 i1="03" i2="2"><s0>224B02</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Localisation</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Localization</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Localización</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Interface utilisateur</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>User interface</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Interfase usuario</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Reconnaissance caractère</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Character recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Reconocimiento carácter</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Représentation graphique</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Graphics</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Representación gráfica</s0>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Mode conversationnel</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Interactive mode</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Modo conversacional</s0>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Adaptation</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Adaptation</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Adaptación</s0>
<s5>06</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Système information géographique</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG"><s0>Geographic information system</s0>
<s5>07</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA"><s0>Sistema información geográfica</s0>
<s5>07</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Reconnaissance graphique</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Graphical recognition</s0>
<s5>08</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Reconocimiento gráfico</s0>
<s5>08</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE"><s0>Segmentation</s0>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG"><s0>Segmentation</s0>
<s5>09</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA"><s0>Segmentación</s0>
<s5>09</s5>
</fC03>
<fC03 i1="10" i2="X" l="FRE"><s0>Lecteur optique</s0>
<s5>10</s5>
</fC03>
<fC03 i1="10" i2="X" l="ENG"><s0>Optical reader</s0>
<s5>10</s5>
</fC03>
<fC03 i1="10" i2="X" l="SPA"><s0>Lector óptico</s0>
<s5>10</s5>
</fC03>
<fC03 i1="11" i2="X" l="FRE"><s0>Texte</s0>
<s5>11</s5>
</fC03>
<fC03 i1="11" i2="X" l="ENG"><s0>Text</s0>
<s5>11</s5>
</fC03>
<fC03 i1="11" i2="X" l="SPA"><s0>Texto</s0>
<s5>11</s5>
</fC03>
<fC03 i1="12" i2="X" l="FRE"><s0>Interface graphique</s0>
<s5>12</s5>
</fC03>
<fC03 i1="12" i2="X" l="ENG"><s0>Graphical interface</s0>
<s5>12</s5>
</fC03>
<fC03 i1="12" i2="X" l="SPA"><s0>Interfaz grafica</s0>
<s5>12</s5>
</fC03>
<fC03 i1="13" i2="X" l="FRE"><s0>Etude expérimentale</s0>
<s5>13</s5>
</fC03>
<fC03 i1="13" i2="X" l="ENG"><s0>Experimental study</s0>
<s5>13</s5>
</fC03>
<fC03 i1="13" i2="X" l="SPA"><s0>Estudio experimental</s0>
<s5>13</s5>
</fC03>
<fC03 i1="14" i2="X" l="FRE"><s0>Rue</s0>
<s5>14</s5>
</fC03>
<fC03 i1="14" i2="X" l="ENG"><s0>Street</s0>
<s5>14</s5>
</fC03>
<fC03 i1="14" i2="X" l="SPA"><s0>Calle</s0>
<s5>14</s5>
</fC03>
<fC03 i1="15" i2="X" l="FRE"><s0>Taux erreur</s0>
<s5>15</s5>
</fC03>
<fC03 i1="15" i2="X" l="ENG"><s0>Error rate</s0>
<s5>15</s5>
</fC03>
<fC03 i1="15" i2="X" l="SPA"><s0>Indice error</s0>
<s5>15</s5>
</fC03>
<fC03 i1="16" i2="X" l="FRE"><s0>Traitement coopératif</s0>
<s4>INC</s4>
<s5>82</s5>
</fC03>
<fC03 i1="17" i2="X" l="FRE"><s0>Traitement forme</s0>
<s4>INC</s4>
<s5>83</s5>
</fC03>
<fN21><s1>234</s1>
</fN21>
</pA>
</standard>
<server><NO>PASCAL 00-0341139 INIST</NO>
<ET>Integrated text and line-art extraction from a topographic map</ET>
<AU>LUYANG LI; NAGY (G.); SAMAL (A.); SETH (S.); YIHONG XU</AU>
<AF>Panasonic Information and Networking Laboratories/Princeton, NY/Etats-Unis (1 aut.); Rensselaer Polytechnic Institute/Troy, NY/Etats-Unis (2 aut.); University of Nebraska - Lincoln, Department of Computer Science/Lincoln, NE 68588/Etats-Unis (3 aut., 4 aut.); Hewlett-Packard Laboratories/Palo Alto, CA/Etats-Unis (5 aut.)</AF>
<DT>Publication en série; Niveau analytique</DT>
<SO>International journal on document analysis and recognition : (Print); ISSN 1433-2833; Allemagne; Da. 2000; Vol. 2; No. 4; Pp. 177-185; Bibl. 26 ref.</SO>
<LA>Anglais</LA>
<EA>Our proposed approach to text and line-art extraction requires accurately locating a text-string box and identifying external line vectors incident on the box. The results of extrapolating these vectors inside the box are passed to an experimental single-font optical character reader (OCR) program, specifically trained for the font used for street labels. In the first evaluation experiment, automated techniques are used to identify the boxes and the line vectors. In the second, more comprehensive, experiment an operator marks these using a graphical user interface. OCR results on 544 instances of overlapped street-name boxes show the following improvements due to the integrated processing: the error rate is reduced from 4.1% to 2.0% for characters and from 11.8% to 6.4% for words.</EA>
<CC>001D02C03; 001E01J02; 224B02</CC>
<FD>Localisation; Interface utilisateur; Reconnaissance caractère; Représentation graphique; Mode conversationnel; Adaptation; Système information géographique; Reconnaissance graphique; Segmentation; Lecteur optique; Texte; Interface graphique; Etude expérimentale; Rue; Taux erreur; Traitement coopératif; Traitement forme</FD>
<ED>Localization; User interface; Character recognition; Graphics; Interactive mode; Adaptation; Geographic information system; Graphical recognition; Segmentation; Optical reader; Text; Graphical interface; Experimental study; Street; Error rate</ED>
<SD>Localización; Interfase usuario; Reconocimiento carácter; Representación gráfica; Modo conversacional; Adaptación; Sistema información geográfica; Reconocimiento gráfico; Segmentación; Lector óptico; Texto; Interfaz grafica; Estudio experimental; Calle; Indice error</SD>
<LO>INIST-26790.354000082591380030</LO>
<ID>00-0341139</ID>
</server>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000770 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000770 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= PascalFrancis |étape= Corpus |type= RBID |clé= Pascal:00-0341139 |texte= Integrated text and line-art extraction from a topographic map }}
This area was generated with Dilib version V0.6.32. |