Address extraction using Hidden Markov Models
Identifieur interne :
000463 ( PascalFrancis/Corpus );
précédent :
000462;
suivant :
000464
Address extraction using Hidden Markov Models
Auteurs : Kazem Taghva ;
Jeffrey Coombs ;
Ray Pereda ;
Thomas NartkerSource :
-
SPIE proceedings series [ 1017-2653 ] ; 2005.
RBID : Pascal:05-0359645
Descripteurs français
English descriptors
Abstract
This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.
Notice en format standard (ISO 2709)
Pour connaître la documentation sur le format Inist Standard.
pA |
A01 | 01 | 1 | | @0 1017-2653 |
---|
A05 | | | | @2 5676 |
---|
A08 | 01 | 1 | ENG | @1 Address extraction using Hidden Markov Models |
---|
A09 | 01 | 1 | ENG | @1 Document recognition and retrieval XII : San Jose CA, 19-20 January 2005 |
---|
A11 | 01 | 1 | | @1 TAGHVA (Kazem) |
---|
A11 | 02 | 1 | | @1 COOMBS (Jeffrey) |
---|
A11 | 03 | 1 | | @1 PEREDA (Ray) |
---|
A11 | 04 | 1 | | @1 NARTKER (Thomas) |
---|
A12 | 01 | 1 | | @1 SMITH (Elisa H. Barney) @9 ed. |
---|
A12 | 02 | 1 | | @1 TAGHVA (Kazem) @9 ed. |
---|
A14 | 01 | | | @1 Information Science Research Institute University of Nevada, Las Vegas @2 Las Vegas, NV 89154-4021 @3 USA @Z 1 aut. @Z 2 aut. @Z 3 aut. @Z 4 aut. |
---|
A18 | 01 | 1 | | @1 International Society for Optical Engineering @2 Bellingham WA @3 USA @9 org-cong. |
---|
A20 | | | | @1 119-126 |
---|
A21 | | | | @1 2005 |
---|
A23 | 01 | | | @0 ENG |
---|
A26 | 01 | | | @0 0-8194-5649-7 |
---|
A43 | 01 | | | @1 INIST @2 21760 @5 354000124499720140 |
---|
A44 | | | | @0 0000 @1 © 2005 INIST-CNRS. All rights reserved. |
---|
A45 | | | | @0 10 ref. |
---|
A47 | 01 | 1 | | @0 05-0359645 |
---|
A60 | | | | @1 P @2 C |
---|
A61 | | | | @0 A |
---|
A64 | 01 | 1 | | @0 SPIE proceedings series |
---|
A66 | 01 | | | @0 USA |
---|
C01 | 01 | | ENG | @0 This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text. |
---|
C02 | 01 | X | | @0 001D04A05A |
---|
C03 | 01 | 3 | FRE | @0 Modèle Markov variable cachée @5 01 |
---|
C03 | 01 | 3 | ENG | @0 Hidden Markov models @5 01 |
---|
C03 | 02 | X | FRE | @0 Implémentation @5 02 |
---|
C03 | 02 | X | ENG | @0 Implementation @5 02 |
---|
C03 | 02 | X | SPA | @0 Implementación @5 02 |
---|
C03 | 03 | X | FRE | @0 Reconnaissance optique caractère @5 03 |
---|
C03 | 03 | X | ENG | @0 Optical character recognition @5 03 |
---|
C03 | 03 | X | SPA | @0 Reconocimento óptico de caracteres @5 03 |
---|
C03 | 04 | X | FRE | @0 Précision élevée @5 04 |
---|
C03 | 04 | X | ENG | @0 High precision @5 04 |
---|
C03 | 04 | X | SPA | @0 Precisión elevada @5 04 |
---|
C03 | 05 | X | FRE | @0 Extraction information @5 05 |
---|
C03 | 05 | X | ENG | @0 Information extraction @5 05 |
---|
C03 | 05 | X | SPA | @0 Extracción información @5 05 |
---|
C03 | 06 | X | FRE | @0 Approche probabiliste @5 31 |
---|
C03 | 06 | X | ENG | @0 Probabilistic approach @5 31 |
---|
C03 | 06 | X | SPA | @0 Enfoque probabilista @5 31 |
---|
N21 | | | | @1 248 |
---|
|
pR |
A30 | 01 | 1 | ENG | @1 Document recognition and retrieval. Conference @2 12 @3 San Jose CA USA @4 2005-01-19 |
---|
|
Format Inist (serveur)
NO : | PASCAL 05-0359645 INIST |
ET : | Address extraction using Hidden Markov Models |
AU : | TAGHVA (Kazem); COOMBS (Jeffrey); PEREDA (Ray); NARTKER (Thomas); SMITH (Elisa H. Barney); TAGHVA (Kazem) |
AF : | Information Science Research Institute University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut., 3 aut., 4 aut.) |
DT : | Publication en série; Congrès; Niveau analytique |
SO : | SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2005; Vol. 5676; Pp. 119-126; Bibl. 10 ref. |
LA : | Anglais |
EA : | This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text. |
CC : | 001D04A05A |
FD : | Modèle Markov variable cachée; Implémentation; Reconnaissance optique caractère; Précision élevée; Extraction information; Approche probabiliste |
ED : | Hidden Markov models; Implementation; Optical character recognition; High precision; Information extraction; Probabilistic approach |
SD : | Implementación; Reconocimento óptico de caracteres; Precisión elevada; Extracción información; Enfoque probabilista |
LO : | INIST-21760.354000124499720140 |
ID : | 05-0359645 |
Links to Exploration step
Pascal:05-0359645
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Address extraction using Hidden Markov Models</title>
<author><name sortKey="Taghva, Kazem" sort="Taghva, Kazem" uniqKey="Taghva K" first="Kazem" last="Taghva">Kazem Taghva</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Coombs, Jeffrey" sort="Coombs, Jeffrey" uniqKey="Coombs J" first="Jeffrey" last="Coombs">Jeffrey Coombs</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Pereda, Ray" sort="Pereda, Ray" uniqKey="Pereda R" first="Ray" last="Pereda">Ray Pereda</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Nartker, Thomas" sort="Nartker, Thomas" uniqKey="Nartker T" first="Thomas" last="Nartker">Thomas Nartker</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">05-0359645</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0359645 INIST</idno>
<idno type="RBID">Pascal:05-0359645</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000463</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Address extraction using Hidden Markov Models</title>
<author><name sortKey="Taghva, Kazem" sort="Taghva, Kazem" uniqKey="Taghva K" first="Kazem" last="Taghva">Kazem Taghva</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Coombs, Jeffrey" sort="Coombs, Jeffrey" uniqKey="Coombs J" first="Jeffrey" last="Coombs">Jeffrey Coombs</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Pereda, Ray" sort="Pereda, Ray" uniqKey="Pereda R" first="Ray" last="Pereda">Ray Pereda</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Nartker, Thomas" sort="Nartker, Thomas" uniqKey="Nartker T" first="Thomas" last="Nartker">Thomas Nartker</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint><date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Hidden Markov models</term>
<term>High precision</term>
<term>Implementation</term>
<term>Information extraction</term>
<term>Optical character recognition</term>
<term>Probabilistic approach</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Modèle Markov variable cachée</term>
<term>Implémentation</term>
<term>Reconnaissance optique caractère</term>
<term>Précision élevée</term>
<term>Extraction information</term>
<term>Approche probabiliste</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1017-2653</s0>
</fA01>
<fA05><s2>5676</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG"><s1>Address extraction using Hidden Markov Models</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval XII : San Jose CA, 19-20 January 2005</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>TAGHVA (Kazem)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>COOMBS (Jeffrey)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>PEREDA (Ray)</s1>
</fA11>
<fA11 i1="04" i2="1"><s1>NARTKER (Thomas)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>SMITH (Elisa H. Barney)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>TAGHVA (Kazem)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1"><s1>International Society for Optical Engineering</s1>
<s2>Bellingham WA</s2>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20><s1>119-126</s1>
</fA20>
<fA21><s1>2005</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA26 i1="01"><s0>0-8194-5649-7</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>21760</s2>
<s5>354000124499720140</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2005 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>10 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>05-0359645</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA64 i1="01" i2="1"><s0>SPIE proceedings series</s0>
</fA64>
<fA66 i1="01"><s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D04A05A</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE"><s0>Modèle Markov variable cachée</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG"><s0>Hidden Markov models</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Implémentation</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Implementation</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Implementación</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Reconnaissance optique caractère</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Optical character recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Reconocimento óptico de caracteres</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Précision élevée</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>High precision</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Precisión elevada</s0>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Extraction information</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Information extraction</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Extracción información</s0>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Approche probabiliste</s0>
<s5>31</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Probabilistic approach</s0>
<s5>31</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Enfoque probabilista</s0>
<s5>31</s5>
</fC03>
<fN21><s1>248</s1>
</fN21>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval. Conference</s1>
<s2>12</s2>
<s3>San Jose CA USA</s3>
<s4>2005-01-19</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 05-0359645 INIST</NO>
<ET>Address extraction using Hidden Markov Models</ET>
<AU>TAGHVA (Kazem); COOMBS (Jeffrey); PEREDA (Ray); NARTKER (Thomas); SMITH (Elisa H. Barney); TAGHVA (Kazem)</AU>
<AF>Information Science Research Institute University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut., 3 aut., 4 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2005; Vol. 5676; Pp. 119-126; Bibl. 10 ref.</SO>
<LA>Anglais</LA>
<EA>This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.</EA>
<CC>001D04A05A</CC>
<FD>Modèle Markov variable cachée; Implémentation; Reconnaissance optique caractère; Précision élevée; Extraction information; Approche probabiliste</FD>
<ED>Hidden Markov models; Implementation; Optical character recognition; High precision; Information extraction; Probabilistic approach</ED>
<SD>Implementación; Reconocimento óptico de caracteres; Precisión elevada; Extracción información; Enfoque probabilista</SD>
<LO>INIST-21760.354000124499720140</LO>
<ID>05-0359645</ID>
</server>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000463 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000463 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien
|wiki= Ticri/CIDE
|area= OcrV1
|flux= PascalFrancis
|étape= Corpus
|type= RBID
|clé= Pascal:05-0359645
|texte= Address extraction using Hidden Markov Models
}}
| This area was generated with Dilib version V0.6.32. Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024 | |