Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Address extraction using Hidden Markov Models

Identifieur interne : 000463 ( PascalFrancis/Corpus ); précédent : 000462; suivant : 000464

Address extraction using Hidden Markov Models

Auteurs : Kazem Taghva ; Jeffrey Coombs ; Ray Pereda ; Thomas Nartker

Source :

RBID : Pascal:05-0359645

Descripteurs français

English descriptors

Abstract

This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 1017-2653
A05       @2 5676
A08 01  1  ENG  @1 Address extraction using Hidden Markov Models
A09 01  1  ENG  @1 Document recognition and retrieval XII : San Jose CA, 19-20 January 2005
A11 01  1    @1 TAGHVA (Kazem)
A11 02  1    @1 COOMBS (Jeffrey)
A11 03  1    @1 PEREDA (Ray)
A11 04  1    @1 NARTKER (Thomas)
A12 01  1    @1 SMITH (Elisa H. Barney) @9 ed.
A12 02  1    @1 TAGHVA (Kazem) @9 ed.
A14 01      @1 Information Science Research Institute University of Nevada, Las Vegas @2 Las Vegas, NV 89154-4021 @3 USA @Z 1 aut. @Z 2 aut. @Z 3 aut. @Z 4 aut.
A18 01  1    @1 International Society for Optical Engineering @2 Bellingham WA @3 USA @9 org-cong.
A20       @1 119-126
A21       @1 2005
A23 01      @0 ENG
A26 01      @0 0-8194-5649-7
A43 01      @1 INIST @2 21760 @5 354000124499720140
A44       @0 0000 @1 © 2005 INIST-CNRS. All rights reserved.
A45       @0 10 ref.
A47 01  1    @0 05-0359645
A60       @1 P @2 C
A61       @0 A
A64 01  1    @0 SPIE proceedings series
A66 01      @0 USA
C01 01    ENG  @0 This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.
C02 01  X    @0 001D04A05A
C03 01  3  FRE  @0 Modèle Markov variable cachée @5 01
C03 01  3  ENG  @0 Hidden Markov models @5 01
C03 02  X  FRE  @0 Implémentation @5 02
C03 02  X  ENG  @0 Implementation @5 02
C03 02  X  SPA  @0 Implementación @5 02
C03 03  X  FRE  @0 Reconnaissance optique caractère @5 03
C03 03  X  ENG  @0 Optical character recognition @5 03
C03 03  X  SPA  @0 Reconocimento óptico de caracteres @5 03
C03 04  X  FRE  @0 Précision élevée @5 04
C03 04  X  ENG  @0 High precision @5 04
C03 04  X  SPA  @0 Precisión elevada @5 04
C03 05  X  FRE  @0 Extraction information @5 05
C03 05  X  ENG  @0 Information extraction @5 05
C03 05  X  SPA  @0 Extracción información @5 05
C03 06  X  FRE  @0 Approche probabiliste @5 31
C03 06  X  ENG  @0 Probabilistic approach @5 31
C03 06  X  SPA  @0 Enfoque probabilista @5 31
N21       @1 248
pR  
A30 01  1  ENG  @1 Document recognition and retrieval. Conference @2 12 @3 San Jose CA USA @4 2005-01-19

Format Inist (serveur)

NO : PASCAL 05-0359645 INIST
ET : Address extraction using Hidden Markov Models
AU : TAGHVA (Kazem); COOMBS (Jeffrey); PEREDA (Ray); NARTKER (Thomas); SMITH (Elisa H. Barney); TAGHVA (Kazem)
AF : Information Science Research Institute University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut., 3 aut., 4 aut.)
DT : Publication en série; Congrès; Niveau analytique
SO : SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2005; Vol. 5676; Pp. 119-126; Bibl. 10 ref.
LA : Anglais
EA : This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.
CC : 001D04A05A
FD : Modèle Markov variable cachée; Implémentation; Reconnaissance optique caractère; Précision élevée; Extraction information; Approche probabiliste
ED : Hidden Markov models; Implementation; Optical character recognition; High precision; Information extraction; Probabilistic approach
SD : Implementación; Reconocimento óptico de caracteres; Precisión elevada; Extracción información; Enfoque probabilista
LO : INIST-21760.354000124499720140
ID : 05-0359645

Links to Exploration step

Pascal:05-0359645

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Address extraction using Hidden Markov Models</title>
<author>
<name sortKey="Taghva, Kazem" sort="Taghva, Kazem" uniqKey="Taghva K" first="Kazem" last="Taghva">Kazem Taghva</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Coombs, Jeffrey" sort="Coombs, Jeffrey" uniqKey="Coombs J" first="Jeffrey" last="Coombs">Jeffrey Coombs</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Pereda, Ray" sort="Pereda, Ray" uniqKey="Pereda R" first="Ray" last="Pereda">Ray Pereda</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Nartker, Thomas" sort="Nartker, Thomas" uniqKey="Nartker T" first="Thomas" last="Nartker">Thomas Nartker</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">05-0359645</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0359645 INIST</idno>
<idno type="RBID">Pascal:05-0359645</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000463</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Address extraction using Hidden Markov Models</title>
<author>
<name sortKey="Taghva, Kazem" sort="Taghva, Kazem" uniqKey="Taghva K" first="Kazem" last="Taghva">Kazem Taghva</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Coombs, Jeffrey" sort="Coombs, Jeffrey" uniqKey="Coombs J" first="Jeffrey" last="Coombs">Jeffrey Coombs</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Pereda, Ray" sort="Pereda, Ray" uniqKey="Pereda R" first="Ray" last="Pereda">Ray Pereda</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Nartker, Thomas" sort="Nartker, Thomas" uniqKey="Nartker T" first="Thomas" last="Nartker">Thomas Nartker</name>
<affiliation>
<inist:fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint>
<date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Hidden Markov models</term>
<term>High precision</term>
<term>Implementation</term>
<term>Information extraction</term>
<term>Optical character recognition</term>
<term>Probabilistic approach</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Modèle Markov variable cachée</term>
<term>Implémentation</term>
<term>Reconnaissance optique caractère</term>
<term>Précision élevée</term>
<term>Extraction information</term>
<term>Approche probabiliste</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>1017-2653</s0>
</fA01>
<fA05>
<s2>5676</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG">
<s1>Address extraction using Hidden Markov Models</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval XII : San Jose CA, 19-20 January 2005</s1>
</fA09>
<fA11 i1="01" i2="1">
<s1>TAGHVA (Kazem)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>COOMBS (Jeffrey)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>PEREDA (Ray)</s1>
</fA11>
<fA11 i1="04" i2="1">
<s1>NARTKER (Thomas)</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>SMITH (Elisa H. Barney)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>TAGHVA (Kazem)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01">
<s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1">
<s1>International Society for Optical Engineering</s1>
<s2>Bellingham WA</s2>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20>
<s1>119-126</s1>
</fA20>
<fA21>
<s1>2005</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA26 i1="01">
<s0>0-8194-5649-7</s0>
</fA26>
<fA43 i1="01">
<s1>INIST</s1>
<s2>21760</s2>
<s5>354000124499720140</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2005 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>10 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>05-0359645</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>SPIE proceedings series</s0>
</fA64>
<fA66 i1="01">
<s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D04A05A</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE">
<s0>Modèle Markov variable cachée</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG">
<s0>Hidden Markov models</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Implémentation</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Implementation</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA">
<s0>Implementación</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Reconnaissance optique caractère</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Optical character recognition</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Reconocimento óptico de caracteres</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Précision élevée</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>High precision</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Precisión elevada</s0>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Extraction information</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Information extraction</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Extracción información</s0>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Approche probabiliste</s0>
<s5>31</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Probabilistic approach</s0>
<s5>31</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Enfoque probabilista</s0>
<s5>31</s5>
</fC03>
<fN21>
<s1>248</s1>
</fN21>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval. Conference</s1>
<s2>12</s2>
<s3>San Jose CA USA</s3>
<s4>2005-01-19</s4>
</fA30>
</pR>
</standard>
<server>
<NO>PASCAL 05-0359645 INIST</NO>
<ET>Address extraction using Hidden Markov Models</ET>
<AU>TAGHVA (Kazem); COOMBS (Jeffrey); PEREDA (Ray); NARTKER (Thomas); SMITH (Elisa H. Barney); TAGHVA (Kazem)</AU>
<AF>Information Science Research Institute University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut., 3 aut., 4 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>SPIE proceedings series; ISSN 1017-2653; Etats-Unis; Da. 2005; Vol. 5676; Pp. 119-126; Bibl. 10 ref.</SO>
<LA>Anglais</LA>
<EA>This paper presents the implementation and evaluation of a Hidden Markov Model to extract addresses from OCR text. Although Hidden Markov Models discover addresses with high precision and recall, this type of Information Extraction task seems to be affected negatively by the presence of OCR text.</EA>
<CC>001D04A05A</CC>
<FD>Modèle Markov variable cachée; Implémentation; Reconnaissance optique caractère; Précision élevée; Extraction information; Approche probabiliste</FD>
<ED>Hidden Markov models; Implementation; Optical character recognition; High precision; Information extraction; Probabilistic approach</ED>
<SD>Implementación; Reconocimento óptico de caracteres; Precisión elevada; Extracción información; Enfoque probabilista</SD>
<LO>INIST-21760.354000124499720140</LO>
<ID>05-0359645</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000463 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000463 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:05-0359645
   |texte=   Address extraction using Hidden Markov Models
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024