Date of Birth Extraction Using Precise Shallow Parsing
Identifieur interne :
000162 ( PascalFrancis/Corpus );
précédent :
000161;
suivant :
000163
Date of Birth Extraction Using Precise Shallow Parsing
Auteurs : Ray Pereda ;
Kazem TaghvaSource :
-
Proceedings of SPIE, the International Society for Optical Engineering [ 0277-786X ] ; 2010.
RBID : Pascal:10-0429703
Descripteurs français
English descriptors
Abstract
This paper presents the implementation and evaluation of a pattern-based program to extract date of birth information from OCR text. Although the program finds data of birth information with high precision and recall, this type of information extraction task seems to be negatively impacted by OCR errors.
Notice en format standard (ISO 2709)
Pour connaître la documentation sur le format Inist Standard.
pA |
A01 | 01 | 1 | | @0 0277-786X |
---|
A02 | 01 | | | @0 PSISDG |
---|
A03 | | 1 | | @0 Proc. SPIE Int. Soc. Opt. Eng. |
---|
A05 | | | | @2 7534 |
---|
A08 | 01 | 1 | ENG | @1 Date of Birth Extraction Using Precise Shallow Parsing |
---|
A09 | 01 | 1 | ENG | @1 Document recognition and retrieval XVII : 19-21 January 2010, San Jose, California, United States |
---|
A11 | 01 | 1 | | @1 PEREDA (Ray) |
---|
A11 | 02 | 1 | | @1 TAGHVA (Kazem) |
---|
A12 | 01 | 1 | | @1 LIKFORMAN-SULEM (Laurence) @9 ed. |
---|
A12 | 02 | 1 | | @1 AGAM (Gady) @9 ed. |
---|
A14 | 01 | | | @1 Information Science Research Institute University of Nevada, Las Vegas @2 Las Vegas, NV 89154-4021 @3 USA @Z 1 aut. @Z 2 aut. |
---|
A18 | 01 | 1 | | @1 SPIE @3 USA @9 org-cong. |
---|
A18 | 02 | 1 | | @1 IS&T @3 USA @9 org-cong. |
---|
A18 | 03 | 1 | | @1 Institut TELECOM @3 FRA @9 org-cong. |
---|
A20 | | | | @2 753406.1-753406.7 |
---|
A21 | | | | @1 2010 |
---|
A23 | 01 | | | @0 ENG |
---|
A25 | 01 | | | @1 SPIE @2 Bellingham WA |
---|
A26 | 01 | | | @0 978-0-8194-7927-3 |
---|
A43 | 01 | | | @1 INIST @2 21760 @5 354000174683810050 |
---|
A44 | | | | @0 0000 @1 © 2010 INIST-CNRS. All rights reserved. |
---|
A45 | | | | @0 10 ref. |
---|
A47 | 01 | 1 | | @0 10-0429703 |
---|
A60 | | | | @1 P @2 C |
---|
A61 | | | | @0 A |
---|
A64 | 01 | 1 | | @0 Proceedings of SPIE, the International Society for Optical Engineering |
---|
A66 | 01 | | | @0 USA |
---|
C01 | 01 | | ENG | @0 This paper presents the implementation and evaluation of a pattern-based program to extract date of birth information from OCR text. Although the program finds data of birth information with high precision and recall, this type of information extraction task seems to be negatively impacted by OCR errors. |
---|
C02 | 01 | 3 | | @0 001B00A30C |
---|
C02 | 02 | 3 | | @0 001B40B30S |
---|
C02 | 03 | X | | @0 001D04A05A |
---|
C02 | 04 | X | | @0 001D04A03 |
---|
C03 | 01 | 3 | FRE | @0 Reconnaissance forme @5 61 |
---|
C03 | 01 | 3 | ENG | @0 Pattern recognition @5 61 |
---|
C03 | 02 | X | FRE | @0 Recherche documentaire @5 62 |
---|
C03 | 02 | X | ENG | @0 Document retrieval @5 62 |
---|
C03 | 02 | X | SPA | @0 Búsqueda documental @5 62 |
---|
C03 | 03 | X | FRE | @0 Analyse syntaxique @5 63 |
---|
C03 | 03 | X | ENG | @0 Syntactic analysis @5 63 |
---|
C03 | 03 | X | SPA | @0 Análisis sintáxico @5 63 |
---|
C03 | 04 | 3 | FRE | @0 Implémentation @5 64 |
---|
C03 | 04 | 3 | ENG | @0 Implementation @5 64 |
---|
C03 | 05 | 3 | FRE | @0 Reconnaissance optique caractère @5 65 |
---|
C03 | 05 | 3 | ENG | @0 Optical character recognition @5 65 |
---|
C03 | 06 | X | FRE | @0 Précision élevée @5 66 |
---|
C03 | 06 | X | ENG | @0 High precision @5 66 |
---|
C03 | 06 | X | SPA | @0 Precisión elevada @5 66 |
---|
C03 | 07 | X | FRE | @0 Extraction information @5 67 |
---|
C03 | 07 | X | ENG | @0 Information extraction @5 67 |
---|
C03 | 07 | X | SPA | @0 Extracción información @5 67 |
---|
C03 | 08 | 3 | FRE | @0 0130C @4 INC @5 83 |
---|
C03 | 09 | 3 | FRE | @0 4230S @4 INC @5 91 |
---|
C07 | 01 | X | FRE | @0 Traitement information @5 68 |
---|
C07 | 01 | X | ENG | @0 Information processing @5 68 |
---|
C07 | 01 | X | SPA | @0 Procesamiento información @5 68 |
---|
N21 | | | | @1 277 |
---|
N44 | 01 | | | @1 OTO |
---|
N82 | | | | @1 OTO |
---|
|
pR |
A30 | 01 | 1 | ENG | @1 Document recognition and retrieval @2 17 @3 San Jose CA USA @4 2010 |
---|
|
Format Inist (serveur)
NO : | PASCAL 10-0429703 INIST |
ET : | Date of Birth Extraction Using Precise Shallow Parsing |
AU : | PEREDA (Ray); TAGHVA (Kazem); LIKFORMAN-SULEM (Laurence); AGAM (Gady) |
AF : | Information Science Research Institute University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut.) |
DT : | Publication en série; Congrès; Niveau analytique |
SO : | Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Coden PSISDG; Etats-Unis; Da. 2010; Vol. 7534; 753406.1-753406.7; Bibl. 10 ref. |
LA : | Anglais |
EA : | This paper presents the implementation and evaluation of a pattern-based program to extract date of birth information from OCR text. Although the program finds data of birth information with high precision and recall, this type of information extraction task seems to be negatively impacted by OCR errors. |
CC : | 001B00A30C; 001B40B30S; 001D04A05A; 001D04A03 |
FD : | Reconnaissance forme; Recherche documentaire; Analyse syntaxique; Implémentation; Reconnaissance optique caractère; Précision élevée; Extraction information; 0130C; 4230S |
FG : | Traitement information |
ED : | Pattern recognition; Document retrieval; Syntactic analysis; Implementation; Optical character recognition; High precision; Information extraction |
EG : | Information processing |
SD : | Búsqueda documental; Análisis sintáxico; Precisión elevada; Extracción información |
LO : | INIST-21760.354000174683810050 |
ID : | 10-0429703 |
Links to Exploration step
Pascal:10-0429703
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Date of Birth Extraction Using Precise Shallow Parsing</title>
<author><name sortKey="Pereda, Ray" sort="Pereda, Ray" uniqKey="Pereda R" first="Ray" last="Pereda">Ray Pereda</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Taghva, Kazem" sort="Taghva, Kazem" uniqKey="Taghva K" first="Kazem" last="Taghva">Kazem Taghva</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">10-0429703</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 10-0429703 INIST</idno>
<idno type="RBID">Pascal:10-0429703</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000162</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Date of Birth Extraction Using Precise Shallow Parsing</title>
<author><name sortKey="Pereda, Ray" sort="Pereda, Ray" uniqKey="Pereda R" first="Ray" last="Pereda">Ray Pereda</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Taghva, Kazem" sort="Taghva, Kazem" uniqKey="Taghva K" first="Kazem" last="Taghva">Kazem Taghva</name>
<affiliation><inist:fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
<imprint><date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Document retrieval</term>
<term>High precision</term>
<term>Implementation</term>
<term>Information extraction</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Syntactic analysis</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance forme</term>
<term>Recherche documentaire</term>
<term>Analyse syntaxique</term>
<term>Implémentation</term>
<term>Reconnaissance optique caractère</term>
<term>Précision élevée</term>
<term>Extraction information</term>
<term>0130C</term>
<term>4230S</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper presents the implementation and evaluation of a pattern-based program to extract date of birth information from OCR text. Although the program finds data of birth information with high precision and recall, this type of information extraction task seems to be negatively impacted by OCR errors.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>0277-786X</s0>
</fA01>
<fA02 i1="01"><s0>PSISDG</s0>
</fA02>
<fA03 i2="1"><s0>Proc. SPIE Int. Soc. Opt. Eng.</s0>
</fA03>
<fA05><s2>7534</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG"><s1>Date of Birth Extraction Using Precise Shallow Parsing</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval XVII : 19-21 January 2010, San Jose, California, United States</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>PEREDA (Ray)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>TAGHVA (Kazem)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>LIKFORMAN-SULEM (Laurence)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>AGAM (Gady)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>Information Science Research Institute University of Nevada, Las Vegas</s1>
<s2>Las Vegas, NV 89154-4021</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1"><s1>SPIE</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="02" i2="1"><s1>IS&T</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="03" i2="1"><s1>Institut TELECOM</s1>
<s3>FRA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20><s2>753406.1-753406.7</s2>
</fA20>
<fA21><s1>2010</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA25 i1="01"><s1>SPIE</s1>
<s2>Bellingham WA</s2>
</fA25>
<fA26 i1="01"><s0>978-0-8194-7927-3</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>21760</s2>
<s5>354000174683810050</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2010 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>10 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>10-0429703</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA64 i1="01" i2="1"><s0>Proceedings of SPIE, the International Society for Optical Engineering</s0>
</fA64>
<fA66 i1="01"><s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>This paper presents the implementation and evaluation of a pattern-based program to extract date of birth information from OCR text. Although the program finds data of birth information with high precision and recall, this type of information extraction task seems to be negatively impacted by OCR errors.</s0>
</fC01>
<fC02 i1="01" i2="3"><s0>001B00A30C</s0>
</fC02>
<fC02 i1="02" i2="3"><s0>001B40B30S</s0>
</fC02>
<fC02 i1="03" i2="X"><s0>001D04A05A</s0>
</fC02>
<fC02 i1="04" i2="X"><s0>001D04A03</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE"><s0>Reconnaissance forme</s0>
<s5>61</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG"><s0>Pattern recognition</s0>
<s5>61</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Recherche documentaire</s0>
<s5>62</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Document retrieval</s0>
<s5>62</s5>
</fC03>
<fC03 i1="02" i2="X" l="SPA"><s0>Búsqueda documental</s0>
<s5>62</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Analyse syntaxique</s0>
<s5>63</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Syntactic analysis</s0>
<s5>63</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Análisis sintáxico</s0>
<s5>63</s5>
</fC03>
<fC03 i1="04" i2="3" l="FRE"><s0>Implémentation</s0>
<s5>64</s5>
</fC03>
<fC03 i1="04" i2="3" l="ENG"><s0>Implementation</s0>
<s5>64</s5>
</fC03>
<fC03 i1="05" i2="3" l="FRE"><s0>Reconnaissance optique caractère</s0>
<s5>65</s5>
</fC03>
<fC03 i1="05" i2="3" l="ENG"><s0>Optical character recognition</s0>
<s5>65</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Précision élevée</s0>
<s5>66</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>High precision</s0>
<s5>66</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Precisión elevada</s0>
<s5>66</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Extraction information</s0>
<s5>67</s5>
</fC03>
<fC03 i1="07" i2="X" l="ENG"><s0>Information extraction</s0>
<s5>67</s5>
</fC03>
<fC03 i1="07" i2="X" l="SPA"><s0>Extracción información</s0>
<s5>67</s5>
</fC03>
<fC03 i1="08" i2="3" l="FRE"><s0>0130C</s0>
<s4>INC</s4>
<s5>83</s5>
</fC03>
<fC03 i1="09" i2="3" l="FRE"><s0>4230S</s0>
<s4>INC</s4>
<s5>91</s5>
</fC03>
<fC07 i1="01" i2="X" l="FRE"><s0>Traitement information</s0>
<s5>68</s5>
</fC07>
<fC07 i1="01" i2="X" l="ENG"><s0>Information processing</s0>
<s5>68</s5>
</fC07>
<fC07 i1="01" i2="X" l="SPA"><s0>Procesamiento información</s0>
<s5>68</s5>
</fC07>
<fN21><s1>277</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval</s1>
<s2>17</s2>
<s3>San Jose CA USA</s3>
<s4>2010</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 10-0429703 INIST</NO>
<ET>Date of Birth Extraction Using Precise Shallow Parsing</ET>
<AU>PEREDA (Ray); TAGHVA (Kazem); LIKFORMAN-SULEM (Laurence); AGAM (Gady)</AU>
<AF>Information Science Research Institute University of Nevada, Las Vegas/Las Vegas, NV 89154-4021/Etats-Unis (1 aut., 2 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Coden PSISDG; Etats-Unis; Da. 2010; Vol. 7534; 753406.1-753406.7; Bibl. 10 ref.</SO>
<LA>Anglais</LA>
<EA>This paper presents the implementation and evaluation of a pattern-based program to extract date of birth information from OCR text. Although the program finds data of birth information with high precision and recall, this type of information extraction task seems to be negatively impacted by OCR errors.</EA>
<CC>001B00A30C; 001B40B30S; 001D04A05A; 001D04A03</CC>
<FD>Reconnaissance forme; Recherche documentaire; Analyse syntaxique; Implémentation; Reconnaissance optique caractère; Précision élevée; Extraction information; 0130C; 4230S</FD>
<FG>Traitement information</FG>
<ED>Pattern recognition; Document retrieval; Syntactic analysis; Implementation; Optical character recognition; High precision; Information extraction</ED>
<EG>Information processing</EG>
<SD>Búsqueda documental; Análisis sintáxico; Precisión elevada; Extracción información</SD>
<LO>INIST-21760.354000174683810050</LO>
<ID>10-0429703</ID>
</server>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000162 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000162 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien
|wiki= Ticri/CIDE
|area= OcrV1
|flux= PascalFrancis
|étape= Corpus
|type= RBID
|clé= Pascal:10-0429703
|texte= Date of Birth Extraction Using Precise Shallow Parsing
}}
| This area was generated with Dilib version V0.6.32. Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024 | |