Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Document recognition and XML generation of tabular form discharge summaries for analogous case search system.

Identifieur interne : 000057 ( PubMed/Corpus ); précédent : 000056; suivant : 000058

Document recognition and XML generation of tabular form discharge summaries for analogous case search system.

Auteurs : H. Kawanaka ; T. Sumida ; K. Yamamoto ; T. Shinogi ; S. Tsuruoka

Source :

RBID : pubmed:18066422

English descriptors

Abstract

This paper discusses and develops a document image recognition, keyword extraction and automatic XML generation system to search analogous cases from paper-based documents. In this paper, we propose the document structure recognition method and automatic XML generation method for the tabular form discharge summary documents. This paper also develops the prototype system using the proposed method. Evaluation experiments using actual documents are done to discuss the effectiveness of the developed system.

PubMed: 18066422

Links to Exploration step

pubmed:18066422

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Document recognition and XML generation of tabular form discharge summaries for analogous case search system.</title>
<author>
<name sortKey="Kawanaka, H" sort="Kawanaka, H" uniqKey="Kawanaka H" first="H" last="Kawanaka">H. Kawanaka</name>
<affiliation>
<nlm:affiliation>Graduate School of Engineering, Mie University, 1577 Kurima-Machiya, Tsu, Mie 514-8507, Japan. kawanaka@elec.mie-u.ac.jp</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Sumida, T" sort="Sumida, T" uniqKey="Sumida T" first="T" last="Sumida">T. Sumida</name>
</author>
<author>
<name sortKey="Yamamoto, K" sort="Yamamoto, K" uniqKey="Yamamoto K" first="K" last="Yamamoto">K. Yamamoto</name>
</author>
<author>
<name sortKey="Shinogi, T" sort="Shinogi, T" uniqKey="Shinogi T" first="T" last="Shinogi">T. Shinogi</name>
</author>
<author>
<name sortKey="Tsuruoka, S" sort="Tsuruoka, S" uniqKey="Tsuruoka S" first="S" last="Tsuruoka">S. Tsuruoka</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2007">2007</date>
<idno type="RBID">pubmed:18066422</idno>
<idno type="pmid">18066422</idno>
<idno type="wicri:Area/PubMed/Corpus">000057</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Document recognition and XML generation of tabular form discharge summaries for analogous case search system.</title>
<author>
<name sortKey="Kawanaka, H" sort="Kawanaka, H" uniqKey="Kawanaka H" first="H" last="Kawanaka">H. Kawanaka</name>
<affiliation>
<nlm:affiliation>Graduate School of Engineering, Mie University, 1577 Kurima-Machiya, Tsu, Mie 514-8507, Japan. kawanaka@elec.mie-u.ac.jp</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Sumida, T" sort="Sumida, T" uniqKey="Sumida T" first="T" last="Sumida">T. Sumida</name>
</author>
<author>
<name sortKey="Yamamoto, K" sort="Yamamoto, K" uniqKey="Yamamoto K" first="K" last="Yamamoto">K. Yamamoto</name>
</author>
<author>
<name sortKey="Shinogi, T" sort="Shinogi, T" uniqKey="Shinogi T" first="T" last="Shinogi">T. Shinogi</name>
</author>
<author>
<name sortKey="Tsuruoka, S" sort="Tsuruoka, S" uniqKey="Tsuruoka S" first="S" last="Tsuruoka">S. Tsuruoka</name>
</author>
</analytic>
<series>
<title level="j">Methods of information in medicine</title>
<idno type="ISSN">0026-1270</idno>
<imprint>
<date when="2007" type="published">2007</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Documentation</term>
<term>Hospital Information Systems (organization & administration)</term>
<term>Humans</term>
<term>Image Interpretation, Computer-Assisted</term>
<term>Japan</term>
<term>Medical Informatics</term>
<term>Medical Records Systems, Computerized</term>
<term>Optical Storage Devices</term>
<term>Patient Discharge</term>
<term>Program Evaluation</term>
<term>Programming Languages</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" type="geographic" xml:lang="en">
<term>Japan</term>
</keywords>
<keywords scheme="MESH" qualifier="organization & administration" xml:lang="en">
<term>Hospital Information Systems</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Documentation</term>
<term>Humans</term>
<term>Image Interpretation, Computer-Assisted</term>
<term>Medical Informatics</term>
<term>Medical Records Systems, Computerized</term>
<term>Optical Storage Devices</term>
<term>Patient Discharge</term>
<term>Program Evaluation</term>
<term>Programming Languages</term>
<term>Software</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper discusses and develops a document image recognition, keyword extraction and automatic XML generation system to search analogous cases from paper-based documents. In this paper, we propose the document structure recognition method and automatic XML generation method for the tabular form discharge summary documents. This paper also develops the prototype system using the proposed method. Evaluation experiments using actual documents are done to discuss the effectiveness of the developed system.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Owner="NLM" Status="MEDLINE">
<PMID Version="1">18066422</PMID>
<DateCreated>
<Year>2007</Year>
<Month>12</Month>
<Day>10</Day>
</DateCreated>
<DateCompleted>
<Year>2008</Year>
<Month>02</Month>
<Day>12</Day>
</DateCompleted>
<Article PubModel="Print">
<Journal>
<ISSN IssnType="Print">0026-1270</ISSN>
<JournalIssue CitedMedium="Print">
<Volume>46</Volume>
<Issue>6</Issue>
<PubDate>
<Year>2007</Year>
</PubDate>
</JournalIssue>
<Title>Methods of information in medicine</Title>
<ISOAbbreviation>Methods Inf Med</ISOAbbreviation>
</Journal>
<ArticleTitle>Document recognition and XML generation of tabular form discharge summaries for analogous case search system.</ArticleTitle>
<Pagination>
<MedlinePgn>700-8</MedlinePgn>
</Pagination>
<Abstract>
<AbstractText Label="OBJECTIVES" NlmCategory="OBJECTIVE">This paper discusses and develops a document image recognition, keyword extraction and automatic XML generation system to search analogous cases from paper-based documents. In this paper, we propose the document structure recognition method and automatic XML generation method for the tabular form discharge summary documents. This paper also develops the prototype system using the proposed method. Evaluation experiments using actual documents are done to discuss the effectiveness of the developed system.</AbstractText>
<AbstractText Label="METHODS" NlmCategory="METHODS">The developed system consists of the following methods. Paper-based summary documents are scanned by a scanner using 300 dpi first. Noise and tilt of the image are reduced by pre-processing, and the table structures are identified. Characters in the table are recognized and converted to text data by the OCR engine. XML documents are automatically generated using obtained results.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">In this paper, patient discharge summary documents archived at Mie University Hospital were used. The results show that XML documents can be automatically generated when standard tabular form documents are input into the developed system. In this experiment, it takes about 20 seconds to generate an XML document using the general personal computer. This paper also compares the developed system with a commercial product to discuss the effectiveness of the present system. Experimental results also show that the accuracy of table structure recognition is high and it can be used in a practical situation.</AbstractText>
<AbstractText Label="CONCLUSIONS" NlmCategory="CONCLUSIONS">This paper showed the effectiveness of the proposed method to recognize the tabular form document images to generate XML documents.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Kawanaka</LastName>
<ForeName>H</ForeName>
<Initials>H</Initials>
<AffiliationInfo>
<Affiliation>Graduate School of Engineering, Mie University, 1577 Kurima-Machiya, Tsu, Mie 514-8507, Japan. kawanaka@elec.mie-u.ac.jp</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Sumida</LastName>
<ForeName>T</ForeName>
<Initials>T</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Yamamoto</LastName>
<ForeName>K</ForeName>
<Initials>K</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Shinogi</LastName>
<ForeName>T</ForeName>
<Initials>T</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Tsuruoka</LastName>
<ForeName>S</ForeName>
<Initials>S</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo>
<Country>Germany</Country>
<MedlineTA>Methods Inf Med</MedlineTA>
<NlmUniqueID>0210453</NlmUniqueID>
<ISSNLinking>0026-1270</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName MajorTopicYN="Y" UI="D004282">Documentation</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" UI="D006751">Hospital Information Systems</DescriptorName>
<QualifierName MajorTopicYN="N" UI="Q000458">organization & administration</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" UI="D006801">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="Y" UI="D007090">Image Interpretation, Computer-Assisted</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" Type="Geographic" UI="D007564">Japan</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" UI="D008490">Medical Informatics</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" UI="D016347">Medical Records Systems, Computerized</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" UI="D016249">Optical Storage Devices</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" UI="D010351">Patient Discharge</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N" UI="D015397">Program Evaluation</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="Y" UI="D011381">Programming Languages</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="Y" UI="D012984">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="pubmed">
<Year>2007</Year>
<Month>12</Month>
<Day>11</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2008</Year>
<Month>2</Month>
<Day>13</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2007</Year>
<Month>12</Month>
<Day>11</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pii">07060700</ArticleId>
<ArticleId IdType="pubmed">18066422</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000057 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 000057 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:18066422
   |texte=   Document recognition and XML generation of tabular form discharge summaries for analogous case search system.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:18066422" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024