Document Reverse Engineering: From Paper to XML
Identifieur interne : 001993 ( Main/Merge ); précédent : 001992; suivant : 001994Document Reverse Engineering: From Paper to XML
Auteurs : Kyong-Ho Lee [États-Unis] ; Yoon-Chul Choy [Corée du Sud] ; Sung-Bae Cho [Corée du Sud] ; Xiao Tang [États-Unis] ; Victor Mccrary [États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: Since XML has the advantage of embedding logical structure information into documents, it is widely used as the universal format for structured documents on the Web. This makes it attractive to convert paper-based documents with logical hierarchy into XML representations automatically. Document image analysis and understanding [1] consists of two phases: geometric and logical structure analysis. Because the two phases take different kinds of data as input, it may not be desirable to apply the same method to them. Targeting technical journal document with multiple pages, we present a hybridization of knowledge-based and syntactic methods for geometric and logical structure analysis of document images.
Url:
DOI: 10.1007/3-540-45869-7_53
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 003B19
- to stream Istex, to step Curation: 003853
- to stream Istex, to step Checkpoint: 001027
Links to Exploration step
ISTEX:09C513E24DF93766F77EC8FA412D79E76CCDC8F0Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Document Reverse Engineering: From Paper to XML</title>
<author><name sortKey="Lee, Kyong Ho" sort="Lee, Kyong Ho" uniqKey="Lee K" first="Kyong-Ho" last="Lee">Kyong-Ho Lee</name>
</author>
<author><name sortKey="Choy, Yoon Chul" sort="Choy, Yoon Chul" uniqKey="Choy Y" first="Yoon-Chul" last="Choy">Yoon-Chul Choy</name>
</author>
<author><name sortKey="Cho, Sung Bae" sort="Cho, Sung Bae" uniqKey="Cho S" first="Sung-Bae" last="Cho">Sung-Bae Cho</name>
</author>
<author><name sortKey="Tang, Xiao" sort="Tang, Xiao" uniqKey="Tang X" first="Xiao" last="Tang">Xiao Tang</name>
</author>
<author><name sortKey="Mccrary, Victor" sort="Mccrary, Victor" uniqKey="Mccrary V" first="Victor" last="Mccrary">Victor Mccrary</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:09C513E24DF93766F77EC8FA412D79E76CCDC8F0</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_53</idno>
<idno type="url">https://api.istex.fr/document/09C513E24DF93766F77EC8FA412D79E76CCDC8F0/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003B19</idno>
<idno type="wicri:Area/Istex/Curation">003853</idno>
<idno type="wicri:Area/Istex/Checkpoint">001027</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Lee K:document:reverse:engineering</idno>
<idno type="wicri:Area/Main/Merge">001993</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Document Reverse Engineering: From Paper to XML</title>
<author><name sortKey="Lee, Kyong Ho" sort="Lee, Kyong Ho" uniqKey="Lee K" first="Kyong-Ho" last="Lee">Kyong-Ho Lee</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Institute of Standards and Technology, 100 Bureau Drive, 20889, Gaithersburg, MD</wicri:regionArea>
<placeName><region type="state">Maryland</region>
</placeName>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: kyongho@nist.gov</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Choy, Yoon Chul" sort="Choy, Yoon Chul" uniqKey="Choy Y" first="Yoon-Chul" last="Choy">Yoon-Chul Choy</name>
<affiliation wicri:level="3"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Dept. Computer Science, Yonsei Univ., 134 Shinchon-dong, 120-749, Seodaemun-ku, Seoul</wicri:regionArea>
<placeName><settlement type="city">Séoul</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
<author><name sortKey="Cho, Sung Bae" sort="Cho, Sung Bae" uniqKey="Cho S" first="Sung-Bae" last="Cho">Sung-Bae Cho</name>
<affiliation wicri:level="3"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Dept. Computer Science, Yonsei Univ., 134 Shinchon-dong, 120-749, Seodaemun-ku, Seoul</wicri:regionArea>
<placeName><settlement type="city">Séoul</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
<author><name sortKey="Tang, Xiao" sort="Tang, Xiao" uniqKey="Tang X" first="Xiao" last="Tang">Xiao Tang</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Institute of Standards and Technology, 100 Bureau Drive, 20889, Gaithersburg, MD</wicri:regionArea>
<placeName><region type="state">Maryland</region>
</placeName>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: xiao.tang@nist.gov</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Mccrary, Victor" sort="Mccrary, Victor" uniqKey="Mccrary V" first="Victor" last="Mccrary">Victor Mccrary</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Institute of Standards and Technology, 100 Bureau Drive, 20889, Gaithersburg, MD</wicri:regionArea>
<placeName><region type="state">Maryland</region>
</placeName>
</affiliation>
<affiliation><wicri:noCountry code="no comma">E-mail: victor.mccrary@nist.gov</wicri:noCountry>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">09C513E24DF93766F77EC8FA412D79E76CCDC8F0</idno>
<idno type="DOI">10.1007/3-540-45869-7_53</idno>
<idno type="ChapterID">53</idno>
<idno type="ChapterID">Chap53</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Since XML has the advantage of embedding logical structure information into documents, it is widely used as the universal format for structured documents on the Web. This makes it attractive to convert paper-based documents with logical hierarchy into XML representations automatically. Document image analysis and understanding [1] consists of two phases: geometric and logical structure analysis. Because the two phases take different kinds of data as input, it may not be desirable to apply the same method to them. Targeting technical journal document with multiple pages, we present a hybridization of knowledge-based and syntactic methods for geometric and logical structure analysis of document images.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001993 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001993 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Merge |type= RBID |clé= ISTEX:09C513E24DF93766F77EC8FA412D79E76CCDC8F0 |texte= Document Reverse Engineering: From Paper to XML }}
This area was generated with Dilib version V0.6.32. |