ScienceTreks an autonomous digital library system
Identifieur interne : 001115 ( Main/Exploration ); précédent : 001114; suivant : 001116ScienceTreks an autonomous digital library system
Auteurs : A. R. D. Prasad ; Alexander Ivanyukovich [Italie] ; Maurizio Marchese [Italie] ; Fausto Giunchiglia [Italie]Source :
- Online Information Review [ 1468-4527 ] ; 2008-08-08.
Abstract
Purpose The purpose of this paper is to provide support for automation of the annotation process of large corpora of digital content. Designmethodologyapproach The paper presents and discusses an information extraction pipeline from digital document acquisition to information extraction, processing and management. An overall architecture that supports such an extraction pipeline is detailed and discussed. Findings The proposed pipeline is implemented in a working prototype of an autonomous digital library ADL system called ScienceTreks that supports a broad range of methods for document acquisition does not rely on any external information sources and is solely based on the existing information in the document itself and in the overall set in a given digital archive and provides application programming interfaces API to support easy integration of external systems and tools in the existing pipeline. Practical implications The proposed ADL system can be used in automating endtoend information retrieval and processing, supporting the control and elimination of errorprone human intervention in the process. Originalityvalue High quality automatic metadata extraction is a crucial step in the move from linguistic entities to logical entities, relation information and logical relations, and therefore to the semantic level of digital library usability. This in turn creates the opportunity for valueadded services within existing and future semanticenabled digital library systems.
Url:
DOI: 10.1108/14684520810897368
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 001B89
- to stream Istex, to step Curation: 001A72
- to stream Istex, to step Checkpoint: 000529
- to stream Main, to step Merge: 001217
- to stream Main, to step Curation: 001115
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">ScienceTreks an autonomous digital library system</title>
<author wicri:is="90%"><name sortKey="Prasad, A R D" sort="Prasad, A R D" uniqKey="Prasad A" first="A. R. D." last="Prasad">A. R. D. Prasad</name>
</author>
<author><name sortKey="Ivanyukovich, Alexander" sort="Ivanyukovich, Alexander" uniqKey="Ivanyukovich A" first="Alexander" last="Ivanyukovich">Alexander Ivanyukovich</name>
</author>
<author><name sortKey="Marchese, Maurizio" sort="Marchese, Maurizio" uniqKey="Marchese M" first="Maurizio" last="Marchese">Maurizio Marchese</name>
</author>
<author><name sortKey="Giunchiglia, Fausto" sort="Giunchiglia, Fausto" uniqKey="Giunchiglia F" first="Fausto" last="Giunchiglia">Fausto Giunchiglia</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:8DB9DDC074E1353C99D633F07A936EE6057B2B99</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1108/14684520810897368</idno>
<idno type="url">https://api.istex.fr/document/8DB9DDC074E1353C99D633F07A936EE6057B2B99/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001B89</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001B89</idno>
<idno type="wicri:Area/Istex/Curation">001A72</idno>
<idno type="wicri:Area/Istex/Checkpoint">000529</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000529</idno>
<idno type="wicri:doubleKey">1468-4527:2008:Prasad A:sciencetreks:an:autonomous</idno>
<idno type="wicri:Area/Main/Merge">001217</idno>
<idno type="wicri:Area/Main/Curation">001115</idno>
<idno type="wicri:Area/Main/Exploration">001115</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">ScienceTreks an autonomous digital library system</title>
<author wicri:is="90%"><name sortKey="Prasad, A R D" sort="Prasad, A R D" uniqKey="Prasad A" first="A. R. D." last="Prasad">A. R. D. Prasad</name>
</author>
<author><name sortKey="Ivanyukovich, Alexander" sort="Ivanyukovich, Alexander" uniqKey="Ivanyukovich A" first="Alexander" last="Ivanyukovich">Alexander Ivanyukovich</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information and Communication Technology, University of Trento, Trento</wicri:regionArea>
<wicri:noRegion>Trento</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Marchese, Maurizio" sort="Marchese, Maurizio" uniqKey="Marchese M" first="Maurizio" last="Marchese">Maurizio Marchese</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information and Communication Technology, University of Trento, Trento</wicri:regionArea>
<wicri:noRegion>Trento</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Giunchiglia, Fausto" sort="Giunchiglia, Fausto" uniqKey="Giunchiglia F" first="Fausto" last="Giunchiglia">Fausto Giunchiglia</name>
<affiliation wicri:level="1"><country xml:lang="fr">Italie</country>
<wicri:regionArea>Department of Information and Communication Technology, University of Trento, Trento</wicri:regionArea>
<wicri:noRegion>Trento</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Online Information Review</title>
<idno type="ISSN">1468-4527</idno>
<imprint><publisher>Emerald Group Publishing Limited</publisher>
<date type="published" when="2008-08-08">2008-08-08</date>
<biblScope unit="volume">32</biblScope>
<biblScope unit="issue">4</biblScope>
<biblScope unit="page" from="488">488</biblScope>
<biblScope unit="page" to="499">499</biblScope>
</imprint>
<idno type="ISSN">1468-4527</idno>
</series>
<idno type="istex">8DB9DDC074E1353C99D633F07A936EE6057B2B99</idno>
<idno type="DOI">10.1108/14684520810897368</idno>
<idno type="filenameID">2640320403</idno>
<idno type="original-pdf">2640320403.pdf</idno>
<idno type="href">14684520810897368.pdf</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">1468-4527</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract">Purpose The purpose of this paper is to provide support for automation of the annotation process of large corpora of digital content. Designmethodologyapproach The paper presents and discusses an information extraction pipeline from digital document acquisition to information extraction, processing and management. An overall architecture that supports such an extraction pipeline is detailed and discussed. Findings The proposed pipeline is implemented in a working prototype of an autonomous digital library ADL system called ScienceTreks that supports a broad range of methods for document acquisition does not rely on any external information sources and is solely based on the existing information in the document itself and in the overall set in a given digital archive and provides application programming interfaces API to support easy integration of external systems and tools in the existing pipeline. Practical implications The proposed ADL system can be used in automating endtoend information retrieval and processing, supporting the control and elimination of errorprone human intervention in the process. Originalityvalue High quality automatic metadata extraction is a crucial step in the move from linguistic entities to logical entities, relation information and logical relations, and therefore to the semantic level of digital library usability. This in turn creates the opportunity for valueadded services within existing and future semanticenabled digital library systems.</div>
</front>
</TEI>
<affiliations><list><country><li>Italie</li>
</country>
</list>
<tree><noCountry><name sortKey="Prasad, A R D" sort="Prasad, A R D" uniqKey="Prasad A" first="A. R. D." last="Prasad">A. R. D. Prasad</name>
</noCountry>
<country name="Italie"><noRegion><name sortKey="Ivanyukovich, Alexander" sort="Ivanyukovich, Alexander" uniqKey="Ivanyukovich A" first="Alexander" last="Ivanyukovich">Alexander Ivanyukovich</name>
</noRegion>
<name sortKey="Giunchiglia, Fausto" sort="Giunchiglia, Fausto" uniqKey="Giunchiglia F" first="Fausto" last="Giunchiglia">Fausto Giunchiglia</name>
<name sortKey="Marchese, Maurizio" sort="Marchese, Maurizio" uniqKey="Marchese M" first="Maurizio" last="Marchese">Maurizio Marchese</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Rhénanie/explor/UnivTrevesV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001115 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001115 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Rhénanie |area= UnivTrevesV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:8DB9DDC074E1353C99D633F07A936EE6057B2B99 |texte= ScienceTreks an autonomous digital library system }}
![]() | This area was generated with Dilib version V0.6.31. | ![]() |