A proposal for annotation, semantic similarity and classification of textual documents
Identifieur interne : 005624 ( Main/Curation ); précédent : 005623; suivant : 005625A proposal for annotation, semantic similarity and classification of textual documents
Auteurs : Emmanuel Nauer [France] ; Amedeo Napoli [France]Source :
- Lecture notes in computer science
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Intelligence artificielle, Classification.
English descriptors
- KwdEn :
Abstract
In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the documents. The content of a document is annotated and the resulting annotation is represented by a labeled tree whose nodes and edges are represented by concepts lying within a domain ontology. A reasoning process may be carried out on annotation trees, allowing the comparison of documents between each others, for classification or information retrieval purposes. An algorithm for classifying documents with respect to semantic similarity and a discussion conclude the paper.
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000348
- to stream PascalFrancis, to step Curation: Pour aller vers cette notice dans l'étape Curation :000677
- to stream PascalFrancis, to step Checkpoint: Pour aller vers cette notice dans l'étape Curation :000414
- to stream Main, to step Merge: Pour aller vers cette notice dans l'étape Curation :005819
Links to Exploration step
Pascal:08-0032186Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A proposal for annotation, semantic similarity and classification of textual documents</title>
<author><name sortKey="Nauer, Emmanuel" sort="Nauer, Emmanuel" uniqKey="Nauer E" first="Emmanuel" last="Nauer">Emmanuel Nauer</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">08-0032186</idno>
<date when="2006">2006</date>
<idno type="stanalyst">PASCAL 08-0032186 INIST</idno>
<idno type="RBID">Pascal:08-0032186</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000348</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000677</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000414</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000414</idno>
<idno type="wicri:Area/Main/Merge">005819</idno>
<idno type="wicri:Area/Main/Curation">005624</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A proposal for annotation, semantic similarity and classification of textual documents</title>
<author><name sortKey="Nauer, Emmanuel" sort="Nauer, Emmanuel" uniqKey="Nauer E" first="Emmanuel" last="Nauer">Emmanuel Nauer</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Lecture notes in computer science</title>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Lecture notes in computer science</title>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Annotation</term>
<term>Artificial intelligence</term>
<term>Classification</term>
<term>Content analysis</term>
<term>Information retrieval</term>
<term>Lying</term>
<term>Ontology</term>
<term>Semantics</term>
<term>Similarity</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Intelligence artificielle</term>
<term>Similitude</term>
<term>Classification</term>
<term>Texte</term>
<term>Analyse contenu</term>
<term>Ontologie</term>
<term>Recherche information</term>
<term>Annotation</term>
<term>Sémantique</term>
<term>Mensonge</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Intelligence artificielle</term>
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the documents. The content of a document is annotated and the resulting annotation is represented by a labeled tree whose nodes and edges are represented by concepts lying within a domain ontology. A reasoning process may be carried out on annotation trees, allowing the comparison of documents between each others, for classification or information retrieval purposes. An algorithm for classifying documents with respect to semantic similarity and a discussion conclude the paper.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 005624 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 005624 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Main |étape= Curation |type= RBID |clé= Pascal:08-0032186 |texte= A proposal for annotation, semantic similarity and classification of textual documents }}
This area was generated with Dilib version V0.6.33. |