Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A proposal for annotation, semantic similarity and classification of textual documents

Identifieur interne : 005624 ( Main/Exploration ); précédent : 005623; suivant : 005625

A proposal for annotation, semantic similarity and classification of textual documents

Auteurs : Emmanuel Nauer [France] ; Amedeo Napoli [France]

Source :

RBID : Pascal:08-0032186

Descripteurs français

English descriptors

Abstract

In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the documents. The content of a document is annotated and the resulting annotation is represented by a labeled tree whose nodes and edges are represented by concepts lying within a domain ontology. A reasoning process may be carried out on annotation trees, allowing the comparison of documents between each others, for classification or information retrieval purposes. An algorithm for classifying documents with respect to semantic similarity and a discussion conclude the paper.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A proposal for annotation, semantic similarity and classification of textual documents</title>
<author>
<name sortKey="Nauer, Emmanuel" sort="Nauer, Emmanuel" uniqKey="Nauer E" first="Emmanuel" last="Nauer">Emmanuel Nauer</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">08-0032186</idno>
<date when="2006">2006</date>
<idno type="stanalyst">PASCAL 08-0032186 INIST</idno>
<idno type="RBID">Pascal:08-0032186</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000348</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000677</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000414</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000414</idno>
<idno type="wicri:Area/Main/Merge">005819</idno>
<idno type="wicri:Area/Main/Curation">005624</idno>
<idno type="wicri:Area/Main/Exploration">005624</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A proposal for annotation, semantic similarity and classification of textual documents</title>
<author>
<name sortKey="Nauer, Emmanuel" sort="Nauer, Emmanuel" uniqKey="Nauer E" first="Emmanuel" last="Nauer">Emmanuel Nauer</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA -UMR 7503 Bâtiment B, B.P. 239</s1>
<s2>54506 Vandœuvre-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Lecture notes in computer science</title>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Lecture notes in computer science</title>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Annotation</term>
<term>Artificial intelligence</term>
<term>Classification</term>
<term>Content analysis</term>
<term>Information retrieval</term>
<term>Lying</term>
<term>Ontology</term>
<term>Semantics</term>
<term>Similarity</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Intelligence artificielle</term>
<term>Similitude</term>
<term>Classification</term>
<term>Texte</term>
<term>Analyse contenu</term>
<term>Ontologie</term>
<term>Recherche information</term>
<term>Annotation</term>
<term>Sémantique</term>
<term>Mensonge</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Intelligence artificielle</term>
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the documents. The content of a document is annotated and the resulting annotation is represented by a labeled tree whose nodes and edges are represented by concepts lying within a domain ontology. A reasoning process may be carried out on annotation trees, allowing the comparison of documents between each others, for classification or information retrieval purposes. An algorithm for classifying documents with respect to semantic similarity and a discussion conclude the paper.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Vandœuvre-lès-Nancy</li>
</settlement>
</list>
<tree>
<country name="France">
<region name="Grand Est">
<name sortKey="Nauer, Emmanuel" sort="Nauer, Emmanuel" uniqKey="Nauer E" first="Emmanuel" last="Nauer">Emmanuel Nauer</name>
</region>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 005624 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 005624 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:08-0032186
   |texte=   A proposal for annotation, semantic similarity and classification of textual documents
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022