Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Learning Subgraph Patterns from text for Extracting Disease–Symptom Relationships

Identifieur interne : 002E58 ( Hal/Curation ); précédent : 002E57; suivant : 002E59

Learning Subgraph Patterns from text for Extracting Disease–Symptom Relationships

Auteurs : Mohsen Hassan [France] ; Adrien Coulet [France] ; Yannick Toussaint [France]

Source :

RBID : Hal:hal-01095595

English descriptors

Abstract

To some extent, texts can be represented in the form of graphs, such as dependency graphs in which nodes represent words and edges represent grammatical dependencies between words. Graph representation of texts is an interesting alternative to string representation because it provides an additional level of abstraction over the syntax that is sometime easier to compute. In this paper, we study the use of graph mining methods on texts represented as dependency graphs, for extracting relationships between pairs of annotated entities. We propose a three step approach that includes (1) the transformation of texts in a collection of dependency graphs; (2) the selection of frequent subgraphs, named hereafter patterns, on the basis of positive sentences; and (3) the extraction of relationships by searching for occurrences of patterns in novel sentences. Our method has been experimented by extracting disease–symptom relationships from a corpus of 51,292 PubMed abstracts (428,491 sentences)related to 50 rare diseases. The extraction of correct disease–symptom relationships has been evaluated on 565 sentences, showing a precision of 0.91 and a recall of 0.49 (F-Meaure is 0.63). These preliminary experiments show the feasibility of extracting good quality relationships using frequent subgraph mining.

Url:

Links toward previous steps (curation, corpus...)


Links to Exploration step

Hal:hal-01095595

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Learning Subgraph Patterns from text for Extracting Disease–Symptom Relationships</title>
<author>
<name sortKey="Hassan, Mohsen" sort="Hassan, Mohsen" uniqKey="Hassan M" first="Mohsen" last="Hassan">Mohsen Hassan</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-205125" status="VALID">
<idno type="RNSR">200818291B</idno>
<orgName>Knowledge representation, reasonning</orgName>
<orgName type="acronym">ORPAILLEUR</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/orpailleur</ref>
</desc>
<listRelation>
<relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-129671" type="direct">
<org type="laboratory" xml:id="struct-129671" status="VALID">
<idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc>
<address>
<addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Coulet, Adrien" sort="Coulet, Adrien" uniqKey="Coulet A" first="Adrien" last="Coulet">Adrien Coulet</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-205125" status="VALID">
<idno type="RNSR">200818291B</idno>
<orgName>Knowledge representation, reasonning</orgName>
<orgName type="acronym">ORPAILLEUR</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/orpailleur</ref>
</desc>
<listRelation>
<relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-129671" type="direct">
<org type="laboratory" xml:id="struct-129671" status="VALID">
<idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc>
<address>
<addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Toussaint, Yannick" sort="Toussaint, Yannick" uniqKey="Toussaint Y" first="Yannick" last="Toussaint">Yannick Toussaint</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-205125" status="VALID">
<idno type="RNSR">200818291B</idno>
<orgName>Knowledge representation, reasonning</orgName>
<orgName type="acronym">ORPAILLEUR</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/orpailleur</ref>
</desc>
<listRelation>
<relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-129671" type="direct">
<org type="laboratory" xml:id="struct-129671" status="VALID">
<idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc>
<address>
<addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-01095595</idno>
<idno type="halId">hal-01095595</idno>
<idno type="halUri">https://hal.inria.fr/hal-01095595</idno>
<idno type="url">https://hal.inria.fr/hal-01095595</idno>
<date when="2014-09-15">2014-09-15</date>
<idno type="wicri:Area/Hal/Corpus">002E58</idno>
<idno type="wicri:Area/Hal/Curation">002E58</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Learning Subgraph Patterns from text for Extracting Disease–Symptom Relationships</title>
<author>
<name sortKey="Hassan, Mohsen" sort="Hassan, Mohsen" uniqKey="Hassan M" first="Mohsen" last="Hassan">Mohsen Hassan</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-205125" status="VALID">
<idno type="RNSR">200818291B</idno>
<orgName>Knowledge representation, reasonning</orgName>
<orgName type="acronym">ORPAILLEUR</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/orpailleur</ref>
</desc>
<listRelation>
<relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-129671" type="direct">
<org type="laboratory" xml:id="struct-129671" status="VALID">
<idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc>
<address>
<addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Coulet, Adrien" sort="Coulet, Adrien" uniqKey="Coulet A" first="Adrien" last="Coulet">Adrien Coulet</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-205125" status="VALID">
<idno type="RNSR">200818291B</idno>
<orgName>Knowledge representation, reasonning</orgName>
<orgName type="acronym">ORPAILLEUR</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/orpailleur</ref>
</desc>
<listRelation>
<relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-129671" type="direct">
<org type="laboratory" xml:id="struct-129671" status="VALID">
<idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc>
<address>
<addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Toussaint, Yannick" sort="Toussaint, Yannick" uniqKey="Toussaint Y" first="Yannick" last="Toussaint">Yannick Toussaint</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-205125" status="VALID">
<idno type="RNSR">200818291B</idno>
<orgName>Knowledge representation, reasonning</orgName>
<orgName type="acronym">ORPAILLEUR</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/orpailleur</ref>
</desc>
<listRelation>
<relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-129671" type="direct">
<org type="laboratory" xml:id="struct-129671" status="VALID">
<idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc>
<address>
<addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="en">
<term>Biomedical informatics</term>
<term>Graph Mining</term>
<term>Relation Extraction</term>
<term>Text Mining</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">To some extent, texts can be represented in the form of graphs, such as dependency graphs in which nodes represent words and edges represent grammatical dependencies between words. Graph representation of texts is an interesting alternative to string representation because it provides an additional level of abstraction over the syntax that is sometime easier to compute. In this paper, we study the use of graph mining methods on texts represented as dependency graphs, for extracting relationships between pairs of annotated entities. We propose a three step approach that includes (1) the transformation of texts in a collection of dependency graphs; (2) the selection of frequent subgraphs, named hereafter patterns, on the basis of positive sentences; and (3) the extraction of relationships by searching for occurrences of patterns in novel sentences. Our method has been experimented by extracting disease–symptom relationships from a corpus of 51,292 PubMed abstracts (428,491 sentences)related to 50 rare diseases. The extraction of correct disease–symptom relationships has been evaluated on 565 sentences, showing a precision of 0.91 and a recall of 0.49 (F-Meaure is 0.63). These preliminary experiments show the feasibility of extracting good quality relationships using frequent subgraph mining.</div>
</front>
</TEI>
<hal api="V3">
<titleStmt>
<title xml:lang="en">Learning Subgraph Patterns from text for Extracting Disease–Symptom Relationships</title>
<author role="aut">
<persName>
<forename type="first">Mohsen</forename>
<surname>Hassan</surname>
</persName>
<email></email>
<idno type="halauthor">1108679</idno>
<affiliation ref="#struct-205125"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Adrien</forename>
<surname>Coulet</surname>
</persName>
<email>adrien.coulet@loria.fr</email>
<idno type="halauthor">1108680</idno>
<affiliation ref="#struct-205125"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Yannick</forename>
<surname>Toussaint</surname>
</persName>
<email>Yannick.Toussaint@loria.fr</email>
<idno type="halauthor">371532</idno>
<affiliation ref="#struct-205125"></affiliation>
</author>
<editor role="depositor">
<persName>
<forename>Adrien</forename>
<surname>Coulet</surname>
</persName>
<email>adrien.coulet@loria.fr</email>
</editor>
</titleStmt>
<editionStmt>
<edition n="v1" type="current">
<date type="whenSubmitted">2014-12-15 20:09:26</date>
<date type="whenWritten">2014-06-27</date>
<date type="whenModified">2015-09-22 01:12:25</date>
<date type="whenReleased">2014-12-19 14:39:50</date>
<date type="whenProduced">2014-09-15</date>
<date type="whenEndEmbargoed">2014-12-15</date>
<ref type="file" target="https://hal.inria.fr/hal-01095595/document">
<date notBefore="2014-12-15"></date>
</ref>
<ref type="file" subtype="author" n="1" target="https://hal.inria.fr/hal-01095595/file/paper6.pdf">
<date notBefore="2014-12-15"></date>
</ref>
</edition>
<respStmt>
<resp>contributor</resp>
<name key="110882">
<persName>
<forename>Adrien</forename>
<surname>Coulet</surname>
</persName>
<email>adrien.coulet@loria.fr</email>
</name>
</respStmt>
</editionStmt>
<publicationStmt>
<distributor>CCSD</distributor>
<idno type="halId">hal-01095595</idno>
<idno type="halUri">https://hal.inria.fr/hal-01095595</idno>
<idno type="halBibtex">hassan:hal-01095595</idno>
<idno type="halRefHtml">Peggy Cellier; Thierry Charnois; Andreas Hotho; Stan Matwin; Marie-Francine Moens; Yannick Toussaint. 1st International Workshop on Interactions between Data Mining and Natural Language Processing, Sep 2014, Nancy, France. ceur-ws, 1202, 2014, <http://ceur-ws.org/Vol-1202/></idno>
<idno type="halRef">Peggy Cellier; Thierry Charnois; Andreas Hotho; Stan Matwin; Marie-Francine Moens; Yannick Toussaint. 1st International Workshop on Interactions between Data Mining and Natural Language Processing, Sep 2014, Nancy, France. ceur-ws, 1202, 2014, <http://ceur-ws.org/Vol-1202/></idno>
</publicationStmt>
<seriesStmt>
<idno type="stamp" n="CNRS">CNRS - Centre national de la recherche scientifique</idno>
<idno type="stamp" n="INRIA">INRIA - Institut National de Recherche en Informatique et en Automatique</idno>
<idno type="stamp" n="INRIA-LORRAINE">INRIA Nancy - Grand Est</idno>
<idno type="stamp" n="LORIA2">Publications du LORIA</idno>
<idno type="stamp" n="INRIA-NANCY-GRAND-EST">INRIA Nancy - Grand Est</idno>
<idno type="stamp" n="LORIA-TALC" p="LORIA">Traitement automatique des langues et des connaissances</idno>
<idno type="stamp" n="UNIV-LORRAINE">Université de Lorraine</idno>
<idno type="stamp" n="LORIA">LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications</idno>
<idno type="stamp" n="INRIA_TEST">INRIA - Institut National de Recherche en Informatique et en Automatique</idno>
</seriesStmt>
<notesStmt>
<note type="audience" n="2">International</note>
<note type="invited" n="0">No</note>
<note type="popular" n="0">No</note>
<note type="peer" n="1">Yes</note>
<note type="proceedings" n="1">Yes</note>
</notesStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Learning Subgraph Patterns from text for Extracting Disease–Symptom Relationships</title>
<author role="aut">
<persName>
<forename type="first">Mohsen</forename>
<surname>Hassan</surname>
</persName>
<idno type="halAuthorId">1108679</idno>
<affiliation ref="#struct-205125"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Adrien</forename>
<surname>Coulet</surname>
</persName>
<email>adrien.coulet@loria.fr</email>
<idno type="halAuthorId">1108680</idno>
<affiliation ref="#struct-205125"></affiliation>
</author>
<author role="aut">
<persName>
<forename type="first">Yannick</forename>
<surname>Toussaint</surname>
</persName>
<email>Yannick.Toussaint@loria.fr</email>
<idno type="halAuthorId">371532</idno>
<affiliation ref="#struct-205125"></affiliation>
</author>
</analytic>
<monogr>
<meeting>
<title>1st International Workshop on Interactions between Data Mining and Natural Language Processing</title>
<date type="start">2014-09-15</date>
<settlement>Nancy</settlement>
<country key="FR">France</country>
</meeting>
<editor>Peggy Cellier</editor>
<editor>Thierry Charnois</editor>
<editor>Andreas Hotho</editor>
<editor>Stan Matwin</editor>
<editor>Marie-Francine Moens</editor>
<editor>Yannick Toussaint</editor>
<imprint>
<publisher>ceur-ws</publisher>
<biblScope unit="volume">1202</biblScope>
<date type="datePub">2014-09-15</date>
</imprint>
</monogr>
<ref type="publisher">http://ceur-ws.org/Vol-1202/</ref>
</biblStruct>
</sourceDesc>
<profileDesc>
<langUsage>
<language ident="en">English</language>
</langUsage>
<textClass>
<keywords scheme="author">
<term xml:lang="en">Relation Extraction</term>
<term xml:lang="en">Text Mining</term>
<term xml:lang="en">Biomedical informatics</term>
<term xml:lang="en">Graph Mining</term>
</keywords>
<classCode scheme="halDomain" n="info.info-lg">Computer Science [cs]/Machine Learning [cs.LG]</classCode>
<classCode scheme="halDomain" n="info.info-cl">Computer Science [cs]/Computation and Language [cs.CL]</classCode>
<classCode scheme="halDomain" n="info.info-bi">Computer Science [cs]/Bioinformatics [q-bio.QM]</classCode>
<classCode scheme="halTypology" n="COMM">Conference papers</classCode>
</textClass>
<abstract xml:lang="en">To some extent, texts can be represented in the form of graphs, such as dependency graphs in which nodes represent words and edges represent grammatical dependencies between words. Graph representation of texts is an interesting alternative to string representation because it provides an additional level of abstraction over the syntax that is sometime easier to compute. In this paper, we study the use of graph mining methods on texts represented as dependency graphs, for extracting relationships between pairs of annotated entities. We propose a three step approach that includes (1) the transformation of texts in a collection of dependency graphs; (2) the selection of frequent subgraphs, named hereafter patterns, on the basis of positive sentences; and (3) the extraction of relationships by searching for occurrences of patterns in novel sentences. Our method has been experimented by extracting disease–symptom relationships from a corpus of 51,292 PubMed abstracts (428,491 sentences)related to 50 rare diseases. The extraction of correct disease–symptom relationships has been evaluated on 565 sentences, showing a precision of 0.91 and a recall of 0.49 (F-Meaure is 0.63). These preliminary experiments show the feasibility of extracting good quality relationships using frequent subgraph mining.</abstract>
</profileDesc>
</hal>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Hal/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002E58 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Hal/Curation/biblio.hfd -nk 002E58 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Hal
   |étape=   Curation
   |type=    RBID
   |clé=     Hal:hal-01095595
   |texte=   Learning Subgraph Patterns from text for Extracting Disease–Symptom Relationships
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022