Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Ontology-guided data preparation for discovering genotype-phenotype relationships

Identifieur interne : 000081 ( Pmc/Checkpoint ); précédent : 000080; suivant : 000082

Ontology-guided data preparation for discovering genotype-phenotype relationships

Auteurs : Adrien Coulet [France] ; Malika Smaïl-Tabbone [France] ; Pascale Benlian [France] ; Amedeo Napoli [France] ; Marie-Dominique Devignes [France]

Source :

RBID : PMC:2367630

Abstract

Background

Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning.

Results

This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results.

Conclusions

The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.


Url:
DOI: 10.1186/1471-2105-9-S4-S3
PubMed: 18460176
PubMed Central: 2367630


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:2367630

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Ontology-guided data preparation for discovering genotype-phenotype relationships</title>
<author>
<name sortKey="Coulet, Adrien" sort="Coulet, Adrien" uniqKey="Coulet A" first="Adrien" last="Coulet">Adrien Coulet</name>
<affiliation wicri:level="3">
<nlm:aff id="I1">KIKA Medical, Paris, F-75012, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>KIKA Medical, Paris, F-75012</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Benlian, Pascale" sort="Benlian, Pascale" uniqKey="Benlian P" first="Pascale" last="Benlian">Pascale Benlian</name>
<affiliation wicri:level="3">
<nlm:aff id="I3">Université Pierre et Marie Curie - Paris6, INSERM UMRS 538 Biochimie-Biologie Moléculaire, Paris, F-75571, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Université Pierre et Marie Curie - Paris6, INSERM UMRS 538 Biochimie-Biologie Moléculaire, Paris, F-75571</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">18460176</idno>
<idno type="pmc">2367630</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367630</idno>
<idno type="RBID">PMC:2367630</idno>
<idno type="doi">10.1186/1471-2105-9-S4-S3</idno>
<date when="2008">2008</date>
<idno type="wicri:Area/Pmc/Corpus">000022</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000022</idno>
<idno type="wicri:Area/Pmc/Curation">000022</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000022</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000081</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000081</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Ontology-guided data preparation for discovering genotype-phenotype relationships</title>
<author>
<name sortKey="Coulet, Adrien" sort="Coulet, Adrien" uniqKey="Coulet A" first="Adrien" last="Coulet">Adrien Coulet</name>
<affiliation wicri:level="3">
<nlm:aff id="I1">KIKA Medical, Paris, F-75012, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>KIKA Medical, Paris, F-75012</wicri:regionArea>
<wicri:noRegion>75012</wicri:noRegion>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Benlian, Pascale" sort="Benlian, Pascale" uniqKey="Benlian P" first="Pascale" last="Benlian">Pascale Benlian</name>
<affiliation wicri:level="3">
<nlm:aff id="I3">Université Pierre et Marie Curie - Paris6, INSERM UMRS 538 Biochimie-Biologie Moléculaire, Paris, F-75571, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Université Pierre et Marie Curie - Paris6, INSERM UMRS 538 Biochimie-Biologie Moléculaire, Paris, F-75571</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
<affiliation wicri:level="3">
<nlm:aff id="I2">LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning.</p>
</sec>
<sec>
<title>Results</title>
<p>This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-title>BMC Bioinformatics</journal-title>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">18460176</article-id>
<article-id pub-id-type="pmc">2367630</article-id>
<article-id pub-id-type="publisher-id">1471-2105-9-S4-S3</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-9-S4-S3</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Ontology-guided data preparation for discovering genotype-phenotype relationships</article-title>
</title-group>
<contrib-group>
<contrib id="A1" corresp="yes" contrib-type="author">
<name>
<surname>Coulet</surname>
<given-names>Adrien</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>adrien.coulet@loria.fr</email>
</contrib>
<contrib id="A2" contrib-type="author">
<name>
<surname>Smaïl-Tabbone</surname>
<given-names>Malika</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>malika.smail@loria.fr</email>
</contrib>
<contrib id="A3" contrib-type="author">
<name>
<surname>Benlian</surname>
<given-names>Pascale</given-names>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>pascale.benlian@sat.ap-hop-paris.fr</email>
</contrib>
<contrib id="A4" contrib-type="author">
<name>
<surname>Napoli</surname>
<given-names>Amedeo</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>amedeo.napoli@loria.fr</email>
</contrib>
<contrib id="A5" contrib-type="author">
<name>
<surname>Devignes</surname>
<given-names>Marie-Dominique</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>marie-dominique.devignes@loria.fr</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
KIKA Medical, Paris, F-75012, France</aff>
<aff id="I2">
<label>2</label>
LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP), Vandoeuvre-lès-Nancy, F- 54506, France</aff>
<aff id="I3">
<label>3</label>
Université Pierre et Marie Curie - Paris6, INSERM UMRS 538 Biochimie-Biologie Moléculaire, Paris, F-75571, France</aff>
<pub-date pub-type="collection">
<year>2008</year>
</pub-date>
<pub-date pub-type="epub">
<day>25</day>
<month>4</month>
<year>2008</year>
</pub-date>
<volume>9</volume>
<issue>Suppl 4</issue>
<supplement>
<named-content content-type="supplement-title">A Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications</named-content>
<named-content content-type="supplement-editor">Paolo Romano, Michael Schroeder, Nicola Cannata and Roberto Marangoni</named-content>
</supplement>
<fpage>S3</fpage>
<lpage>S3</lpage>
<ext-link ext-link-type="uri" xlink:href="http://www.biomedcentral.com/1471-2105/9/S4/S3"></ext-link>
<permissions>
<copyright-statement>Copyright © 2008 Coulet et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2008</copyright-year>
<copyright-holder>Coulet et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<p>This is an open access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0"></ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</p>
<pmc-comment> Coulet Adrien adrien.coulet@loria.fr Ontology-guided data preparation for discovering genotype-phenotype relationships 2008BMC Bioinformatics 9(Suppl 4): S3-. (2008)1471-2105(2008)9:Suppl 4urn:ISSN:1471-2105</pmc-comment>
</license>
</permissions>
<abstract>
<sec>
<title>Background</title>
<p>Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning.</p>
</sec>
<sec>
<title>Results</title>
<p>This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.</p>
</sec>
</abstract>
<conference>
<conf-date>12–15 June 2007</conf-date>
<conf-name>Seventh International Workshop on Network Tools and Applications in Biology (NETTAB 2007)</conf-name>
<conf-loc>Pisa, Italy</conf-loc>
</conference>
</article-meta>
</front>
</pmc>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Grand Est</li>
<li>Lorraine (région)</li>
<li>Île-de-France</li>
</region>
</list>
<tree>
<country name="France">
<region name="Île-de-France">
<name sortKey="Coulet, Adrien" sort="Coulet, Adrien" uniqKey="Coulet A" first="Adrien" last="Coulet">Adrien Coulet</name>
</region>
<name sortKey="Benlian, Pascale" sort="Benlian, Pascale" uniqKey="Benlian P" first="Pascale" last="Benlian">Pascale Benlian</name>
<name sortKey="Coulet, Adrien" sort="Coulet, Adrien" uniqKey="Coulet A" first="Adrien" last="Coulet">Adrien Coulet</name>
<name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Pmc/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000081 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd -nk 000081 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Pmc
   |étape=   Checkpoint
   |type=    RBID
   |clé=     PMC:2367630
   |texte=   Ontology-guided data preparation for discovering genotype-phenotype relationships
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/RBID.i   -Sk "pubmed:18460176" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a InforLorV4 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022