Text-mining assisted regulatory annotation
Identifieur interne : 000239 ( Pmc/Curation ); précédent : 000238; suivant : 000240Text-mining assisted regulatory annotation
Auteurs : Stein Aerts [Belgique] ; Maximilian Haeussler [France] ; Steven Van Vooren [Belgique] ; Obi L. Griffith [Canada] ; Paco Hulpiau [Belgique] ; Steven Jm Jones [Canada] ; Stephen B. Montgomery [Royaume-Uni] ; Casey M. Bergman [Royaume-Uni]Source :
- Genome Biology [ 1465-6906 ] ; 2008.
Abstract
Text-mining technologies can be integrated with genome annotation systems, increasing the availability of annotated
Url:
DOI: 10.1186/gb-2008-9-2-r31
PubMed: 18271954
PubMed Central: 2374703
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000239
Links to Exploration step
PMC:2374703Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Text-mining assisted regulatory annotation</title>
<author><name sortKey="Aerts, Stein" sort="Aerts, Stein" uniqKey="Aerts S" first="Stein" last="Aerts">Stein Aerts</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Laboratory of Neurogenetics, Department of Molecular and Developmental Genetics, VIB, Leuven, B-3000, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Laboratory of Neurogenetics, Department of Molecular and Developmental Genetics, VIB, Leuven, B-3000</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><nlm:aff id="I2">Department of Human Genetics, Katholieke Universiteit Leuven School of Medicine, Herestraat, Leuven, B-3000, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Human Genetics, Katholieke Universiteit Leuven School of Medicine, Herestraat, Leuven, B-3000</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Haeussler, Maximilian" sort="Haeussler, Maximilian" uniqKey="Haeussler M" first="Maximilian" last="Haeussler">Maximilian Haeussler</name>
<affiliation wicri:level="1"><nlm:aff id="I3">Institut de Neurosciences A Fessard, Centre National de la Rechere Scientifique, Gif-sur-Yvette, 91 198, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Institut de Neurosciences A Fessard, Centre National de la Rechere Scientifique, Gif-sur-Yvette, 91 198</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Van Vooren, Steven" sort="Van Vooren, Steven" uniqKey="Van Vooren S" first="Steven" last="Van Vooren">Steven Van Vooren</name>
<affiliation wicri:level="1"><nlm:aff id="I4">Department of Electrical Engineering, Katholieke Universiteit Leuven, Heverlee, B-3001, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Electrical Engineering, Katholieke Universiteit Leuven, Heverlee, B-3001</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Griffith, Obi L" sort="Griffith, Obi L" uniqKey="Griffith O" first="Obi L" last="Griffith">Obi L. Griffith</name>
<affiliation wicri:level="1"><nlm:aff id="I5">Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6, Canada</nlm:aff>
<country xml:lang="fr">Canada</country>
<wicri:regionArea>Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Hulpiau, Paco" sort="Hulpiau, Paco" uniqKey="Hulpiau P" first="Paco" last="Hulpiau">Paco Hulpiau</name>
<affiliation wicri:level="1"><nlm:aff id="I6">VIB Department for Molecular Biomedical Research, Ghent University, Ghent, 9052, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>VIB Department for Molecular Biomedical Research, Ghent University, Ghent, 9052</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Jones, Steven Jm" sort="Jones, Steven Jm" uniqKey="Jones S" first="Steven Jm" last="Jones">Steven Jm Jones</name>
<affiliation wicri:level="1"><nlm:aff id="I5">Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6, Canada</nlm:aff>
<country xml:lang="fr">Canada</country>
<wicri:regionArea>Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Montgomery, Stephen B" sort="Montgomery, Stephen B" uniqKey="Montgomery S" first="Stephen B" last="Montgomery">Stephen B. Montgomery</name>
<affiliation wicri:level="1"><nlm:aff id="I7">Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Wellcome Trust Sanger Institute, Hinxton, CB10 1SA</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Bergman, Casey M" sort="Bergman, Casey M" uniqKey="Bergman C" first="Casey M" last="Bergman">Casey M. Bergman</name>
<affiliation wicri:level="1"><nlm:aff id="I8">Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">18271954</idno>
<idno type="pmc">2374703</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2374703</idno>
<idno type="RBID">PMC:2374703</idno>
<idno type="doi">10.1186/gb-2008-9-2-r31</idno>
<date when="2008">2008</date>
<idno type="wicri:Area/Pmc/Corpus">000239</idno>
<idno type="wicri:Area/Pmc/Curation">000239</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Text-mining assisted regulatory annotation</title>
<author><name sortKey="Aerts, Stein" sort="Aerts, Stein" uniqKey="Aerts S" first="Stein" last="Aerts">Stein Aerts</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Laboratory of Neurogenetics, Department of Molecular and Developmental Genetics, VIB, Leuven, B-3000, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Laboratory of Neurogenetics, Department of Molecular and Developmental Genetics, VIB, Leuven, B-3000</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><nlm:aff id="I2">Department of Human Genetics, Katholieke Universiteit Leuven School of Medicine, Herestraat, Leuven, B-3000, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Human Genetics, Katholieke Universiteit Leuven School of Medicine, Herestraat, Leuven, B-3000</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Haeussler, Maximilian" sort="Haeussler, Maximilian" uniqKey="Haeussler M" first="Maximilian" last="Haeussler">Maximilian Haeussler</name>
<affiliation wicri:level="1"><nlm:aff id="I3">Institut de Neurosciences A Fessard, Centre National de la Rechere Scientifique, Gif-sur-Yvette, 91 198, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Institut de Neurosciences A Fessard, Centre National de la Rechere Scientifique, Gif-sur-Yvette, 91 198</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Van Vooren, Steven" sort="Van Vooren, Steven" uniqKey="Van Vooren S" first="Steven" last="Van Vooren">Steven Van Vooren</name>
<affiliation wicri:level="1"><nlm:aff id="I4">Department of Electrical Engineering, Katholieke Universiteit Leuven, Heverlee, B-3001, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Electrical Engineering, Katholieke Universiteit Leuven, Heverlee, B-3001</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Griffith, Obi L" sort="Griffith, Obi L" uniqKey="Griffith O" first="Obi L" last="Griffith">Obi L. Griffith</name>
<affiliation wicri:level="1"><nlm:aff id="I5">Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6, Canada</nlm:aff>
<country xml:lang="fr">Canada</country>
<wicri:regionArea>Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Hulpiau, Paco" sort="Hulpiau, Paco" uniqKey="Hulpiau P" first="Paco" last="Hulpiau">Paco Hulpiau</name>
<affiliation wicri:level="1"><nlm:aff id="I6">VIB Department for Molecular Biomedical Research, Ghent University, Ghent, 9052, Belgium</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>VIB Department for Molecular Biomedical Research, Ghent University, Ghent, 9052</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Jones, Steven Jm" sort="Jones, Steven Jm" uniqKey="Jones S" first="Steven Jm" last="Jones">Steven Jm Jones</name>
<affiliation wicri:level="1"><nlm:aff id="I5">Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6, Canada</nlm:aff>
<country xml:lang="fr">Canada</country>
<wicri:regionArea>Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Montgomery, Stephen B" sort="Montgomery, Stephen B" uniqKey="Montgomery S" first="Stephen B" last="Montgomery">Stephen B. Montgomery</name>
<affiliation wicri:level="1"><nlm:aff id="I7">Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Wellcome Trust Sanger Institute, Hinxton, CB10 1SA</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Bergman, Casey M" sort="Bergman, Casey M" uniqKey="Bergman C" first="Casey M" last="Bergman">Casey M. Bergman</name>
<affiliation wicri:level="1"><nlm:aff id="I8">Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series><title level="j">Genome Biology</title>
<idno type="ISSN">1465-6906</idno>
<idno type="eISSN">1465-6914</idno>
<imprint><date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>Text-mining technologies can be integrated with genome annotation systems, increasing the availability of annotated <italic>cis</italic>
-regulatory data.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-ta">Genome Biol</journal-id>
<journal-title>Genome Biology</journal-title>
<issn pub-type="ppub">1465-6906</issn>
<issn pub-type="epub">1465-6914</issn>
<publisher><publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">18271954</article-id>
<article-id pub-id-type="pmc">2374703</article-id>
<article-id pub-id-type="publisher-id">gb-2008-9-2-r31</article-id>
<article-id pub-id-type="doi">10.1186/gb-2008-9-2-r31</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research</subject>
</subj-group>
</article-categories>
<title-group><article-title>Text-mining assisted regulatory annotation</article-title>
</title-group>
<contrib-group><contrib id="A1" corresp="yes" contrib-type="author"><name><surname>Aerts</surname>
<given-names>Stein</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>stein.aerts@med.kuleuven.be</email>
</contrib>
<contrib id="A2" contrib-type="author"><name><surname>Haeussler</surname>
<given-names>Maximilian</given-names>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>maximilianh@gmail.com</email>
</contrib>
<contrib id="A3" contrib-type="author"><name><surname>van Vooren</surname>
<given-names>Steven</given-names>
</name>
<xref ref-type="aff" rid="I4">4</xref>
<email>Steven.VanVooren@esat.kuleuven.ac.be</email>
</contrib>
<contrib id="A4" contrib-type="author"><name><surname>Griffith</surname>
<given-names>Obi L</given-names>
</name>
<xref ref-type="aff" rid="I5">5</xref>
<email>obig@bcgsc.ca</email>
</contrib>
<contrib id="A5" contrib-type="author"><name><surname>Hulpiau</surname>
<given-names>Paco</given-names>
</name>
<xref ref-type="aff" rid="I6">6</xref>
<email>paco.hulpiau@dmbr.ugent.be</email>
</contrib>
<contrib id="A6" contrib-type="author"><name><surname>Jones</surname>
<given-names>Steven JM</given-names>
</name>
<xref ref-type="aff" rid="I5">5</xref>
<email>sjones@bcgsc.ca</email>
</contrib>
<contrib id="A7" contrib-type="author"><name><surname>Montgomery</surname>
<given-names>Stephen B</given-names>
</name>
<xref ref-type="aff" rid="I7">7</xref>
<email>sm8@sanger.ac.uk</email>
</contrib>
<contrib id="A8" corresp="yes" contrib-type="author"><name><surname>Bergman</surname>
<given-names>Casey M</given-names>
</name>
<xref ref-type="aff" rid="I8">8</xref>
<email>casey.bergman@manchester.ac.uk</email>
</contrib>
<contrib id="A9" contrib-type="author"><collab>The Open Regulatory Annotation Consortium</collab>
<email>oreganno@noaddress.com</email>
</contrib>
</contrib-group>
<aff id="I1"><label>1</label>
Laboratory of Neurogenetics, Department of Molecular and Developmental Genetics, VIB, Leuven, B-3000, Belgium</aff>
<aff id="I2"><label>2</label>
Department of Human Genetics, Katholieke Universiteit Leuven School of Medicine, Herestraat, Leuven, B-3000, Belgium</aff>
<aff id="I3"><label>3</label>
Institut de Neurosciences A Fessard, Centre National de la Rechere Scientifique, Gif-sur-Yvette, 91 198, France</aff>
<aff id="I4"><label>4</label>
Department of Electrical Engineering, Katholieke Universiteit Leuven, Heverlee, B-3001, Belgium</aff>
<aff id="I5"><label>5</label>
Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, V5Z 4E6, Canada</aff>
<aff id="I6"><label>6</label>
VIB Department for Molecular Biomedical Research, Ghent University, Ghent, 9052, Belgium</aff>
<aff id="I7"><label>7</label>
Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK</aff>
<aff id="I8"><label>8</label>
Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK</aff>
<pub-date pub-type="ppub"><year>2008</year>
</pub-date>
<pub-date pub-type="epub"><day>13</day>
<month>2</month>
<year>2008</year>
</pub-date>
<volume>9</volume>
<issue>2</issue>
<fpage>R31</fpage>
<lpage>R31</lpage>
<ext-link ext-link-type="uri" xlink:href="http://genomebiology.com/2008/9/2/R31"></ext-link>
<history><date date-type="received"><day>2</day>
<month>10</month>
<year>2007</year>
</date>
<date date-type="rev-recd"><day>21</day>
<month>12</month>
<year>2007</year>
</date>
<date date-type="accepted"><day>13</day>
<month>2</month>
<year>2008</year>
</date>
</history>
<permissions><copyright-statement>Copyright © 2008 Aerts et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2008</copyright-year>
<copyright-holder>Aerts et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0"><p>This is an open access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0"></ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</p>
<pmc-comment>
Aerts
Stein
stein.aerts@med.kuleuven.be
Text-mining assisted regulatory annotation
2008 Genome Biology 9(2): R31-. (2008) 1465-6906(2008)9:2 urn:ISSN:1465-6906 </pmc-comment>
</license>
</permissions>
<abstract abstract-type="short"><p>Text-mining technologies can be integrated with genome annotation systems, increasing the availability of annotated <italic>cis</italic>
-regulatory data.</p>
</abstract>
<abstract><sec><title>Background</title>
<p>Decoding transcriptional regulatory networks and the genomic <italic>cis</italic>
-regulatory logic implemented in their control nodes is a fundamental challenge in genome biology. High-throughput computational and experimental analyses of regulatory networks and sequences rely heavily on positive control data from prior small-scale experiments, but the vast majority of previously discovered regulatory data remains locked in the biomedical literature.</p>
</sec>
<sec><title>Results</title>
<p>We develop text-mining strategies to identify relevant publications and extract sequence information to assist the regulatory annotation process. Using a vector space model to identify Medline abstracts from papers likely to have high <italic>cis</italic>
-regulatory content, we demonstrate that document relevance ranking can assist the curation of transcriptional regulatory networks and estimate that, minimally, 30,000 papers harbor unannotated <italic>cis</italic>
-regulatory data. In addition, we show that DNA sequences can be extracted from primary text with high <italic>cis</italic>
-regulatory content and mapped to genome sequences as a means of identifying the location, organism and target gene information that is critical to the <italic>cis</italic>
-regulatory annotation process.</p>
</sec>
<sec><title>Conclusion</title>
<p>Our results demonstrate that text-mining technologies can be successfully integrated with genome annotation systems, thereby increasing the availability of annotated <italic>cis</italic>
-regulatory data needed to catalyze advances in the field of gene regulation.</p>
</sec>
</abstract>
</article-meta>
</front>
</pmc>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Belgique/explor/OpenAccessBelV2/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000239 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000239 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Belgique |area= OpenAccessBelV2 |flux= Pmc |étape= Curation |type= RBID |clé= PMC:2374703 |texte= Text-mining assisted regulatory annotation }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i -Sk "pubmed:18271954" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd \ | NlmPubMed2Wicri -a OpenAccessBelV2
This area was generated with Dilib version V0.6.25. |