Serveur d'exploration autour du libre accès en Belgique

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization

Identifieur interne : 000265 ( Pmc/Checkpoint ); précédent : 000264; suivant : 000266

Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization

Auteurs : Sofie Van Landeghem [Belgique] ; Jari Björne [Finlande] ; Chih-Hsuan Wei [États-Unis, Taïwan] ; Kai Hakala [Finlande] ; Sampo Pyysalo [Royaume-Uni] ; Sophia Ananiadou [Royaume-Uni] ; Hung-Yu Kao [Taïwan] ; Zhiyong Lu [États-Unis] ; Tapio Salakoski [Finlande] ; Yves Van De Peer [Belgique] ; Filip Ginter [Finlande]

Source :

RBID : PMC:3629104

Abstract

Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons – Attribution – Share Alike (CC BY-SA) license.


Url:
DOI: 10.1371/journal.pone.0055814
PubMed: 23613707
PubMed Central: 3629104


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:3629104

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization</title>
<author>
<name sortKey="Van Landeghem, Sofie" sort="Van Landeghem, Sofie" uniqKey="Van Landeghem S" first="Sofie" last="Van Landeghem">Sofie Van Landeghem</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Plant Systems Biology, VIB, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Systems Biology, VIB, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Bjorne, Jari" sort="Bjorne, Jari" uniqKey="Bjorne J" first="Jari" last="Björne">Jari Björne</name>
<affiliation wicri:level="3">
<nlm:aff id="aff3">
<addr-line>Turku Centre for Computer Science, Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Turku Centre for Computer Science, Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
<author>
<name sortKey="Wei, Chih Hsuan" sort="Wei, Chih Hsuan" uniqKey="Wei C" first="Chih-Hsuan" last="Wei">Chih-Hsuan Wei</name>
<affiliation wicri:level="2">
<nlm:aff id="aff5">
<addr-line>National Center for Biotechnology Information, Bethesda, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Center for Biotechnology Information, Bethesda, Maryland</wicri:regionArea>
<placeName>
<region type="state">Maryland</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<addr-line>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan</addr-line>
</nlm:aff>
<country xml:lang="fr">Taïwan</country>
<wicri:regionArea>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan</wicri:regionArea>
<wicri:noRegion>Tainan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Hakala, Kai" sort="Hakala, Kai" uniqKey="Hakala K" first="Kai" last="Hakala">Kai Hakala</name>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
<author>
<name sortKey="Pyysalo, Sampo" sort="Pyysalo, Sampo" uniqKey="Pyysalo S" first="Sampo" last="Pyysalo">Sampo Pyysalo</name>
<affiliation wicri:level="4">
<nlm:aff id="aff7">
<addr-line>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester</wicri:regionArea>
<placeName>
<settlement type="city">Manchester</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Grand Manchester</region>
</placeName>
<orgName type="university">Université de Manchester</orgName>
</affiliation>
</author>
<author>
<name sortKey="Ananiadou, Sophia" sort="Ananiadou, Sophia" uniqKey="Ananiadou S" first="Sophia" last="Ananiadou">Sophia Ananiadou</name>
<affiliation wicri:level="4">
<nlm:aff id="aff7">
<addr-line>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester</wicri:regionArea>
<placeName>
<settlement type="city">Manchester</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Grand Manchester</region>
</placeName>
<orgName type="university">Université de Manchester</orgName>
</affiliation>
</author>
<author>
<name sortKey="Kao, Hung Yu" sort="Kao, Hung Yu" uniqKey="Kao H" first="Hung-Yu" last="Kao">Hung-Yu Kao</name>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<addr-line>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan</addr-line>
</nlm:aff>
<country xml:lang="fr">Taïwan</country>
<wicri:regionArea>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan</wicri:regionArea>
<wicri:noRegion>Tainan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Lu, Zhiyong" sort="Lu, Zhiyong" uniqKey="Lu Z" first="Zhiyong" last="Lu">Zhiyong Lu</name>
<affiliation wicri:level="2">
<nlm:aff id="aff5">
<addr-line>National Center for Biotechnology Information, Bethesda, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Center for Biotechnology Information, Bethesda, Maryland</wicri:regionArea>
<placeName>
<region type="state">Maryland</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Salakoski, Tapio" sort="Salakoski, Tapio" uniqKey="Salakoski T" first="Tapio" last="Salakoski">Tapio Salakoski</name>
<affiliation wicri:level="3">
<nlm:aff id="aff3">
<addr-line>Turku Centre for Computer Science, Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Turku Centre for Computer Science, Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
<author>
<name sortKey="Van De Peer, Yves" sort="Van De Peer, Yves" uniqKey="Van De Peer Y" first="Yves" last="Van De Peer">Yves Van De Peer</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Plant Systems Biology, VIB, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Systems Biology, VIB, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Ginter, Filip" sort="Ginter, Filip" uniqKey="Ginter F" first="Filip" last="Ginter">Filip Ginter</name>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">23613707</idno>
<idno type="pmc">3629104</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3629104</idno>
<idno type="RBID">PMC:3629104</idno>
<idno type="doi">10.1371/journal.pone.0055814</idno>
<date when="2013">2013</date>
<idno type="wicri:Area/Pmc/Corpus">000416</idno>
<idno type="wicri:Area/Pmc/Curation">000416</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000265</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization</title>
<author>
<name sortKey="Van Landeghem, Sofie" sort="Van Landeghem, Sofie" uniqKey="Van Landeghem S" first="Sofie" last="Van Landeghem">Sofie Van Landeghem</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Plant Systems Biology, VIB, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Systems Biology, VIB, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Bjorne, Jari" sort="Bjorne, Jari" uniqKey="Bjorne J" first="Jari" last="Björne">Jari Björne</name>
<affiliation wicri:level="3">
<nlm:aff id="aff3">
<addr-line>Turku Centre for Computer Science, Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Turku Centre for Computer Science, Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
<author>
<name sortKey="Wei, Chih Hsuan" sort="Wei, Chih Hsuan" uniqKey="Wei C" first="Chih-Hsuan" last="Wei">Chih-Hsuan Wei</name>
<affiliation wicri:level="2">
<nlm:aff id="aff5">
<addr-line>National Center for Biotechnology Information, Bethesda, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Center for Biotechnology Information, Bethesda, Maryland</wicri:regionArea>
<placeName>
<region type="state">Maryland</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<addr-line>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan</addr-line>
</nlm:aff>
<country xml:lang="fr">Taïwan</country>
<wicri:regionArea>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan</wicri:regionArea>
<wicri:noRegion>Tainan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Hakala, Kai" sort="Hakala, Kai" uniqKey="Hakala K" first="Kai" last="Hakala">Kai Hakala</name>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
<author>
<name sortKey="Pyysalo, Sampo" sort="Pyysalo, Sampo" uniqKey="Pyysalo S" first="Sampo" last="Pyysalo">Sampo Pyysalo</name>
<affiliation wicri:level="4">
<nlm:aff id="aff7">
<addr-line>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester</wicri:regionArea>
<placeName>
<settlement type="city">Manchester</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Grand Manchester</region>
</placeName>
<orgName type="university">Université de Manchester</orgName>
</affiliation>
</author>
<author>
<name sortKey="Ananiadou, Sophia" sort="Ananiadou, Sophia" uniqKey="Ananiadou S" first="Sophia" last="Ananiadou">Sophia Ananiadou</name>
<affiliation wicri:level="4">
<nlm:aff id="aff7">
<addr-line>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester</wicri:regionArea>
<placeName>
<settlement type="city">Manchester</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Grand Manchester</region>
</placeName>
<orgName type="university">Université de Manchester</orgName>
</affiliation>
</author>
<author>
<name sortKey="Kao, Hung Yu" sort="Kao, Hung Yu" uniqKey="Kao H" first="Hung-Yu" last="Kao">Hung-Yu Kao</name>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<addr-line>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan</addr-line>
</nlm:aff>
<country xml:lang="fr">Taïwan</country>
<wicri:regionArea>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan</wicri:regionArea>
<wicri:noRegion>Tainan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Lu, Zhiyong" sort="Lu, Zhiyong" uniqKey="Lu Z" first="Zhiyong" last="Lu">Zhiyong Lu</name>
<affiliation wicri:level="2">
<nlm:aff id="aff5">
<addr-line>National Center for Biotechnology Information, Bethesda, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>National Center for Biotechnology Information, Bethesda, Maryland</wicri:regionArea>
<placeName>
<region type="state">Maryland</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Salakoski, Tapio" sort="Salakoski, Tapio" uniqKey="Salakoski T" first="Tapio" last="Salakoski">Tapio Salakoski</name>
<affiliation wicri:level="3">
<nlm:aff id="aff3">
<addr-line>Turku Centre for Computer Science, Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Turku Centre for Computer Science, Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
</affiliation>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
<author>
<name sortKey="Van De Peer, Yves" sort="Van De Peer, Yves" uniqKey="Van De Peer Y" first="Yves" last="Van De Peer">Yves Van De Peer</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Department of Plant Systems Biology, VIB, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Systems Biology, VIB, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium</addr-line>
</nlm:aff>
<country xml:lang="fr">Belgique</country>
<wicri:regionArea>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent</wicri:regionArea>
<wicri:noRegion>Gent</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Ginter, Filip" sort="Ginter, Filip" uniqKey="Ginter F" first="Filip" last="Ginter">Filip Ginter</name>
<affiliation wicri:level="4">
<nlm:aff id="aff4">
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</nlm:aff>
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Information Technology, University of Turku</wicri:regionArea>
<placeName>
<settlement type="city">Turku</settlement>
<region type="région" nuts="2">Finlande occidentale</region>
</placeName>
<orgName type="university">Université de Turku</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (
<ext-link ext-link-type="uri" xlink:href="http://www.evexdb.org/api/v001/">http://www.evexdb.org/api/v001/</ext-link>
). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from
<ext-link ext-link-type="uri" xlink:href="http://evexdb.org/download/">http://evexdb.org/download/</ext-link>
, under the Creative Commons – Attribution – Share Alike (CC BY-SA) license.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Szklarczyk, D" uniqKey="Szklarczyk D">D Szklarczyk</name>
</author>
<author>
<name sortKey="Franceschini, A" uniqKey="Franceschini A">A Franceschini</name>
</author>
<author>
<name sortKey="Kuhn, M" uniqKey="Kuhn M">M Kuhn</name>
</author>
<author>
<name sortKey="Simonovic, M" uniqKey="Simonovic M">M Simonovic</name>
</author>
<author>
<name sortKey="Roth, A" uniqKey="Roth A">A Roth</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stark, C" uniqKey="Stark C">C Stark</name>
</author>
<author>
<name sortKey="Breitkreutz, Bj" uniqKey="Breitkreutz B">BJ Breitkreutz</name>
</author>
<author>
<name sortKey="Chatr Aryamontri, A" uniqKey="Chatr Aryamontri A">A Chatr-aryamontri</name>
</author>
<author>
<name sortKey="Boucher, L" uniqKey="Boucher L">L Boucher</name>
</author>
<author>
<name sortKey="Oughtred, R" uniqKey="Oughtred R">R Oughtred</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ongenaert, M" uniqKey="Ongenaert M">M Ongenaert</name>
</author>
<author>
<name sortKey="Van Neste, L" uniqKey="Van Neste L">L Van Neste</name>
</author>
<author>
<name sortKey="De Meyer, T" uniqKey="De Meyer T">T De Meyer</name>
</author>
<author>
<name sortKey="Menschaert, G" uniqKey="Menschaert G">G Menschaert</name>
</author>
<author>
<name sortKey="Bekaert, S" uniqKey="Bekaert S">S Bekaert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Haibe Kains, B" uniqKey="Haibe Kains B">B Haibe-Kains</name>
</author>
<author>
<name sortKey="Olsen, C" uniqKey="Olsen C">C Olsen</name>
</author>
<author>
<name sortKey="Djebbari, A" uniqKey="Djebbari A">A Djebbari</name>
</author>
<author>
<name sortKey="Bontempi, G" uniqKey="Bontempi G">G Bontempi</name>
</author>
<author>
<name sortKey="Correll, M" uniqKey="Correll M">M Correll</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rebholz Schuhmann, D" uniqKey="Rebholz Schuhmann D">D Rebholz-Schuhmann</name>
</author>
<author>
<name sortKey="Kirsch, H" uniqKey="Kirsch H">H Kirsch</name>
</author>
<author>
<name sortKey="Arregui, M" uniqKey="Arregui M">M Arregui</name>
</author>
<author>
<name sortKey="Gaudan, S" uniqKey="Gaudan S">S Gaudan</name>
</author>
<author>
<name sortKey="Riethoven, M" uniqKey="Riethoven M">M Riethoven</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hearst, Ma" uniqKey="Hearst M">MA Hearst</name>
</author>
<author>
<name sortKey="Divoli, A" uniqKey="Divoli A">A Divoli</name>
</author>
<author>
<name sortKey="Guturu, H" uniqKey="Guturu H">H Guturu</name>
</author>
<author>
<name sortKey="Ksikes, A" uniqKey="Ksikes A">A Ksikes</name>
</author>
<author>
<name sortKey="Nakov, P" uniqKey="Nakov P">P Nakov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Matsuzaki, T" uniqKey="Matsuzaki T">T Matsuzaki</name>
</author>
<author>
<name sortKey="Okazaki, N" uniqKey="Okazaki N">N Okazaki</name>
</author>
<author>
<name sortKey="Miwa, M" uniqKey="Miwa M">M Miwa</name>
</author>
<author>
<name sortKey="Saetre, R" uniqKey="Saetre R">R Saetre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Agarwal, S" uniqKey="Agarwal S">S Agarwal</name>
</author>
<author>
<name sortKey="Yu, H" uniqKey="Yu H">H Yu</name>
</author>
<author>
<name sortKey="Kohane, I" uniqKey="Kohane I">I Kohane</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Landeghem, S" uniqKey="Van Landeghem S">S Van Landeghem</name>
</author>
<author>
<name sortKey="Ginter, F" uniqKey="Ginter F">F Ginter</name>
</author>
<author>
<name sortKey="Van De Peer, Y" uniqKey="Van De Peer Y">Y Van de Peer</name>
</author>
<author>
<name sortKey="Salakoski, T" uniqKey="Salakoski T">T Salakoski</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hucka, M" uniqKey="Hucka M">M Hucka</name>
</author>
<author>
<name sortKey="Finney, A" uniqKey="Finney A">A Finney</name>
</author>
<author>
<name sortKey="Sauro, H" uniqKey="Sauro H">H Sauro</name>
</author>
<author>
<name sortKey="Bolouri, H" uniqKey="Bolouri H">H Bolouri</name>
</author>
<author>
<name sortKey="Doyle, J" uniqKey="Doyle J">J Doyle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Demir, E" uniqKey="Demir E">E Demir</name>
</author>
<author>
<name sortKey="Cary, M" uniqKey="Cary M">M Cary</name>
</author>
<author>
<name sortKey="Paley, S" uniqKey="Paley S">S Paley</name>
</author>
<author>
<name sortKey="Fukuda, K" uniqKey="Fukuda K">K Fukuda</name>
</author>
<author>
<name sortKey="Lemer, C" uniqKey="Lemer C">C Lemer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ananiadou, S" uniqKey="Ananiadou S">S Ananiadou</name>
</author>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
<author>
<name sortKey="Kell, Db" uniqKey="Kell D">DB Kell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kim, J" uniqKey="Kim J">J Kim</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Kano, Y" uniqKey="Kano Y">Y Kano</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kim, Jd" uniqKey="Kim J">JD Kim</name>
</author>
<author>
<name sortKey="Nguyen, N" uniqKey="Nguyen N">N Nguyen</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
<author>
<name sortKey="Takagi, T" uniqKey="Takagi T">T Takagi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Rak, R" uniqKey="Rak R">R Rak</name>
</author>
<author>
<name sortKey="Sullivan, D" uniqKey="Sullivan D">D Sullivan</name>
</author>
<author>
<name sortKey="Mao, C" uniqKey="Mao C">C Mao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, L" uniqKey="Chen L">L Chen</name>
</author>
<author>
<name sortKey="Liu, H" uniqKey="Liu H">H Liu</name>
</author>
<author>
<name sortKey="Friedman, C" uniqKey="Friedman C">C Friedman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sayers, Ew" uniqKey="Sayers E">EW Sayers</name>
</author>
<author>
<name sortKey="Barrett, T" uniqKey="Barrett T">T Barrett</name>
</author>
<author>
<name sortKey="Benson, Da" uniqKey="Benson D">DA Benson</name>
</author>
<author>
<name sortKey="Bolton, E" uniqKey="Bolton E">E Bolton</name>
</author>
<author>
<name sortKey="Bryant, Sh" uniqKey="Bryant S">SH Bryant</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="The Uniprot, Consortium" uniqKey="The Uniprot C">Consortium The UniProt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
<author>
<name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rose, Pw" uniqKey="Rose P">PW Rose</name>
</author>
<author>
<name sortKey="Beran, B" uniqKey="Beran B">B Beran</name>
</author>
<author>
<name sortKey="Bi, C" uniqKey="Bi C">C Bi</name>
</author>
<author>
<name sortKey="Bluhm, Wf" uniqKey="Bluhm W">WF Bluhm</name>
</author>
<author>
<name sortKey="Dimitropoulos, D" uniqKey="Dimitropoulos D">D Dimitropoulos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hirschman, L" uniqKey="Hirschman L">L Hirschman</name>
</author>
<author>
<name sortKey="Yeh, A" uniqKey="Yeh A">A Yeh</name>
</author>
<author>
<name sortKey="Blaschke, C" uniqKey="Blaschke C">C Blaschke</name>
</author>
<author>
<name sortKey="Valencia, A" uniqKey="Valencia A">A Valencia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krallinger, M" uniqKey="Krallinger M">M Krallinger</name>
</author>
<author>
<name sortKey="Morgan, A" uniqKey="Morgan A">A Morgan</name>
</author>
<author>
<name sortKey="Smith, L" uniqKey="Smith L">L Smith</name>
</author>
<author>
<name sortKey="Leitner, F" uniqKey="Leitner F">F Leitner</name>
</author>
<author>
<name sortKey="Tanabe, L" uniqKey="Tanabe L">L Tanabe</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leitner, F" uniqKey="Leitner F">F Leitner</name>
</author>
<author>
<name sortKey="Mardis, S" uniqKey="Mardis S">S Mardis</name>
</author>
<author>
<name sortKey="Krallinger, M" uniqKey="Krallinger M">M Krallinger</name>
</author>
<author>
<name sortKey="Cesareni, G" uniqKey="Cesareni G">G Cesareni</name>
</author>
<author>
<name sortKey="Hirschman, L" uniqKey="Hirschman L">L Hirschman</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stenetorp, P" uniqKey="Stenetorp P">P Stenetorp</name>
</author>
<author>
<name sortKey="Topi, G" uniqKey="Topi G">G Topić</name>
</author>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Kim, Jd" uniqKey="Kim J">JD Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bjorne, J" uniqKey="Bjorne J">J Björne</name>
</author>
<author>
<name sortKey="Ginter, F" uniqKey="Ginter F">F Ginter</name>
</author>
<author>
<name sortKey="Salakoski, T" uniqKey="Salakoski T">T Salakoski</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kazama, J" uniqKey="Kazama J">J Kazama</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Charniak, E" uniqKey="Charniak E">E Charniak</name>
</author>
<author>
<name sortKey="Johnson, M" uniqKey="Johnson M">M Johnson</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Marneffe, Mc" uniqKey="De Marneffe M">MC de Marneffe</name>
</author>
<author>
<name sortKey="Maccartney, B" uniqKey="Maccartney B">B MacCartney</name>
</author>
<author>
<name sortKey="Manning, C" uniqKey="Manning C">C Manning</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bjorne, J" uniqKey="Bjorne J">J Björne</name>
</author>
<author>
<name sortKey="Ginter, F" uniqKey="Ginter F">F Ginter</name>
</author>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
<author>
<name sortKey="Salakoski, T" uniqKey="Salakoski T">T Salakoski</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bjorne, J" uniqKey="Bjorne J">J Björne</name>
</author>
<author>
<name sortKey="Van Landeghem, S" uniqKey="Van Landeghem S">S Van Landeghem</name>
</author>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Ginter, F" uniqKey="Ginter F">F Ginter</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Flicek, P" uniqKey="Flicek P">P Flicek</name>
</author>
<author>
<name sortKey="Amode, Mr" uniqKey="Amode M">MR Amode</name>
</author>
<author>
<name sortKey="Barrell, D" uniqKey="Barrell D">D Barrell</name>
</author>
<author>
<name sortKey="Beal, K" uniqKey="Beal K">K Beal</name>
</author>
<author>
<name sortKey="Brent, S" uniqKey="Brent S">S Brent</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kersey, Pj" uniqKey="Kersey P">PJ Kersey</name>
</author>
<author>
<name sortKey="Lawson, D" uniqKey="Lawson D">D Lawson</name>
</author>
<author>
<name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
<author>
<name sortKey="Derwent, Ps" uniqKey="Derwent P">PS Derwent</name>
</author>
<author>
<name sortKey="Haimel, M" uniqKey="Haimel M">M Haimel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hsu, Cn" uniqKey="Hsu C">CN Hsu</name>
</author>
<author>
<name sortKey="Chang, Ym" uniqKey="Chang Y">YM Chang</name>
</author>
<author>
<name sortKey="Kuo, Cj" uniqKey="Kuo C">CJ Kuo</name>
</author>
<author>
<name sortKey="Lin, Ys" uniqKey="Lin Y">YS Lin</name>
</author>
<author>
<name sortKey="Huang, Hs" uniqKey="Huang H">HS Huang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wei, Ch" uniqKey="Wei C">CH Wei</name>
</author>
<author>
<name sortKey="Kao, Hy" uniqKey="Kao H">HY Kao</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wei, Ch" uniqKey="Wei C">CH Wei</name>
</author>
<author>
<name sortKey="Kao, Hy" uniqKey="Kao H">HY Kao</name>
</author>
<author>
<name sortKey="Lu, Z" uniqKey="Lu Z">Z Lu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cohen, Kb" uniqKey="Cohen K">KB Cohen</name>
</author>
<author>
<name sortKey="Johnson, H" uniqKey="Johnson H">H Johnson</name>
</author>
<author>
<name sortKey="Verspoor, K" uniqKey="Verspoor K">K Verspoor</name>
</author>
<author>
<name sortKey="Roeder, C" uniqKey="Roeder C">C Roeder</name>
</author>
<author>
<name sortKey="Hunter, L" uniqKey="Hunter L">L Hunter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blake, C" uniqKey="Blake C">C Blake</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kim, Jd" uniqKey="Kim J">JD Kim</name>
</author>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Bossy, R" uniqKey="Bossy R">R Bossy</name>
</author>
<author>
<name sortKey="Nguyen, N" uniqKey="Nguyen N">N Nguyen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Miwa, M" uniqKey="Miwa M">M Miwa</name>
</author>
<author>
<name sortKey="Thompson, P" uniqKey="Thompson P">P Thompson</name>
</author>
<author>
<name sortKey="Ananiadou, S" uniqKey="Ananiadou S">S Ananiadou</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bada, M" uniqKey="Bada M">M Bada</name>
</author>
<author>
<name sortKey="Eckert, M" uniqKey="Eckert M">M Eckert</name>
</author>
<author>
<name sortKey="Evans, D" uniqKey="Evans D">D Evans</name>
</author>
<author>
<name sortKey="Garcia, K" uniqKey="Garcia K">K Garcia</name>
</author>
<author>
<name sortKey="Shipley, K" uniqKey="Shipley K">K Shipley</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">23613707</article-id>
<article-id pub-id-type="pmc">3629104</article-id>
<article-id pub-id-type="publisher-id">PONE-D-12-34514</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0055814</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline-v2">
<subject>Biology</subject>
<subj-group>
<subject>Computational Biology</subject>
<subj-group>
<subject>Biological Data Management</subject>
<subject>Natural Language Processing</subject>
<subject>Text Mining</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v2">
<subject>Computer Science</subject>
<subj-group>
<subject>Algorithms</subject>
</subj-group>
<subj-group>
<subject>Information Technology</subject>
<subj-group>
<subject>Databases</subject>
</subj-group>
</subj-group>
<subj-group>
<subject>Software Engineering</subject>
<subj-group>
<subject>Software Tools</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v2">
<subject>Engineering</subject>
<subj-group>
<subject>Signal Processing</subject>
<subj-group>
<subject>Data Mining</subject>
</subj-group>
</subj-group>
<subj-group>
<subject>Software Engineering</subject>
<subj-group>
<subject>Software Tools</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v2">
<subject>Mathematics</subject>
<subj-group>
<subject>Applied Mathematics</subject>
<subj-group>
<subject>Algorithms</subject>
</subj-group>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization</article-title>
<alt-title alt-title-type="running-head">Event Extraction with Multi-Level Normalization</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" equal-contrib="yes">
<name>
<surname>Van Landeghem</surname>
<given-names>Sofie</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" equal-contrib="yes">
<name>
<surname>Björne</surname>
<given-names>Jari</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wei</surname>
<given-names>Chih-Hsuan</given-names>
</name>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
<xref ref-type="aff" rid="aff6">
<sup>6</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hakala</surname>
<given-names>Kai</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pyysalo</surname>
<given-names>Sampo</given-names>
</name>
<xref ref-type="aff" rid="aff7">
<sup>7</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ananiadou</surname>
<given-names>Sophia</given-names>
</name>
<xref ref-type="aff" rid="aff7">
<sup>7</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kao</surname>
<given-names>Hung-Yu</given-names>
</name>
<xref ref-type="aff" rid="aff6">
<sup>6</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lu</surname>
<given-names>Zhiyong</given-names>
</name>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Salakoski</surname>
<given-names>Tapio</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Van de Peer</surname>
<given-names>Yves</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ginter</surname>
<given-names>Filip</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<xref ref-type="corresp" rid="cor1">
<sup>*</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<label>1</label>
<addr-line>Department of Plant Systems Biology, VIB, Gent, Belgium</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>Turku Centre for Computer Science, Turku, Finland</addr-line>
</aff>
<aff id="aff4">
<label>4</label>
<addr-line>Department of Information Technology, University of Turku, Finland</addr-line>
</aff>
<aff id="aff5">
<label>5</label>
<addr-line>National Center for Biotechnology Information, Bethesda, Maryland, United States of America</addr-line>
</aff>
<aff id="aff6">
<label>6</label>
<addr-line>Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan</addr-line>
</aff>
<aff id="aff7">
<label>7</label>
<addr-line>National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester, United Kingdom</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Aerts</surname>
<given-names>Stein</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">
<addr-line>University of Leuven, Belgium</addr-line>
</aff>
<author-notes>
<corresp id="cor1">* E-mail:
<email>ginter@cs.utu.fi</email>
</corresp>
<fn fn-type="conflict">
<p>
<bold>Competing Interests: </bold>
The authors have declared that no competing interests exist.</p>
</fn>
<fn fn-type="con">
<p>Conceived and designed the experiments: SVL JB SP SA HYK ZL TS YVdP FG. Performed the experiments: SVL JB CHW KH FG. Analyzed the data: SVL JB CHW SP YVdP FG. Wrote the paper: SVL JB SP FG.</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<year>2013</year>
</pub-date>
<pub-date pub-type="epub">
<day>17</day>
<month>4</month>
<year>2013</year>
</pub-date>
<volume>8</volume>
<issue>4</issue>
<elocation-id>e55814</elocation-id>
<history>
<date date-type="received">
<day>2</day>
<month>11</month>
<year>2012</year>
</date>
<date date-type="accepted">
<day>2</day>
<month>1</month>
<year>2013</year>
</date>
</history>
<permissions>
<copyright-year>2013</copyright-year>
<license>
<license-p>This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.</license-p>
</license>
</permissions>
<abstract>
<p>Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (
<ext-link ext-link-type="uri" xlink:href="http://www.evexdb.org/api/v001/">http://www.evexdb.org/api/v001/</ext-link>
). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from
<ext-link ext-link-type="uri" xlink:href="http://evexdb.org/download/">http://evexdb.org/download/</ext-link>
, under the Creative Commons – Attribution – Share Alike (CC BY-SA) license.</p>
</abstract>
<funding-group>
<funding-statement>This work was supported by the Research Foundation Flanders (
<ext-link ext-link-type="uri" xlink:href="http://www.fwo.be/">http://www.fwo.be/</ext-link>
); the Intramural Research Program of the National Institutes of Health, the National Library of Medicine (
<ext-link ext-link-type="uri" xlink:href="http://irp.nih.gov/">http://irp.nih.gov/</ext-link>
); the Academy of Finland (
<ext-link ext-link-type="uri" xlink:href="http://www.aka.fi">http://www.aka.fi</ext-link>
); and the UK Biotechnology and Biological Sciences Research Council (
<ext-link ext-link-type="uri" xlink:href="http://www.bbsrc.ac.uk">http://www.bbsrc.ac.uk</ext-link>
). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement>
</funding-group>
<counts>
<page-count count="12"></page-count>
</counts>
</article-meta>
</front>
</pmc>
<affiliations>
<list>
<country>
<li>Belgique</li>
<li>Finlande</li>
<li>Royaume-Uni</li>
<li>Taïwan</li>
<li>États-Unis</li>
</country>
<region>
<li>Angleterre</li>
<li>Finlande occidentale</li>
<li>Grand Manchester</li>
<li>Maryland</li>
</region>
<settlement>
<li>Manchester</li>
<li>Turku</li>
</settlement>
<orgName>
<li>Université de Manchester</li>
<li>Université de Turku</li>
</orgName>
</list>
<tree>
<country name="Belgique">
<noRegion>
<name sortKey="Van Landeghem, Sofie" sort="Van Landeghem, Sofie" uniqKey="Van Landeghem S" first="Sofie" last="Van Landeghem">Sofie Van Landeghem</name>
</noRegion>
<name sortKey="Van De Peer, Yves" sort="Van De Peer, Yves" uniqKey="Van De Peer Y" first="Yves" last="Van De Peer">Yves Van De Peer</name>
<name sortKey="Van De Peer, Yves" sort="Van De Peer, Yves" uniqKey="Van De Peer Y" first="Yves" last="Van De Peer">Yves Van De Peer</name>
<name sortKey="Van Landeghem, Sofie" sort="Van Landeghem, Sofie" uniqKey="Van Landeghem S" first="Sofie" last="Van Landeghem">Sofie Van Landeghem</name>
</country>
<country name="Finlande">
<region name="Finlande occidentale">
<name sortKey="Bjorne, Jari" sort="Bjorne, Jari" uniqKey="Bjorne J" first="Jari" last="Björne">Jari Björne</name>
</region>
<name sortKey="Bjorne, Jari" sort="Bjorne, Jari" uniqKey="Bjorne J" first="Jari" last="Björne">Jari Björne</name>
<name sortKey="Ginter, Filip" sort="Ginter, Filip" uniqKey="Ginter F" first="Filip" last="Ginter">Filip Ginter</name>
<name sortKey="Hakala, Kai" sort="Hakala, Kai" uniqKey="Hakala K" first="Kai" last="Hakala">Kai Hakala</name>
<name sortKey="Salakoski, Tapio" sort="Salakoski, Tapio" uniqKey="Salakoski T" first="Tapio" last="Salakoski">Tapio Salakoski</name>
<name sortKey="Salakoski, Tapio" sort="Salakoski, Tapio" uniqKey="Salakoski T" first="Tapio" last="Salakoski">Tapio Salakoski</name>
</country>
<country name="États-Unis">
<region name="Maryland">
<name sortKey="Wei, Chih Hsuan" sort="Wei, Chih Hsuan" uniqKey="Wei C" first="Chih-Hsuan" last="Wei">Chih-Hsuan Wei</name>
</region>
<name sortKey="Lu, Zhiyong" sort="Lu, Zhiyong" uniqKey="Lu Z" first="Zhiyong" last="Lu">Zhiyong Lu</name>
</country>
<country name="Taïwan">
<noRegion>
<name sortKey="Wei, Chih Hsuan" sort="Wei, Chih Hsuan" uniqKey="Wei C" first="Chih-Hsuan" last="Wei">Chih-Hsuan Wei</name>
</noRegion>
<name sortKey="Kao, Hung Yu" sort="Kao, Hung Yu" uniqKey="Kao H" first="Hung-Yu" last="Kao">Hung-Yu Kao</name>
</country>
<country name="Royaume-Uni">
<region name="Angleterre">
<name sortKey="Pyysalo, Sampo" sort="Pyysalo, Sampo" uniqKey="Pyysalo S" first="Sampo" last="Pyysalo">Sampo Pyysalo</name>
</region>
<name sortKey="Ananiadou, Sophia" sort="Ananiadou, Sophia" uniqKey="Ananiadou S" first="Sophia" last="Ananiadou">Sophia Ananiadou</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Belgique/explor/OpenAccessBelV2/Data/Pmc/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000265 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd -nk 000265 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Belgique
   |area=    OpenAccessBelV2
   |flux=    Pmc
   |étape=   Checkpoint
   |type=    RBID
   |clé=     PMC:3629104
   |texte=   Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/RBID.i   -Sk "pubmed:23613707" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a OpenAccessBelV2 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Dec 1 00:43:49 2016. Site generation: Wed Mar 6 14:51:30 2024