Serveur d'exploration sur SGML

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Building a glaucoma interaction network using a text mining approach

Identifieur interne : 000015 ( Pmc/Corpus ); précédent : 000014; suivant : 000016

Building a glaucoma interaction network using a text mining approach

Auteurs : Maha Soliman ; Olfa Nasraoui ; Nigel G. F. Cooper

Source :

RBID : PMC:4857381

Abstract

Background

The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease.

Results

A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx.

Conclusions

This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of relations that could not be found in existing interaction databases and that were found to be new, in addition to a smaller subnetwork consisting of interconnected clusters of seven glaucoma genes. Future improvements can be applied towards obtaining a better version of this network.

Electronic supplementary material

The online version of this article (doi:10.1186/s13040-016-0096-2) contains supplementary material, which is available to authorized users.


Url:
DOI: 10.1186/s13040-016-0096-2
PubMed: 27152122
PubMed Central: 4857381

Links to Exploration step

PMC:4857381

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Building a glaucoma interaction network using a text mining approach</title>
<author>
<name sortKey="Soliman, Maha" sort="Soliman, Maha" uniqKey="Soliman M" first="Maha" last="Soliman">Maha Soliman</name>
<affiliation>
<nlm:aff id="Aff1">Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nasraoui, Olfa" sort="Nasraoui, Olfa" uniqKey="Nasraoui O" first="Olfa" last="Nasraoui">Olfa Nasraoui</name>
<affiliation>
<nlm:aff id="Aff2">Knowledge Discovery & Web Mining Lab, Department of Computer Engineering & Computer Science, University of Louisville, J.B Speed School of Engineering, Louisville, KY USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Cooper, Nigel G F" sort="Cooper, Nigel G F" uniqKey="Cooper N" first="Nigel G. F." last="Cooper">Nigel G. F. Cooper</name>
<affiliation>
<nlm:aff id="Aff1">Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27152122</idno>
<idno type="pmc">4857381</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4857381</idno>
<idno type="RBID">PMC:4857381</idno>
<idno type="doi">10.1186/s13040-016-0096-2</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000015</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000015</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Building a glaucoma interaction network using a text mining approach</title>
<author>
<name sortKey="Soliman, Maha" sort="Soliman, Maha" uniqKey="Soliman M" first="Maha" last="Soliman">Maha Soliman</name>
<affiliation>
<nlm:aff id="Aff1">Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Nasraoui, Olfa" sort="Nasraoui, Olfa" uniqKey="Nasraoui O" first="Olfa" last="Nasraoui">Olfa Nasraoui</name>
<affiliation>
<nlm:aff id="Aff2">Knowledge Discovery & Web Mining Lab, Department of Computer Engineering & Computer Science, University of Louisville, J.B Speed School of Engineering, Louisville, KY USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Cooper, Nigel G F" sort="Cooper, Nigel G F" uniqKey="Cooper N" first="Nigel G. F." last="Cooper">Nigel G. F. Cooper</name>
<affiliation>
<nlm:aff id="Aff1">Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BioData Mining</title>
<idno type="eISSN">1756-0381</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease.</p>
</sec>
<sec>
<title>Results</title>
<p>A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at
<ext-link ext-link-type="uri" xlink:href="http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx">http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx</ext-link>
.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of relations that could not be found in existing interaction databases and that were found to be new, in addition to a smaller subnetwork consisting of interconnected clusters of seven glaucoma genes. Future improvements can be applied towards obtaining a better version of this network.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s13040-016-0096-2) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Swanson, Dr" uniqKey="Swanson D">DR Swanson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Srinivasan, P" uniqKey="Srinivasan P">P Srinivasan</name>
</author>
<author>
<name sortKey="Libbus, B" uniqKey="Libbus B">B Libbus</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wren, Jd" uniqKey="Wren J">JD Wren</name>
</author>
<author>
<name sortKey="Bekeredjian, R" uniqKey="Bekeredjian R">R Bekeredjian</name>
</author>
<author>
<name sortKey="Stewart, Ja" uniqKey="Stewart J">JA Stewart</name>
</author>
<author>
<name sortKey="Shohet, Rv" uniqKey="Shohet R">RV Shohet</name>
</author>
<author>
<name sortKey="Garner, Hr" uniqKey="Garner H">HR Garner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, H" uniqKey="Chen H">H Chen</name>
</author>
<author>
<name sortKey="Sharp, Bm" uniqKey="Sharp B">BM Sharp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Der Eijk, Cc" uniqKey="Van Der Eijk C">CC van der Eijk</name>
</author>
<author>
<name sortKey="Van Mulligen, Em" uniqKey="Van Mulligen E">EM van Mulligen</name>
</author>
<author>
<name sortKey="Kors, Ja" uniqKey="Kors J">JA Kors</name>
</author>
<author>
<name sortKey="Mons, B" uniqKey="Mons B">B Mons</name>
</author>
<author>
<name sortKey="Van Den Berg, J" uniqKey="Van Den Berg J">J van den Berg</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Abulaish, M" uniqKey="Abulaish M">M Abulaish</name>
</author>
<author>
<name sortKey="Dey, L" uniqKey="Dey L">L Dey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="He, M" uniqKey="He M">M He</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, Y" uniqKey="Yang Y">Y Yang</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Zhou, K" uniqKey="Zhou K">K Zhou</name>
</author>
<author>
<name sortKey="Hong, A" uniqKey="Hong A">A Hong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Malhotra, A" uniqKey="Malhotra A">A Malhotra</name>
</author>
<author>
<name sortKey="Younesi, E" uniqKey="Younesi E">E Younesi</name>
</author>
<author>
<name sortKey="Bagewadi, S" uniqKey="Bagewadi S">S Bagewadi</name>
</author>
<author>
<name sortKey="Hofmann Apitius, M" uniqKey="Hofmann Apitius M">M Hofmann-Apitius</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quan, C" uniqKey="Quan C">C Quan</name>
</author>
<author>
<name sortKey="Ren, F" uniqKey="Ren F">F Ren</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ozgur, A" uniqKey="Ozgur A">A Ozgur</name>
</author>
<author>
<name sortKey="Vu, T" uniqKey="Vu T">T Vu</name>
</author>
<author>
<name sortKey="Erkan, G" uniqKey="Erkan G">G Erkan</name>
</author>
<author>
<name sortKey="Radev, Dr" uniqKey="Radev D">DR Radev</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, X" uniqKey="Wu X">X Wu</name>
</author>
<author>
<name sortKey="Chen, L" uniqKey="Chen L">L Chen</name>
</author>
<author>
<name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Skusa, A" uniqKey="Skusa A">A Skusa</name>
</author>
<author>
<name sortKey="Ruegg, A" uniqKey="Ruegg A">A Rüegg</name>
</author>
<author>
<name sortKey="Kohler, J" uniqKey="Kohler J">J Köhler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nguyen, N" uniqKey="Nguyen N">N Nguyen</name>
</author>
<author>
<name sortKey="Miwa, M" uniqKey="Miwa M">M Miwa</name>
</author>
<author>
<name sortKey="Tsuruoka, Y" uniqKey="Tsuruoka Y">Y Tsuruoka</name>
</author>
<author>
<name sortKey="Tojo, S" uniqKey="Tojo S">S Tojo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Etzioni, O" uniqKey="Etzioni O">O Etzioni</name>
</author>
<author>
<name sortKey="Banko, M" uniqKey="Banko M">M Banko</name>
</author>
<author>
<name sortKey="Soderland, S" uniqKey="Soderland S">S Soderland</name>
</author>
<author>
<name sortKey="Weld, Ds" uniqKey="Weld D">DS Weld</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rinaldi, F" uniqKey="Rinaldi F">F Rinaldi</name>
</author>
<author>
<name sortKey="Clematide, S" uniqKey="Clematide S">S Clematide</name>
</author>
<author>
<name sortKey="Marques, H" uniqKey="Marques H">H Marques</name>
</author>
<author>
<name sortKey="Ellendorff, T" uniqKey="Ellendorff T">T Ellendorff</name>
</author>
<author>
<name sortKey="Romacker, M" uniqKey="Romacker M">M Romacker</name>
</author>
<author>
<name sortKey="Rodriguez Esteban, R" uniqKey="Rodriguez Esteban R">R Rodriguez-Esteban</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jelier, R" uniqKey="Jelier R">R Jelier</name>
</author>
<author>
<name sortKey="Schuemie, Mj" uniqKey="Schuemie M">MJ Schuemie</name>
</author>
<author>
<name sortKey="Veldhoven, A" uniqKey="Veldhoven A">A Veldhoven</name>
</author>
<author>
<name sortKey="Dorssers, Lc" uniqKey="Dorssers L">LC Dorssers</name>
</author>
<author>
<name sortKey="Jenster, G" uniqKey="Jenster G">G Jenster</name>
</author>
<author>
<name sortKey="Kors, Ja" uniqKey="Kors J">JA Kors</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kingman, S" uniqKey="Kingman S">S Kingman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Beidoe, G" uniqKey="Beidoe G">G Beidoe</name>
</author>
<author>
<name sortKey="Mousa, Sa" uniqKey="Mousa S">SA Mousa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hu, T" uniqKey="Hu T">T HU</name>
</author>
<author>
<name sortKey="Darabos, C" uniqKey="Darabos C">C Darabos</name>
</author>
<author>
<name sortKey="Cricco Me, Ke" uniqKey="Cricco Me K">KE Cricco Me</name>
</author>
<author>
<name sortKey="Moore, Jh" uniqKey="Moore J">JH Moore</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Basu, K" uniqKey="Basu K">K Basu</name>
</author>
<author>
<name sortKey="Sen, A" uniqKey="Sen A">A Sen</name>
</author>
<author>
<name sortKey="Ray, K" uniqKey="Ray K">K Ray</name>
</author>
<author>
<name sortKey="Ghosh, I" uniqKey="Ghosh I">I Ghosh</name>
</author>
<author>
<name sortKey="Datta, K" uniqKey="Datta K">K Datta</name>
</author>
<author>
<name sortKey="Mukhopadhyay, A" uniqKey="Mukhopadhyay A">A Mukhopadhyay</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mooney, Rj" uniqKey="Mooney R">RJ Mooney</name>
</author>
<author>
<name sortKey="Bunescu, R" uniqKey="Bunescu R">R Bunescu</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tanabe, L" uniqKey="Tanabe L">L Tanabe</name>
</author>
<author>
<name sortKey="Xie, N" uniqKey="Xie N">N Xie</name>
</author>
<author>
<name sortKey="Thom, Lh" uniqKey="Thom L">LH Thom</name>
</author>
<author>
<name sortKey="Matten, W" uniqKey="Matten W">W Matten</name>
</author>
<author>
<name sortKey="Wilbur, Wj" uniqKey="Wilbur W">WJ Wilbur</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kim, Jd" uniqKey="Kim J">JD Kim</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krallinger, M" uniqKey="Krallinger M">M Krallinger</name>
</author>
<author>
<name sortKey="Leitner, F" uniqKey="Leitner F">F Leitner</name>
</author>
<author>
<name sortKey="Valencia, A" uniqKey="Valencia A">A Valencia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hamosh, A" uniqKey="Hamosh A">A Hamosh</name>
</author>
<author>
<name sortKey="Scott, Af" uniqKey="Scott A">AF Scott</name>
</author>
<author>
<name sortKey="Amberger, Js" uniqKey="Amberger J">JS Amberger</name>
</author>
<author>
<name sortKey="Bocchini, Ca" uniqKey="Bocchini C">CA Bocchini</name>
</author>
<author>
<name sortKey="Mckusick, Va" uniqKey="Mckusick V">VA McKusick</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bauer Mehren, A" uniqKey="Bauer Mehren A">A Bauer-Mehren</name>
</author>
<author>
<name sortKey="Bundschus, M" uniqKey="Bundschus M">M Bundschus</name>
</author>
<author>
<name sortKey="Rautschka, M" uniqKey="Rautschka M">M Rautschka</name>
</author>
<author>
<name sortKey="Mayer, Ma" uniqKey="Mayer M">MA Mayer</name>
</author>
<author>
<name sortKey="Sanz, F" uniqKey="Sanz F">F Sanz</name>
</author>
<author>
<name sortKey="Furlong, Li" uniqKey="Furlong L">LI Furlong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bauer Mehren, A" uniqKey="Bauer Mehren A">A Bauer-Mehren</name>
</author>
<author>
<name sortKey="Rautschka, M" uniqKey="Rautschka M">M Rautschka</name>
</author>
<author>
<name sortKey="Sanz, F" uniqKey="Sanz F">F Sanz</name>
</author>
<author>
<name sortKey="Furlong, Li" uniqKey="Furlong L">LI Furlong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gray, Ka" uniqKey="Gray K">KA Gray</name>
</author>
<author>
<name sortKey="Yates, B" uniqKey="Yates B">B Yates</name>
</author>
<author>
<name sortKey="Seal, Rl" uniqKey="Seal R">RL Seal</name>
</author>
<author>
<name sortKey="Wright, Mw" uniqKey="Wright M">MW Wright</name>
</author>
<author>
<name sortKey="Bruford, Ea" uniqKey="Bruford E">EA Bruford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fader, A" uniqKey="Fader A">A Fader</name>
</author>
<author>
<name sortKey="Soderland, S" uniqKey="Soderland S">S Soderland</name>
</author>
<author>
<name sortKey="Etzioni, O" uniqKey="Etzioni O">O Etzioni</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stark, C" uniqKey="Stark C">C Stark</name>
</author>
<author>
<name sortKey="Breitkreutz, B J" uniqKey="Breitkreutz B">B-J Breitkreutz</name>
</author>
<author>
<name sortKey="Reguly, T" uniqKey="Reguly T">T Reguly</name>
</author>
<author>
<name sortKey="Boucher, L" uniqKey="Boucher L">L Boucher</name>
</author>
<author>
<name sortKey="Breitkreutz, A" uniqKey="Breitkreutz A">A Breitkreutz</name>
</author>
<author>
<name sortKey="Tyers, M" uniqKey="Tyers M">M Tyers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rebhan, M" uniqKey="Rebhan M">M Rebhan</name>
</author>
<author>
<name sortKey="Chalifa Caspi, V" uniqKey="Chalifa Caspi V">V Chalifa-Caspi</name>
</author>
<author>
<name sortKey="Prilusky, J" uniqKey="Prilusky J">J Prilusky</name>
</author>
<author>
<name sortKey="Lancet, D" uniqKey="Lancet D">D Lancet</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bastian, M" uniqKey="Bastian M">M Bastian</name>
</author>
<author>
<name sortKey="Heymann, S" uniqKey="Heymann S">S Heymann</name>
</author>
<author>
<name sortKey="Jacomy, M" uniqKey="Jacomy M">M Jacomy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shannon, P" uniqKey="Shannon P">P Shannon</name>
</author>
<author>
<name sortKey="Markiel, A" uniqKey="Markiel A">A Markiel</name>
</author>
<author>
<name sortKey="Ozier, O" uniqKey="Ozier O">O Ozier</name>
</author>
<author>
<name sortKey="Baliga, Ns" uniqKey="Baliga N">NS Baliga</name>
</author>
<author>
<name sortKey="Wang, Jt" uniqKey="Wang J">JT Wang</name>
</author>
<author>
<name sortKey="Ramage, D" uniqKey="Ramage D">D Ramage</name>
</author>
<author>
<name sortKey="Amin, N" uniqKey="Amin N">N Amin</name>
</author>
<author>
<name sortKey="Schwikowski, B" uniqKey="Schwikowski B">B Schwikowski</name>
</author>
<author>
<name sortKey="Ideker, T" uniqKey="Ideker T">T Ideker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mi, H" uniqKey="Mi H">H Mi</name>
</author>
<author>
<name sortKey="Muruganujan, A" uniqKey="Muruganujan A">A Muruganujan</name>
</author>
<author>
<name sortKey="Casagrande, Jt" uniqKey="Casagrande J">JT Casagrande</name>
</author>
<author>
<name sortKey="Thomas, Pd" uniqKey="Thomas P">PD Thomas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Da Huang, W" uniqKey="Da Huang W">W da Huang</name>
</author>
<author>
<name sortKey="Sherman, Bt" uniqKey="Sherman B">BT Sherman</name>
</author>
<author>
<name sortKey="Lempicki, Ra" uniqKey="Lempicki R">RA Lempicki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Da Huang, W" uniqKey="Da Huang W">W da Huang</name>
</author>
<author>
<name sortKey="Sherman, Bt" uniqKey="Sherman B">BT Sherman</name>
</author>
<author>
<name sortKey="Lempicki, Ra" uniqKey="Lempicki R">RA Lempicki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Carmona Saez, P" uniqKey="Carmona Saez P">P Carmona-Saez</name>
</author>
<author>
<name sortKey="Chagoyen, M" uniqKey="Chagoyen M">M Chagoyen</name>
</author>
<author>
<name sortKey="Tirado, F" uniqKey="Tirado F">F Tirado</name>
</author>
<author>
<name sortKey="Carazo, Jm" uniqKey="Carazo J">JM Carazo</name>
</author>
<author>
<name sortKey="Pascual Montano, A" uniqKey="Pascual Montano A">A Pascual-Montano</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tabas Madrid, D" uniqKey="Tabas Madrid D">D Tabas-Madrid</name>
</author>
<author>
<name sortKey="Nogales Cadenas, R" uniqKey="Nogales Cadenas R">R Nogales-Cadenas</name>
</author>
<author>
<name sortKey="Pascual Montano, A" uniqKey="Pascual Montano A">A Pascual-Montano</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rokicki, W" uniqKey="Rokicki W">W Rokicki</name>
</author>
<author>
<name sortKey="Dorecka, M" uniqKey="Dorecka M">M Dorecka</name>
</author>
<author>
<name sortKey="Romaniuk, W" uniqKey="Romaniuk W">W Romaniuk</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Villarreal, G" uniqKey="Villarreal G">G Villarreal</name>
</author>
<author>
<name sortKey="Chatterjee, A" uniqKey="Chatterjee A">A Chatterjee</name>
</author>
<author>
<name sortKey="Oh, Ss" uniqKey="Oh S">SS Oh</name>
</author>
<author>
<name sortKey="Oh, Dj" uniqKey="Oh D">DJ Oh</name>
</author>
<author>
<name sortKey="Kang, Mh" uniqKey="Kang M">MH Kang</name>
</author>
<author>
<name sortKey="Rhee, Dj" uniqKey="Rhee D">DJ Rhee</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barabasi, A L" uniqKey="Barabasi A">A-L Barabási</name>
</author>
<author>
<name sortKey="Albert, R" uniqKey="Albert R">R Albert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ravasz, E" uniqKey="Ravasz E">E Ravasz</name>
</author>
<author>
<name sortKey="Somera, Al" uniqKey="Somera A">AL Somera</name>
</author>
<author>
<name sortKey="Mongru, Da" uniqKey="Mongru D">DA Mongru</name>
</author>
<author>
<name sortKey="Oltvai, Zn" uniqKey="Oltvai Z">ZN Oltvai</name>
</author>
<author>
<name sortKey="Barabasi, A L" uniqKey="Barabasi A">A-L Barabási</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yook, Sh" uniqKey="Yook S">SH Yook</name>
</author>
<author>
<name sortKey="Oltvai, Zn" uniqKey="Oltvai Z">ZN Oltvai</name>
</author>
<author>
<name sortKey="Barabasi, Al" uniqKey="Barabasi A">AL Barabási</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chtioui, S" uniqKey="Chtioui S">S Chtioui</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ekbal, A" uniqKey="Ekbal A">A Ekbal</name>
</author>
<author>
<name sortKey="Saha, S" uniqKey="Saha S">S Saha</name>
</author>
<author>
<name sortKey="Sikdar, Uk" uniqKey="Sikdar U">UK Sikdar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blondel, Vd" uniqKey="Blondel V">VD Blondel</name>
</author>
<author>
<name sortKey="Guillaume, Jl" uniqKey="Guillaume J">JL Guillaume</name>
</author>
<author>
<name sortKey="Lambiotte, R" uniqKey="Lambiotte R">R Lambiotte</name>
</author>
<author>
<name sortKey="Lefebvre, E" uniqKey="Lefebvre E">E Lefebvre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lambiotte, R" uniqKey="Lambiotte R">R Lambiotte</name>
</author>
<author>
<name sortKey="Delvenne, Jc" uniqKey="Delvenne J">JC Delvenne</name>
</author>
<author>
<name sortKey="Barahona, M" uniqKey="Barahona M">M Barahona</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pyysalo, S" uniqKey="Pyysalo S">S Pyysalo</name>
</author>
<author>
<name sortKey="Ohta, T" uniqKey="Ohta T">T Ohta</name>
</author>
<author>
<name sortKey="Kim, J D" uniqKey="Kim J">J-D Kim</name>
</author>
<author>
<name sortKey="Tsujii, J" uniqKey="Tsujii J">J Tsujii</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Marneffe, M C" uniqKey="De Marneffe M">M-C De Marneffe</name>
</author>
<author>
<name sortKey="Maccartney, B" uniqKey="Maccartney B">B MacCartney</name>
</author>
<author>
<name sortKey="Manning, Cd" uniqKey="Manning C">CD Manning</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nakatake, S" uniqKey="Nakatake S">S Nakatake</name>
</author>
<author>
<name sortKey="Yoshida, S" uniqKey="Yoshida S">S Yoshida</name>
</author>
<author>
<name sortKey="Nakao, S" uniqKey="Nakao S">S Nakao</name>
</author>
<author>
<name sortKey="Arita, R" uniqKey="Arita R">R Arita</name>
</author>
<author>
<name sortKey="Yasuda, M" uniqKey="Yasuda M">M Yasuda</name>
</author>
<author>
<name sortKey="Kita, T" uniqKey="Kita T">T Kita</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Dy" uniqKey="Wang D">DY Wang</name>
</author>
<author>
<name sortKey="Ray, A" uniqKey="Ray A">A Ray</name>
</author>
<author>
<name sortKey="Rodgers, K" uniqKey="Rodgers K">K Rodgers</name>
</author>
<author>
<name sortKey="Ergorul, C" uniqKey="Ergorul C">C Ergorul</name>
</author>
<author>
<name sortKey="Hyman, Bt" uniqKey="Hyman B">BT Hyman</name>
</author>
<author>
<name sortKey="Huang, W" uniqKey="Huang W">W Huang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stewart, Mw" uniqKey="Stewart M">MW Stewart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wecker, T" uniqKey="Wecker T">T Wecker</name>
</author>
<author>
<name sortKey="Han, H" uniqKey="Han H">H Han</name>
</author>
<author>
<name sortKey="Borner, J" uniqKey="Borner J">J Borner</name>
</author>
<author>
<name sortKey="Grehn, F" uniqKey="Grehn F">F Grehn</name>
</author>
<author>
<name sortKey="Schlunck, G" uniqKey="Schlunck G">G Schlunck</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Izzotti, A" uniqKey="Izzotti A">A Izzotti</name>
</author>
<author>
<name sortKey="Longobardi, M" uniqKey="Longobardi M">M Longobardi</name>
</author>
<author>
<name sortKey="Cartiglia, C" uniqKey="Cartiglia C">C Cartiglia</name>
</author>
<author>
<name sortKey="Sacca, Sc" uniqKey="Sacca S">SC Sacca</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BioData Min</journal-id>
<journal-id journal-id-type="iso-abbrev">BioData Min</journal-id>
<journal-title-group>
<journal-title>BioData Mining</journal-title>
</journal-title-group>
<issn pub-type="epub">1756-0381</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27152122</article-id>
<article-id pub-id-type="pmc">4857381</article-id>
<article-id pub-id-type="publisher-id">96</article-id>
<article-id pub-id-type="doi">10.1186/s13040-016-0096-2</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Building a glaucoma interaction network using a text mining approach</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Soliman</surname>
<given-names>Maha</given-names>
</name>
<address>
<email>maha.soliman@louisville.edu</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Nasraoui</surname>
<given-names>Olfa</given-names>
</name>
<xref ref-type="aff" rid="Aff2"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Cooper</surname>
<given-names>Nigel G. F.</given-names>
</name>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<aff id="Aff1">
<label></label>
Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA</aff>
<aff id="Aff2">
<label></label>
Knowledge Discovery & Web Mining Lab, Department of Computer Engineering & Computer Science, University of Louisville, J.B Speed School of Engineering, Louisville, KY USA</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>5</day>
<month>5</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>5</day>
<month>5</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="collection">
<year>2016</year>
</pub-date>
<volume>9</volume>
<elocation-id>17</elocation-id>
<history>
<date date-type="received">
<day>9</day>
<month>10</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>23</day>
<month>4</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>© Soliman et al. 2016</copyright-statement>
<license license-type="OpenAccess">
<license-p>
<bold>Open Access</bold>
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<sec>
<title>Background</title>
<p>The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease.</p>
</sec>
<sec>
<title>Results</title>
<p>A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at
<ext-link ext-link-type="uri" xlink:href="http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx">http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx</ext-link>
.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of relations that could not be found in existing interaction databases and that were found to be new, in addition to a smaller subnetwork consisting of interconnected clusters of seven glaucoma genes. Future improvements can be applied towards obtaining a better version of this network.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s13040-016-0096-2) contains supplementary material, which is available to authorized users.</p>
</sec>
</abstract>
<kwd-group xml:lang="en">
<title>Keywords</title>
<kwd>Text mining</kwd>
<kwd>Interaction network</kwd>
<kwd>Glaucoma</kwd>
<kwd>Relation extraction</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source>
<institution-wrap>
<institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/100000053</institution-id>
<institution>National Eye Institute</institution>
</institution-wrap>
</funding-source>
<award-id>R01EY017594</award-id>
<principal-award-recipient>
<name>
<surname>Cooper</surname>
<given-names>Nigel G. F.</given-names>
</name>
</principal-award-recipient>
</award-group>
</funding-group>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2016</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1">
<title>Background</title>
<p>Extraction of biological networks, related to specific diseases or conditions from the scientific literature, is an emerging problem which may be solved with the aid of text mining approaches. Biological networks are important features used for modelling, analysis and simulation of biological systems [
<xref ref-type="bibr" rid="CR1">1</xref>
], and for the development of hypotheses from data-sets [
<xref ref-type="bibr" rid="CR2">2</xref>
<xref ref-type="bibr" rid="CR6">6</xref>
]. In general, the inference of an interaction network from text can be sub-tasked as: 1) determination of the source of the text to be searched, 2) identification of the entities to be extracted (genes, proteins, metabolites, diseases), and 3) inference of potential relationships between selected entities. Once these subtasks are resolved, the entities and their relationships can be mapped to the nodes and edges of a biological network. A common aspect for subtasks two and three is their amenability to the use of text mining methods for their resolution.</p>
<p>As for the first subtask, the source of text to be mined can be abstracts or full text articles in collections of scientific publications. While the use of abstracts would be more advantageous due to their concise information content [
<xref ref-type="bibr" rid="CR7">7</xref>
<xref ref-type="bibr" rid="CR9">9</xref>
], an increasing number of text mining approaches make use of full text journals [
<xref ref-type="bibr" rid="CR10">10</xref>
]. However, in trying to deal with full text publications, there are technical challenges due to the existence of different formats (pdf, HTML) as well as non-uniform substructure across journals. In terms of the second subtask, there are many examples in the literature in which text mining approaches have been used to infer a relationship between biomarker genes and diseases/disorders, including for example, insulin-resistance [
<xref ref-type="bibr" rid="CR11">11</xref>
], Alzheimer disease [
<xref ref-type="bibr" rid="CR12">12</xref>
], breast cancer [
<xref ref-type="bibr" rid="CR13">13</xref>
], prostate cancer [
<xref ref-type="bibr" rid="CR14">14</xref>
], and respiratory disease [
<xref ref-type="bibr" rid="CR15">15</xref>
]. Therefore, it is possible to develop putative associations between biomarkers and glaucoma with a text mining approach. The third sub-task is to develop a relation extraction (RE) process to reliably infer binary relationships between the entities previously derived from subtask one. Relationships depend on the type of entities we are dealing with. For example, if an entity is a transcription factor, then the textual terms that reflect regulation (up/down-regulate…, etc.) can be sought in the relation extraction process. If an entity is a protein, then textual terms that reflect activation or binding are sought in the relation extraction process [
<xref ref-type="bibr" rid="CR16">16</xref>
,
<xref ref-type="bibr" rid="CR17">17</xref>
]. RE can be a closed or an open process. It is closed when there is a set of relations determined a priori such as, (“activate”, “up-regulate”, “express”) and the extractor predicts one of a finite and fixed set of relations. It is open when no relations are specified in advance [
<xref ref-type="bibr" rid="CR18">18</xref>
]. For example, an open RE system that runs over the sentence “
<italic>HSPA6</italic>
is a potential target gene of
<italic>FOXC1</italic>
”, will list the following binary relation:
<disp-formula id="Equa">
<alternatives>
<tex-math id="M1">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \left( HSPA 6,\ \mathrm{is}\ \mathrm{a}\ \mathrm{target}\ \mathrm{gene}, FOXC1\right) $$\end{document}</tex-math>
<mml:math id="M2">
<mml:mfenced close=")" open="(">
<mml:mrow>
<mml:mi mathvariant="italic">HSPA</mml:mi>
<mml:mn mathvariant="italic">6</mml:mn>
<mml:mo>,</mml:mo>
<mml:mspace width="0.25em"></mml:mspace>
<mml:mi mathvariant="normal">is</mml:mi>
<mml:mspace width="0.25em"></mml:mspace>
<mml:mi mathvariant="normal">a</mml:mi>
<mml:mspace width="0.25em"></mml:mspace>
<mml:mi mathvariant="normal">target</mml:mi>
<mml:mspace width="0.25em"></mml:mspace>
<mml:mi mathvariant="normal">gene</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="italic">FOXC</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:math>
<graphic xlink:href="13040_2016_96_Article_Equa.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>On the other hand, if a closed RE is used, this relation will not be extracted unless the relation “target” was included in the set of relations determined a priori. In general, a closed RE is useful when extracting relations from scientific literature, while an open RE is suitable when extracting relations from the web [
<xref ref-type="bibr" rid="CR19">19</xref>
].</p>
<p>Text mining services have evolved rapidly to become an important component of inference pipelines. The next generation of text mining approaches have to deal with the construction of complete text mining systems to aid the inference of interactions or associations between bio entities. OntoGene [
<xref ref-type="bibr" rid="CR20">20</xref>
], Anni [
<xref ref-type="bibr" rid="CR21">21</xref>
], RLISM [
<xref ref-type="bibr" rid="CR22">22</xref>
], and CRAB [
<xref ref-type="bibr" rid="CR23">23</xref>
] are examples of such next generation systems. In terms of usage, OntoGene is considered the most integrative because it allows the detection of entities and relationships from selected categories of entities, such as proteins, genes, drugs, diseases, and chemicals. On the other hand, Anni has the advantage of introducing an ontology based interface to MEDLINE, and it is capable of retrieving documents for several classes of biomedical concepts. In addition, RLIMS-P and CRAB 2.0 are topic specific approaches. For example, RLIMS-P targets protein phosphorylation and CRAB 2.0 targets cancer risk assessment.</p>
<p>The goal of this study is to initiate the development of a glaucoma interaction network with the aid of text mining the open access scientific literature housed in PubMed Central (PMC). According to the Glaucoma Research Foundation (GRF), glaucoma is the second leading cause of blindness [
<xref ref-type="bibr" rid="CR24">24</xref>
]. It is an invisible disease and gradually steals sight without warning. Generally, it cannot be cured, but it can be controlled [
<xref ref-type="bibr" rid="CR25">25</xref>
]. Some reported glaucoma interaction networks were based on genome wide association studies (GWAS) [
<xref ref-type="bibr" rid="CR26">26</xref>
,
<xref ref-type="bibr" rid="CR27">27</xref>
] while others focused on interaction networks from genome wide expression studies (GWES) [
<xref ref-type="bibr" rid="CR28">28</xref>
,
<xref ref-type="bibr" rid="CR29">29</xref>
] but none have yet been based solely on text mining of the vast swath of PMC literature, where all types of glaucoma studies are covered. Such a network is expected to have a wider coverage than prior efforts because it will not be inferred from a particular type of study but rather from all types of studies related to glaucoma.</p>
</sec>
<sec id="Sec2">
<title>Methods</title>
<p>Text mining enables the discovery of useful knowledge from unstructured or semi-structured text [
<xref ref-type="bibr" rid="CR30">30</xref>
,
<xref ref-type="bibr" rid="CR31">31</xref>
] which fits the goal of this study. Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
is the flow diagram that shows how the results in this study are generated. The text mining pipeline (Fig. 
<xref rid="Fig2" ref-type="fig">2</xref>
), which was used in step 3 of the flow diagram, starts from each article containing some information to be extracted. The article is first segmented into its constituent sentences using a segmenter. Each sentence is then sub-segmented into its constituent words, called tokens, using a tokenizer. Subsequently, part of speech (POS) tagging is applied to each of the tokens to identify the role of each word within the sentence. Additionally, a name entity recognition (NER) is used to identify target entities, which are gene names. Finally, a relation extraction (RE) routine is applied to extract existing relations within each sentence. The relations are then validated, where possible, against an existing reference knowledgebase. Finally, entities and relations are translated into an interaction network. The main tasks in our methodology are:
<fig id="Fig1">
<label>Fig. 1</label>
<caption>
<p>The workflow pipeline followed to build the glaucoma interaction network. Step 1: PubMed Central is queried for glaucoma related articles. Step 2: all glaucoma articles are collected and a glaucoma collection is constructed. Step 3: each document in the resulting collection is processed using the text mining pipeline detailed in Fig. 
<xref rid="Fig2" ref-type="fig">2</xref>
and a set of relations is obtained. Step 4: relations are stored into a database and filtered using SQL queries. Step 5: Filtered relations are subjected to manual inspection to identify meaningful relations worthy of validation. Step 6: inspected relations are then validated and evaluated against external reference databases. Step 7: validated relations are mapped to nodes and edges to form a potential glaucoma network. Step 8: network analysis of the resulting network is performed. The left panel contains external databases needed by each step of the workflow. See Table 
<xref rid="Tab1" ref-type="table">1</xref>
for definition of BD, and BO</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig1_HTML" id="MO1"></graphic>
</fig>
<fig id="Fig2">
<label>Fig. 2</label>
<caption>
<p>The Text Mining Pipeline. The text mining pipeline that corresponds to step 3 in Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
. First, the segmenter module segments each article into its constituent sentences denoted s
<sub>1</sub>
to s
<sub>n</sub>
. Second, the sentence tokenizer module tokenizes each sentence into a bag of words denoted w
<sub>1</sub>
to w
<sub>n</sub>
. Third, the part of Speech POS module identifies the role of each word in a sentence. Fourth, the name entity recognition module NER extracts gene mentions E
<sub>1</sub>
, E
<sub>2</sub>
, E
<sub>n</sub>
from the words of the sentence. Finally the relation extraction module (RE) extracts relations R
<sub>1</sub>
, R
<sub>2</sub>
, R
<sub>n</sub>
from the words of the sentence. The output interaction from applying this sequence of modules is in the form: “Es, Rs, Es” and is saved in a database of interactions</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig2_HTML" id="MO2"></graphic>
</fig>
</p>
<sec id="Sec3">
<title>Text selection and retrieval</title>
<p>Unlike PubMed, all articles in PubMed Central (PMC) are full text and open access. This makes PMC a suitable repository of the literature for mining full text articles. We used a PubMed medical subject headings (MeSH) terms query to collect all possible glaucoma related articles. PMC Open Access was queried for eight types of key terms related to glaucoma including: “open-angle glaucoma”,”angle-closure glaucoma”,”secondary glaucoma”, “congenital glaucoma”, “hyper glaucoma”, “neovascular glaucoma”, “pigmentary dispersion glaucoma” and”open access”. The resulting data set composed a corpus of 8,660 full length articles ready for mining. Articles were downloaded from PMC Open Access according to the PMC OAI service [
<xref ref-type="bibr" rid="CR32">32</xref>
].</p>
</sec>
<sec id="Sec4">
<title>Entity selection and extraction</title>
<p>This study targets the extraction of gene associations which have been previously linked to glaucoma in the open access literature. Our target entities, broadly speaking, are “gene/gene products”. In our approach, we did not make any distinction between mentions of gene, mRNA, or protein in the text. For simplicity, we will reference gene/gene products as “gene”. Association can cover direct protein-protein interaction (PPI) type; predicted or found experimentally, bimolecular events such as expression and localization, and/or static relations. Our definition for association is a loose biological definition that covers any relation that holds between genes or related entities, that is of biological/biomedical or health-related interest, without necessarily implying change [
<xref ref-type="bibr" rid="CR33">33</xref>
,
<xref ref-type="bibr" rid="CR42">42</xref>
]. It is for this reason that we have opted for an open RE strategy.</p>
<p>Our glaucoma corpus was segmented into 1,398,475 sentences with the LingPipe sentence segmenter [
<xref ref-type="bibr" rid="CR34">34</xref>
]. Genes within sentences were annotated using the LingPipe taggers CharLmHmmChunker and TokenShapeChunker. The performance of any tagger can be evaluated by testing the tagger on an annotated corpus. GenTag [
<xref ref-type="bibr" rid="CR35">35</xref>
], and GENIA [
<xref ref-type="bibr" rid="CR36">36</xref>
] are well known biomedical annotated corpuses for performance evaluation of taggers. CharLmHmmChunker is trained on GenTag while TokenShapeChunker is trained on GENIA. Compared to GENIA, GenTag is more generic and less specific while GENIA has annotations for 36 biomedical named entities, and therefore provides a breadth classification. Our motivation for using both taggers is to maximize the number of extracted genes [
<xref ref-type="bibr" rid="CR37">37</xref>
]. Both taggers accept full length articles as text files and provide an output of annotated files, formatted in Standard Generalized Mark-up Language (SGML) for gene mentions. SGML uses XML tags to describe a mentioned gene but the user will need to specify an encoding system for both input and output files, as well as the desired type of input/output files. For our particular study, we have used the”UTF-8” encoding system, and plain text format for our input/output files.</p>
</sec>
<sec id="Sec5">
<title>Benchmarking genes</title>
<p>A total of 305 glaucoma benchmark genes (BG) were used in this study. Of this number, 155 come from the Online Mendelian Inheritance in the Man database, OMIM® [
<xref ref-type="bibr" rid="CR38">38</xref>
] (BO), while the 180 remaining genes come from the Disease Gene Network database DisGeNET release 2.1.0 (July 2014) [
<xref ref-type="bibr" rid="CR39">39</xref>
<xref ref-type="bibr" rid="CR41">41</xref>
] (BD). There were 30 benchmark genes (BC) common to both OMIM and DisGeNET databases (Table 
<xref rid="Tab1" ref-type="table">1</xref>
) indicating their likely importance to glaucoma. The union of OMIM and DisGeNET genes were used as benchmark genes for our intended glaucoma interaction network (Additional File
<xref rid="MOESM3" ref-type="media">3</xref>
). Table 
<xref rid="Tab1" ref-type="table">1</xref>
lists the benchmark gene types and their abbreviations. Any gene in the literature, which was co-listed in one sentence with one of these BG, is considered a putative association. Sentences, that contain one gene, were filtered out from the tagged sentences to focus our search on sentences that have two or more genes, provided that one of the genes was a BG. If the sentence does not contain a BG, then it is excluded. The idea of the filtering step was to ensure the existence of interacting genes with some BG. The next task is to capture associations between the BG and other non-benchmark genes (NBG), thus constructing a glaucoma interaction network capturing potentially novel relations. The output of this step is a list of associated genes. Some genes were found to be a gene name, a gene synonym, or a previous gene symbol and all of these aliases were mapped to their HUGO approved gene symbol [
<xref ref-type="bibr" rid="CR42">42</xref>
].
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>Glaucoma benchmark and non-benchmark genes used in building the network</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Abbreviation</th>
<th>Definition</th>
<th>Number</th>
<th>Percent</th>
</tr>
</thead>
<tbody>
<tr>
<td>BO</td>
<td>Benchmark glaucoma genes from OMIM database queried with “Glaucoma”</td>
<td>155</td>
<td>51 %</td>
</tr>
<tr>
<td>BD</td>
<td>Benchmark glaucoma genes from DisGeNET database queried with “Glaucoma”</td>
<td>180</td>
<td>59 %</td>
</tr>
<tr>
<td>BC</td>
<td>Benchmark glaucoma genes from the intersection of OMIM and DisGeNET databases</td>
<td>30 (BO∩BD)</td>
<td>10 %</td>
</tr>
<tr>
<td>BG</td>
<td>Benchmark glaucoma genes from union of BO and BD</td>
<td>305 (BO⋃BD)</td>
<td>100 %</td>
</tr>
<tr>
<td>NBG</td>
<td>Non-benchmark genes from PubMed Central</td>
<td>150</td>
<td>N/A</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>For simplicity, benchmark genes used to build the interaction network are abbreviated as BG. If BG are obtained from OMIM, then we call them BO. If BG are obtained from DisGeNET, then we call them BD. Benchmark genes, common to OMIM and DisGeNET, are called BC. Genes that are not benchmark genes are called NBG. The definition, number and percentages of all benchmark genes are listed in columns 2 to 4</p>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec id="Sec6">
<title>Relation extraction</title>
<p>Sentences that contain putative pairs were subjected to the open source relation extractor ReVerb [
<xref ref-type="bibr" rid="CR43">43</xref>
] to extract binary relationships between gene mentions. ReVerb parses each sentence and identifies its main verb. It then starts identifying the subject and object of the sentence. It outputs triplets of “E, Rel, E”, where E is an entity and Rel is a relationship (the main verb of the sentence). In addition to extracted relations, ReVerb also outputs a confidence score associated with the relation that reflects how much ReVerb is certain of its extraction mechanism. Application of ReVerb identified 33,339 binary relations. Extracted relations were verified using the interaction databases GeneMANIA [
<xref ref-type="bibr" rid="CR44">44</xref>
] and the Biological General Repository for Interaction Datasets database (BioGRID release 3.4.129) [
<xref ref-type="bibr" rid="CR45">45</xref>
]. If the reference databases could not recognize a particular gene in a relation, the gene’s different aliases are first retrieved from GeneCards [
<xref ref-type="bibr" rid="CR46">46</xref>
] and the relation is verified using GeneMANIA or BioGRID.</p>
</sec>
<sec id="Sec7">
<title>Network construction</title>
<p>Extracted entities and relations were manually inspected and mapped to nodes and edges. The Gephi open source graph visualization software tool [
<xref ref-type="bibr" rid="CR47">47</xref>
] was used to develop a graphic representation of the extracted interaction network (Fig. 
<xref rid="Fig7" ref-type="fig">7</xref>
). Analysis of the generated network was carried out with the Cytoscape network analyzer [
<xref ref-type="bibr" rid="CR48">48</xref>
]. Enrichment analysis for the extracted genes was conducted through the PANTHER classification system version 10.0 (release May 2015) [
<xref ref-type="bibr" rid="CR49">49</xref>
], as well as the Database for Annotation, Visualization and Integrated Discovery (DAVID) [
<xref ref-type="bibr" rid="CR50">50</xref>
,
<xref ref-type="bibr" rid="CR51">51</xref>
], and the gene annotations co-occurrence discovery database (GeneCodis) [
<xref ref-type="bibr" rid="CR52">52</xref>
<xref ref-type="bibr" rid="CR54">54</xref>
].</p>
</sec>
</sec>
<sec id="Sec8">
<title>Results</title>
<p>The output from ReVerb may contain incorrect triplets. Therefore, all triplets were saved into a database and were subjected to a filtering process, in which a query is constructed to extract triplets that contained any biological entity name. Filtering ReVerb relations resulted in a total of 550 triplets of “E, Rel, E”, where E is an entity (gene), and Rel is a verb associating the two entities. Some relations from the filtered list of the 550 relations involved “POAG” (Primary Open Angle Glaucoma), while others involve “XFS” (Exfoliation syndrome), a developmental variant of glaucoma (Table 
<xref rid="Tab2" ref-type="table">2</xref>
). The relations included known relations, new relations, disconnected relations, redundant relations, misinterpreted relations, and unverified relations. A known relation is a previously published relation, for example, the relation between
<italic>OPTN</italic>
and
<italic>MYOC</italic>
. A relation is defined as new when no direct link between its entities is reported by GeneMANIA or BioGRID. If an indirect link can be established between relation entities through an intervening gene(s), then it is evidence for the possibility of the relation. If no indirect link can be established between relation entities, then it is a disconnected relation, in other words, a relation involving nodes that are currently considered to be disconnected. A redundant relation is a known or new relation, but is repeated many times. A misinterpreted relation is a relation involving an acronym that is identical to a gene symbol, for example
<italic>ECD</italic>
is an acronym for the endothelial cell density, but was captured as a gene symbol for ecdysoneless homolog gene. An unverified relation is a known or new relation, involving a gene that is not identified by HUGO, GeneCards, or GeneMANIA. Filtering out redundant and misinterpreted relations resulted in a total of 257 unique triplets (E REL E), that include 74 genes from the combined DisGeNet and OMIM databases (BG), 17 of which were common (BC) to both databases (BO, BD), and 150 related genes (NBG) uncovered from the PubMed Central literature database. In terms of the classification of the extracted relations (Fig.
<xref rid="Fig3" ref-type="fig">3</xref>
), 76 were previously known relations, 149 were new relations, 21 were unverified and yet interpretable relations (Table 
<xref rid="Tab3" ref-type="table">3</xref>
) and 11 relations involved disconnected nodes, which linkage could not be confirmed at this time (Table 
<xref rid="Tab4" ref-type="table">4</xref>
) and yet some contextual evidence (Column 5 in Table 
<xref rid="Tab4" ref-type="table">4</xref>
) may suggest some plausible linkage. Both of the 550 and the 257 relations can be found in the Additional files
<xref rid="MOESM1" ref-type="media">1</xref>
and
<xref rid="MOESM2" ref-type="media">2</xref>
respectively.
<table-wrap id="Tab2">
<label>Table 2</label>
<caption>
<p>Genes related to Primary Open Angle Glaucoma (POAG) and Exfoliation syndrome (XFS)</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Gene</th>
<th>Disease</th>
<th>Confidence</th>
<th>Support</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<italic>MYOC</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.98</td>
<td>30</td>
</tr>
<tr>
<td>
<italic>LOXL1</italic>
</td>
<td>XFS</td>
<td char="." align="char">0.98</td>
<td>12</td>
</tr>
<tr>
<td>
<italic>TG</italic>
</td>
<td>XFS</td>
<td char="." align="char">0.98</td>
<td>1</td>
</tr>
<tr>
<td>
<italic>CYP1B1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.97</td>
<td>12</td>
</tr>
<tr>
<td>
<italic>GSTT1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.97</td>
<td>4</td>
</tr>
<tr>
<td>
<italic>CAV1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.97</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>SPARC</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.96</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>CPE</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.96</td>
<td>1</td>
</tr>
<tr>
<td>
<italic>APOE</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.94</td>
<td>7</td>
</tr>
<tr>
<td>
<italic>CDKN2B-AS1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.94</td>
<td>3</td>
</tr>
<tr>
<td>
<italic>OPTN</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.93</td>
<td>17</td>
</tr>
<tr>
<td>
<italic>NOS3</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.93</td>
<td>5</td>
</tr>
<tr>
<td>
<italic>WDR36</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.92</td>
<td>13</td>
</tr>
<tr>
<td>
<italic>GLC1A</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.92</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>GLC1N</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.92</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>GSTM1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.91</td>
<td>4</td>
</tr>
<tr>
<td>
<italic>PDIA5</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.89</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>GC</italic>
</td>
<td>XFS</td>
<td char="." align="char">0.88</td>
<td>1</td>
</tr>
<tr>
<td>
<italic>T</italic>
</td>
<td>XFS</td>
<td char="." align="char">0.88</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>TTR</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.87</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>LOXL1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.86</td>
<td>3</td>
</tr>
<tr>
<td>
<italic>CDKN2B</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.86</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>SIX1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.85</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>NTF4</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.83</td>
<td>4</td>
</tr>
<tr>
<td>
<italic>CNTNAP2</italic>
</td>
<td>XFS</td>
<td char="." align="char">0.83</td>
<td>1</td>
</tr>
<tr>
<td>
<italic>GLC3A</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.82</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>OPA1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.81</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>TBK1</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.78</td>
<td>2</td>
</tr>
<tr>
<td>
<italic>MMP1</italic>
</td>
<td>XFS</td>
<td char="." align="char">0.67</td>
<td>1</td>
</tr>
<tr>
<td>
<italic>MMP3</italic>
</td>
<td>XFS</td>
<td char="." align="char">0.67</td>
<td>1</td>
</tr>
<tr>
<td>
<italic>TP53</italic>
</td>
<td>POAG</td>
<td char="." align="char">0.66</td>
<td>1</td>
</tr>
<tr>
<td>
<bold>
<italic>ELN</italic>
</bold>
</td>
<td>
<bold>XFS</bold>
</td>
<td char="." align="char">
<bold>0.24</bold>
</td>
<td>
<bold>1</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The gene and its related disease are listed under the “Gene” and “Disease” columns respectively. The confidence column is the maximum of all confidence values reported by ReVerb for the same relation, extracted from multiple articles. Relations with low confidence are bolded. The support column is the count of articles listing the same gene relation</p>
</table-wrap-foot>
</table-wrap>
<fig id="Fig3">
<label>Fig. 3</label>
<caption>
<p>Illustration of the three types of extracted relations found by GeneMANIA in the glaucoma corpus. The total number of extracted relations from the workflow were 257 and they were distributed into 76 known, 149 new, 11 disconnected, and 21 were unverifiable relations. Each type of relation is represented by a picture below it. A known relation is illustrated by three circles directly linked to each other, where a circle represents a gene. A new relation is illustrated by a dotted line between blue and black genes, because an indirect path could be established from the blue to the black gene through the red gene. An unverified relation is illustrated by a question mark in the black gene and a dotted line between the blue and black gene. A disconnected relation is illustrated by the disconnected black gene from the rest of the connected genes</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig3_HTML" id="MO3"></graphic>
</fig>
<table-wrap id="Tab3">
<label>Table 3</label>
<caption>
<p>Twenty one extracted relations with unverified links from GeneMANIA</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Gene1</th>
<th>Gene2</th>
<th>Confidence</th>
<th>Unverified node</th>
<th>PMC Excerpt</th>
<th>PMCID/Year</th>
<th>Remark</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<italic>CDKN2B-AS1</italic>
</td>
<td>
<italic>CDKN2B</italic>
</td>
<td char="." align="char">0.93</td>
<td>
<italic>CDKN2B-AS1</italic>
</td>
<td>CDKN2B-AS1 has been shown to be involved in the regulation of CDKN2B, CDKN2A and ARF expression.</td>
<td>PMC4132588/2014</td>
<td>
<italic>CDKN2B-AS1</italic>
is a
<italic>CDKN2B</italic>
antisense. GeneMANIA does not recognize gene anti-sense.</td>
</tr>
<tr>
<td>
<italic>CDKN2B-AS1</italic>
</td>
<td>
<italic>CDKN2A</italic>
</td>
<td char="." align="char">0.93</td>
<td>
<italic>CDKN2B-AS1</italic>
</td>
<td>CDKN2B-AS1 has been shown to be involved in the regulation of CDKN2B, CDKN2A and ARF expression.</td>
<td>PMC4132588/2014</td>
<td>CDKN2B-AS1 is CDKN2B antisense.
<break></break>
GeneMANIA does not recognize gene anti-sense</td>
</tr>
<tr>
<td>
<italic>CDKN2B-AS1</italic>
</td>
<td>
<italic>ARF</italic>
</td>
<td char="." align="char">0.93</td>
<td>
<italic>CDKN2B-AS1</italic>
</td>
<td>CDKN2B-AS1 has been shown to be involved in the regulation of CDKN2B, CDKN2A and ARF expression.</td>
<td>PMC4132588/2014</td>
<td>CDKN2B-AS1 is CDKN2B antisense.
<break></break>
GeneMANIA does not recognize gene anti-sense</td>
</tr>
<tr>
<td>
<italic>CDKN2BAS</italic>
</td>
<td>
<italic>CDKN2A</italic>
</td>
<td char="." align="char">0.92</td>
<td>
<italic>CDKN2BAS</italic>
</td>
<td>CDKN2BAS also regulates the expression of CDKN2A, a gene previously shown to be down-regulated in other neurodegenerative disorders, including Alzheimer’s disease, suggesting that regulation of CDKN2A expression by CDKN2BAS could also contribute to degeneration of the optic nerve in glaucoma.</td>
<td>PMC3343074/2012</td>
<td>CDKN2BAS is CDKN2B antisense. GeneMANIA does not recognize gene anti-sense</td>
</tr>
<tr>
<td>
<italic>CNTF</italic>
</td>
<td>
<italic>LIFRß</italic>
</td>
<td char="." align="char">0.90</td>
<td>
<italic>LIFRß</italic>
</td>
<td>In mouse, human OSM activates the heterodimer of LIF receptor ß (LIFRß and gp130, like CNTF.</td>
<td>PMC4171539/2014</td>
<td>LIFRB is a mouse gene that GeneMANIA did not recognize</td>
</tr>
<tr>
<td>
<italic>miR410</italic>
</td>
<td>
<italic>VEGFA</italic>
</td>
<td char="." align="char">0.9</td>
<td>
<italic>miR410</italic>
</td>
<td>Protein levels of VEGFA were also down-regulated with miR410 overexpression and up-regulated with miR-410 interference.</td>
<td>PMC400246/2014</td>
<td>GeneMANIA does not recognize microRNAs.</td>
</tr>
<tr>
<td>
<italic>STAT1</italic>
</td>
<td>
<italic>ANRIL</italic>
</td>
<td char="." align="char">0.89</td>
<td>
<italic>STAT1</italic>
</td>
<td>The binding of STAT1 induces the expression of ANRIL, and represses CDKN2B in endothelial cells.</td>
<td>PMC3565320/2013</td>
<td>GeneMANIA does not recognize locus
<italic>ANRIL</italic>
</td>
</tr>
<tr>
<td>
<italic>siPITX2</italic>
</td>
<td>
<italic>DKK1</italic>
</td>
<td char="." align="char">0.83</td>
<td>
<italic>siPITX2</italic>
</td>
<td>DKK1 and KCNJ2 which were shown to be affected by PITX2 siRNAs by real time PCR experiments were each previously reported in one study.</td>
<td>PMC2654047/2009</td>
<td>
<italic>siPITX2</italic>
is short interfering
<italic>PITX2</italic>
. GeneMANIA does not recognize short interfering RNAs.</td>
</tr>
<tr>
<td>
<italic>siPITX2</italic>
</td>
<td>
<italic>KCNJ2</italic>
</td>
<td char="." align="char">0.83</td>
<td>
<italic>siPITX2</italic>
</td>
<td>DKK1 and KCNJ2 which were shown to be affected by PITX2 siRNAs by real time PCR experiments were each previously reported in one study.</td>
<td>PMC2654047/2009</td>
<td>
<italic>siPITX2</italic>
is short interfering
<italic>PITX2</italic>
. GeneMANIA does not recognize short interfering RNAs.</td>
</tr>
<tr>
<td>
<italic>XCPE1</italic>
</td>
<td>
<italic>LTBP2</italic>
</td>
<td char="." align="char">0.82</td>
<td>
<italic>XCPE1</italic>
</td>
<td>LTBP2 was predicted to be regulated by KLF4 (at 10 promoters), SP1 (at eight promoters), GATA4 and TEAD (at five promoters) and XCPE1 (at four promoters) was associated with LTBP2.</td>
<td>PMC4019825/2014</td>
<td>
<italic>XCPE1</italic>
is X gene core promoter element 1 (DNA element). GeneMANIA does not recognize
<italic>XCPE1</italic>
</td>
</tr>
<tr>
<td>
<italic>GLC3A</italic>
</td>
<td>
<italic>GLC3B</italic>
</td>
<td char="." align="char">0.78</td>
<td>
<italic>GLC3B</italic>
</td>
<td>To narrow down the potential candidate CNVs (genes) and match the identified CNVs to target regions and/or genes, we first focused on known chromosomal loci for PCG, namely GLC3A (2p2-p21), which harbors CYP1B1, GLC3B (1p36.2-p36.1), and GLC3C (14q23).</td>
<td>PMC3250374/2011</td>
<td>GeneMANIA does not recognize gene locus</td>
</tr>
<tr>
<td>
<italic>GLC3A</italic>
</td>
<td>
<italic>GLC3C</italic>
</td>
<td char="." align="char">0.78</td>
<td>
<italic>GLC3C</italic>
</td>
<td>To narrow down the potential candidate CNVs (genes) and match the identified CNVs to target regions and/or genes, we first focused on known chromosomal loci for PCG, namely GLC3A (2p2-p21), which harbors CYP1B1, GLC3B (1p36.2-p36.1), and GLC3C (14q23).</td>
<td>PMC3250374/2011</td>
<td>GeneMANIA does not recognize gene locus</td>
</tr>
<tr>
<td>
<italic>E50K</italic>
</td>
<td>
<italic>TBK1</italic>
</td>
<td char="." align="char">0.74</td>
<td>
<italic>E50K</italic>
</td>
<td>Recently, it was found that E50K mutant strongly interacted with TBK1, which evoked intracellular insolubility of OPTN, leading to improper OPTN transition from the endoplasmic reticulum to the Golgi body.</td>
<td>PMC4077773/2014</td>
<td>GeneMANIA recognizes
<italic>OPTN</italic>
not its mutated form.
<italic>E50K</italic>
is a mutation in the
<italic>OPTN</italic>
gene</td>
</tr>
<tr>
<td>
<italic>DCDC4</italic>
</td>
<td>
<italic>PAX6</italic>
</td>
<td char="." align="char">0.74</td>
<td>
<italic>DCDC4</italic>
</td>
<td>The 3′ deletion identified in family 86 contained ELP4 and DCD4, which are located downstream of PAX6.</td>
<td>PMC3044699/2011</td>
<td>
<italic>DCD4</italic>
(double cortin domain containing 4) is not found in HUGO</td>
</tr>
<tr>
<td>
<italic>MTMR2</italic>
</td>
<td>
<italic>NEFL</italic>
</td>
<td char="." align="char">0.60</td>
<td>
<italic>NEFL</italic>
</td>
<td>However, catalytically inactive CMT disease-related MTMR2 mutants lead to NEFL assembly defects and to pathologies similar to the one caused by NEFL mutations, suggesting that MTMR2 and NEFL may function in a common pathway in the development and maintenance of peripheral axons.</td>
<td>PMC3514635/2012</td>
<td>GeneMANIA does not recognize
<italic>NEFL.</italic>
</td>
</tr>
<tr>
<td>
<italic>TTRV30M</italic>
</td>
<td>
<italic>EPO</italic>
</td>
<td char="." align="char">0.50</td>
<td>
<italic>TTRV30M</italic>
</td>
<td>It has been suggested that inhibition of EPO production could be caused by the toxicity of prefibrillar aggregates of TTR V30M.</td>
<td>PMC4087117/2014</td>
<td>GeneMANIA recognizes
<italic>TTR</italic>
not its mutated form
<italic>V30M. V30M</italic>
is a point mutation within
<italic>TTR</italic>
</td>
</tr>
<tr>
<td>
<bold>
<italic>BDNF-AS</italic>
</bold>
</td>
<td>
<bold>
<italic>EZH2</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.40</bold>
</td>
<td>
<italic>BDNF-AS</italic>
</td>
<td>Further characterization of BDNF-AS indicates that BDNF-AS recruits EZH2 and the PRC2 complex to the BDNF promoter to repress BDNF transcription through H3K27me3 histone modifications.</td>
<td>PMC4047558/2014</td>
<td>
<italic>BDNF-AS</italic>
is
<italic>BDNF</italic>
antisense. GeneMANIA does not recognize anti-sense</td>
</tr>
<tr>
<td>
<bold>
<italic>BDNF-AS</italic>
</bold>
</td>
<td>
<bold>
<italic>PRC2</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.40</bold>
</td>
<td>
<italic>BDNF-AS</italic>
</td>
<td>Further characterization of BDNF-AS indicates that BDNF-AS recruits EZH2 and the PRC2 complex to the BDNF promoter to repress BDNF transcription through H3K27me3 histone modifications.</td>
<td>PMC4047558/2014</td>
<td>
<italic>BDNF-AS</italic>
is
<italic>BDNF</italic>
antisense. GeneMANIA does not recognize anti-sense</td>
</tr>
<tr>
<td>
<bold>
<italic>BDNF-AS</italic>
</bold>
</td>
<td>
<bold>
<italic>BDNF</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.40</bold>
</td>
<td>
<italic>BDNF-AS</italic>
</td>
<td>Further characterization of BDNF-AS indicates that BDNF-AS recruits EZH2 and the PRC2 complex to the BDNF promoter to repress BDNF transcription through H3K27me3 histone modifications.</td>
<td>PMC4047558/2014</td>
<td>
<italic>BDNF-AS</italic>
is
<italic>BDNF</italic>
antisense. GeneMANIA does not recognize anti-sense</td>
</tr>
<tr>
<td>
<bold>
<italic>siCSTA</italic>
</bold>
</td>
<td>
<bold>
<italic>MYOC</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.35</bold>
</td>
<td>
<italic>siCSTA</italic>
</td>
<td>It would be interesting to investigate whether the application of an inhibitor to CSTA, such as its siRNA, could restore the normal MYOC processing and affect the outcome of the disease.</td>
<td>PMC3352898/2012</td>
<td>
<italic>siCSTA</italic>
is short interfering
<italic>CSTA</italic>
. GenMANIA does not cover short interfering RNAs.</td>
</tr>
<tr>
<td>
<bold>
<italic>Glu50Lys</italic>
</bold>
</td>
<td>
<bold>
<italic>OPTN</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.3</bold>
</td>
<td>
<italic>Glu50Lys</italic>
</td>
<td>More, recently, Minegishi and coworkers reported that the over-expression of a glaucoma causing-mutation in OPTN, Glu50Lys, produces an accumulation of insoluble OPTN protein that can be blocked with chemical inhibition of TBK1 activity in HEK293 cells.</td>
<td>PMC4038935/2014</td>
<td>
<italic>Glu50Lys</italic>
is a mutation in the
<italic>OPTN</italic>
gene</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The genes in each extracted relation are listed under the “Gene1” and the “Gene2” columns respectively. A measure of confidence, reported by ReVerb, is listed under the “Confidence” column, and relations with low confidence (<0.5) are bolded. The unverified node is listed under the “Unverified node” column. The associated text that relates the two genes is listed under the “PMC Excerpt” column. Some genes were identified by their synonyms found in either GeneCards or GeneMANIA. The PMCID of the original article coupled with the year of publication is given under”PMCID/Year” column. Important remarks and gene synonyms may be listed under the “Remark” column</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="Tab4">
<label>Table 4</label>
<caption>
<p>Eleven extracted relations with disconnected gene nodes from GeneMANIA</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Gene1</th>
<th>Gene2</th>
<th>Confidence</th>
<th>Disconnected node</th>
<th>PMC Excerpt</th>
<th>PMCID/Year</th>
<th>Remark</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<italic>DCDC1</italic>
</td>
<td>
<italic>PAX6</italic>
</td>
<td char="." align="char">0.96</td>
<td>
<italic>DCDC1</italic>
</td>
<td>ELP4 and DCDC1 are located downstream of PAX6.</td>
<td>PMC2375324/2008</td>
<td></td>
</tr>
<tr>
<td>
<italic>ALB</italic>
</td>
<td>
<italic>ELP4</italic>
</td>
<td char="." align="char">0.93</td>
<td>
<italic>ELP4</italic>
</td>
<td>ALB was used to normalize ELP4 and PAX6 values for the detection of the relative copy number of the deletion region.</td>
<td>PMC3859656/2013</td>
<td></td>
</tr>
<tr>
<td>
<italic>ATOH7</italic>
</td>
<td>
<italic>FBN1</italic>
</td>
<td char="." align="char">0.88</td>
<td>
<italic>ATOH7</italic>
</td>
<td>We found 10 candidate POAG genes that were highly expressed in both the CPE and NPE (AKAP13, C1QBP, CHSY1, COL8A2, CYP1B1, FBN1, IBTK, MFN2, TMCO1, and TMEM248), three genes that were expressed significantly higher in the CPE (CDH1, CDKN2B, and SIX1), and six genes that were expressed significantly higher in the NPE (ATOH7, CYP1B1, FBN1, MYOC, PAX6, and SIX6).</td>
<td>PMC3909915/2014</td>
<td></td>
</tr>
<tr>
<td>
<italic>FBN1</italic>
</td>
<td>
<italic>TMEM248</italic>
</td>
<td char="." align="char">0.88</td>
<td>
<italic>TMEM248</italic>
</td>
<td>We found 10 candidate POAG genes that were highly expressed in both the CPE and NPE (AKAP13, C1QBP, CHSY1, COL8A2, CYP1B1, FBN1, IBTK, MFN2, TMCO1, and TMEM248), three genes that were expressed significantly higher in the CPE (CDH1, CDKN2B, and SIX1), and six genes that were expressed significantly higher in the NPE (ATOH7, CYP1B1, FBN1, MYOC, PAX6, and SIX6).</td>
<td>PMC3909915/2014</td>
<td></td>
</tr>
<tr>
<td>
<italic>GSK3B</italic>
</td>
<td>
<italic>MTHFR</italic>
</td>
<td char="." align="char">0.85</td>
<td>
<italic>MTHFR</italic>
</td>
<td>For example, GSK3B has a direct connection with IL4 and a secondary connection with MTHFR.</td>
<td>PMC2653647/2009</td>
<td></td>
</tr>
<tr>
<td>
<italic>GAPDH</italic>
</td>
<td>
<italic>VSX1</italic>
</td>
<td char="." align="char">0.85</td>
<td>
<italic>VSX1</italic>
</td>
<td>Each bar represents the relative expression of VSX1 normalized to GAPDH in a different tissue/age; mean ± SD (Sc: sclera, Co: cornea, Ir: iris, CB: ciliary body, Len: lens, Cho:</td>
<td>PMC2267740/2008</td>
<td></td>
</tr>
<tr>
<td>
<italic>GLS2</italic>
</td>
<td>
<italic>HMGB1</italic>
</td>
<td char="." align="char">0.80</td>
<td>
<italic>GLS2</italic>
</td>
<td>the HMGB1 inhibitor GA attenuated diabetes-induced upregulation of HMGB1 and downregulation of BDNF</td>
<td>PMC3671668/2013</td>
<td>
<italic>GLS2</italic>
is a synonym of
<italic>GA</italic>
</td>
</tr>
<tr>
<td>
<italic>SHH</italic>
</td>
<td>
<italic>ATOH7</italic>
</td>
<td char="." align="char">0.78</td>
<td>
<italic>ATOH7</italic>
</td>
<td>Thus the SHH and GDF11 regulate ATOH7, which in turn regulates Brn3b.</td>
<td>PMC2883590/2010</td>
<td></td>
</tr>
<tr>
<td>
<bold>
<italic>LMX1B</italic>
</bold>
</td>
<td>
<bold>
<italic>COL3A1</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.45</bold>
</td>
<td>
<bold>
<italic>LMX1B</italic>
</bold>
</td>
<td>Recent immunohistological studies in NPS patients with severe glomerular disease suggest a possible regulation of type III collagen by LMX1B, while the homozygous</td>
<td>PMC2669506/2007</td>
<td>
<italic>COL3A1</italic>
is a synonym of Type_III_collagen</td>
</tr>
<tr>
<td>
<bold>
<italic>NPS</italic>
</bold>
</td>
<td>
<bold>
<italic>PAX6</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.05</bold>
</td>
<td>
<bold>
<italic>NPS</italic>
</bold>
</td>
<td>Research has demonstrated that retinal neurons and RGCs are mainly comprised of anteriorized NPS that express PAX6 and OTX2.</td>
<td>PMC3747054/2013</td>
<td></td>
</tr>
<tr>
<td>
<bold>
<italic>NPS</italic>
</bold>
</td>
<td>
<bold>
<italic>OTX2</italic>
</bold>
</td>
<td char="." align="char">
<bold>0.05</bold>
</td>
<td>
<bold>
<italic>NPS</italic>
</bold>
</td>
<td>Research has demonstrated that retinal neurons and RGCs are mainly comprised of anteriorized NPS that express PAX6 and OTX2</td>
<td>PMC3747054/2013</td>
<td></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The genes in each extracted relation are listed under the “Gene1” and the “Gene2” columns, respectively. A measure of confidence, reported by ReVerb, is listed under the “Confidence” column and relations with low confidence (<0.5) are bolded. The disconnected node in the relation is listed under the “Disconnected node” column. The associated text that relates the two genes is listed under the “PMC Excerpt” column. Some genes were identified by their synonyms found in either GeneCards or GeneMANIA. The PMCID of the original article, coupled with the year of publication, is given under ”PMCID/Year” column. Important remarks and gene synonyms may be listed under the “Remark” column</p>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec id="Sec9">
<title>Analysis and validation</title>
<p>The associations between the pair of entities within the 257 extracted triplets (E,Rel,E) were validated against both the GenMANIA database and BioGRID. Validation using BioGRID showed an agreement in only 24 previously known relations with GeneMANIA. Unlike GeneMANIA, BioGRID does not consider the entire gene network for a pair of genes to identify indirect relations as in GeneMANIA. Therefore, all relations, except the 24 known ones, are new according to BioGRID. Most of the 21 unverified relations were due to unrecognized entity symbols in GeneMANIA at the time of writing this paper, such as antisense of a gene (
<italic>BDNF</italic>
-
<italic>AS</italic>
,
<italic>CDKN2B</italic>
-
<italic>AS</italic>
) or small interfering RNA for a particular gene (
<italic>siPITX2</italic>
,
<italic>siCSTA</italic>
), microRNA, general protein family name (M-opsin), and gene variants or mutation (
<italic>OPTN</italic>
variants:
<italic>Glu50Lys</italic>
or
<italic>E50K</italic>
). However contextual evidence (text) from PMC-ID papers (col. 7 in Table 
<xref rid="Tab4" ref-type="table">4</xref>
) suggests some evidence based on the experiments reported in the mined literature. A summary of the different extracted relations and their percentages is listed in Table 
<xref rid="Tab5" ref-type="table">5</xref>
and the top fifty most frequent relations are depicted in Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
.
<table-wrap id="Tab5">
<label>Table 5</label>
<caption>
<p>Percentages of extracted relations</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Finding Type</th>
<th>Description</th>
<th>Percentage</th>
</tr>
</thead>
<tbody>
<tr>
<td>Known</td>
<td>Verified</td>
<td>76/257 ~ 30 %</td>
</tr>
<tr>
<td>New</td>
<td>Can be verified via one or more indirect paths from the known network</td>
<td>149/257 ~ 58 %</td>
</tr>
<tr>
<td>Disconnected</td>
<td>Potential discovery that can be verified by lab experiment in the future</td>
<td>11/257 ~ 4 %</td>
</tr>
<tr>
<td>Unverified</td>
<td>Gene symbols could not be found in GeneMANIA, HUGO or GeneCards</td>
<td>21/257 ~ 8 %</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The Total number of unique and valid relations is 257, which are classified into known, new, disconnected, and unverified relations, respectively. Description and percentage of each class is given under the “Description” and “Percentage” columns</p>
</table-wrap-foot>
</table-wrap>
<fig id="Fig4">
<label>Fig. 4</label>
<caption>
<p>The top 50 gene pair occurrences in our filtered glaucoma corpus. The occurrence frequency of a pair is calculated as the number of articles that has listed this pair in its content. Multiple occurrences of a pair per article is considered one occurrence</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig4_HTML" id="MO4"></graphic>
</fig>
</p>
<p>As mentioned in the results section, the results included 150 NBG in relation with the 74 BG. The 150 NBG were subjected to enrichment analysis through the PANTHER, DAVID, and GeneCodis databases. We excluded the 74 BG from the functional analysis step to avoid intentionally enriching the results with biological processes and pathways that are already known to be related to glaucoma. PANTHER ranked apoptosis at the top of all biological processes associated with those genes (Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
), which is in line with the evidence that retinal ganglion cell death is a hallmark of glaucoma [
<xref ref-type="bibr" rid="CR55">55</xref>
]. The most enriched biological processes, associated false discovery rate (FDRs) and enrichment scores, reported by PANTHER and DAVID clustering, are listed in Table 
<xref rid="Tab6" ref-type="table">6</xref>
. Furthermore, PANTHER identified gonadotropin-releasing hormone receptor (GnRHR) (involving 8.1 % of the total genes on average) and Wnt signalling pathways (involving 4.5 % of the total genes on average) with the highest gene associations. Interestingly, it was recently reported that several Wnt signaling target genes have been identified as potential players in glaucoma pathogenesis [
<xref ref-type="bibr" rid="CR56">56</xref>
,
<xref ref-type="bibr" rid="CR57">57</xref>
]. The GnRHR pathway was proposed to control central nervous physiology and pathophysiology modulating cognitive changes associated with aging and age-related neurodegenerative disorders [
<xref ref-type="bibr" rid="CR58">58</xref>
]. Combined pathway analysis by PANTHER and GeneCodis is shown with supporting literature (Fig. 
<xref rid="Fig6" ref-type="fig">6</xref>
and Table 
<xref rid="Tab7" ref-type="table">7</xref>
).
<fig id="Fig5">
<label>Fig. 5</label>
<caption>
<p>Biological processes associated with extracted non benchmark genes. A pie chart, generated with the aid of PANTHER, with a listing of biological processes associated with 150 extracted non benchmark genes</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig5_HTML" id="MO5"></graphic>
</fig>
<table-wrap id="Tab6">
<label>Table 6</label>
<caption>
<p>Functional analysis of the 150 extracted non-benchmark genes</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Biological Process</th>
<th>Gene Count</th>
<th>Corrected
<italic>P</italic>
-value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Regulation of apoptosis**</td>
<td>25</td>
<td>2.73E-06</td>
</tr>
<tr>
<td>Inflammatory Response*</td>
<td>12</td>
<td>0.002</td>
</tr>
<tr>
<td>Immune Response**</td>
<td>17</td>
<td>0.004</td>
</tr>
<tr>
<td>Regulation of response to stimulus**</td>
<td>9</td>
<td>0.01</td>
</tr>
<tr>
<td>Defense Response*</td>
<td>15</td>
<td>0.01</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Biological processes, reported by DAVID, are suffixed by * and are associated with their genes count and corrected
<italic>p</italic>
-value. Biological processes, that are common to both PANTHER and DAVID are suffixed by ** and are associated with their gene count and corrected
<italic>p</italic>
-values, obtained from DAVID</p>
</table-wrap-foot>
</table-wrap>
<fig id="Fig6">
<label>Fig. 6</label>
<caption>
<p>Pathways associated with extracted non benchmark genes. Common pathways reported with the aid of PANTHER and GeneCodis for the 150 extracted non- benchmark genes</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig6_HTML" id="MO6"></graphic>
</fig>
<table-wrap id="Tab7">
<label>Table 7</label>
<caption>
<p>Pathway analysis of the 150 extracted NBG</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Pathway name</th>
<th>Count of genes in pathway</th>
<th>FDR</th>
<th>% of genes in pathway</th>
<th>Supporting References</th>
</tr>
</thead>
<tbody>
<tr>
<td>Gonadotropin releasing hormone receptor pathway
<sup>b</sup>
</td>
<td>9</td>
<td></td>
<td>8.1</td>
<td>[
<xref ref-type="bibr" rid="CR58">58</xref>
]</td>
</tr>
<tr>
<td>Interleukin signaling pathway
<sup>b</sup>
</td>
<td>6</td>
<td></td>
<td>5.4</td>
<td>[
<xref ref-type="bibr" rid="CR69">69</xref>
]</td>
</tr>
<tr>
<td>Wnt signalling pathway
<sup>a</sup>
</td>
<td>5</td>
<td>0.006</td>
<td>4.2</td>
<td>[
<xref ref-type="bibr" rid="CR56">56</xref>
,
<xref ref-type="bibr" rid="CR57">57</xref>
]</td>
</tr>
<tr>
<td>Jak-STAT signaling pathway
<sup>a</sup>
</td>
<td>5</td>
<td>0.001</td>
<td>1.8</td>
<td>[
<xref ref-type="bibr" rid="CR70">70</xref>
]</td>
</tr>
<tr>
<td>PDGF signaling pathway
<sup>b</sup>
</td>
<td>5</td>
<td></td>
<td>4.5</td>
<td>[
<xref ref-type="bibr" rid="CR71">71</xref>
]</td>
</tr>
<tr>
<td>TGF-beta signaling pathway
<sup>a</sup>
</td>
<td>4</td>
<td>0.01</td>
<td>3.6</td>
<td>[
<xref ref-type="bibr" rid="CR72">72</xref>
]</td>
</tr>
<tr>
<td>Apoptosis signaling pathway
<sup>b</sup>
</td>
<td>2</td>
<td></td>
<td>1.8</td>
<td>[
<xref ref-type="bibr" rid="CR73">73</xref>
,
<xref ref-type="bibr" rid="CR74">74</xref>
]</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Common pathways, reported by both GeneCodis and PANTHER, are suffixed by
<sup>a</sup>
and the associated false discovery rate (FDR) from GeneCodis is reported. Pathways, reported by PANTHER, are suffixed by
<sup>b</sup>
. The percentage of total genes in the pathway is reported along supporting references that link glaucoma to the pathway</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>Our result is expected to be comprehensive, with partial resemblance to other studies of glaucoma interaction networks. For example, our result shares only 5 and 29 genes with two previous studies [
<xref ref-type="bibr" rid="CR28">28</xref>
,
<xref ref-type="bibr" rid="CR29">29</xref>
] respectively. This emphasizes the fact that interaction networks from text mining approaches can be quite comprehensive because they can incorporate and integrate information from all types of studies. Our enrichment analysis also agreed with previously reported enrichments to glaucoma studies [
<xref ref-type="bibr" rid="CR29">29</xref>
] such as apoptosis and induction of apoptosis as underlying biological processes and pathways such as
<italic>PDGF</italic>
signaling pathway, Ras pathway, and apoptosis signaling pathway.</p>
<sec id="Sec10">
<title>Network features</title>
<p>The resulting graph is a scale-free network that follows the Barabási–Albert (BA) network model [
<xref ref-type="bibr" rid="CR59">59</xref>
]. A scale-free network is a network with node links that follow a power law distribution, i.e. the probability of linking to a given node is proportional to the number of existing links,
<italic>k,</italic>
that node has. Our glaucoma network (Fig. 
<xref rid="Fig7" ref-type="fig">7</xref>
) consists of 224 nodes and 255 edges. Network analysis shows that the network has a diameter of 13 and a path length distribution as shown in Fig. 
<xref rid="Fig8" ref-type="fig">8</xref>
. While the diameter of the network and path length distribution are quantitative measures that offer insight into how well connected a network is, the clustering coefficient describes how clustered the network is. The network diameter is the longest path between all possible pairs of nodes in the network, while the path length distribution summarizes the number of steps along the paths connecting all possible pairs of network nodes. The network has a relatively low clustering coefficient of 0.11; a property which appears to characterize most metabolic networks and protein interaction networks [
<xref ref-type="bibr" rid="CR60">60</xref>
,
<xref ref-type="bibr" rid="CR61">61</xref>
], indicating that low degree nodes tend to belong to highly connected neighborhoods, whereas high degree nodes tend to have neighbors that are less connected to each other. The node degree is the number of in-links and out-links for a particular node in the network. The network node degree distribution follows a power law (Fig. 
<xref rid="Fig9" ref-type="fig">9</xref>
), another property of scale free networks. Table 
<xref rid="Tab8" ref-type="table">8</xref>
lists the nodes with top ten degrees, indicating hub entities in the network. To conclude, the current version of the extracted glaucoma interaction network is small but informative. Future versions of the network are expected to evolve closer to a small world network as more links between nodes get added.
<fig id="Fig7">
<label>Fig. 7</label>
<caption>
<p>Extracted glaucoma network. Glaucoma network laid with different node sizes. The node size reflects the node degree of a gene where the degree is the total of the number of in-degree and out-degree links. The nodes colored in cyan belong to the BC. The known relations are colored in black. The new extracted relations are colored in blue. The relations with disconnected nodes are colored in green. The relations with unverified nodes are colored in red</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig7_HTML" id="MO7"></graphic>
</fig>
<fig id="Fig8">
<label>Fig. 8</label>
<caption>
<p>Glaucoma network path distribution by the Cytoscape network analyser</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig8_HTML" id="MO8"></graphic>
</fig>
<fig id="Fig9">
<label>Fig. 9</label>
<caption>
<p>Glaucoma network node degree distribution. The glaucoma node degree distribution, generated by the Cytoscape network analyser, follows a power law fitted to the form
<italic>y</italic>
 = 137.67
<italic>x</italic>
 − 
<sup>1.99</sup>
</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig9_HTML" id="MO9"></graphic>
</fig>
<table-wrap id="Tab8">
<label>Table 8</label>
<caption>
<p>Genes (nodes) with the top 10° in the extracted glaucoma interaction network</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Gene(node)</th>
<th>Degree</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<italic>CYP1B1</italic>
</td>
<td>17</td>
</tr>
<tr>
<td>
<italic>FBN1</italic>
</td>
<td>14</td>
</tr>
<tr>
<td>
<italic>PAX6</italic>
</td>
<td>13</td>
</tr>
<tr>
<td>
<italic>MYOC</italic>
</td>
<td>11</td>
</tr>
<tr>
<td>
<italic>MFN2</italic>
</td>
<td>10</td>
</tr>
<tr>
<td>
<italic>OPTN</italic>
</td>
<td>9</td>
</tr>
<tr>
<td>
<italic>CKM</italic>
</td>
<td>9</td>
</tr>
<tr>
<td>
<italic>AKAP13</italic>
</td>
<td>9</td>
</tr>
<tr>
<td>
<italic>IBTK</italic>
</td>
<td>9</td>
</tr>
<tr>
<td>
<italic>TMCO1</italic>
</td>
<td>9</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The degree column represents the total number of a node’s ingoing and outgoing links. Note that
<italic>CYP1B1</italic>
heads the list with a total of 17 links</p>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec id="Sec11">
<title>Performance evaluation</title>
<p>As described in the “
<xref rid="Sec2" ref-type="sec">Methods</xref>
” section, our text mining pipeline consists of three steps: 1) Text retrieval, 2) Entity extraction, and 3) Relation extraction; each of which has a different associated level of performance. Text retrieval performance is evaluated based on the retrieval of relevant documents. Entity recognition performance is evaluated by the fact that most, if not all genes, should be captured from the collection of glaucoma documents. Relation extraction performance is validated by the extraction of relevant relations. Performance evaluation is usually based on precision (P), recall (R) and F1-score metrics. P is defined as the proportion of retrieved instances that are relevant, while R is the proportion of relevant instances that were retrieved. F1-score combines recall and precision. These metrics are given in Eq (1):
<disp-formula id="Equb">
<alternatives>
<tex-math id="M3">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ P=\frac{\#\kern0.5em of\kern0.5em relevant\kern0.5em retrieved\kern0.5em instances}{\#\kern0.5em of\kern0.5em retrieved\kern0.5em instances},\kern0.5em R=\frac{\#\kern0.5em of\kern0.5em relevant\kern0.5em retrieved\kern0.5em instances}{\#\kern0.5em of\kern0.5em relevant\kern0.5em instances},\kern0.5em F1=\frac{2*P*R}{P+R}(1) $$\end{document}</tex-math>
<mml:math id="M4">
<mml:mi>P</mml:mi>
<mml:mo mathvariant="italic">=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo mathvariant="italic">#</mml:mo>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">of</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">relevant</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">retrieved</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">instances</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="italic">#</mml:mo>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">of</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">retrieved</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">instances</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo mathvariant="italic">,</mml:mo>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi>R</mml:mi>
<mml:mo mathvariant="italic">=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo mathvariant="italic">#</mml:mo>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">of</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">relevant</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">retrieved</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">instances</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="italic">#</mml:mo>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">of</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">relevant</mml:mi>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi mathvariant="italic">instances</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo mathvariant="italic">,</mml:mo>
<mml:mspace width="0.5em"></mml:mspace>
<mml:mi>F</mml:mi>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="italic">=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn mathvariant="italic">2</mml:mn>
<mml:mo mathvariant="italic">*</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo mathvariant="italic">*</mml:mo>
<mml:mi>R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo mathvariant="italic">+</mml:mo>
<mml:mi>R</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mtext>(1)</mml:mtext>
</mml:math>
<graphic xlink:href="13040_2016_96_Article_Equb.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
</p>
<p>The text retrieval step performance metrics and values are listed in Table 
<xref rid="Tab9" ref-type="table">9</xref>
and Table 
<xref rid="Tab10" ref-type="table">10</xref>
. For the entity extraction step performance, the GENIA tagger targets a broader domain. Hence, it can be expected to tag varied entities (including localization, cell type, DNA, etc.), but possibly less genes/proteins than the GenTag tagger. This is because the latter is more focused towards genes and proteins. Indeed, in our particular study, GENIA tagger tagged 2410 genes while GenTag tagged 3422 genes. Table 
<xref rid="Tab11" ref-type="table">11</xref>
lists the performance measures, reported in [
<xref ref-type="bibr" rid="CR62">62</xref>
] for GENIA and the average performance measures, reported in [
<xref ref-type="bibr" rid="CR63">63</xref>
] and [
<xref ref-type="bibr" rid="CR64">64</xref>
] for GenTag.
<table-wrap id="Tab9">
<label>Table 9</label>
<caption>
<p>Distribution of articles in the text retrieval step, depending on their accessibility and relevance</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th></th>
<th>Relevant</th>
<th>Not Relevant</th>
<th>Total</th>
</tr>
</thead>
<tbody>
<tr>
<td>Retrieved open access articles</td>
<td>7425</td>
<td>1235</td>
<td>8660</td>
</tr>
<tr>
<td>Restricted access (not Retrieved)</td>
<td>22733</td>
<td>unknown</td>
<td>__</td>
</tr>
<tr>
<td>Total</td>
<td>31393</td>
<td>__</td>
<td>__</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Relevant articles are those that contain at least one occurrence of the word “glaucoma” in their text. The portion of restricted access articles, are not relevant, is unknown to us at the time of writing this article</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="Tab10">
<label>Table 10</label>
<caption>
<p>Evaluation metrics for the retrieval step</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Metric</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Precision</td>
<td>7425/8660 = 85 %</td>
</tr>
<tr>
<td>Recall</td>
<td>7425/31393 = 23 %</td>
</tr>
<tr>
<td>F1</td>
<td>36 %</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Evaluation metrics are computed based on Table 
<xref rid="Tab9" ref-type="table">9</xref>
. Note that recall is limited by the number of open access articles at this time</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="Tab11">
<label>Table 11</label>
<caption>
<p>Performance measures of the used LingPipe NER tagger</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Tagger</th>
<th>Entity Type</th>
<th>Recall (%)</th>
<th>Precision (%)</th>
<th>F-score (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="6">GENIA</td>
<td>Protein</td>
<td>81.41</td>
<td>65.82</td>
<td char="." align="char">72.79</td>
</tr>
<tr>
<td>DNA</td>
<td>66.76</td>
<td>65.64</td>
<td char="." align="char">66.2</td>
</tr>
<tr>
<td>RNA</td>
<td>68.64</td>
<td>60.45</td>
<td char="." align="char">64.29</td>
</tr>
<tr>
<td>Cell Line</td>
<td>59.6</td>
<td>56.12</td>
<td char="." align="char">57.81</td>
</tr>
<tr>
<td>Cell Type</td>
<td>70.54</td>
<td>78.51</td>
<td char="." align="char">74.31</td>
</tr>
<tr>
<td>Overall</td>
<td>75.78</td>
<td>67.45</td>
<td char="." align="char">71.37</td>
</tr>
<tr>
<td>GENTAG</td>
<td>Gene/Protein</td>
<td>79</td>
<td>88</td>
<td char="." align="char">70.8</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Reported measures for the GENIA tagger is based on the GENIA performance web site [
<xref ref-type="bibr" rid="CR62">62</xref>
] while performance measures of the GENTAG tagger is the average of the measures reported in [
<xref ref-type="bibr" rid="CR63">63</xref>
,
<xref ref-type="bibr" rid="CR64">64</xref>
]</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>Because the relation extraction step depends on ReVerb, we report ReVerb’s performance from [
<xref ref-type="bibr" rid="CR43">43</xref>
], which were 65 % precision and 52 % recall. Therefore, the F1 score associated with the relation extraction step is estimated at 58 %.</p>
</sec>
</sec>
<sec id="Sec12">
<title>Discussion</title>
<p>While we have described an expansion of the known network of glaucoma related genes, we were surprised that less than a quarter of the genes extracted from DisGeNet and OMIM combined were connected to our network at this time (74/305 = 24 % BG). Community detection with the Gephi’s Louvain modularity maximization algorithm [
<xref ref-type="bibr" rid="CR65">65</xref>
], partitioned the network into five distinct modular clusters (Fig. 
<xref rid="Fig10" ref-type="fig">10</xref>
). The Louvain modularity maximization algorithm measures the density of links, inside clusters as compared to links between clusters and uses a resolution measure [
<xref ref-type="bibr" rid="CR66">66</xref>
] that measures the flows of probabilities in the network. The resulting five clusters formed a strongly connected subnetwork that is 41 % of the size of the original network (96 nodes and 148 edges), with only the giant influential components (nodes with high connectivity) of the network. Examination of the clusters, showed that each has one or more of the BC genes, making a total of 7 BC. Almost the same ratio is observed with the clusters, where less than a quarter of the 30, genes present in both of OMIM and DisGeNET databases (7/30 = 23 %), are connected to the clusters. As to the BC genes, the green cluster has
<italic>CYP1B1</italic>
and
<italic>MYOC</italic>
, the purple cluster has
<italic>OPTN</italic>
,
<italic>TBK1</italic>
, and
<italic>TNF</italic>
, the red, yellow, and blue clusters have
<italic>OPA1</italic>
,
<italic>FOXC1</italic>
, and
<italic>CMK</italic>
respectively. Their representation here supports the notion that the 30 BC are most highly ranked among all of the BG. Table 
<xref rid="Tab12" ref-type="table">12</xref>
profiles the different properties of each of the five clusters and Fig. 
<xref rid="Fig11" ref-type="fig">11</xref>
depicts the clusters and their sizes.
<fig id="Fig10">
<label>Fig. 10</label>
<caption>
<p>A smaller glaucoma interconnected subnetwork resulting from applying the modularity algorithm in Gephi on the original glaucoma network. The glaucoma network in Fig. 
<xref rid="Fig7" ref-type="fig">7</xref>
was subjected to the Gephi modularity clustering algorithm to identify communities and classes within the network. Five distinct classes colored in green, purple, red, yellow, and blue respectively, can be seen</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig10_HTML" id="MO10"></graphic>
</fig>
<table-wrap id="Tab12">
<label>Table 12</label>
<caption>
<p>Clusters extracted from the giant components in the glaucoma network and their associated profiles</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Cluster</th>
<th># Nodes</th>
<th>BG</th>
<th>NBG</th>
<th>Node with highest degree</th>
<th>Known relations</th>
<th>New relations</th>
<th>Unverified relations</th>
<th>Disconnected relations</th>
</tr>
</thead>
<tbody>
<tr>
<td>Green</td>
<td>36</td>
<td>6</td>
<td>11</td>
<td>
<italic>CYP1B1</italic>
 = 17</td>
<td>10</td>
<td>14</td>
<td>8</td>
<td>4</td>
</tr>
<tr>
<td>Purple</td>
<td>23</td>
<td>1</td>
<td>9</td>
<td>
<italic>OPTN</italic>
 = 10</td>
<td>7</td>
<td>15</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>Red</td>
<td>15</td>
<td>0</td>
<td>5</td>
<td>
<italic>OPA1</italic>
 = 5</td>
<td>2</td>
<td>12</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>Yellow</td>
<td>13</td>
<td>2</td>
<td>5</td>
<td>
<italic>FOXC4</italic>
 = 7</td>
<td>2</td>
<td>11</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Blue</td>
<td>9</td>
<td>2</td>
<td>7</td>
<td>
<italic>CKM =</italic>
 9</td>
<td>4</td>
<td>5</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The giant components in the glaucoma network depicted in Fig. 
<xref rid="Fig7" ref-type="fig">7</xref>
are clustered into five clusters. Clusters are ordered in descending order of the number of nodes in each cluster. Cluster properties include number of BG, NBG, highest degree, and the number of different types of relations contained within the cluster</p>
</table-wrap-foot>
</table-wrap>
<fig id="Fig11">
<label>Fig. 11</label>
<caption>
<p>Modularity community classes and associated node sizes. The modularity classes are listed on the X axis while the number of nodes is on the Y axis. The highest number of nodes is 36 in modularity class 3, while the least is 9 in modularity cluster 0. The value of modularity, before and after applying the resolution, is listed on the top left of the figure. A resolution value of 9.0, was used in association with the modularity algorithm to obtain dense, well separated classes</p>
</caption>
<graphic xlink:href="13040_2016_96_Fig11_HTML" id="MO11"></graphic>
</fig>
</p>
<p>The text mining approach, adopted in this study, relies heavily on natural language processing (NLP) methods. We reported in this study, the first version of a glaucoma interaction network, with the intention to report future refined versions when improvements in the text mining pipeline become available. For example, more specificity could likely be added to the results if a better tailored tagger was used. We relied on taggers that were trained on general biological texts that are not specific to glaucoma. Therefore, it is expected that not all entities will be captured from our article collection and an in-house developed tagger, that is trained on literature related to eye diseases and disorders, would likely improve our outcome. Additionally, we note that the currently available glaucoma corpus has a relatively small size compared to other corpora associated with other diseases such as prostate cancer or breast cancer. Since the number of extracted relations is proportional to the size of the corpus, it is desirable to increase the corpus size to discover more relations. There are many possibilities to increase the size of the available glaucoma corpus. For example, PubMed abstracts could be added to the current corpus, or only PubMed abstracts could be considered instead of PMC full text articles. Both options may significantly impact our future results.</p>
<p>Perhaps, the most sought improvement after enlarging the body of literature, would be to reconsider the relation extraction step. ReVerb is designed for open relation extraction, and has not been tweaked for closed relation extraction. In closed relation extraction, the target includes verbs that are known a priori. However, considering our small corpus, it would have negatively affected our extracted relations if we had been confined to a closed set of predetermined verbs [
<xref ref-type="bibr" rid="CR67">67</xref>
]. Another difficulty faced by ReVerb is handling complex sentence structures. Although many authors tend to use simple sentence structure such as: Subject-verb-Object, in describing a relationship between two genes, it is not rare for authors to use more complex sentence structures such as conjunctive structure sentences. The latter are sentences that bear multiple verb based relationships or a single verb, to describe many-to-one or one-to-many relationships in a single sentence, respectively. Due to its shallow syntactic analysis, ReVerb’s maximum recall is limited and therefore, it misses most of the conjunctive structure sentences. A better but probably time consuming alternative, is to use an NLP parser such as the Stanford parser [
<xref ref-type="bibr" rid="CR68">68</xref>
] to parse target sentences, then search the parsing tree to capture all missing models of verbs.</p>
</sec>
<sec id="Sec13">
<title>Conclusions</title>
<p>In this study, we have constructed a glaucoma interaction network using a text mining approach applied to open access PMC based literature. Our findings revealed 149 potential new relations. These newly discovered relationships link 74 benchmark genes (BG) present in the 2 databases, DisGeNet and OMIM, with 150 non-benchmark genes (NBG) present in the PubMed Central database, in the form of a small world interaction network. These findings include 21 unverified relations and 11 disconnected relations, which could be verified in the lab. The constructed network contains five distinct gene clusters in association with 7 BC. The 5 clusters are interconnected through 4 gene-gene associations which include:
<italic>OPA1-MFN2</italic>
,
<italic>PITX2-PAX6</italic>
,
<italic>MYOC-CKM</italic>
and
<italic>MYOC-OPTN</italic>
. Thus the larger network is only possible because of these 4 bridges. It is important to note that 2 of these 4 gene-gene bridges,
<italic>OPA1-MFN2</italic>
and
<italic>MYOC-OPTN</italic>
, were discovered through this text mining approach which has associated genes in the DisGeNet and OMIM databases with the PubMed Central database. Finally, we have discussed several important issues with text mining approaches which could aid future iterations of disease-based gene-interaction networks.</p>
</sec>
</body>
<back>
<app-group>
<app id="App1">
<sec id="Sec14">
<title>Additional files</title>
<p>
<media position="anchor" xlink:href="13040_2016_96_MOESM1_ESM.xlsx" id="MOESM1">
<label>Additional file 1:</label>
<caption>
<p>Filtered Extracted Relations. (XLSX 58 kb)</p>
</caption>
</media>
<media position="anchor" xlink:href="13040_2016_96_MOESM2_ESM.xlsx" id="MOESM2">
<label>Additional file 2:</label>
<caption>
<p>Unique Extracted Relations. (XLSX 32 kb)</p>
</caption>
</media>
<media position="anchor" xlink:href="13040_2016_96_MOESM3_ESM.xlsx" id="MOESM3">
<label>Additional file 3:</label>
<caption>
<p>Glaucoma Benchmark Genes (XLSX 38 kb)</p>
</caption>
</media>
</p>
</sec>
</app>
</app-group>
<fn-group>
<fn>
<p>
<bold>Competing interests</bold>
</p>
<p>The authors declare that they have no competing interests.</p>
</fn>
<fn>
<p>
<bold>Authors’ contributions</bold>
</p>
<p>MS initiated, designed and implemented the study and drafted the manuscript. ON oversaw the text mining approach and the result validation, and revised the manuscript. NGFC coordinated the study, provided biological interpretation and revisions to the manuscript drafts. All authors read and approved the final manuscript.</p>
</fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>We would like to thank Dr. Eric Rouchka for his valuable comments for improving the manuscript.</p>
<p>This work was supported in part by grants from the National Eye Institute R01EY017594 and the National Institute of General Medical Sciences P20 GM103436.</p>
</ack>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1.</label>
<mixed-citation publication-type="other">Christopher R, Dhiman A, Fox J, Gendelman R, Haberitcher T, Kagle D, Spizz G, Khalil IG, Hill C. Data-driven computer simulation of human cancer cell. Ann N Y Acad Sci. 2004;1020:132–53.</mixed-citation>
</ref>
<ref id="CR2">
<label>2.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Swanson</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Fish oil, Raynaud’s syndrome, and undiscovered public knowledge</article-title>
<source>Perspect Biol Med</source>
<year>1986</year>
<volume>30</volume>
<issue>1</issue>
<fpage>7</fpage>
<lpage>18</lpage>
<pub-id pub-id-type="doi">10.1353/pbm.1986.0087</pub-id>
<pub-id pub-id-type="pmid">3797213</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Srinivasan</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Libbus</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Mining MEDLINE for implicit links between dietary substances and diseases</article-title>
<source>Bioinformatics</source>
<year>2004</year>
<volume>20</volume>
<issue>Suppl 1</issue>
<fpage>i290</fpage>
<lpage>296</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bth914</pub-id>
<pub-id pub-id-type="pmid">15262811</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wren</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Bekeredjian</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Stewart</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Shohet</surname>
<given-names>RV</given-names>
</name>
<name>
<surname>Garner</surname>
<given-names>HR</given-names>
</name>
</person-group>
<article-title>Knowledge discovery by automated identification and ranking of implicit relationships</article-title>
<source>Bioinformatics</source>
<year>2004</year>
<volume>20</volume>
<issue>3</issue>
<fpage>389</fpage>
<lpage>398</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btg421</pub-id>
<pub-id pub-id-type="pmid">14960466</pub-id>
</element-citation>
</ref>
<ref id="CR5">
<label>5.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Sharp</surname>
<given-names>BM</given-names>
</name>
</person-group>
<article-title>Content-rich biological network constructed by mining PubMed abstracts</article-title>
<source>BMC Bioinformatics</source>
<year>2004</year>
<volume>5</volume>
<fpage>147</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-5-147</pub-id>
<pub-id pub-id-type="pmid">15473905</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van der Eijk</surname>
<given-names>CC</given-names>
</name>
<name>
<surname>van Mulligen</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Kors</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Mons</surname>
<given-names>B</given-names>
</name>
<name>
<surname>van den Berg</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Constructing an associative concept space for literature‐based discovery</article-title>
<source>J Am Society Information Science Technology</source>
<year>2004</year>
<volume>55</volume>
<issue>5</issue>
<fpage>436</fpage>
<lpage>444</lpage>
<pub-id pub-id-type="doi">10.1002/asi.10392</pub-id>
</element-citation>
</ref>
<ref id="CR7">
<label>7.</label>
<mixed-citation publication-type="other">Zaremba S, Ramos-Santacruz M, Hampton T, Shetty P, Fedorko J, Whitmore J, Greene JM, Perna NT, Glasner JD, Plunkett 3rd G, et al. Text-mining of PubMed abstracts by natural language processing to create a public knowledge base on molecular mechanisms of bacterial enteropathogens. BMC Bioinformatics. 2009;10:177.</mixed-citation>
</ref>
<ref id="CR8">
<label>8.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abulaish</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Dey</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Biological relation extraction and query answering from medline abstracts using ontology-based text mining</article-title>
<source>Data Knowledge Engineering</source>
<year>2007</year>
<volume>61</volume>
<issue>2</issue>
<fpage>228</fpage>
<lpage>262</lpage>
<pub-id pub-id-type="doi">10.1016/j.datak.2006.06.007</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>PPI finder: a mining tool for human protein-protein interactions</article-title>
<source>PLoS One</source>
<year>2009</year>
<volume>4</volume>
<issue>2</issue>
<fpage>e4554</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0004554</pub-id>
<pub-id pub-id-type="pmid">19234603</pub-id>
</element-citation>
</ref>
<ref id="CR10">
<label>10.</label>
<mixed-citation publication-type="other">Tudor CO, Ross KE, Li G, Vijay-Shanker K, Wu CH, Arighi CN. Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system. Database. 2015;2015:bav020.</mixed-citation>
</ref>
<ref id="CR11">
<label>11.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Hong</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Constructing regulatory networks to identify biomarkers for insulin resistance</article-title>
<source>Gene</source>
<year>2014</year>
<volume>539</volume>
<issue>1</issue>
<fpage>68</fpage>
<lpage>74</lpage>
<pub-id pub-id-type="doi">10.1016/j.gene.2014.01.061</pub-id>
<pub-id pub-id-type="pmid">24512691</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Malhotra</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Younesi</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Bagewadi</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hofmann-Apitius</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Linking hypothetical knowledge patterns to disease molecular signatures for biomarker discovery in Alzheimer’s disease</article-title>
<source>Genome Med</source>
<year>2014</year>
<volume>6</volume>
<issue>11</issue>
<fpage>97</fpage>
<pub-id pub-id-type="pmid">25484918</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Quan</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Gene–disease association extraction by text mining and network analysis</article-title>
<source>Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)@ EACL</source>
<year>2014</year>
<fpage>54</fpage>
<lpage>63</lpage>
</element-citation>
</ref>
<ref id="CR14">
<label>14.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ozgur</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Vu</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Erkan</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Radev</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Identifying gene-disease associations using centrality on a literature mined gene-interaction network</article-title>
<source>Bioinformatics</source>
<year>2008</year>
<volume>24</volume>
<issue>13</issue>
<fpage>i277</fpage>
<lpage>285</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btn182</pub-id>
<pub-id pub-id-type="pmid">18586725</pub-id>
</element-citation>
</ref>
<ref id="CR15">
<label>15.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
</person-group>
<article-title>Network biomarkers, interaction networks and dynamical network biomarkers in respiratory diseases</article-title>
<source>Clin Transl Med</source>
<year>2014</year>
<volume>3</volume>
<fpage>16</fpage>
<pub-id pub-id-type="doi">10.1186/2001-1326-3-16</pub-id>
<pub-id pub-id-type="pmid">24995123</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16.</label>
<mixed-citation publication-type="other">Smith B, Ceusters W, Klagges B, Kohler J, Kumar A, Lomax J, Mungall C, Neuhaus F, Rector AL, 23 Rosse C. Relations in biomedical ontologies. Genome Biol. 2005;6(5):R46.</mixed-citation>
</ref>
<ref id="CR17">
<label>17.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Skusa</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rüegg</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Köhler</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Extraction of biological interaction networks from scientific literature</article-title>
<source>Brief Bioinform</source>
<year>2005</year>
<volume>6</volume>
<issue>3</issue>
<fpage>263</fpage>
<lpage>276</lpage>
<pub-id pub-id-type="doi">10.1093/bib/6.3.263</pub-id>
<pub-id pub-id-type="pmid">16212774</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Nguyen</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Miwa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tsuruoka</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tojo</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Open information extraction from biomedical literature using predicate-argument structure patterns</article-title>
<source>Proceedings of The 5th International Symposium on Languages in Biology and Medicine</source>
<year>2013</year>
<fpage>51</fpage>
<lpage>55</lpage>
</element-citation>
</ref>
<ref id="CR19">
<label>19.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Etzioni</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Banko</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Soderland</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Weld</surname>
<given-names>DS</given-names>
</name>
</person-group>
<article-title>Open information extraction from the web</article-title>
<source>Communications ACM</source>
<year>2008</year>
<volume>51</volume>
<issue>12</issue>
<fpage>68</fpage>
<lpage>74</lpage>
<pub-id pub-id-type="doi">10.1145/1409360.1409378</pub-id>
</element-citation>
</ref>
<ref id="CR20">
<label>20.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rinaldi</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Clematide</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Marques</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ellendorff</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Romacker</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Rodriguez-Esteban</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>OntoGene web services for biomedical text mining</article-title>
<source>BMC Bioinformatics</source>
<year>2014</year>
<volume>15</volume>
<issue>14</issue>
<fpage>S6</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-15-S14-S6</pub-id>
<pub-id pub-id-type="pmid">25472638</pub-id>
</element-citation>
</ref>
<ref id="CR21">
<label>21.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jelier</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Schuemie</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Veldhoven</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Dorssers</surname>
<given-names>LC</given-names>
</name>
<name>
<surname>Jenster</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Kors</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>Anni 2.0: a multipurpose text-mining tool for the life sciences</article-title>
<source>Genome Biol</source>
<year>2008</year>
<volume>9</volume>
<issue>6</issue>
<fpage>R96</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2008-9-6-r96</pub-id>
<pub-id pub-id-type="pmid">18549479</pub-id>
</element-citation>
</ref>
<ref id="CR22">
<label>22.</label>
<mixed-citation publication-type="other">Torii M, Li G, Li Z, Oughtred R, Diella F, Celen I, Arighi CN, Huang H, Vijay-Shanker K, Wu CH. RLIMS-P: an online text-mining tool for literature-based extraction of protein phosphorylation information. Database. 2014;2014:bau081.</mixed-citation>
</ref>
<ref id="CR23">
<label>23.</label>
<mixed-citation publication-type="other">Guo Y, Séaghdha DO, Silins I, Sun L, Högberg J, Stenius U, Korhonen A. CRAB 2.0: A text mining tool for supporting literature review in chemical cancer risk assessment. COLING. 2014;2014:76.</mixed-citation>
</ref>
<ref id="CR24">
<label>24.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kingman</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Glaucoma is second leading cause of blindness globally</article-title>
<source>Bull World Health Organ</source>
<year>2004</year>
<volume>82</volume>
<issue>11</issue>
<fpage>887</fpage>
<lpage>888</lpage>
<pub-id pub-id-type="pmid">15640929</pub-id>
</element-citation>
</ref>
<ref id="CR25">
<label>25.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Beidoe</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Mousa</surname>
<given-names>SA</given-names>
</name>
</person-group>
<article-title>Current primary open-angle glaucoma treatments and future directions</article-title>
<source>Clin Ophthalmol</source>
<year>2012</year>
<volume>6</volume>
<fpage>1699</fpage>
<lpage>1707</lpage>
<pub-id pub-id-type="pmid">23118520</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>HU</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Darabos</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Cricco Me</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Moore</surname>
<given-names>JH</given-names>
</name>
</person-group>
<article-title>Genome-wide genetic interaction analysis of glaucoma using expert knowledge derived from human phenotype networks</article-title>
<source>Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing</source>
<year>2014</year>
<fpage>207</fpage>
<lpage>218</lpage>
</element-citation>
</ref>
<ref id="CR27">
<label>27.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Basu</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Sen</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ray</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ghosh</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Datta</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Mukhopadhyay</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Genetic association and gene-gene interaction of HAS2, HABP1 and HYAL3 implicate hyaluronan metabolic genes in glaucomatous neurodegeneration</article-title>
<source>Dis Markers</source>
<year>2012</year>
<volume>33</volume>
<issue>3</issue>
<fpage>145</fpage>
<lpage>154</lpage>
<pub-id pub-id-type="doi">10.1155/2012/390539</pub-id>
<pub-id pub-id-type="pmid">22960332</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28.</label>
<mixed-citation publication-type="other">Colak D, Morales J, Bosley TM, Al-Bakheet A, AlYounes B, Kaya N, Abu-Amero KK. Genome-Wide Expression Profiling of Patients with Primary Open Angle GlaucomaGene Expression Profiling of POAG. Invest Ophthalmol Vis Sci. 2012;53(9):5899–904.</mixed-citation>
</ref>
<ref id="CR29">
<label>29.</label>
<mixed-citation publication-type="other">Nikolskaya T, Nikolsky Y, Serebryiskaya T, Zvereva S, Sviridov E, Dezso Z, Rahkmatulin E, Brennan RJ, Yankovsky N, Bhattacharya SK. Network analysis of human glaucomatous optic nerve head astrocytes. BMC Med Genomics. 2009;2(1):24.</mixed-citation>
</ref>
<ref id="CR30">
<label>30.</label>
<mixed-citation publication-type="other">Ronen F, James S. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured. New York, NY, USA: Cambridge University Press; 2006.</mixed-citation>
</ref>
<ref id="CR31">
<label>31.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mooney</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Bunescu</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Mining knowledge from text using information extraction</article-title>
<source>ACM SIGKDD Explorations Newsletter</source>
<year>2005</year>
<volume>7</volume>
<issue>1</issue>
<fpage>3</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1145/1089815.1089817</pub-id>
</element-citation>
</ref>
<ref id="CR32">
<label>32.</label>
<mixed-citation publication-type="other">The PMC Open Access Subset [
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/">http://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/</ext-link>
]. Accessed 25 Mar 2015.</mixed-citation>
</ref>
<ref id="CR33">
<label>33.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pyysalo</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ohta</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Tsujii</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>An analysis of gene/protein associations at PubMed scale</article-title>
<source>J Biomed Semantics</source>
<year>2011</year>
<volume>2</volume>
<issue>5</issue>
<fpage>S5</fpage>
<pub-id pub-id-type="doi">10.1186/2041-1480-2-S5-S5</pub-id>
<pub-id pub-id-type="pmid">22166173</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34.</label>
<mixed-citation publication-type="other">Baldwin B, Carpenter B. LingPipe. 2003. Available from World Wide Web:
<ext-link ext-link-type="uri" xlink:href="http://alias-i.com/lingpipe/">http://alias-i.com/lingpipe/</ext-link>
. Accessed 25 Mar 2015.</mixed-citation>
</ref>
<ref id="CR35">
<label>35.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tanabe</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Thom</surname>
<given-names>LH</given-names>
</name>
<name>
<surname>Matten</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Wilbur</surname>
<given-names>WJ</given-names>
</name>
</person-group>
<article-title>GENETAG: a tagged corpus for gene/protein named entity recognition</article-title>
<source>BMC Bioinformatics</source>
<year>2005</year>
<volume>6</volume>
<issue>1</issue>
<fpage>S3</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-6-S1-S3</pub-id>
<pub-id pub-id-type="pmid">15960837</pub-id>
</element-citation>
</ref>
<ref id="CR36">
<label>36.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Ohta</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Tsujii</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Corpus annotation for mining biomedical events from literature</article-title>
<source>BMC Bioinformatics</source>
<year>2008</year>
<volume>9</volume>
<fpage>10</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-9-10</pub-id>
<pub-id pub-id-type="pmid">18182099</pub-id>
</element-citation>
</ref>
<ref id="CR37">
<label>37.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Krallinger</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Leitner</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Valencia</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Assessment of the second BioCreative PPI task: automatic extraction of protein-protein interactions</article-title>
<source>Proceedings of the second biocreative challenge evaluation workshop</source>
<year>2007</year>
<fpage>41</fpage>
<lpage>54</lpage>
</element-citation>
</ref>
<ref id="CR38">
<label>38.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hamosh</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Scott</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Amberger</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Bocchini</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>McKusick</surname>
<given-names>VA</given-names>
</name>
</person-group>
<article-title>Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<issue>suppl 1</issue>
<fpage>D514</fpage>
<lpage>D517</lpage>
<pub-id pub-id-type="pmid">15608251</pub-id>
</element-citation>
</ref>
<ref id="CR39">
<label>39.</label>
<mixed-citation publication-type="other">Pinero J, Queralt-Rosinach N, Bravo A, Deu-Pons J, Bauer-Mehren A, Baron M, Sanz F, Furlong LI. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database (Oxford). 2015;2015:bav028.</mixed-citation>
</ref>
<ref id="CR40">
<label>40.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bauer-Mehren</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Bundschus</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Rautschka</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mayer</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Sanz</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Furlong</surname>
<given-names>LI</given-names>
</name>
</person-group>
<article-title>Gene-disease network analysis reveals functional modules in mendelian, complex and environmental diseases</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<issue>6</issue>
<fpage>e20284</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0020284</pub-id>
<pub-id pub-id-type="pmid">21695124</pub-id>
</element-citation>
</ref>
<ref id="CR41">
<label>41.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bauer-Mehren</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rautschka</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sanz</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Furlong</surname>
<given-names>LI</given-names>
</name>
</person-group>
<article-title>DisGeNET: a Cytoscape plugin to visualize, integrate, search and analyze gene-disease networks</article-title>
<source>Bioinformatics</source>
<year>2010</year>
<volume>26</volume>
<issue>22</issue>
<fpage>2924</fpage>
<lpage>2926</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btq538</pub-id>
<pub-id pub-id-type="pmid">20861032</pub-id>
</element-citation>
</ref>
<ref id="CR42">
<label>42.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gray</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Yates</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Seal</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Wright</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Bruford</surname>
<given-names>EA</given-names>
</name>
</person-group>
<article-title>Genenames. org: the HGNC resources in 2015</article-title>
<source>Nucleic Acids Research</source>
<year>2015</year>
<volume>43</volume>
<issue>D1</issue>
<fpage>D1079</fpage>
<lpage>D1085</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gku1071</pub-id>
<pub-id pub-id-type="pmid">25361968</pub-id>
</element-citation>
</ref>
<ref id="CR43">
<label>43.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Fader</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Soderland</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Etzioni</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>Identifying relations for open information extraction</article-title>
<source>Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
<year>2011</year>
<fpage>1535</fpage>
<lpage>1545</lpage>
</element-citation>
</ref>
<ref id="CR44">
<label>44.</label>
<mixed-citation publication-type="other">Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, et al. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010;38(Web Server issue):W214–220.</mixed-citation>
</ref>
<ref id="CR45">
<label>45.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stark</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Breitkreutz</surname>
<given-names>B-J</given-names>
</name>
<name>
<surname>Reguly</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Boucher</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Breitkreutz</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Tyers</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>BioGRID: a general repository for interaction datasets</article-title>
<source>Nucleic Acids Res</source>
<year>2006</year>
<volume>34</volume>
<issue>suppl 1</issue>
<fpage>D535</fpage>
<lpage>D539</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkj109</pub-id>
<pub-id pub-id-type="pmid">16381927</pub-id>
</element-citation>
</ref>
<ref id="CR46">
<label>46.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rebhan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Chalifa-Caspi</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Prilusky</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lancet</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>GeneCards: integrating information about genes, proteins and diseases</article-title>
<source>Trends Genet</source>
<year>1997</year>
<volume>13</volume>
<issue>4</issue>
<fpage>163</fpage>
<pub-id pub-id-type="doi">10.1016/S0168-9525(97)01103-7</pub-id>
<pub-id pub-id-type="pmid">9097728</pub-id>
</element-citation>
</ref>
<ref id="CR47">
<label>47.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bastian</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Heymann</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Jacomy</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Gephi: an open source software for exploring and manipulating networks</article-title>
<source>ICWSM</source>
<year>2009</year>
<volume>8</volume>
<fpage>361</fpage>
<lpage>362</lpage>
</element-citation>
</ref>
<ref id="CR48">
<label>48.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shannon</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Markiel</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ozier</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Baliga</surname>
<given-names>NS</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>JT</given-names>
</name>
<name>
<surname>Ramage</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Amin</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Schwikowski</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Ideker</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Cytoscape: a software environment for integrated models of biomolecular interaction networks</article-title>
<source>Genome Res</source>
<year>2003</year>
<volume>13</volume>
<issue>11</issue>
<fpage>2498</fpage>
<lpage>2504</lpage>
<pub-id pub-id-type="doi">10.1101/gr.1239303</pub-id>
<pub-id pub-id-type="pmid">14597658</pub-id>
</element-citation>
</ref>
<ref id="CR49">
<label>49.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Muruganujan</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Casagrande</surname>
<given-names>JT</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>PD</given-names>
</name>
</person-group>
<article-title>Large-scale gene function analysis with the PANTHER classification system</article-title>
<source>Nat Protoc</source>
<year>2013</year>
<volume>8</volume>
<issue>8</issue>
<fpage>1551</fpage>
<lpage>1566</lpage>
<pub-id pub-id-type="doi">10.1038/nprot.2013.092</pub-id>
<pub-id pub-id-type="pmid">23868073</pub-id>
</element-citation>
</ref>
<ref id="CR50">
<label>50.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>da Huang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Sherman</surname>
<given-names>BT</given-names>
</name>
<name>
<surname>Lempicki</surname>
<given-names>RA</given-names>
</name>
</person-group>
<article-title>Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources</article-title>
<source>Nat Protoc</source>
<year>2009</year>
<volume>4</volume>
<issue>1</issue>
<fpage>44</fpage>
<lpage>57</lpage>
<pub-id pub-id-type="doi">10.1038/nprot.2008.211</pub-id>
<pub-id pub-id-type="pmid">19131956</pub-id>
</element-citation>
</ref>
<ref id="CR51">
<label>51.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>da Huang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Sherman</surname>
<given-names>BT</given-names>
</name>
<name>
<surname>Lempicki</surname>
<given-names>RA</given-names>
</name>
</person-group>
<article-title>Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists</article-title>
<source>Nucleic Acids Res</source>
<year>2009</year>
<volume>37</volume>
<issue>1</issue>
<fpage>1</fpage>
<lpage>13</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkn923</pub-id>
<pub-id pub-id-type="pmid">19033363</pub-id>
</element-citation>
</ref>
<ref id="CR52">
<label>52.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carmona-Saez</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Chagoyen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tirado</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Carazo</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Pascual-Montano</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists</article-title>
<source>Genome Biol</source>
<year>2007</year>
<volume>8</volume>
<issue>1</issue>
<fpage>R3</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2007-8-1-r3</pub-id>
<pub-id pub-id-type="pmid">17204154</pub-id>
</element-citation>
</ref>
<ref id="CR53">
<label>53.</label>
<mixed-citation publication-type="other">Nogales-Cadenas R, Carmona-Saez P, Vazquez M, Vicente C, Yang X, Tirado F, Carazo JM, Pascual-Montano A. GeneCodis: interpreting gene lists through enrichment analysis and integration of diverse biological information. Nucleic Acids Res. 2009;37 suppl 2:W317–22.</mixed-citation>
</ref>
<ref id="CR54">
<label>54.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tabas-Madrid</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Nogales-Cadenas</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Pascual-Montano</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>GeneCodis3: a non-redundant and modular enrichment analysis tool for functional genomics</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<issue>W1</issue>
<fpage>W478</fpage>
<lpage>W483</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gks402</pub-id>
<pub-id pub-id-type="pmid">22573175</pub-id>
</element-citation>
</ref>
<ref id="CR55">
<label>55.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rokicki</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Dorecka</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Romaniuk</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Retinal ganglion cells death in glaucoma--mechanism and potential treatment. Part I</article-title>
<source>Klin Oczna</source>
<year>2006</year>
<volume>109</volume>
<issue>7–9</issue>
<fpage>349</fpage>
<lpage>52</lpage>
<pub-id pub-id-type="pmid">18260296</pub-id>
</element-citation>
</ref>
<ref id="CR56">
<label>56.</label>
<mixed-citation publication-type="other">Wang WH, McNatt LG, Pang IH, Millar JC, Hellberg PE, Hellberg MH, Steely HT, Rubin JS, Fingert JH, Sheffield VC, et al. Increased expression of the WNT antagonist sFRP-1 in glaucoma elevates intraocular pressure. J Clin Invest. 2008;118(3):1056–64.</mixed-citation>
</ref>
<ref id="CR57">
<label>57.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Villarreal</surname>
<given-names>G</given-names>
<suffix>Jr</suffix>
</name>
<name>
<surname>Chatterjee</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Oh</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Oh</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Kang</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>Rhee</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Canonical wnt signaling regulates extracellular matrix expression in the trabecular meshwork</article-title>
<source>Invest Ophthalmol Vis Sci</source>
<year>2014</year>
<volume>55</volume>
<issue>11</issue>
<fpage>7433</fpage>
<lpage>7440</lpage>
<pub-id pub-id-type="doi">10.1167/iovs.13-12652</pub-id>
<pub-id pub-id-type="pmid">25352117</pub-id>
</element-citation>
</ref>
<ref id="CR58">
<label>58.</label>
<mixed-citation publication-type="other">Wang L, Chadwick W, Park SS, Zhou Y, Silver N, Martin B, Maudsley S. Gonadotropin-releasing hormone receptor system: modulatory role in aging and neurodegeneration. CNS Neurol Disord Drug Targets. 2010;9(5):651–60.</mixed-citation>
</ref>
<ref id="CR59">
<label>59.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barabási</surname>
<given-names>A-L</given-names>
</name>
<name>
<surname>Albert</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Emergence of scaling in random networks</article-title>
<source>Science</source>
<year>1999</year>
<volume>286</volume>
<issue>5439</issue>
<fpage>509</fpage>
<lpage>512</lpage>
<pub-id pub-id-type="doi">10.1126/science.286.5439.509</pub-id>
<pub-id pub-id-type="pmid">10521342</pub-id>
</element-citation>
</ref>
<ref id="CR60">
<label>60.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ravasz</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Somera</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Mongru</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Oltvai</surname>
<given-names>ZN</given-names>
</name>
<name>
<surname>Barabási</surname>
<given-names>A-L</given-names>
</name>
</person-group>
<article-title>Hierarchical organization of modularity in metabolic networks</article-title>
<source>Science</source>
<year>2002</year>
<volume>297</volume>
<issue>5586</issue>
<fpage>1551</fpage>
<lpage>1555</lpage>
<pub-id pub-id-type="doi">10.1126/science.1073374</pub-id>
<pub-id pub-id-type="pmid">12202830</pub-id>
</element-citation>
</ref>
<ref id="CR61">
<label>61.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yook</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Oltvai</surname>
<given-names>ZN</given-names>
</name>
<name>
<surname>Barabási</surname>
<given-names>AL</given-names>
</name>
</person-group>
<article-title>Functional and topological characterization of protein interaction networks</article-title>
<source>Proteomics</source>
<year>2004</year>
<volume>4</volume>
<issue>4</issue>
<fpage>928</fpage>
<lpage>942</lpage>
<pub-id pub-id-type="doi">10.1002/pmic.200300636</pub-id>
<pub-id pub-id-type="pmid">15048975</pub-id>
</element-citation>
</ref>
<ref id="CR62">
<label>62.</label>
<mixed-citation publication-type="other">GENIA Tagger- part-of-speech tagging, shallow parsing, and named entity recognition for biomedical text- [
<ext-link ext-link-type="uri" xlink:href="http://www.nactem.ac.uk/tsujii/GENIA/tagger/">http://www.nactem.ac.uk/tsujii/GENIA/tagger/</ext-link>
]. Accessed 25 Mar 2015.</mixed-citation>
</ref>
<ref id="CR63">
<label>63.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Chtioui</surname>
<given-names>S</given-names>
</name>
</person-group>
<source>Evaluation of gene/protein name recognition Programs</source>
<year>2008</year>
<publisher-loc>Geneva</publisher-loc>
<publisher-name>Masters in Proteomics and Bioinformatics, University of Geneva</publisher-name>
</element-citation>
</ref>
<ref id="CR64">
<label>64.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ekbal</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Saha</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Sikdar</surname>
<given-names>UK</given-names>
</name>
</person-group>
<article-title>Biomedical named entity extraction: some issues of corpus compatibilities</article-title>
<source>Springerplus</source>
<year>2013</year>
<volume>2</volume>
<fpage>601</fpage>
<pub-id pub-id-type="doi">10.1186/2193-1801-2-601</pub-id>
<pub-id pub-id-type="pmid">24294548</pub-id>
</element-citation>
</ref>
<ref id="CR65">
<label>65.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blondel</surname>
<given-names>VD</given-names>
</name>
<name>
<surname>Guillaume</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Lambiotte</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Lefebvre</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Fast unfolding of communities in large networks</article-title>
<source>J Statistical Mechanics</source>
<year>2008</year>
<volume>2008</volume>
<issue>10</issue>
<fpage>10008</fpage>
<pub-id pub-id-type="doi">10.1088/1742-5468/2008/10/P10008</pub-id>
</element-citation>
</ref>
<ref id="CR66">
<label>66.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Lambiotte</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Delvenne</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Barahona</surname>
<given-names>M</given-names>
</name>
</person-group>
<source>Laplacian dynamics and multiscale modular structure in networks. arXiv preprint arXiv:0812.1770</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="CR67">
<label>67.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Pyysalo</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ohta</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>J-D</given-names>
</name>
<name>
<surname>Tsujii</surname>
<given-names>J</given-names>
</name>
</person-group>
<source>Static relations: a piece in the biomedical information extraction puzzle. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing</source>
<year>2009</year>
<fpage>1</fpage>
<lpage>9</lpage>
</element-citation>
</ref>
<ref id="CR68">
<label>68.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>De Marneffe</surname>
<given-names>M-C</given-names>
</name>
<name>
<surname>MacCartney</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Manning</surname>
<given-names>CD</given-names>
</name>
</person-group>
<article-title>Generating typed dependency parses from phrase structure parses</article-title>
<source>Proceedings of LREC</source>
<year>2006</year>
<fpage>449</fpage>
<lpage>454</lpage>
</element-citation>
</ref>
<ref id="CR69">
<label>69.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nakatake</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yoshida</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Nakao</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Arita</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Yasuda</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kita</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Hyphema is a risk factor for failure of trabeculectomy in neovascular glaucoma: a retrospective analysis</article-title>
<source>BMC Ophthalmol</source>
<year>2014</year>
<volume>14</volume>
<issue>1</issue>
<fpage>55</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2415-14-55</pub-id>
<pub-id pub-id-type="pmid">24766841</pub-id>
</element-citation>
</ref>
<ref id="CR70">
<label>70.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>DY</given-names>
</name>
<name>
<surname>Ray</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rodgers</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ergorul</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Hyman</surname>
<given-names>BT</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Global gene expression changes in rat retinal ganglion cells in experimental glaucoma</article-title>
<source>Invest Ophthalmol Vis Sci</source>
<year>2010</year>
<volume>51</volume>
<issue>8</issue>
<fpage>4084</fpage>
<lpage>95</lpage>
<pub-id pub-id-type="doi">10.1167/iovs.09-4864</pub-id>
<pub-id pub-id-type="pmid">20335623</pub-id>
</element-citation>
</ref>
<ref id="CR71">
<label>71.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Stewart</surname>
<given-names>MW</given-names>
</name>
</person-group>
<source>PDGF: ophthalmology’s next great target</source>
<year>2013</year>
</element-citation>
</ref>
<ref id="CR72">
<label>72.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wecker</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Borner</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Grehn</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Schlunck</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Effects of TGF-beta2 on cadherins and beta-catenin in human trabecular meshwork cells</article-title>
<source>Invest Ophthalmol Vis Sci</source>
<year>2013</year>
<volume>54</volume>
<issue>10</issue>
<fpage>6456</fpage>
<lpage>62</lpage>
<pub-id pub-id-type="doi">10.1167/iovs.13-12669</pub-id>
<pub-id pub-id-type="pmid">24003087</pub-id>
</element-citation>
</ref>
<ref id="CR73">
<label>73.</label>
<mixed-citation publication-type="other">Ayub H, Micheal S, Akhtar F, Khan MI, Bashir S, Waheed NK, Ali M, Schoenmaker-Koller FE, Shafique S, Qamar R, den Hollander AI. Association of a Polymorphism in the BIRC6 Gene with Pseudoexfoliative Glaucoma. PLoS One. 2014;9(8):e105023.</mixed-citation>
</ref>
<ref id="CR74">
<label>74.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Izzotti</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Longobardi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Cartiglia</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Sacca</surname>
<given-names>SC</given-names>
</name>
</person-group>
<article-title>Mitochondrial damage in the trabecular meshwork occurs only in primary open-angle glaucoma and in pseudoexfoliative glaucoma</article-title>
<source>Plos One</source>
<year>2011</year>
<volume>6</volume>
<issue>1</issue>
<fpage>e14567</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0014567</pub-id>
<pub-id pub-id-type="pmid">21283745</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Informatique/explor/SgmlV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000015 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000015 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Informatique
   |area=    SgmlV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4857381
   |texte=   Building a glaucoma interaction network using a text mining approach
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:27152122" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a SgmlV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jul 1 14:26:08 2019. Site generation: Wed Apr 28 21:40:44 2021