InforLorV4, Pmc, Corpus, bibRecord, 0000520

***** Acces problem to record *****\

Identifieur interne : 0000520 ( Pmc/Corpus ); précédent : 0000519; suivant : 0000521 ***** probable Xml problem with record *****

Links to Exploration step

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Integrative relational machine-learning for understanding drug side-effect profiles</title>
<author><name sortKey="Bresso, Emmanuel" sort="Bresso, Emmanuel" uniqKey="Bresso E" first="Emmanuel" last="Bresso">Emmanuel Bresso</name>
<affiliation><nlm:aff id="I1">Université de Lorraine, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I3">Harmonic Pharma, Espace Transfert INRIA NGE, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Grisoni, Renaud" sort="Grisoni, Renaud" uniqKey="Grisoni R" first="Renaud" last="Grisoni">Renaud Grisoni</name>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Marchetti, Gino" sort="Marchetti, Gino" uniqKey="Marchetti G" first="Gino" last="Marchetti">Gino Marchetti</name>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I4">CNRS, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Karaboga, Arnaud Sinan" sort="Karaboga, Arnaud Sinan" uniqKey="Karaboga A" first="Arnaud Sinan" last="Karaboga">Arnaud Sinan Karaboga</name>
<affiliation><nlm:aff id="I3">Harmonic Pharma, Espace Transfert INRIA NGE, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I4">CNRS, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
<affiliation><nlm:aff id="I1">Université de Lorraine, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">23802887</idno>
<idno type="pmc">3710241</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3710241</idno>
<idno type="RBID">PMC:3710241</idno>
<idno type="doi">10.1186/1471-2105-14-207</idno>
<date when="2013">2013</date>
<idno type="wicri:Area/Pmc/Corpus">000052</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000052</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Integrative relational machine-learning for understanding drug side-effect profiles</title>
<author><name sortKey="Bresso, Emmanuel" sort="Bresso, Emmanuel" uniqKey="Bresso E" first="Emmanuel" last="Bresso">Emmanuel Bresso</name>
<affiliation><nlm:aff id="I1">Université de Lorraine, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I3">Harmonic Pharma, Espace Transfert INRIA NGE, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Grisoni, Renaud" sort="Grisoni, Renaud" uniqKey="Grisoni R" first="Renaud" last="Grisoni">Renaud Grisoni</name>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Marchetti, Gino" sort="Marchetti, Gino" uniqKey="Marchetti G" first="Gino" last="Marchetti">Gino Marchetti</name>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I4">CNRS, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Karaboga, Arnaud Sinan" sort="Karaboga, Arnaud Sinan" uniqKey="Karaboga A" first="Arnaud Sinan" last="Karaboga">Arnaud Sinan Karaboga</name>
<affiliation><nlm:aff id="I3">Harmonic Pharma, Espace Transfert INRIA NGE, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I4">CNRS, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
<affiliation><nlm:aff id="I1">Université de Lorraine, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="I2">INRIA, Villers-lès-Nancy, 54600, France</nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint><date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>Drug side effects represent a common reason for stopping drug development during clinical trials. Improving our ability to understand drug side effects is necessary to reduce attrition rates during drug development as well as the risk of discovering novel side effects in available drugs. Today, most investigations deal with isolated side effects and overlook possible redundancy and their frequent co-occurrence.</p>
</sec>
<sec><title>Results</title>
<p>In this work, drug annotations are collected from SIDER and DrugBank databases. Terms describing individual side effects reported in SIDER are clustered with a semantic similarity measure into term clusters (TCs). Maximal frequent itemsets are extracted from the resulting drug x TC binary table, leading to the identification of what we call side-effect profiles (SEPs). A SEP is defined as the longest combination of TCs which are shared by a significant number of drugs. Frequent SEPs are explored on the basis of integrated drug and target descriptors using two machine learning methods: decision-trees and inductive-logic programming. Although both methods yield explicit models, inductive-logic programming method performs relational learning and is able to exploit not only drug properties but also background knowledge. Learning efficiency is evaluated by cross-validation and direct testing with new molecules. Comparison of the two machine-learning methods shows that the inductive-logic-programming method displays a greater sensitivity than decision trees and successfully exploit background knowledge such as functional annotations and pathways of drug targets, thereby producing rich and expressive rules. All models and theories are available on a dedicated web site.</p>
</sec>
<sec><title>Conclusions</title>
<p>Side effect profiles covering significant number of drugs have been extracted from a drug ×side-effect association table. Integration of background knowledge concerning both chemical and biological spaces has been combined with a relational learning method for discovering rules which explicitly characterize drug-SEP associations. These rules are successfully used for predicting SEPs associated with new drugs.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Derumeaux, G" uniqKey="Derumeaux G">G Derumeaux</name>
</author>
<author><name sortKey="Ernande, L" uniqKey="Ernande L">L Ernande</name>
</author>
<author><name sortKey="Serusclat, A" uniqKey="Serusclat A">A Serusclat</name>
</author>
<author><name sortKey="Servan, E" uniqKey="Servan E">E Servan</name>
</author>
<author><name sortKey="Bruckert, E" uniqKey="Bruckert E">E Bruckert</name>
</author>
<author><name sortKey="Rousset, H" uniqKey="Rousset H">H Rousset</name>
</author>
<author><name sortKey="Senn, S" uniqKey="Senn S">S Senn</name>
</author>
<author><name sortKey="Van Gaal, L" uniqKey="Van Gaal L">L Van Gaal</name>
</author>
<author><name sortKey="Picandet, B" uniqKey="Picandet B">B Picandet</name>
</author>
<author><name sortKey="Gavini, F" uniqKey="Gavini F">F Gavini</name>
</author>
<author><name sortKey="Moulin, P" uniqKey="Moulin P">P Moulin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kuhn, M" uniqKey="Kuhn M">M Kuhn</name>
</author>
<author><name sortKey="Campillos, M" uniqKey="Campillos M">M Campillos</name>
</author>
<author><name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author><name sortKey="Jensen, Lj" uniqKey="Jensen L">LJ Jensen</name>
</author>
<author><name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Campillos, M" uniqKey="Campillos M">M Campillos</name>
</author>
<author><name sortKey="Kuhn, M" uniqKey="Kuhn M">M Kuhn</name>
</author>
<author><name sortKey="Gavin, Ac" uniqKey="Gavin A">AC Gavin</name>
</author>
<author><name sortKey="Jensen, Lj" uniqKey="Jensen L">LJ Jensen</name>
</author>
<author><name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Takarabe, M" uniqKey="Takarabe M">M Takarabe</name>
</author>
<author><name sortKey="Kotera, M" uniqKey="Kotera M">M Kotera</name>
</author>
<author><name sortKey="Nishimura, Y" uniqKey="Nishimura Y">Y Nishimura</name>
</author>
<author><name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
<author><name sortKey="Yamanishi, Y" uniqKey="Yamanishi Y">Y Yamanishi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yang, L" uniqKey="Yang L">L Yang</name>
</author>
<author><name sortKey="Agarwal, P" uniqKey="Agarwal P">P Agarwal</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Scheiber, J" uniqKey="Scheiber J">J Scheiber</name>
</author>
<author><name sortKey="Jenkins, Jl" uniqKey="Jenkins J">JL Jenkins</name>
</author>
<author><name sortKey="Sukuru, Sc" uniqKey="Sukuru S">SC Sukuru</name>
</author>
<author><name sortKey="Bender, A" uniqKey="Bender A">A Bender</name>
</author>
<author><name sortKey="Mikhailov, D" uniqKey="Mikhailov D">D Mikhailov</name>
</author>
<author><name sortKey="Milik, M" uniqKey="Milik M">M Milik</name>
</author>
<author><name sortKey="Azzaoui, K" uniqKey="Azzaoui K">K Azzaoui</name>
</author>
<author><name sortKey="Whitebread, S" uniqKey="Whitebread S">S Whitebread</name>
</author>
<author><name sortKey="Hamon, J" uniqKey="Hamon J">J Hamon</name>
</author>
<author><name sortKey="Urban, L" uniqKey="Urban L">L Urban</name>
</author>
<author><name sortKey="Glick, M" uniqKey="Glick M">M Glick</name>
</author>
<author><name sortKey="Davies, Jw" uniqKey="Davies J">JW Davies</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lee, S" uniqKey="Lee S">S Lee</name>
</author>
<author><name sortKey="Lee, Kh" uniqKey="Lee K">KH Lee</name>
</author>
<author><name sortKey="Song, M" uniqKey="Song M">M Song</name>
</author>
<author><name sortKey="Lee, D" uniqKey="Lee D">D Lee</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yamanishi, Y" uniqKey="Yamanishi Y">Y Yamanishi</name>
</author>
<author><name sortKey="Pauwels, E" uniqKey="Pauwels E">E Pauwels</name>
</author>
<author><name sortKey="Kotera, M" uniqKey="Kotera M">M Kotera</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Benabderrahmane, S" uniqKey="Benabderrahmane S">S Benabderrahmane</name>
</author>
<author><name sortKey="Smail Tabbone, M" uniqKey="Smail Tabbone M">M Smail-Tabbone</name>
</author>
<author><name sortKey="Poch, O" uniqKey="Poch O">O Poch</name>
</author>
<author><name sortKey="Napoli, A" uniqKey="Napoli A">A Napoli</name>
</author>
<author><name sortKey="Devignes, Md" uniqKey="Devignes M">MD Devignes</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Knox, C" uniqKey="Knox C">C Knox</name>
</author>
<author><name sortKey="Law, V" uniqKey="Law V">V Law</name>
</author>
<author><name sortKey="Jewison, T" uniqKey="Jewison T">T Jewison</name>
</author>
<author><name sortKey="Liu, P" uniqKey="Liu P">P Liu</name>
</author>
<author><name sortKey="Ly, S" uniqKey="Ly S">S Ly</name>
</author>
<author><name sortKey="Frolkis, A" uniqKey="Frolkis A">A Frolkis</name>
</author>
<author><name sortKey="Pon, A" uniqKey="Pon A">A Pon</name>
</author>
<author><name sortKey="Banco, K" uniqKey="Banco K">K Banco</name>
</author>
<author><name sortKey="Mak, C" uniqKey="Mak C">C Mak</name>
</author>
<author><name sortKey="Neveu, V" uniqKey="Neveu V">V Neveu</name>
</author>
<author><name sortKey="Djoumbou, Y" uniqKey="Djoumbou Y">Y Djoumbou</name>
</author>
<author><name sortKey="Eisner, R" uniqKey="Eisner R">R Eisner</name>
</author>
<author><name sortKey="Guo, Ac" uniqKey="Guo A">AC Guo</name>
</author>
<author><name sortKey="Wishart, Ds" uniqKey="Wishart D">DS Wishart</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="O Oyle, Nm" uniqKey="O Oyle N">NM O’Boyle</name>
</author>
<author><name sortKey="Banck, M" uniqKey="Banck M">M Banck</name>
</author>
<author><name sortKey="James, Ca" uniqKey="James C">CA James</name>
</author>
<author><name sortKey="Morley, C" uniqKey="Morley C">C Morley</name>
</author>
<author><name sortKey="Vandermeersch, T" uniqKey="Vandermeersch T">T Vandermeersch</name>
</author>
<author><name sortKey="Hutchison, Gr" uniqKey="Hutchison G">GR Hutchison</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ritchie, Dw" uniqKey="Ritchie D">DW Ritchie</name>
</author>
<author><name sortKey="Kemp, Gjl" uniqKey="Kemp G">GJL Kemp</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cai, W" uniqKey="Cai W">W Cai</name>
</author>
<author><name sortKey="Xu, J" uniqKey="Xu J">J Xu</name>
</author>
<author><name sortKey="Shao, X" uniqKey="Shao X">X Shao</name>
</author>
<author><name sortKey="Leroux, V" uniqKey="Leroux V">V Leroux</name>
</author>
<author><name sortKey="Beautrait, A" uniqKey="Beautrait A">A Beautrait</name>
</author>
<author><name sortKey="Maigret, B" uniqKey="Maigret B">B Maigret</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ward, Jh" uniqKey="Ward J">JH Ward</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kelley, La" uniqKey="Kelley L">LA Kelley</name>
</author>
<author><name sortKey="Gardner, Sp" uniqKey="Gardner S">SP Gardner</name>
</author>
<author><name sortKey="Sutcliffe, Mj" uniqKey="Sutcliffe M">MJ Sutcliffe</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Berman, Hm" uniqKey="Berman H">HM Berman</name>
</author>
<author><name sortKey="Westbrook, J" uniqKey="Westbrook J">J Westbrook</name>
</author>
<author><name sortKey="Feng, Z" uniqKey="Feng Z">Z Feng</name>
</author>
<author><name sortKey="Gilliland, G" uniqKey="Gilliland G">G Gilliland</name>
</author>
<author><name sortKey="Bhat, Tn" uniqKey="Bhat T">TN Bhat</name>
</author>
<author><name sortKey="Weissig, H" uniqKey="Weissig H">H Weissig</name>
</author>
<author><name sortKey="Shindyalov, In" uniqKey="Shindyalov I">IN Shindyalov</name>
</author>
<author><name sortKey="Bourne, Pe" uniqKey="Bourne P">PE Bourne</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kerrien, S" uniqKey="Kerrien S">S Kerrien</name>
</author>
<author><name sortKey="Aranda, B" uniqKey="Aranda B">B Aranda</name>
</author>
<author><name sortKey="Breuza, L" uniqKey="Breuza L">L Breuza</name>
</author>
<author><name sortKey="Bridge, A" uniqKey="Bridge A">A Bridge</name>
</author>
<author><name sortKey="Broackes Carter, F" uniqKey="Broackes Carter F">F Broackes-Carter</name>
</author>
<author><name sortKey="Chen, C" uniqKey="Chen C">C Chen</name>
</author>
<author><name sortKey="Duesbury, M" uniqKey="Duesbury M">M Duesbury</name>
</author>
<author><name sortKey="Dumousseau, M" uniqKey="Dumousseau M">M Dumousseau</name>
</author>
<author><name sortKey="Feuermann, M" uniqKey="Feuermann M">M Feuermann</name>
</author>
<author><name sortKey="Hinz, U" uniqKey="Hinz U">U Hinz</name>
</author>
<author><name sortKey="Jandrasits, C" uniqKey="Jandrasits C">C Jandrasits</name>
</author>
<author><name sortKey="Jimenez, Rc" uniqKey="Jimenez R">RC Jimenez</name>
</author>
<author><name sortKey="Khadake, J" uniqKey="Khadake J">J Khadake</name>
</author>
<author><name sortKey="Mahadevan, U" uniqKey="Mahadevan U">U Mahadevan</name>
</author>
<author><name sortKey="Masson, P" uniqKey="Masson P">P Masson</name>
</author>
<author><name sortKey="Pedruzzi, I" uniqKey="Pedruzzi I">I Pedruzzi</name>
</author>
<author><name sortKey="Pfeiffenberger, E" uniqKey="Pfeiffenberger E">E Pfeiffenberger</name>
</author>
<author><name sortKey="Porras, P" uniqKey="Porras P">P Porras</name>
</author>
<author><name sortKey="Raghunath, A" uniqKey="Raghunath A">A Raghunath</name>
</author>
<author><name sortKey="Roechert, B" uniqKey="Roechert B">B Roechert</name>
</author>
<author><name sortKey="Orchard, S" uniqKey="Orchard S">S Orchard</name>
</author>
<author><name sortKey="Hermjakob, H" uniqKey="Hermjakob H">H Hermjakob</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
<author><name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
<author><name sortKey="Sato, Y" uniqKey="Sato Y">Y Sato</name>
</author>
<author><name sortKey="Furumichi, M" uniqKey="Furumichi M">M Furumichi</name>
</author>
<author><name sortKey="Tanabe, M" uniqKey="Tanabe M">M Tanabe</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Schaefer, Cf" uniqKey="Schaefer C">CF Schaefer</name>
</author>
<author><name sortKey="Anthony, K" uniqKey="Anthony K">K Anthony</name>
</author>
<author><name sortKey="Krupa, S" uniqKey="Krupa S">S Krupa</name>
</author>
<author><name sortKey="Buchoff, J" uniqKey="Buchoff J">J Buchoff</name>
</author>
<author><name sortKey="Day, M" uniqKey="Day M">M Day</name>
</author>
<author><name sortKey="Hannay, T" uniqKey="Hannay T">T Hannay</name>
</author>
<author><name sortKey="Buetow, Kh" uniqKey="Buetow K">KH Buetow</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Binns, D" uniqKey="Binns D">D Binns</name>
</author>
<author><name sortKey="Dimmer, E" uniqKey="Dimmer E">E Dimmer</name>
</author>
<author><name sortKey="Huntley, R" uniqKey="Huntley R">R Huntley</name>
</author>
<author><name sortKey="Barrell, D" uniqKey="Barrell D">D Barrell</name>
</author>
<author><name sortKey="O Onovan, C" uniqKey="O Onovan C">C O’Donovan</name>
</author>
<author><name sortKey="Apweiler, R" uniqKey="Apweiler R">R Apweiler</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hunter, S" uniqKey="Hunter S">S Hunter</name>
</author>
<author><name sortKey="Jones, P" uniqKey="Jones P">P Jones</name>
</author>
<author><name sortKey="Mitchell, A" uniqKey="Mitchell A">A Mitchell</name>
</author>
<author><name sortKey="Apweiler, R" uniqKey="Apweiler R">R Apweiler</name>
</author>
<author><name sortKey="Attwood, Tk" uniqKey="Attwood T">TK Attwood</name>
</author>
<author><name sortKey="Bateman, A" uniqKey="Bateman A">A Bateman</name>
</author>
<author><name sortKey="Bernard, T" uniqKey="Bernard T">T Bernard</name>
</author>
<author><name sortKey="Binns, D" uniqKey="Binns D">D Binns</name>
</author>
<author><name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
<author><name sortKey="Burge, S" uniqKey="Burge S">S Burge</name>
</author>
<author><name sortKey="De Castro, E" uniqKey="De Castro E">E de Castro</name>
</author>
<author><name sortKey="Coggill, P" uniqKey="Coggill P">P Coggill</name>
</author>
<author><name sortKey="Corbett, M" uniqKey="Corbett M">M Corbett</name>
</author>
<author><name sortKey="Das, U" uniqKey="Das U">U Das</name>
</author>
<author><name sortKey="Daugherty, L" uniqKey="Daugherty L">L Daugherty</name>
</author>
<author><name sortKey="Duquenne, L" uniqKey="Duquenne L">L Duquenne</name>
</author>
<author><name sortKey="Finn, Rd" uniqKey="Finn R">RD Finn</name>
</author>
<author><name sortKey="Fraser, M" uniqKey="Fraser M">M Fraser</name>
</author>
<author><name sortKey="Gough, J" uniqKey="Gough J">J Gough</name>
</author>
<author><name sortKey="Haft, D" uniqKey="Haft D">D Haft</name>
</author>
<author><name sortKey="Hulo, N" uniqKey="Hulo N">N Hulo</name>
</author>
<author><name sortKey="Kahn, D" uniqKey="Kahn D">D Kahn</name>
</author>
<author><name sortKey="Kelly, E" uniqKey="Kelly E">E Kelly</name>
</author>
<author><name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author><name sortKey="Lonsdale, D" uniqKey="Lonsdale D">D Lonsdale</name>
</author>
<author><name sortKey="Lopez, R" uniqKey="Lopez R">R Lopez</name>
</author>
<author><name sortKey="Madera, M" uniqKey="Madera M">M Madera</name>
</author>
<author><name sortKey="Maslen, J" uniqKey="Maslen J">J Maslen</name>
</author>
<author><name sortKey="Mcanulla, C" uniqKey="Mcanulla C">C McAnulla</name>
</author>
<author><name sortKey="Mcdowall, J" uniqKey="Mcdowall J">J McDowall</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bresso, E" uniqKey="Bresso E">E Bresso</name>
</author>
<author><name sortKey="Benabderrahmane, S" uniqKey="Benabderrahmane S">S Benabderrahmane</name>
</author>
<author><name sortKey="Smail Tabbone, M" uniqKey="Smail Tabbone M">M Smail-Tabbone</name>
</author>
<author><name sortKey="Marchetti, G" uniqKey="Marchetti G">G Marchetti</name>
</author>
<author><name sortKey="Karaboga, As" uniqKey="Karaboga A">AS Karaboga</name>
</author>
<author><name sortKey="Souchet, M" uniqKey="Souchet M">M Souchet</name>
</author>
<author><name sortKey="Napoli, A" uniqKey="Napoli A">A Napoli</name>
</author>
<author><name sortKey="Devignes, Md" uniqKey="Devignes M">MD Devignes</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Szathmary, L" uniqKey="Szathmary L">L Szathmary</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Hall, M" uniqKey="Hall M">M Hall</name>
</author>
<author><name sortKey="Frank, E" uniqKey="Frank E">E Frank</name>
</author>
<author><name sortKey="Holmes, G" uniqKey="Holmes G">G Holmes</name>
</author>
<author><name sortKey="Pfahringer, B" uniqKey="Pfahringer B">B Pfahringer</name>
</author>
<author><name sortKey="Reutemann, P" uniqKey="Reutemann P">P Reutemann</name>
</author>
<author><name sortKey="Witten, Ih" uniqKey="Witten I">IH Witten</name>
</author>
<author><name sortKey="Witten, Ih" uniqKey="Witten I">IH Witten</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Muggleton, S" uniqKey="Muggleton S">S Muggleton</name>
</author>
<author><name sortKey="Srinivasan, A" uniqKey="Srinivasan A">A Srinivasan</name>
</author>
<author><name sortKey="King, Rd" uniqKey="King R">RD King</name>
</author>
<author><name sortKey="Sternberg, Mje" uniqKey="Sternberg M">MJE Sternberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Page, D" uniqKey="Page D">D Page</name>
</author>
<author><name sortKey="Craven, M" uniqKey="Craven M">M Craven</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Santos, Jc" uniqKey="Santos J">JC Santos</name>
</author>
<author><name sortKey="Nassif, H" uniqKey="Nassif H">H Nassif</name>
</author>
<author><name sortKey="Page, D" uniqKey="Page D">D Page</name>
</author>
<author><name sortKey="Muggleton, Sh" uniqKey="Muggleton S">SH Muggleton</name>
</author>
<author><name sortKey="Sternberg, Mj" uniqKey="Sternberg M">MJ Sternberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Muggleton, S" uniqKey="Muggleton S">S Muggleton</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Bresso, E" uniqKey="Bresso E">E Bresso</name>
</author>
<author><name sortKey="Grisoni, R" uniqKey="Grisoni R">R Grisoni</name>
</author>
<author><name sortKey="Devignes, Md" uniqKey="Devignes M">MD Devignes</name>
</author>
<author><name sortKey="Napoli, A" uniqKey="Napoli A">A Napoli</name>
</author>
<author><name sortKey="Smail Tabbone, M" uniqKey="Smail Tabbone M">M Smail-Tabbone</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Napoli, A" uniqKey="Napoli A">A Napoli</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dolcino, M" uniqKey="Dolcino M">M Dolcino</name>
</author>
<author><name sortKey="Cozzani, E" uniqKey="Cozzani E">E Cozzani</name>
</author>
<author><name sortKey="Riva, S" uniqKey="Riva S">S Riva</name>
</author>
<author><name sortKey="Parodi, A" uniqKey="Parodi A">A Parodi</name>
</author>
<author><name sortKey="Tinazzi, E" uniqKey="Tinazzi E">E Tinazzi</name>
</author>
<author><name sortKey="Lunardi, C" uniqKey="Lunardi C">C Lunardi</name>
</author>
<author><name sortKey="Puccetti, A" uniqKey="Puccetti A">A Puccetti</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article" xml:lang="en"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group><journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher><publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">23802887</article-id>
<article-id pub-id-type="pmc">3710241</article-id>
<article-id pub-id-type="publisher-id">1471-2105-14-207</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-14-207</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group><article-title>Integrative relational machine-learning for understanding drug side-effect profiles</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" corresp="yes" id="A1"><name><surname>Bresso</surname>
<given-names>Emmanuel</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>emmanuel.bresso@loria.fr</email>
</contrib>
<contrib contrib-type="author" id="A2"><name><surname>Grisoni</surname>
<given-names>Renaud</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>renaud.grisoni@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A3"><name><surname>Marchetti</surname>
<given-names>Gino</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I4">4</xref>
<email>gino.marchetti@loria.fr</email>
</contrib>
<contrib contrib-type="author" id="A4"><name><surname>Karaboga</surname>
<given-names>Arnaud Sinan</given-names>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>karaboga@harmonicpharma.com</email>
</contrib>
<contrib contrib-type="author" id="A5"><name><surname>Souchet</surname>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>souchet@harmonicpharma.com</email>
</contrib>
<contrib contrib-type="author" id="A6"><name><surname>Devignes</surname>
<given-names>Marie-Dominique</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I4">4</xref>
<email>marie-dominique.devignes@loria.fr</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A7"><name><surname>Smaïl-Tabbone</surname>
<given-names>Malika</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>malika.smail@loria.fr</email>
</contrib>
</contrib-group>
<aff id="I1"><label>1</label>
Université de Lorraine, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</aff>
<aff id="I2"><label>2</label>
INRIA, Villers-lès-Nancy, 54600, France</aff>
<aff id="I3"><label>3</label>
Harmonic Pharma, Espace Transfert INRIA NGE, Villers-lès-Nancy, 54600, France</aff>
<aff id="I4"><label>4</label>
CNRS, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France</aff>
<pub-date pub-type="collection"><year>2013</year>
</pub-date>
<pub-date pub-type="epub"><day>26</day>
<month>6</month>
<year>2013</year>
</pub-date>
<volume>14</volume>
<fpage>207</fpage>
<lpage>207</lpage>
<history><date date-type="received"><day>7</day>
<month>2</month>
<year>2013</year>
</date>
<date date-type="accepted"><day>21</day>
<month>6</month>
<year>2013</year>
</date>
</history>
<permissions><copyright-statement>Copyright © 2013 Bresso et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2013</copyright-year>
<copyright-holder>Bresso et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0"><license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/14/207"></self-uri>
<abstract><sec><title>Background</title>
<p>Drug side effects represent a common reason for stopping drug development during clinical trials. Improving our ability to understand drug side effects is necessary to reduce attrition rates during drug development as well as the risk of discovering novel side effects in available drugs. Today, most investigations deal with isolated side effects and overlook possible redundancy and their frequent co-occurrence.</p>
</sec>
<sec><title>Results</title>
<p>In this work, drug annotations are collected from SIDER and DrugBank databases. Terms describing individual side effects reported in SIDER are clustered with a semantic similarity measure into term clusters (TCs). Maximal frequent itemsets are extracted from the resulting drug x TC binary table, leading to the identification of what we call side-effect profiles (SEPs). A SEP is defined as the longest combination of TCs which are shared by a significant number of drugs. Frequent SEPs are explored on the basis of integrated drug and target descriptors using two machine learning methods: decision-trees and inductive-logic programming. Although both methods yield explicit models, inductive-logic programming method performs relational learning and is able to exploit not only drug properties but also background knowledge. Learning efficiency is evaluated by cross-validation and direct testing with new molecules. Comparison of the two machine-learning methods shows that the inductive-logic-programming method displays a greater sensitivity than decision trees and successfully exploit background knowledge such as functional annotations and pathways of drug targets, thereby producing rich and expressive rules. All models and theories are available on a dedicated web site.</p>
</sec>
<sec><title>Conclusions</title>
<p>Side effect profiles covering significant number of drugs have been extracted from a drug ×side-effect association table. Integration of background knowledge concerning both chemical and biological spaces has been combined with a relational learning method for discovering rules which explicitly characterize drug-SEP associations. These rules are successfully used for predicting SEPs associated with new drugs.</p>
</sec>
</abstract>
<kwd-group><kwd>Relational machine learning</kwd>
<kwd>Data integration</kwd>
<kwd>Drug discovery</kwd>
<kwd>Data mining</kwd>
<kwd>Drug side-effects</kwd>
</kwd-group>
</article-meta>
</front>
<body><sec><title>Background</title>
<p>Side effects are unwanted responses to drug treatment. Some side effects are adverse, while others are more tolerable. Many side effects are detected during clinical trials, and adverse side effects are often responsible for the high attrition rate of drug candidates. For example in 2008, the French Department of Industry estimated that only 1 drug out of 250 was approved by the FDA [<xref ref-type="bibr" rid="B1">1</xref>
]. Beside toxicity, it is not desirable to prescribe for a long period drugs having side effects like nausea or headache. Moreover, not all side effects are detected during clinical trials. For example, the cardiotoxicity of benfluorex was only recently highlighted [<xref ref-type="bibr" rid="B2">2</xref>
] even though benfluorex was approved in the 1970’s. Thus, early recognition of side effects is an important issue for drug development and safety.</p>
<p>To support side effect exploration, two main resources reporting their association with drugs have been developed. The FDA Adverse Event Reporting System (FAERS) stores the observed side effects reported directly by health care professionals and consumers. The SIDER database stores side-effect information mentioned on drug package inserts [<xref ref-type="bibr" rid="B3">3</xref>
].</p>
<p>Two groups of studies have been conducted on side effects. On the one hand, side-effect information has been exploited for drug repositioning. For example, Campillos et al. [<xref ref-type="bibr" rid="B4">4</xref>
] used a corpus-based side-effect similarity approach to show that pairs of drugs sharing similar side effects can have common targets. Thus, they use side-effect similarity to predict new targets for a drug. In a similar spirit, Takarabe et al. [<xref ref-type="bibr" rid="B5">5</xref>
] used FAERS to define pharmacological drug-drug similarity and to predict unknown drug-target interactions from the integration of the pharmacological similarity and genomic sequence similarity of target proteins. At the disease level, Yang and Agarwal [<xref ref-type="bibr" rid="B6">6</xref>
] proposed an approach based on the hypothesis that drugs sharing side effects could be indicated for the same disease. Drug side-effect associations and drug-disease relationships were used to develop a systematic drug repositioning method and to suggest, for instance, an antidiabetic effect for drugs causing porphyria.</p>
<p>On the other hand, other studies focus on understanding how side effects occur. As described above, relationships may exist between side effects and drug targets. Moreover, the link between chemical structure and side effects was shown by Scheiber et al. [<xref ref-type="bibr" rid="B7">7</xref>
]. From a more mechanistic point of view, Lee et al. [<xref ref-type="bibr" rid="B8">8</xref>
] showed that side effects can be correlated with the biological processes in which the drug targets are involved. For instance, they showed that nausea is correlated to an up-regulation of the deaminase activity. A very recent paper aims at predicting the side-effect profiles of molecules based on their chemical structures (defining the chemical space) and the information of their target proteins (defining the biological space) [<xref ref-type="bibr" rid="B9">9</xref>
]. The so-called side-effect profile of a molecule is simply defined as its binary fingerprint with respect to the side-effect terms. However, such earlier studies have several limitations. For example, (i) they consider only individual side effects, and ignore the fact that often more than one side effect is associated with a drug, (ii) the biological space is over-simplified, and (iii) the resulting prediction models are “black boxes” which do not provide any explicit and reusable knowledge.</p>
<p>Here, we study in a systematic way drug side-effect associations, and we propose a method for identifying and characterizing side-effect profiles (SEPs) shared by several drugs.</p>
<p>Our approach is composed of five main steps, as illustrated in Figure <xref ref-type="fig" rid="F1">1</xref>
. The first step (Figure <xref ref-type="fig" rid="F1">1</xref>
A) consists of grouping the terms used for side effects in SIDER using a semantic similarity measure in order to build Term Clusters (TC) corresponding to groups of semantically related SEs [<xref ref-type="bibr" rid="B10">10</xref>
]. In parallel, drugs from SIDER are mapped to DrugBank in order to retrieve information about drugs themselves and their targets (Figure <xref ref-type="fig" rid="F1">1</xref>
B). Then, TCs and drugs are associated in order to represent each drug by a side-effect fingerprint (Figure <xref ref-type="fig" rid="F1">1</xref>
C). SEPs are extracted as maximal frequent itemsets from side effect fingerprints (Figure <xref ref-type="fig" rid="F1">1</xref>
D). The aim is then to characterize each SEP in terms of drug and target properties. This can be addressed as a supervised classification task. Two machine-learning methods are chosen for this task: Decision Trees (DTs) and Inductive Logic Programming (ILP) (Figure <xref ref-type="fig" rid="F1">1</xref>
E). These two methods provide easily readable results which can then be exploited for understanding SEPs. Decision trees use a single table as input in which each row corresponds to a drug and each column to a drug descriptor. Inductive Logic Programming uses relational descriptors to learn a first-order-logic concept definition from observations. Relational descriptors encoding characteristics of both drugs and their targets are retrieved from our “NetworkDB” integrated database, which is built from several data sources including DrugBank, UniProt, KEGG, and GO. The models obtained for a set of selected SEPs with these two machine-learning methods are then evaluated by cross-validation and tested directly with new drugs. Finally, some elements are provided for model interpretation.</p>
<fig id="F1" position="float"><label>Figure 1</label>
<caption><p><bold>Overview of our approach for characterizing drug-SEP associations.</bold>
 Terms used for describing side effects in SIDER DB are grouped using a semantic similarity measure in order to build Term Clusters or TCs <bold>(A)</bold>
. Drugs are mapped to DrugBank in order to retrieve information about drugs themselves and their targets <bold>(B)</bold>
. TCs are associated to drugs to represent each drug by a side-effect fingerprint <bold>(C)</bold>
. SEPs are extracted as maximal frequent itemsets from side effect fingerprints <bold>(D)</bold>
. Two machine-learning methods are used to characterize each SEP in terms of drug and target properties <bold>(E)</bold>
.</p>
</caption>
<graphic xlink:href="1471-2105-14-207-1"></graphic>
</fig>
</sec>
<sec sec-type="methods"><title>Methods</title>
<sec><title>The NetworkDB resource</title>
<p>NetworkDB is a relational database which integrates data about molecules and their targets. These data are collected from various public data sources mentioned in the following sections. Figure <xref ref-type="fig" rid="F2">2</xref>
 shows the conceptual model of the database.</p>
<fig id="F2" position="float"><label>Figure 2</label>
<caption><p><bold>NetworkDB conceptual model.</bold>
 In this entity-relationship schema, entities are in boxes and relationships in ellipses.</p>
</caption>
<graphic xlink:href="1471-2105-14-207-2"></graphic>
</fig>
<sec><title>Chemical space: drugs and their properties</title>
<p>The SIDER database contains drug side-effect relationships [<xref ref-type="bibr" rid="B3">3</xref>
]. DrugBank is used to collect data such as categories and targets [<xref ref-type="bibr" rid="B11">11</xref>
]. The join between SIDER and DrugBank is based on the PubChem Compound identifier given by SIDER and DrugBank. A total of 554 drugs from SIDER are referenced in DrugBank v3.0.</p>
<p>Each drug is described by its category and a set of clusters it belongs to. In fact, various structural representations and associated similarity measures were used to cluster drugs. The first similarity measure is based on SMILES representation. The SMILES codes are converted thanks to Open Babel program into fingerprints which allows linear and ring substructures to be identified [<xref ref-type="bibr" rid="B12">12</xref>
]. Then, the structural similarity between two molecular fingerprints is calculated using the Tanimoto measure. In addition, we calculated three other similarity scores using spherical harmonics representation of molecules. This parametric representation of macromolecular surface was originally proposed and applied by Ritchie and Kemp [<xref ref-type="bibr" rid="B13">13</xref>
] and Cai et al. [<xref ref-type="bibr" rid="B14">14</xref>
]. The proprietary program HPCC (Harmonic Pharma) supports three variants of the spherical harmonic representation. HPCCgeo uses spherical harmonic coefficients (shape information) to calculate similarity between drugs, HPCCchem is based on chemical properties mapped on the spherical harmonic representation, and HPCCcombo combines shape and chemical information. Ward’s method is used to perform four hierarchical clusterings of drugs [<xref ref-type="bibr" rid="B15">15</xref>
]. The optimal numbers of clusters is determined by the method of Kelley and al. [<xref ref-type="bibr" rid="B16">16</xref>
]. Thus, 60 clusters are obtained with Tanimoto, 53 with HPCCgeo, 21 with HPCCchem and 34 with HPCCcombo measures.</p>
<p>Drug categories are retrieved from DrugBank. These categories are mapped on the descendants of three MeSH concepts, namely “Molecular Mechanisms of Pharmacological Action” (<italic>D27.505.519</italic>
), “Physiological Effects of Drugs” (<italic>D27.505.696</italic>
) and “Therapeutic Uses” (<italic>D27.505.954</italic>
).</p>
</sec>
<sec><title>Biological space: proteins and their properties</title>
<p>Drug targets are extracted from both DrugBank and PDB [<xref ref-type="bibr" rid="B17">17</xref>
]. The outer join between PDB and DrugBank (retaining all DrugBank targets) is based on SMILES code identity. Drug targets are associated with their UniProt accession numbers. Thus, 768 targets are collected, representing an average of four targets per drug. Then, target annotations are retrieved from different databases. Protein-protein interactions are retrieved from the IntAct database [<xref ref-type="bibr" rid="B18">18</xref>
] and 5959 interactions were collected which correspond to 2827 new proteins. For all the proteins (drug targets and their interactants), 1403 pathway names are extracted from the KEGG database and the Pathway Interaction Database which integrates data from NCI-Nature, BioCarta and Reactome [<xref ref-type="bibr" rid="B19">19</xref>
,<xref ref-type="bibr" rid="B20">20</xref>
]. For the same proteins, GO terms are also collected from QuickGO database [<xref ref-type="bibr" rid="B21">21</xref>
]. Thus, 6494 GO terms annotating the 3595 proteins are stored in NetworkDB. Moreover, the “is_a” and “part_of” relationships between GO terms are stored in NetworkDB. Finally, 4650 protein domains associated with the targets and their interactants are retrieved from InterPro [<xref ref-type="bibr" rid="B22">22</xref>
].</p>
</sec>
</sec>
<sec><title>Grouping side-effect terms into term clusters</title>
<p>Side effects are extracted from SIDER. As shown previously [<xref ref-type="bibr" rid="B23">23</xref>
], the use of all terms describing side effects in SIDER (about 1500) impairs the execution of data mining programs and produces numerous and redundant patterns which are inappropriate for expert interpretation. As SIDER side effects terms belong to the Medical Dictionary for Regulatory Activities [<xref ref-type="bibr" rid="B24">24</xref>
], a semantic similarity between these terms can be calculated based on the structure of MedDRA [<xref ref-type="bibr" rid="B10">10</xref>
]. Next, a hierarchical clustering method is applied to obtain 112 Term Clusters (TCs) which are then validated by experts [<xref ref-type="bibr" rid="B23">23</xref>
]. For instance, TC named 65_Dermatitis is the 65th TC and has Dermatitis as representative term.</p>
</sec>
<sec><title>Datasets</title>
<sec><title>Association of drugs with side effects</title>
<p>The association between drugs and TCs is an important step for the characterization of drugs sharing side effects. As the TC size varies from 2 to 59 terms, it seems consistent to use a heuristic procedure depending on the TC size. Let <italic>k</italic>
<sub><italic>i</italic>
</sub>
 be the number of terms in <italic>T</italic>
<italic>C</italic>
<sub><italic>i</italic>
</sub>
 and <italic>n</italic>
<sub><italic>i</italic>
</sub>
 be the minimal number of side effects required for assigning <italic>T</italic>
<italic>C</italic>
<sub><italic>i</italic>
</sub>
 to a drug. Considering <italic>n</italic>
<sub><italic>i</italic>
</sub>
 = 1 for any <italic>T</italic>
<italic>C</italic>
<sub><italic>i</italic>
</sub>
 results in a very loose association yielding a very dense binary table hampering further computation, whereas considering <italic>n</italic>
<sub><italic>i</italic>
</sub>
 = <italic>k</italic>
<sub><italic>i</italic>
</sub>
 for any <italic>T</italic>
<italic>C</italic>
<sub><italic>i</italic>
</sub>
 results in a very stringent association which might skip over important drug side effects. In fact a trade-off between these two extreme solutions is required. Grouping the <italic>k</italic>
<sub><italic>i</italic>
</sub>
 values into 5-range intervals with the last interval from 21 to 59 allows to set up a simple association procedure ranging <italic>n</italic>
<sub><italic>i</italic>
</sub>
 from 1 to 5. The resulting association between drugs and TCs is shown in Figure <xref ref-type="fig" rid="F3">3</xref>
 where each row represents the side-effect binary fingerprint associated with a drug. This binary table (drug ×TC) is then used to discover interesting side-effect profiles defined here as the longest combinations of TCs shared by significant sets of drugs.</p>
<fig id="F3" position="float"><label>Figure 3</label>
<caption><p><bold>Drug side-effect binary table.</bold>
 This table is presented as a heatmap (produced with R) where rows and columns are grouped by distribution similarity. Each row represents the side-effect fingerprint of a drug and each column is a side-effect term cluster.</p>
</caption>
<graphic xlink:href="1471-2105-14-207-3"></graphic>
</fig>
</sec>
<sec><title>Single-table datasets</title>
<p>Single table datasets designed for DT learning represent each drug by an attribute-value vector. Four types of descriptors retrieved from NetworkDB are used to generate these attributes: the first is the class information, <italic>i.e.</italic>
 the studied SEP, the second one includes drug categories, the third one lists all drug targets with for each target, three attributes referring to the type of action of the drug (activation, inhibition and other) and the fourth concerns clusters of similar drugs according to the four similarity measures described above. Because of target and category multiplicity, the total dimension of this dataset varies between 741 and 924 depending on the SEP.</p>
</sec>
<sec><title>Relational datasets</title>
<p>Relational datasets designed for Inductive Logic Programming (ILP) consist in a set of tables extracted from NetworkDB describing drugs properties and background knowledge. Drugs properties are the same as in the single-table dataset, <italic>i.e.</italic>
 categories, targets and clusters. Background knowledge includes GO annotations, domain composition, interactants and pathways of each drug target. Relationships between GO terms constitute an additional table.</p>
</sec>
</sec>
<sec><title>Data mining</title>
<sec><title>Maximal frequent itemsets</title>
<p>In a binary table (object ×attribute), a frequent itemset is a group of attributes shared by a number of objects greater than a threshold support. A frequent itemset is considered as a maximal frequent itemset (MFI) if all its proper supersets are not frequent [<xref ref-type="bibr" rid="B25">25</xref>
]. It follows that two maximal frequent itemsets (MFIs) cannot be shared by a number of objects greater than the threshold support. In our case, MFIs are the largest combinations of TCs shared by a number of drugs greater than 100. This threshold was chosen as a trade-off between high values yielding short MFIs limited to one or two TCs and low values yielding numerous MFIs covering only a few molecules. MFIs are extracted from the binary table (Figure <xref ref-type="fig" rid="F2">2</xref>
) using the Coron program [<xref ref-type="bibr" rid="B26">26</xref>
] after excluding TCs which cover more than 50% of the molecules.</p>
</sec>
<sec><title>Decision trees</title>
<p>Decision tree (DT) construction is a machine-learning method which uses (object ×attribute) table to classify objects. Results given by this method are easily readable. Decision trees are built here with the J48 implementation of C4.5 tree learner in the Weka toolbox using single table datasets converted into the ARFF format [<xref ref-type="bibr" rid="B27">27</xref>
]. We use the default parameters except for two of them: we use <italic>minNumObj</italic>
 = 5 and <italic>binarySplits</italic>
 = <italic>true</italic>
.</p>
</sec>
<sec><title>Inductive Logic Programming (ILP)</title>
<p>ILP is a machine-learning method which uses relational data as input and has been successfully applied to various areas including bioinformatics [<xref ref-type="bibr" rid="B28">28</xref>
-<xref ref-type="bibr" rid="B30">30</xref>
]. It allows us to learn a concept definition from observations, i.e, a set of positive examples (E+) and a set of negative examples (E-), and background knowledge (B) [<xref ref-type="bibr" rid="B31">31</xref>
]. The ILP experiments produce theories as sets of first-order logic rules. They where conducted here with the Aleph Program [<xref ref-type="bibr" rid="B32">32</xref>
]. Many parameters can be tuned for theory construction. The three main parameters are the <italic>min-pos</italic>
, the <italic>noise</italic>
 and the <italic>induce-type</italic>
. The <italic>min-pos</italic>
 parameter is the minimal number of positive examples that a rule must cover. The <italic>noise</italic>
 corresponds to the maximal number of negative examples that an acceptable rule may cover (in our case, one is never sure that a drug does not have a given side effect). The third parameter is <italic>induce-type</italic>
 which directs theory construction. When this parameter is set to <italic>induce-cover</italic>
, overlapping rules are produced (<italic>i.e.</italic>
, a drug can be covered by several rules). Based on previous experience [<xref ref-type="bibr" rid="B33">33</xref>
], we used the following settings: <italic>min</italic>
-<italic>pos</italic>
 = 5, <italic>noise</italic>
 = 1 and <italic>induce</italic>
-<italic>type</italic>
 = <italic>induce</italic>
-<italic>cover</italic>
.</p>
</sec>
</sec>
<sec><title>Model evaluation</title>
<sec><title>Cross-validation</title>
<p>Both ILP theories and decision trees are evaluated with 10 runs of a 10-fold stratified cross-validation. DT cross-validation is performed with the Weka experimenter interface. For ILP, we took advantage of our recent integration of Aleph into the KNIME platform [<xref ref-type="bibr" rid="B34">34</xref>
]. KNIME cross-validation meta-node is adapted for theory evaluation. An example is predicted as positive if it is covered by at least one rule. Each cross-validation assay yields a confusion matrix counting true and false positives, as well as true and false negatives. Each assay is then evaluated by the calculation of accuracy (ratio of correctly classified instances), specificity (true negative rate) and sensitivity (true positive rate).</p>
</sec>
<sec><title>Direct test</title>
<p>Theories and decision trees are also evaluated by direct test. Drugs used for testing are those present in SIDER 2 and DrugBank (v3.0) but not present in SIDER. For these drugs all descriptors are retrieved and stored in the NetworkDB. Furthermore, the reports of FAERS from 2004 to 2011 were imported as a database and used as an external information source for checking the false positives predicted by our models. We consider that a molecule is associated with a SEP in FAERS if for each TC of the SEP there is at least one report that states that the molecule is the primary suspect of an observed side effect belonging to the TC. Our checking procedure is just an anticipation as it relies on the fact that updating the package insert of a drug (stored in SIDER) requires that sufficient amount of adverse effect incidents occur (especially for new drugs).</p>
</sec>
</sec>
</sec>
<sec><title>Results and discussion</title>
<sec><title>Overall distribution of side effects</title>
<p>A drug is associated with a TC (group of semantically related side effects) if it is annotated by a minimum number of side effects of this TC (see Methods). The resulting binary table is shown in Figure <xref ref-type="fig" rid="F3">3</xref>
, where each row represents the side effect fingerprint of one of the 554 drugs considered here, and each column represents one of the 112 TC. In this representation, drugs and TCs have been grouped by distribution similarity. On the right part of the figure, we can see TCs associated with a limited number of drugs, whereas highly represented TCs are on the left. In the same way, drug fingerprints involving few TCs are on the top of Figure <xref ref-type="fig" rid="F3">3</xref>
 and drugs with high number of TCs are on the lower part. Zooming on adjacent columns reveals that some TCs seem to be frequently associated with the same drugs as for example the pair TC 39_Stevens-Johnson_syndrome and TC 100_Erythema_multiforme.</p>
<p>However, apart from providing a general idea about the complexity of TC association with drugs, this visualization cannot be exploited easily. More precise information can be retrieved by querying NetworkDB. For example, the maximal number of TCs associated with a drug is 89 for the ropinirole (an anti-Parkinson agent). Conversely, 18 drugs are associated with only one TC. For instance, bretilium (an anti-hypertensive agent) is only associated with TC 110_Shock. From the TC point of view, the number of drugs associated with a TC ranges from 1 to 410. The 13 TCs covering more than 50% of the molecules are excluded in the rest of the study.</p>
</sec>
<sec><title>Side-effect profiles</title>
<p>The overall intuition provided by Figure <xref ref-type="fig" rid="F3">3</xref>
 is that groups of TCs shared by drugs exist and should be extracted. In fact, extracting patterns from such binary table is the purpose of itemset search algorithms [<xref ref-type="bibr" rid="B35">35</xref>
]. We thus perform MFI extraction and we define side-effect profiles (SEPs) as maximal groups of TCs covering at least 20% of the drug set (110 drugs). The resulting 26 SEPs are listed in Table <xref ref-type="table" rid="T1">1</xref>
. Regarding length, 3 SEPs have only one TC, 13 combine 2 TCs, 9 combine 3 TCs, and only one combines 4 TCs. These 26 SEPs concern 372 molecules (67% of the drug set) and involve 18 distinct TCs of which the most frequent are 99_Headache and 90_Feeling_abnormal which appear 8 times each, whereas 7 TCs appear in only one SEP. These 26 most frequent SEPs are considered in the rest of the study. By construction, although two SEPs can have common TCs, they cannot cover more than 100 molecules in common.</p>
<table-wrap position="float" id="T1"><label>Table 1</label>
<caption><p><bold>Maximal frequent itemsets covering 20% of drugs (support) extracted from the drug</bold>
<bold><italic>×</italic>
</bold>
<bold>TC table</bold>
</p>
</caption>
<table frame="hsides" rules="groups" border="1"><colgroup><col align="left"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
</colgroup>
<thead valign="top"><tr><th align="left"><bold>SEP</bold>
</th>
<th align="left"><bold>Profile composition</bold>
</th>
<th align="center"><bold>Support</bold>
</th>
<th align="center"><bold>Avg overlap</bold>
</th>
</tr>
</thead>
<tbody valign="top"><tr><td align="left" valign="bottom">SEP_1<hr></hr>
</td>
<td align="left" valign="bottom">41_Leukopenia, 90_Feeling_abnormal, 99_Headache<hr></hr>
</td>
<td align="center" valign="bottom">123<hr></hr>
</td>
<td align="center" valign="bottom">69<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_2<hr></hr>
</td>
<td align="left" valign="bottom">90_Feeling_abnormal, 99_Headache, 110_Shock<hr></hr>
</td>
<td align="center" valign="bottom">123<hr></hr>
</td>
<td align="center" valign="bottom">73<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_3<hr></hr>
</td>
<td align="left" valign="bottom">58_Gout<hr></hr>
</td>
<td align="center" valign="bottom">120<hr></hr>
</td>
<td align="center" valign="bottom">60<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_4<hr></hr>
</td>
<td align="left" valign="bottom">70_Pneumonia, 99_Headache<hr></hr>
</td>
<td align="center" valign="bottom">117<hr></hr>
</td>
<td align="center" valign="bottom">71<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_5<hr></hr>
</td>
<td align="left" valign="bottom">110_Shock, 111_Infection<hr></hr>
</td>
<td align="center" valign="bottom">117<hr></hr>
</td>
<td align="center" valign="bottom">68<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_6<hr></hr>
</td>
<td align="left" valign="bottom">76_Asthma, 90_Feeling_abnormal, 99_Headache<hr></hr>
</td>
<td align="center" valign="bottom">117<hr></hr>
</td>
<td align="center" valign="bottom">68<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_7<hr></hr>
</td>
<td align="left" valign="bottom">65_Dermatitis<hr></hr>
</td>
<td align="center" valign="bottom">116<hr></hr>
</td>
<td align="center" valign="bottom">53<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_8<hr></hr>
</td>
<td align="left" valign="bottom">2_Haemorrhage, 76_Asthma<hr></hr>
</td>
<td align="center" valign="bottom">115<hr></hr>
</td>
<td align="center" valign="bottom">65<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_9<hr></hr>
</td>
<td align="left" valign="bottom">41_Leukopenia, 76_Asthma<hr></hr>
</td>
<td align="center" valign="bottom">115<hr></hr>
</td>
<td align="center" valign="bottom">62<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_10<hr></hr>
</td>
<td align="left" valign="bottom">48_Rhinitis, 99_Headache, 111_Infection<hr></hr>
</td>
<td align="center" valign="bottom">115<hr></hr>
</td>
<td align="center" valign="bottom">69<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_11<hr></hr>
</td>
<td align="left" valign="bottom">41_Leukopenia, 110_Shock<hr></hr>
</td>
<td align="center" valign="bottom">114<hr></hr>
</td>
<td align="center" valign="bottom">66<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_12<hr></hr>
</td>
<td align="left" valign="bottom">39_Stevens-Johnson_syndrome, 41_Leukopenia, 100_Erythema_multiforme<hr></hr>
</td>
<td align="center" valign="bottom">114<hr></hr>
</td>
<td align="center" valign="bottom">52<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_13<hr></hr>
</td>
<td align="left" valign="bottom">41_Leukopenia, 48_Rhinitis<hr></hr>
</td>
<td align="center" valign="bottom">113<hr></hr>
</td>
<td align="center" valign="bottom">67<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_14<hr></hr>
</td>
<td align="left" valign="bottom">99_Headache, 100_Erythema_multiforme<hr></hr>
</td>
<td align="center" valign="bottom">113<hr></hr>
</td>
<td align="center" valign="bottom">56<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_15<hr></hr>
</td>
<td align="left" valign="bottom">31_Lymphadenopathy<hr></hr>
</td>
<td align="center" valign="bottom">112<hr></hr>
</td>
<td align="center" valign="bottom">59<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_16<hr></hr>
</td>
<td align="left" valign="bottom">70_Pneumonia, 90_Feeling_abnormal<hr></hr>
</td>
<td align="center" valign="bottom">112<hr></hr>
</td>
<td align="center" valign="bottom">71<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_17<hr></hr>
</td>
<td align="left" valign="bottom">41_Leukopenia, 70_Pneumonia<hr></hr>
</td>
<td align="center" valign="bottom">112<hr></hr>
</td>
<td align="center" valign="bottom">64<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_18<hr></hr>
</td>
<td align="left" valign="bottom">76_Asthma, 111_Infection<hr></hr>
</td>
<td align="center" valign="bottom">112<hr></hr>
</td>
<td align="center" valign="bottom">64<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_19<hr></hr>
</td>
<td align="left" valign="bottom">80_Jaundice, 100_Erythema_multiforme<hr></hr>
</td>
<td align="center" valign="bottom">112<hr></hr>
</td>
<td align="center" valign="bottom">45<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_20<hr></hr>
</td>
<td align="left" valign="bottom">41_Leukopenia, 111_Infection<hr></hr>
</td>
<td align="center" valign="bottom">111<hr></hr>
</td>
<td align="center" valign="bottom">63<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_21<hr></hr>
</td>
<td align="left" valign="bottom">8_Haematuria, 90_Feeling_abnormal, 99_Headache<hr></hr>
</td>
<td align="center" valign="bottom">111<hr></hr>
</td>
<td align="center" valign="bottom">68<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_22<hr></hr>
</td>
<td align="left" valign="bottom">13_Pyrexia, 33_Musculoskeletal_discomfort, 48_Rhinitis, 99_Headache<hr></hr>
</td>
<td align="center" valign="bottom">111<hr></hr>
</td>
<td align="center" valign="bottom">69<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_23<hr></hr>
</td>
<td align="left" valign="bottom">13_Pyrexia, 70_Pneumonia<hr></hr>
</td>
<td align="center" valign="bottom">110<hr></hr>
</td>
<td align="center" valign="bottom">69<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_24<hr></hr>
</td>
<td align="left" valign="bottom">48_Rhinitis, 90_Feeling_abnormal, 110_Shock<hr></hr>
</td>
<td align="center" valign="bottom">110<hr></hr>
</td>
<td align="center" valign="bottom">70<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_25<hr></hr>
</td>
<td align="left" valign="bottom">13_Pyrexia, 90_Feeling_abnormal, 110_Shock<hr></hr>
</td>
<td align="center" valign="bottom">110<hr></hr>
</td>
<td align="center" valign="bottom">70<hr></hr>
</td>
</tr>
<tr><td align="left">SEP_26</td>
<td align="left">48_Rhinitis, 90_Feeling_abnormal, 111_Infection</td>
<td align="center">110</td>
<td align="center">69</td>
</tr>
</tbody>
</table>
<table-wrap-foot><p>Avg overlap: average of overlap size between the SEP and other SEPs.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec><title>Characterization of frequent SEPs</title>
<p>Our hypothesis is that a SEP shared by a large number of drugs can be explained in terms of drug properties and background knowledge. Thus, two machine-learning methods, decision trees and ILP, are applied on the drugs associated with each SEP. For both methods, the positive examples are taken to be all the drugs associated with a SEP, and those drugs that are not associated with any of the TCs composing the SEP are taken as negative examples. Negative examples represent 60% of the learning set.</p>
<p>For each profile, classification efficiency is evaluated using a 10 ×10 cross-validation by accuracy (Acc), specificity (Spec) and sensitivity (Sens). The results presented in Table <xref ref-type="table" rid="T2">2</xref>
 show that for both methods, generated models are good classifiers with an average accuracy of 67% for DTs and 65% for ILP. For 23/26 SEPs, accuracy is better for DTs than with ILP mostly reflecting the higher specificity values obtained with DTs. On the contrary, sensitivity values are always higher with ILP than with DTs with only one exception for SEP_17 where ILP sensitivity value is 0.1 lower than DTs sensitivity. Thus, ILP provides more sensitive theories whereas DTs provide more specific models. In fact, sensitivity is probably more important than specificity for drug development as it is for medical diagnostic. Indeed, low sensitivity means that some SEPs can be skipped over, although they are truly associated with the tested drug. Thus, ILP theories display attractive qualities for SEP prediction. Five SEPs (1, 3, 12, 15, and 19) are particularly well characterized with ILP since sensitivity values are greater than 60%. The amount and quality of available data may explain the observed differences of results between SEPs. It should be noted that comparison with other reported methods is uneasy due to the fact that we aim to characterize and predict SEPs rather than isolated side effects. In fact the closest study is the one of Yamanishi et al. [<xref ref-type="bibr" rid="B9">9</xref>
] whose objective is to predict isolated side effects using multi-class statistical methods. Therefore these authors do not produce comparable accuracy values.</p>
<table-wrap position="float" id="T2"><label>Table 2</label>
<caption><p>Evaluation of learning results by 10 × 10 stratified cross-validation of DT and ILP programs</p>
</caption>
<table frame="hsides" rules="groups" border="1"><colgroup><col align="left"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
</colgroup>
<thead valign="top"><tr><th align="left" valign="bottom"><bold>SEP</bold>
<hr></hr>
</th>
<th colspan="3" align="center" valign="bottom"><bold>DT</bold>
<hr></hr>
</th>
<th colspan="3" align="center" valign="bottom"><bold>ILP</bold>
<hr></hr>
</th>
</tr>
<tr><th align="left"> </th>
<th align="center"><bold>Acc</bold>
</th>
<th align="center"><bold>Spec</bold>
</th>
<th align="center"><bold>Sens</bold>
</th>
<th align="center"><bold>Acc</bold>
</th>
<th align="center"><bold>Spec</bold>
</th>
<th align="center"><bold>Sens</bold>
</th>
</tr>
</thead>
<tbody valign="top"><tr><td align="left" valign="bottom">SEP_1<hr></hr>
</td>
<td align="center" valign="bottom">0.65<hr></hr>
</td>
<td align="center" valign="bottom">0.86<hr></hr>
</td>
<td align="center" valign="bottom">0.39<hr></hr>
</td>
<td align="center" valign="bottom">0.61<hr></hr>
</td>
<td align="center" valign="bottom">0.63<hr></hr>
</td>
<td align="center" valign="bottom">0.6<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_2<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.4<hr></hr>
</td>
<td align="center" valign="bottom">0.63<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.54<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_3<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.47<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.77<hr></hr>
</td>
<td align="center" valign="bottom">0.63<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_4<hr></hr>
</td>
<td align="center" valign="bottom">0.66<hr></hr>
</td>
<td align="center" valign="bottom">0.89<hr></hr>
</td>
<td align="center" valign="bottom">0.32<hr></hr>
</td>
<td align="center" valign="bottom">0.62<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.51<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_5<hr></hr>
</td>
<td align="center" valign="bottom">0.68<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.38<hr></hr>
</td>
<td align="center" valign="bottom">0.64<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.54<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_6<hr></hr>
</td>
<td align="center" valign="bottom">0.68<hr></hr>
</td>
<td align="center" valign="bottom">0.87<hr></hr>
</td>
<td align="center" valign="bottom">0.39<hr></hr>
</td>
<td align="center" valign="bottom">0.61<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.49<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_7<hr></hr>
</td>
<td align="center" valign="bottom">0.65<hr></hr>
</td>
<td align="center" valign="bottom">0.86<hr></hr>
</td>
<td align="center" valign="bottom">0.32<hr></hr>
</td>
<td align="center" valign="bottom">0.6<hr></hr>
</td>
<td align="center" valign="bottom">0.67<hr></hr>
</td>
<td align="center" valign="bottom">0.49<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_8<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.87<hr></hr>
</td>
<td align="center" valign="bottom">0.44<hr></hr>
</td>
<td align="center" valign="bottom">0.67<hr></hr>
</td>
<td align="center" valign="bottom">0.73<hr></hr>
</td>
<td align="center" valign="bottom">0.57<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_9<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.84<hr></hr>
</td>
<td align="center" valign="bottom">0.46<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.75<hr></hr>
</td>
<td align="center" valign="bottom">0.59<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_10<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.89<hr></hr>
</td>
<td align="center" valign="bottom">0.4<hr></hr>
</td>
<td align="center" valign="bottom">0.65<hr></hr>
</td>
<td align="center" valign="bottom">0.76<hr></hr>
</td>
<td align="center" valign="bottom">0.47<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_11<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.44<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.82<hr></hr>
</td>
<td align="center" valign="bottom">0.45<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_12<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.45<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.76<hr></hr>
</td>
<td align="center" valign="bottom">0.61<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_13<hr></hr>
</td>
<td align="center" valign="bottom">0.67<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.35<hr></hr>
</td>
<td align="center" valign="bottom">0.66<hr></hr>
</td>
<td align="center" valign="bottom">0.74<hr></hr>
</td>
<td align="center" valign="bottom">0.54<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_14<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.89<hr></hr>
</td>
<td align="center" valign="bottom">0.39<hr></hr>
</td>
<td align="center" valign="bottom">0.63<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.51<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_15<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.9<hr></hr>
</td>
<td align="center" valign="bottom">0.43<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.76<hr></hr>
</td>
<td align="center" valign="bottom">0.6<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_16<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.89<hr></hr>
</td>
<td align="center" valign="bottom">0.39<hr></hr>
</td>
<td align="center" valign="bottom">0.66<hr></hr>
</td>
<td align="center" valign="bottom">0.72<hr></hr>
</td>
<td align="center" valign="bottom">0.57<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_17<hr></hr>
</td>
<td align="center" valign="bottom">0.74<hr></hr>
</td>
<td align="center" valign="bottom">0.89<hr></hr>
</td>
<td align="center" valign="bottom">0.52<hr></hr>
</td>
<td align="center" valign="bottom">0.65<hr></hr>
</td>
<td align="center" valign="bottom">0.74<hr></hr>
</td>
<td align="center" valign="bottom">0.51<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_18<hr></hr>
</td>
<td align="center" valign="bottom">0.65<hr></hr>
</td>
<td align="center" valign="bottom">0.87<hr></hr>
</td>
<td align="center" valign="bottom">0.34<hr></hr>
</td>
<td align="center" valign="bottom">0.61<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.5<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_19<hr></hr>
</td>
<td align="center" valign="bottom">0.74<hr></hr>
</td>
<td align="center" valign="bottom">0.91<hr></hr>
</td>
<td align="center" valign="bottom">0.47<hr></hr>
</td>
<td align="center" valign="bottom">0.72<hr></hr>
</td>
<td align="center" valign="bottom">0.77<hr></hr>
</td>
<td align="center" valign="bottom">0.64 0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_20<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.89<hr></hr>
</td>
<td align="center" valign="bottom">0.44<hr></hr>
</td>
<td align="center" valign="bottom">0.64<hr></hr>
</td>
<td align="center" valign="bottom">0.73<hr></hr>
</td>
<td align="center" valign="bottom">0.51<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_21<hr></hr>
</td>
<td align="center" valign="bottom">0.72<hr></hr>
</td>
<td align="center" valign="bottom">0.9<hr></hr>
</td>
<td align="center" valign="bottom">0.46<hr></hr>
</td>
<td align="center" valign="bottom">0.64<hr></hr>
</td>
<td align="center" valign="bottom">0.72<hr></hr>
</td>
<td align="center" valign="bottom">0.54<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_22<hr></hr>
</td>
<td align="center" valign="bottom">0.65<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.32<hr></hr>
</td>
<td align="center" valign="bottom">0.61<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.48<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_23<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.89<hr></hr>
</td>
<td align="center" valign="bottom">0.43<hr></hr>
</td>
<td align="center" valign="bottom">0.63<hr></hr>
</td>
<td align="center" valign="bottom">0.7<hr></hr>
</td>
<td align="center" valign="bottom">0.51<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_24<hr></hr>
</td>
<td align="center" valign="bottom">0.68<hr></hr>
</td>
<td align="center" valign="bottom">0.87<hr></hr>
</td>
<td align="center" valign="bottom">0.4<hr></hr>
</td>
<td align="center" valign="bottom">0.62<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.5<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_25<hr></hr>
</td>
<td align="center" valign="bottom">0.71<hr></hr>
</td>
<td align="center" valign="bottom">0.9<hr></hr>
</td>
<td align="center" valign="bottom">0.43<hr></hr>
</td>
<td align="center" valign="bottom">0.65<hr></hr>
</td>
<td align="center" valign="bottom">0.72<hr></hr>
</td>
<td align="center" valign="bottom">0.56<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_26<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.88<hr></hr>
</td>
<td align="center" valign="bottom">0.4<hr></hr>
</td>
<td align="center" valign="bottom">0.62<hr></hr>
</td>
<td align="center" valign="bottom">0.69<hr></hr>
</td>
<td align="center" valign="bottom">0.52<hr></hr>
</td>
</tr>
<tr><td align="left">Average</td>
<td align="center">0.67</td>
<td align="center">0.83</td>
<td align="center">0.43</td>
<td align="center">0.65</td>
<td align="center">0.72</td>
<td align="center">0.54</td>
</tr>
</tbody>
</table>
<table-wrap-foot><p>Acc: accuracy, Spec: specificity, Sens: sensitivity.</p>
</table-wrap-foot>
</table-wrap>
<p>Table <xref ref-type="table" rid="T3">3</xref>
 shows the results obtained with the set of test molecules. Among the novel drugs present in SIDER 2, only 20 are associated with at least one of the 26 studied SEPs. These drugs have been tested with decision trees and ILP theories obtained for each SEP. The total number of drugs in the test set that are associated which each SEP is indicated (column Positives) and compared to the true positive values (TP columns) obtained with test set using either DT model or ILP theory relative to this SEP. Clearly the prediction results are better with ILP theories than with DTs. Indeed 22 true positives (covering 16 SEPs) were detected with ILP theories whereas only 9 true positives (covering 8 SEPs) were detected with DTs. The number of false positives are also reported for each SEP and each model (FP columns). The checking procedure was applied on false positives and the number of confirmed molecules according to FAERS is reported (FAERS columns). Thus, 33 molecules were extracted for ILP theories versus 37 for DTs raising the total number of probable true positives to 55 for ILP and 46 for DTs. Nevertheless, as the variability in cross-validation results suggest, many positive molecules still escape prediction especially for three SEPs: SEP_2, SEP_7, and SEP_21 with both DTs and ILP theories.</p>
<table-wrap position="float" id="T3"><label>Table 3</label>
<caption><p>Direct testing results with 20 new molecules</p>
</caption>
<table frame="hsides" rules="groups" border="1"><colgroup><col align="left"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
</colgroup>
<thead valign="top"><tr><th align="left" valign="bottom"><bold>SEP</bold>
<hr></hr>
</th>
<th align="center" valign="bottom"><bold>Positives</bold>
<hr></hr>
</th>
<th colspan="3" align="center" valign="bottom"><bold>DT</bold>
<hr></hr>
</th>
<th colspan="3" align="center" valign="bottom"><bold>ILP</bold>
<hr></hr>
</th>
</tr>
<tr><th align="left"> </th>
<th align="left"> </th>
<th align="center"><bold>TP</bold>
</th>
<th align="center"><bold>FP</bold>
</th>
<th align="center"><bold>FAERS</bold>
</th>
<th align="center"><bold>TP</bold>
</th>
<th align="center"><bold>FP</bold>
</th>
<th align="center"><bold>FAERS</bold>
</th>
</tr>
</thead>
<tbody valign="top"><tr><td align="left" valign="bottom">SEP_1<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_2<hr></hr>
</td>
<td align="center" valign="bottom">11<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_3<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_4<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_5<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_6<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_7<hr></hr>
</td>
<td align="center" valign="bottom">15<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_8<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_9<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_10<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_11<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_12<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_13<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">6<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_14<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_15<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_16<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">6<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">7<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_17<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_18<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_19<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_20<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_21<hr></hr>
</td>
<td align="center" valign="bottom">8<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_22<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_23<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_24<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
<td align="center" valign="bottom">4<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">SEP_25<hr></hr>
</td>
<td align="center" valign="bottom">8<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
<td align="center" valign="bottom">3<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">2<hr></hr>
</td>
</tr>
<tr><td align="left">SEP_26</td>
<td align="center">4</td>
<td align="center">0</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">0</td>
<td align="center">3</td>
<td align="center">1</td>
</tr>
</tbody>
</table>
<table-wrap-foot><p>Positives: number of positive examples in the test set according to SIDER, TP/FP: number of predicted true/false positives, FAERS: number of fished out molecules based on FAERS data.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec><title>Interpretation of decision trees and theories</title>
<p>Quantitative characteristics of DT models and ILP theories for the 26 selected SEPs are presented in Table <xref ref-type="table" rid="T4">4</xref>
 (the decision trees and ILP theories are available at <ext-link ext-link-type="uri" xlink:href="http://plateforme-mbi.loria.fr/side-effect-profiles">http://plateforme-mbi.loria.fr/side-effect-profiles</ext-link>
).The first observation concerns model coverage. We can see that in average 83% of the drugs are covered by at least one rule in an ILP theory whereas DT models cover in average only 58% of the drugs composing the learning set. The second observation is the use of almost all descriptor types in each DT model or ILP theory. The most represented descriptors are drug categories and clusters for DTs, respectively drug targets and GO terms for ILP theories. This illustrates the importance of using background knowledge about drug targets and GO semantic relationships for the characterization of SEPs.</p>
<table-wrap position="float" id="T4"><label>Table 4</label>
<caption><p>Quantitative characteristics of DT models and ILP theories</p>
</caption>
<table frame="hsides" rules="groups" border="1"><colgroup><col align="left"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
</colgroup>
<thead valign="top"><tr><th align="left" valign="bottom"> <hr></hr>
</th>
<th colspan="2" align="center" valign="bottom"><bold>DT (# nodes per model)</bold>
<hr></hr>
</th>
<th colspan="2" align="center" valign="bottom"><bold>ILP (# rules per theory)</bold>
<hr></hr>
</th>
</tr>
<tr><th align="left"> </th>
<th align="center"><bold>Avg (min-max)</bold>
</th>
<th align="center"><bold>% total</bold>
</th>
<th align="center"><bold>Avg (min-max)</bold>
</th>
<th align="center"><bold>% total</bold>
</th>
</tr>
</thead>
<tbody valign="top"><tr><td align="left" valign="bottom">Model coverage (%)<hr></hr>
</td>
<td align="center" valign="bottom">58 (32–67)<hr></hr>
</td>
<td align="center" valign="bottom">-<hr></hr>
</td>
<td align="center" valign="bottom">83 (77–88)<hr></hr>
</td>
<td align="center" valign="bottom">-<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">Model size<hr></hr>
</td>
<td align="center" valign="bottom">11 (6–15)<hr></hr>
</td>
<td align="center" valign="bottom">-<hr></hr>
</td>
<td align="center" valign="bottom">33 (16–40)<hr></hr>
</td>
<td align="center" valign="bottom">-<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom"><bold>Drug descriptors</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">Categories<hr></hr>
</td>
<td align="center" valign="bottom">4 (1–7)<hr></hr>
</td>
<td align="center" valign="bottom">34<hr></hr>
</td>
<td align="center" valign="bottom">6 (2–13)<hr></hr>
</td>
<td align="center" valign="bottom">19<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">Targets<hr></hr>
</td>
<td align="center" valign="bottom">3 (0–5)<hr></hr>
</td>
<td align="center" valign="bottom">26<hr></hr>
</td>
<td align="center" valign="bottom">30 (23–39)<hr></hr>
</td>
<td align="center" valign="bottom">90<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">Clusters<hr></hr>
</td>
<td align="center" valign="bottom">4 (1–9)<hr></hr>
</td>
<td align="center" valign="bottom">40<hr></hr>
</td>
<td align="center" valign="bottom">9 (4–14)<hr></hr>
</td>
<td align="center" valign="bottom">27<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom"><bold>Target descriptors</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
<td align="center" valign="bottom"> <hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">GO terms<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">24 (16–31)<hr></hr>
</td>
<td align="center" valign="bottom">73<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">Domains<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">1 (0–2)<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">Interactions<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">8 (2–16)<hr></hr>
</td>
<td align="center" valign="bottom">24<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">Pathways<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">NA<hr></hr>
</td>
<td align="center" valign="bottom">4 (1–8)<hr></hr>
</td>
<td align="center" valign="bottom">12<hr></hr>
</td>
</tr>
<tr><td align="left"><bold>GO relationships</bold>
</td>
<td align="center">NA</td>
<td align="center">NA</td>
<td align="center">6 (3–9)</td>
<td align="center">19</td>
</tr>
</tbody>
</table>
<table-wrap-foot><p>Model coverage is the percentage of positive examples covered, averaged over the 26 DT models and 26 ILP theories. Avg: average. Model size corresponds to the average number of nodes in a DT model or of rules in a ILP theory. Occurrence of each type of descriptor is estimated by counting the number of nodes (rules respectively) involving them (NA: not applicable).</p>
</table-wrap-foot>
</table-wrap>
<p>It is worth noting that some rules contained in theories were confirmed using peer-reviewed literature. For example, considering the SEP_7 (65_Dermatitis) theory, rule 11 says that a drug is associated with this SEP if its target interacts with a protein belonging to the KEGG pathway “Focal adhesion” and to the PID pathway “Signaling events mediated by focal adhesion kinase” (Table <xref ref-type="table" rid="T5">5</xref>
). By searching the list of genes implied in dermatitis [<xref ref-type="bibr" rid="B36">36</xref>
] and confronting them to the 2 pathways, we extract 7 genes (<italic>THBS1</italic>
, <italic>COL1A2</italic>
, <italic>COL3A1</italic>
, <italic>COL4A1</italic>
, <italic>COL5A</italic>
, <italic>ITGB4</italic>
 and <italic>LAMA5</italic>
) dysregulated in dermatitis which belong to the KEGG pathway “Focal adhesion”. In the same way, two genes (<italic>BDKRB2</italic>
 and <italic>PTGFR</italic>
) are known to be dysregulated in dermatitis and belong to the “Neuroactive ligand-receptor interaction” KEGG pathway mentioned in rule 14. Finally, if we consider rule 16 we could verify that the gene <italic>ERBB3</italic>
 belonging to the “Endocytosis” KEGG pathway is indeed down regulated in dermatitis.</p>
<table-wrap position="float" id="T5"><label>Table 5</label>
<caption><p>Theory obtained for 65_Dermatitis SEP (SEP_7)</p>
</caption>
<table frame="hsides" rules="groups" border="1"><colgroup><col align="left"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
</colgroup>
<thead valign="top"><tr><th align="left"><bold>Rule #</bold>
</th>
<th align="left"><bold>Condition part of the rule</bold>
</th>
<th align="center"><bold>P</bold>
</th>
<th align="center"><bold>N</bold>
</th>
</tr>
</thead>
<tbody valign="top"><tr><td align="left" valign="bottom">3<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,’cellular response to insulin stimulus’)<hr></hr>
</td>
<td align="center" valign="bottom">15<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">18<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,part_of,go:21543)<hr></hr>
</td>
<td align="center" valign="bottom">13<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">1<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), interact(B,C), goterm(C,’central nervous system development’)<hr></hr>
</td>
<td align="center" valign="bottom">12<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">30<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), interact(B,C), pathway(C,’BCR signaling pathway’,pid), drug_cluster(A,’17_quinine’,hpcc)<hr></hr>
</td>
<td align="center" valign="bottom">12<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">24<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), interact(B,C), goterm(C,’translation’), interact(C,D)<hr></hr>
</td>
<td align="center" valign="bottom">10<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">20<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), interact(B,C), pathway(C,’BCR signaling pathway’,pid), pathway(C,’EPO signaling pathway’,pid)<hr></hr>
</td>
<td align="center" valign="bottom">9<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">25<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), goterm(B,’lipid binding’), goterm(B,’ligand-dependent nuclear receptor activity’)<hr></hr>
</td>
<td align="center" valign="bottom">9<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">35<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), interact(B,C), goterm(C,’identical protein binding’), goterm(C,’DNA binding’)<hr></hr>
</td>
<td align="center" valign="bottom">9<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">6<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,’protein homodimerization activity’), drug_cluster(A,’16_gliclazide’,hpcc)<hr></hr>
</td>
<td align="center" valign="bottom">8<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">8<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), interact(B,C), interact(C,’Serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B beta isoform’)<hr></hr>
</td>
<td align="center" valign="bottom">8<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">15<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,’response to ethanol’), goterm(B,’signal transduction’)<hr></hr>
</td>
<td align="center" valign="bottom">8<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">19<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,go:8227), drug_cluster(A,’16_Flavoxate’,hpcombo)<hr></hr>
</td>
<td align="center" valign="bottom">8<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">31<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), interact(B,C), interact(C,’Dedicator of cytokinesis protein 1’)<hr></hr>
</td>
<td align="center" valign="bottom">8<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">5<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), goterm(B,’receptor activity’), interact(B,C), goterm(C,’mitosis’)<hr></hr>
</td>
<td align="center" valign="bottom">7<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">10<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,’cation channel activity’), goterm(B,’serotonin receptor activity’)<hr></hr>
</td>
<td align="center" valign="bottom">7<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom"><bold>14</bold>
<hr></hr>
</td>
<td align="justify" valign="bottom"><bold>drug_has_target(A,B,activator), pathway(B,’Neuroactive ligand-receptor interaction’,kegg), goterm(B,’transcription, DNA-dependent’), goterm(B,’signal transduction’)</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"><bold>7</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"><bold>0</bold>
<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom"><bold>16</bold>
<hr></hr>
</td>
<td align="justify" valign="bottom"><bold>drug_has_target(A,B,inhibitor), pathway(B,’Endocytosis’,kegg)</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"><bold>7</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"><bold>0</bold>
<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">21<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), interact(B,C), interact(C,’RNA polymerase-associated protein CTR9 homolog’)<hr></hr>
</td>
<td align="center" valign="bottom">7<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">22<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), pathway(B,’Role of Calcineurin-dependent NFAT signaling in lymphocytes’,pid), goterm(B,’signal transduction’)<hr></hr>
</td>
<td align="center" valign="bottom">7<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">23<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), interact(B,C), domain(C,’ Protein synthesis factor, GTP-binding’)<hr></hr>
</td>
<td align="center" valign="bottom">7 1<hr></hr>
</td>
<td> </td>
</tr>
<tr><td align="left" valign="bottom">28<hr></hr>
</td>
<td align="justify" valign="bottom">drug_cluster(A,’7_marinol’,hpcombo)<hr></hr>
</td>
<td align="center" valign="bottom">7<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">7<hr></hr>
</td>
<td align="justify" valign="bottom">category(A,’Topoisomerase Inhibitors’), drug_has_target(A,B,inhibitor), goterm(B,’transferase activity’)<hr></hr>
</td>
<td align="center" valign="bottom">6<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">12<hr></hr>
</td>
<td align="justify" valign="bottom">drug_cluster(A,’29_norfloxacin’,hpcf)<hr></hr>
</td>
<td align="center" valign="bottom">6<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">17<hr></hr>
</td>
<td align="justify" valign="bottom">category(A,’Cyclooxygenase 2 Inhibitors’), drug_cluster(A,’2_estazolam’,hpcc)<hr></hr>
</td>
<td align="center" valign="bottom">6<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">32<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), goterm(B,’inflammatory response’), goterm(B,’protein binding’)<hr></hr>
</td>
<td align="center" valign="bottom">6<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">2<hr></hr>
</td>
<td align="justify" valign="bottom">category(A,’Serotonin Uptake Inhibitors’)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">4<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,’synapse assembly’), drug_cluster(A,’14_fentanyl’,hpcombo)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">9<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,activator), goterm(B,’protein heterodimerization activity’), goterm(B,’cell-cell signaling’)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom"><bold>11</bold>
<hr></hr>
</td>
<td align="justify" valign="bottom"><bold>drug_has_target(A,B,other), interact(B,C), pathway(C,’Focal adhesion’,kegg), pathway(C,’Signaling events mediated by focal adhesion kinase’,pid)</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"><bold>5</bold>
<hr></hr>
</td>
<td align="center" valign="bottom"><bold>0</bold>
<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">13<hr></hr>
</td>
<td align="justify" valign="bottom">category(A,’HIV Protease Inhibitors’), drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,D), go_relation(D,is_a,’catalytic activity’)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">26<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,’heart development’)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">27<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,go:65008), drug_cluster(A,’55_thiothixene’,tanimoto)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">29<hr></hr>
</td>
<td align="justify" valign="bottom">category(A,’HIV Protease Inhibitors’), drug_has_target(A,B,inhibitor), goterm(B,’oxidation reduction’)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">0<hr></hr>
</td>
</tr>
<tr><td align="left" valign="bottom">33<hr></hr>
</td>
<td align="justify" valign="bottom">drug_has_target(A,B,other), goterm(B,C), go_relation(C,is_a,go:51240)<hr></hr>
</td>
<td align="center" valign="bottom">5<hr></hr>
</td>
<td align="center" valign="bottom">1<hr></hr>
</td>
</tr>
<tr><td align="left">34</td>
<td align="justify">drug_has_target(A,B,inhibitor), goterm(B,C), go_relation(C,is_a,’binding’), drug_cluster(A,’27_quinine’,hpcombo)</td>
<td align="center">5</td>
<td align="center">1</td>
</tr>
</tbody>
</table>
<table-wrap-foot><p>The condition parts of the 35 rules contained in SEP_7 theory are given with the number of positive (P) and negative (N) covered examples. The 3 rules confirmed using peer-reviewed literature are in bld. Rules are ordered by number of positive covered examples. The 8 predicates are defined as follows: Drug_has_target(A, B, inhibitor/activator) : drug A activates/inhibits protein B; goterm(B, G): protein B is annotated by GO term G; go_relation (G1, R, G2): the relationship between GO terms G1 and G2 is R; interact(B,C): protein B interacts with protein; pathway(B, P): protein B is involved in pathway P; drug_cluster(A,K,M): drug A is member of cluster K obtained using method M; category(A,T): drug A belongs to category T; domain(B,D): protein B is composed of domain D.</p>
</table-wrap-foot>
</table-wrap>
<p>Finally, from a more global point of view the drugs can be represented according to the rules they satisfy resulting in a drug ×rule binary table. This table constitutes a kind of abstraction of the initial drug ×TC binary table (Figure <xref ref-type="fig" rid="F3">3</xref>
) based on extracted knowledge. Interestingly this new representation leads to improved clustering results for the drug set (not shown) and could be further exploited for prediction studies of particular SEPs.</p>
</sec>
</sec>
<sec sec-type="conclusions"><title>Conclusions</title>
<p>Our study proposes an integrative machine-learning approach for predicting side-effect profiles (SEPs) and understanding their mechanisms. We integrate drug characteristics and background knowledge such as functional annotation, interactions and pathways in a relational database. An extensive learning set is built by associating drugs with clusters of side effects (TCs) according to SIDER information. Our first contribution consists of extracting SEPs from this complex table of fingerprints as the longest groups of TC shared by more than one hundred drugs. We also set up two machine-learning methods, namely decision trees and inductive logic programming in order to learn which combination of properties of drugs and their targets leads to a given SEP. After evaluating the learning models, our general observation is that ILP models have a higher sensitivity than DT models. Because higher sensitivity means predicting fewer false negatives, this means that ILP predicts SEPs more often than decision trees. This was confirmed on a small test set including a checking procedure using FAERS as external and complementary information source. Indeed, more sophisticated prediction procedures can be designed integrating FAERS and based on selected rules. This should improve the prediction accuracy at least for specific SEPs displaying good quality data. The results obtained with ILP also show that background knowledge is well exploited during rule induction. Thus, in addition to targets, chemical structure and biological process annotation already studied by other groups [<xref ref-type="bibr" rid="B4">4</xref>
,<xref ref-type="bibr" rid="B7">7</xref>
,<xref ref-type="bibr" rid="B8">8</xref>
], we show that information about pathways, protein-protein interaction and to a lower extent protein domains also plays an important role in side effect characterization. Further experiments may include other types of background knowledge such as clinical data and/or polymorphisms.</p>
<p>In our approach we characterize SEPs instead of individual TCs. Indeed as drugs are frequently associated with more than one TC, studying separately each TC implicitly assumes that side effects occur independently one from the other. This likely corresponds to a simplified view of side-effect occurrence and the existence of SEPs shared by more than 20% of the drug set strongly suggests that side effects are correlated. Moreover our approach can be applied to any user-defined SEP or TC of interest.</p>
<p>We believe that our approach represents a valuable methodology for understanding and predicting side-effect profiles. Our results suggest that the first-order logic theories can already be used during the drug discovery process in order to early anticipate side-effect apparition and thus decrease the attrition rate.</p>
</sec>
<sec><title>Availability of supporting data</title>
<p>All decision trees and ILP theories are available at <ext-link ext-link-type="uri" xlink:href="http://plateforme-mbi.loria.fr/side-effect-profiles">http://plateforme-mbi.loria.fr/side-effect-profiles</ext-link>
.</p>
</sec>
<sec><title>Abbreviations</title>
<p>Acc: Acccuracy; DT: Decision tree; ILP: Inductive logic programming; MFI: Maximal frequent itemset; Sens: Sensibility; SE: Side effect; SEP: Side effect profile; Spec: Specificity.</p>
</sec>
<sec><title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec><title>Authors’ contributions</title>
<p>EB participated to the conception and design of the study and acquisition of data. He carried out the machine learning experiments. RG designed and developed programs for automatizing machine learning experiments and cross validations. GM carried out the clustering experiments on molecules. ASK and MS participated in the conception of the study and the interpretation and critical analysis of the results. MDD and MST conceived the study and carried out its design and coordination and helped EB to draft the manuscript. All authors read and approved the final manuscript.</p>
</sec>
</body>
<back><sec><title>Acknowledgements</title>
<p>EB benefited from a CIFRE contract (ANRT) including the Harmonic Pharma Company. RG was funded by Inria. GM was funded by CNRS via the BioProLor project.</p>
<p>Thanks to Dave Ritchie and Anisah Ghoorah for their careful reading of the paper.</p>
</sec>
<ref-list><ref id="B1"><mixed-citation publication-type="other"><article-title>U.S. Food and Drug Administration</article-title>
<comment>[<ext-link ext-link-type="uri" xlink:href="http://www.fda.gov">http://www.fda.gov</ext-link>
]</comment>
</mixed-citation>
</ref>
<ref id="B2"><mixed-citation publication-type="journal"><name><surname>Derumeaux</surname>
<given-names>G</given-names>
</name>
<name><surname>Ernande</surname>
<given-names>L</given-names>
</name>
<name><surname>Serusclat</surname>
<given-names>A</given-names>
</name>
<name><surname>Servan</surname>
<given-names>E</given-names>
</name>
<name><surname>Bruckert</surname>
<given-names>E</given-names>
</name>
<name><surname>Rousset</surname>
<given-names>H</given-names>
</name>
<name><surname>Senn</surname>
<given-names>S</given-names>
</name>
<name><surname>Van Gaal</surname>
<given-names>L</given-names>
</name>
<name><surname>Picandet</surname>
<given-names>B</given-names>
</name>
<name><surname>Gavini</surname>
<given-names>F</given-names>
</name>
<name><surname>Moulin</surname>
<given-names>P</given-names>
</name>
<article-title>Echocardiographic evidence for valvular toxicity of benfluorex: a double-blind randomised trial in patients with type 2 diabetes mellitus</article-title>
<source>PLoS ONE</source>
<year>2012</year>
<volume>7</volume>
<issue>6</issue>
<fpage>e38273</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0038273</pub-id>
<pub-id pub-id-type="pmid">22723853</pub-id>
</mixed-citation>
</ref>
<ref id="B3"><mixed-citation publication-type="journal"><name><surname>Kuhn</surname>
<given-names>M</given-names>
</name>
<name><surname>Campillos</surname>
<given-names>M</given-names>
</name>
<name><surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name><surname>Jensen</surname>
<given-names>LJ</given-names>
</name>
<name><surname>Bork</surname>
<given-names>P</given-names>
</name>
<article-title>A side effect resource to capture phenotypic effects of drugs</article-title>
<source>Mol Syst Biol</source>
<year>2010</year>
<volume>6</volume>
<fpage>343</fpage>
<pub-id pub-id-type="pmid">20087340</pub-id>
</mixed-citation>
</ref>
<ref id="B4"><mixed-citation publication-type="journal"><name><surname>Campillos</surname>
<given-names>M</given-names>
</name>
<name><surname>Kuhn</surname>
<given-names>M</given-names>
</name>
<name><surname>Gavin</surname>
<given-names>AC</given-names>
</name>
<name><surname>Jensen</surname>
<given-names>LJ</given-names>
</name>
<name><surname>Bork</surname>
<given-names>P</given-names>
</name>
<article-title>Drug target identification using side-effect similarity</article-title>
<source>Science</source>
<year>2008</year>
<volume>321</volume>
<issue>5886</issue>
<fpage>263</fpage>
<lpage>266</lpage>
<pub-id pub-id-type="doi">10.1126/science.1158140</pub-id>
<pub-id pub-id-type="pmid">18621671</pub-id>
</mixed-citation>
</ref>
<ref id="B5"><mixed-citation publication-type="journal"><name><surname>Takarabe</surname>
<given-names>M</given-names>
</name>
<name><surname>Kotera</surname>
<given-names>M</given-names>
</name>
<name><surname>Nishimura</surname>
<given-names>Y</given-names>
</name>
<name><surname>Goto</surname>
<given-names>S</given-names>
</name>
<name><surname>Yamanishi</surname>
<given-names>Y</given-names>
</name>
<article-title>Drug target prediction using adverse event report systems: a pharmacogenomic approach</article-title>
<source>Bioinformatics</source>
<year>2012</year>
<volume>28</volume>
<issue>18</issue>
<fpage>i611</fpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bts413</pub-id>
<pub-id pub-id-type="pmid">22962489</pub-id>
</mixed-citation>
</ref>
<ref id="B6"><mixed-citation publication-type="journal"><name><surname>Yang</surname>
<given-names>L</given-names>
</name>
<name><surname>Agarwal</surname>
<given-names>P</given-names>
</name>
<article-title>Systematic drug repositioning based on clinical side-effects</article-title>
<source>PLoS ONE</source>
<year>2011</year>
<volume>6</volume>
<issue>12</issue>
<fpage>e28025</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0028025</pub-id>
<pub-id pub-id-type="pmid">22205936</pub-id>
</mixed-citation>
</ref>
<ref id="B7"><mixed-citation publication-type="journal"><name><surname>Scheiber</surname>
<given-names>J</given-names>
</name>
<name><surname>Jenkins</surname>
<given-names>JL</given-names>
</name>
<name><surname>Sukuru</surname>
<given-names>SC</given-names>
</name>
<name><surname>Bender</surname>
<given-names>A</given-names>
</name>
<name><surname>Mikhailov</surname>
<given-names>D</given-names>
</name>
<name><surname>Milik</surname>
<given-names>M</given-names>
</name>
<name><surname>Azzaoui</surname>
<given-names>K</given-names>
</name>
<name><surname>Whitebread</surname>
<given-names>S</given-names>
</name>
<name><surname>Hamon</surname>
<given-names>J</given-names>
</name>
<name><surname>Urban</surname>
<given-names>L</given-names>
</name>
<name><surname>Glick</surname>
<given-names>M</given-names>
</name>
<name><surname>Davies</surname>
<given-names>JW</given-names>
</name>
<article-title>Mapping adverse drug reactions in chemical space</article-title>
<source>J Med Chem</source>
<year>2009</year>
<volume>52</volume>
<issue>9</issue>
<fpage>3103</fpage>
<lpage>3107</lpage>
<pub-id pub-id-type="doi">10.1021/jm801546k</pub-id>
<pub-id pub-id-type="pmid">19378990</pub-id>
</mixed-citation>
</ref>
<ref id="B8"><mixed-citation publication-type="journal"><name><surname>Lee</surname>
<given-names>S</given-names>
</name>
<name><surname>Lee</surname>
<given-names>KH</given-names>
</name>
<name><surname>Song</surname>
<given-names>M</given-names>
</name>
<name><surname>Lee</surname>
<given-names>D</given-names>
</name>
<article-title>Building the process-drug-side effect network to discover the relationship between biological processes and side effects</article-title>
<source>BMC Bioinformatics</source>
<year>2011</year>
<volume>12</volume>
<issue>Suppl 2</issue>
<fpage>S2</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-12-S2-S2</pub-id>
<pub-id pub-id-type="pmid">21489221</pub-id>
</mixed-citation>
</ref>
<ref id="B9"><mixed-citation publication-type="journal"><name><surname>Yamanishi</surname>
<given-names>Y</given-names>
</name>
<name><surname>Pauwels</surname>
<given-names>E</given-names>
</name>
<name><surname>Kotera</surname>
<given-names>M</given-names>
</name>
<article-title>Drug side-effect prediction based on the integration of chemical and biological spaces</article-title>
<source>J Chem Inf Model</source>
<year>2012</year>
<volume>52</volume>
<issue>12</issue>
<fpage>3284</fpage>
<lpage>3292</lpage>
<pub-id pub-id-type="doi">10.1021/ci2005548</pub-id>
<pub-id pub-id-type="pmid">23157436</pub-id>
</mixed-citation>
</ref>
<ref id="B10"><mixed-citation publication-type="journal"><name><surname>Benabderrahmane</surname>
<given-names>S</given-names>
</name>
<name><surname>Smail-Tabbone</surname>
<given-names>M</given-names>
</name>
<name><surname>Poch</surname>
<given-names>O</given-names>
</name>
<name><surname>Napoli</surname>
<given-names>A</given-names>
</name>
<name><surname>Devignes</surname>
<given-names>MD</given-names>
</name>
<article-title>IntelliGO: a new vector-based semantic similarity measure including annotation origin</article-title>
<source>BMC Bioinformatics</source>
<year>2010</year>
<volume>11</volume>
<fpage>588</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-11-588</pub-id>
<pub-id pub-id-type="pmid">21122125</pub-id>
</mixed-citation>
</ref>
<ref id="B11"><mixed-citation publication-type="journal"><name><surname>Knox</surname>
<given-names>C</given-names>
</name>
<name><surname>Law</surname>
<given-names>V</given-names>
</name>
<name><surname>Jewison</surname>
<given-names>T</given-names>
</name>
<name><surname>Liu</surname>
<given-names>P</given-names>
</name>
<name><surname>Ly</surname>
<given-names>S</given-names>
</name>
<name><surname>Frolkis</surname>
<given-names>A</given-names>
</name>
<name><surname>Pon</surname>
<given-names>A</given-names>
</name>
<name><surname>Banco</surname>
<given-names>K</given-names>
</name>
<name><surname>Mak</surname>
<given-names>C</given-names>
</name>
<name><surname>Neveu</surname>
<given-names>V</given-names>
</name>
<name><surname>Djoumbou</surname>
<given-names>Y</given-names>
</name>
<name><surname>Eisner</surname>
<given-names>R</given-names>
</name>
<name><surname>Guo</surname>
<given-names>AC</given-names>
</name>
<name><surname>Wishart</surname>
<given-names>DS</given-names>
</name>
<article-title>DrugBank 3.0: a comprehensive resource for ’omics’ research on drugs.</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<issue>Database issue</issue>
<fpage>D1035—1041</fpage>
<pub-id pub-id-type="pmid">21059682</pub-id>
</mixed-citation>
</ref>
<ref id="B12"><mixed-citation publication-type="journal"><name><surname>O’Boyle</surname>
<given-names>NM</given-names>
</name>
<name><surname>Banck</surname>
<given-names>M</given-names>
</name>
<name><surname>James</surname>
<given-names>CA</given-names>
</name>
<name><surname>Morley</surname>
<given-names>C</given-names>
</name>
<name><surname>Vandermeersch</surname>
<given-names>T</given-names>
</name>
<name><surname>Hutchison</surname>
<given-names>GR</given-names>
</name>
<article-title>Open Babel: An open chemical toolbox</article-title>
<source>J Cheminform</source>
<year>2011</year>
<volume>3</volume>
<fpage>33</fpage>
<pub-id pub-id-type="doi">10.1186/1758-2946-3-33</pub-id>
<pub-id pub-id-type="pmid">21982300</pub-id>
</mixed-citation>
</ref>
<ref id="B13"><mixed-citation publication-type="journal"><name><surname>Ritchie</surname>
<given-names>DW</given-names>
</name>
<name><surname>Kemp</surname>
<given-names>GJL</given-names>
</name>
<article-title>Fast computation, rotation, and comparison of low resolution spherical harmonic molecular surfaces</article-title>
<source>J Comput Chem</source>
<year>1999</year>
<volume>20</volume>
<issue>4</issue>
<fpage>383</fpage>
<lpage>395</lpage>
<pub-id pub-id-type="doi">10.1002/(SICI)1096-987X(199903)20:4<383::AID-JCC1>3.0.CO;2-M</pub-id>
</mixed-citation>
</ref>
<ref id="B14"><mixed-citation publication-type="journal"><name><surname>Cai</surname>
<given-names>W</given-names>
</name>
<name><surname>Xu</surname>
<given-names>J</given-names>
</name>
<name><surname>Shao</surname>
<given-names>X</given-names>
</name>
<name><surname>Leroux</surname>
<given-names>V</given-names>
</name>
<name><surname>Beautrait</surname>
<given-names>A</given-names>
</name>
<name><surname>Maigret</surname>
<given-names>B</given-names>
</name>
<article-title>SHEF: a vHTS geometrical filter using coefficients of spherical harmonic molecular surfaces</article-title>
<source>J Mol Model</source>
<year>2008</year>
<volume>14</volume>
<issue>5</issue>
<fpage>393</fpage>
<lpage>401</lpage>
<pub-id pub-id-type="doi">10.1007/s00894-008-0286-z</pub-id>
<pub-id pub-id-type="pmid">18330602</pub-id>
</mixed-citation>
</ref>
<ref id="B15"><mixed-citation publication-type="journal"><name><surname>Ward</surname>
<given-names>JH</given-names>
</name>
<article-title>Hierarchical grouping to optimize an objective function</article-title>
<source>J Am Stat Assoc</source>
<year>1963</year>
<volume>58</volume>
<issue>301</issue>
<fpage>236</fpage>
<lpage>244</lpage>
<pub-id pub-id-type="doi">10.1080/01621459.1963.10500845</pub-id>
</mixed-citation>
</ref>
<ref id="B16"><mixed-citation publication-type="journal"><name><surname>Kelley</surname>
<given-names>LA</given-names>
</name>
<name><surname>Gardner</surname>
<given-names>SP</given-names>
</name>
<name><surname>Sutcliffe</surname>
<given-names>MJ</given-names>
</name>
<article-title>An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies</article-title>
<source>Protein Eng</source>
<year>1996</year>
<volume>9</volume>
<issue>11</issue>
<fpage>1063</fpage>
<lpage>1065</lpage>
<pub-id pub-id-type="doi">10.1093/protein/9.11.1063</pub-id>
<pub-id pub-id-type="pmid">8961360</pub-id>
</mixed-citation>
</ref>
<ref id="B17"><mixed-citation publication-type="journal"><name><surname>Berman</surname>
<given-names>HM</given-names>
</name>
<name><surname>Westbrook</surname>
<given-names>J</given-names>
</name>
<name><surname>Feng</surname>
<given-names>Z</given-names>
</name>
<name><surname>Gilliland</surname>
<given-names>G</given-names>
</name>
<name><surname>Bhat</surname>
<given-names>TN</given-names>
</name>
<name><surname>Weissig</surname>
<given-names>H</given-names>
</name>
<name><surname>Shindyalov</surname>
<given-names>IN</given-names>
</name>
<name><surname>Bourne</surname>
<given-names>PE</given-names>
</name>
<article-title>The protein data bank</article-title>
<source>Nucleic Acids Res</source>
<year>2000</year>
<volume>28</volume>
<fpage>235</fpage>
<lpage>242</lpage>
<pub-id pub-id-type="doi">10.1093/nar/28.1.235</pub-id>
<pub-id pub-id-type="pmid">10592235</pub-id>
</mixed-citation>
</ref>
<ref id="B18"><mixed-citation publication-type="journal"><name><surname>Kerrien</surname>
<given-names>S</given-names>
</name>
<name><surname>Aranda</surname>
<given-names>B</given-names>
</name>
<name><surname>Breuza</surname>
<given-names>L</given-names>
</name>
<name><surname>Bridge</surname>
<given-names>A</given-names>
</name>
<name><surname>Broackes-Carter</surname>
<given-names>F</given-names>
</name>
<name><surname>Chen</surname>
<given-names>C</given-names>
</name>
<name><surname>Duesbury</surname>
<given-names>M</given-names>
</name>
<name><surname>Dumousseau</surname>
<given-names>M</given-names>
</name>
<name><surname>Feuermann</surname>
<given-names>M</given-names>
</name>
<name><surname>Hinz</surname>
<given-names>U</given-names>
</name>
<name><surname>Jandrasits</surname>
<given-names>C</given-names>
</name>
<name><surname>Jimenez</surname>
<given-names>RC</given-names>
</name>
<name><surname>Khadake</surname>
<given-names>J</given-names>
</name>
<name><surname>Mahadevan</surname>
<given-names>U</given-names>
</name>
<name><surname>Masson</surname>
<given-names>P</given-names>
</name>
<name><surname>Pedruzzi</surname>
<given-names>I</given-names>
</name>
<name><surname>Pfeiffenberger</surname>
<given-names>E</given-names>
</name>
<name><surname>Porras</surname>
<given-names>P</given-names>
</name>
<name><surname>Raghunath</surname>
<given-names>A</given-names>
</name>
<name><surname>Roechert</surname>
<given-names>B</given-names>
</name>
<name><surname>Orchard</surname>
<given-names>S</given-names>
</name>
<name><surname>Hermjakob</surname>
<given-names>H</given-names>
</name>
<article-title>The intAct molecular interaction database in 2012</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<issue>Database issue</issue>
<fpage>D841—D846</fpage>
<pub-id pub-id-type="pmid">22121220</pub-id>
</mixed-citation>
</ref>
<ref id="B19"><mixed-citation publication-type="journal"><name><surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
<name><surname>Goto</surname>
<given-names>S</given-names>
</name>
<name><surname>Sato</surname>
<given-names>Y</given-names>
</name>
<name><surname>Furumichi</surname>
<given-names>M</given-names>
</name>
<name><surname>Tanabe</surname>
<given-names>M</given-names>
</name>
<article-title>KEGG for integration and interpretation of large-scale molecular data sets</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<issue>Database issue</issue>
<fpage>D109—D114</fpage>
<pub-id pub-id-type="pmid">22080510</pub-id>
</mixed-citation>
</ref>
<ref id="B20"><mixed-citation publication-type="journal"><name><surname>Schaefer</surname>
<given-names>CF</given-names>
</name>
<name><surname>Anthony</surname>
<given-names>K</given-names>
</name>
<name><surname>Krupa</surname>
<given-names>S</given-names>
</name>
<name><surname>Buchoff</surname>
<given-names>J</given-names>
</name>
<name><surname>Day</surname>
<given-names>M</given-names>
</name>
<name><surname>Hannay</surname>
<given-names>T</given-names>
</name>
<name><surname>Buetow</surname>
<given-names>KH</given-names>
</name>
<article-title>PID: the Pathway Interaction Database</article-title>
<source>Nucleic Acids Res</source>
<year>2009</year>
<volume>37</volume>
<issue>Database issue</issue>
<fpage>D674—679</fpage>
<pub-id pub-id-type="pmid">18832364</pub-id>
</mixed-citation>
</ref>
<ref id="B21"><mixed-citation publication-type="journal"><name><surname>Binns</surname>
<given-names>D</given-names>
</name>
<name><surname>Dimmer</surname>
<given-names>E</given-names>
</name>
<name><surname>Huntley</surname>
<given-names>R</given-names>
</name>
<name><surname>Barrell</surname>
<given-names>D</given-names>
</name>
<name><surname>O’Donovan</surname>
<given-names>C</given-names>
</name>
<name><surname>Apweiler</surname>
<given-names>R</given-names>
</name>
<article-title>Quick GO: a web-based tool for Gene Ontology searching</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<issue>22</issue>
<fpage>3045</fpage>
<lpage>3046</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp536</pub-id>
<pub-id pub-id-type="pmid">19744993</pub-id>
</mixed-citation>
</ref>
<ref id="B22"><mixed-citation publication-type="journal"><name><surname>Hunter</surname>
<given-names>S</given-names>
</name>
<name><surname>Jones</surname>
<given-names>P</given-names>
</name>
<name><surname>Mitchell</surname>
<given-names>A</given-names>
</name>
<name><surname>Apweiler</surname>
<given-names>R</given-names>
</name>
<name><surname>Attwood</surname>
<given-names>TK</given-names>
</name>
<name><surname>Bateman</surname>
<given-names>A</given-names>
</name>
<name><surname>Bernard</surname>
<given-names>T</given-names>
</name>
<name><surname>Binns</surname>
<given-names>D</given-names>
</name>
<name><surname>Bork</surname>
<given-names>P</given-names>
</name>
<name><surname>Burge</surname>
<given-names>S</given-names>
</name>
<name><surname>de Castro</surname>
<given-names>E</given-names>
</name>
<name><surname>Coggill</surname>
<given-names>P</given-names>
</name>
<name><surname>Corbett</surname>
<given-names>M</given-names>
</name>
<name><surname>Das</surname>
<given-names>U</given-names>
</name>
<name><surname>Daugherty</surname>
<given-names>L</given-names>
</name>
<name><surname>Duquenne</surname>
<given-names>L</given-names>
</name>
<name><surname>Finn</surname>
<given-names>RD</given-names>
</name>
<name><surname>Fraser</surname>
<given-names>M</given-names>
</name>
<name><surname>Gough</surname>
<given-names>J</given-names>
</name>
<name><surname>Haft</surname>
<given-names>D</given-names>
</name>
<name><surname>Hulo</surname>
<given-names>N</given-names>
</name>
<name><surname>Kahn</surname>
<given-names>D</given-names>
</name>
<name><surname>Kelly</surname>
<given-names>E</given-names>
</name>
<name><surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name><surname>Lonsdale</surname>
<given-names>D</given-names>
</name>
<name><surname>Lopez</surname>
<given-names>R</given-names>
</name>
<name><surname>Madera</surname>
<given-names>M</given-names>
</name>
<name><surname>Maslen</surname>
<given-names>J</given-names>
</name>
<name><surname>McAnulla</surname>
<given-names>C</given-names>
</name>
<name><surname>McDowall</surname>
<given-names>J</given-names>
</name>
<etal></etal>
<article-title>InterPro in 2011: new developments in the family and domain prediction database</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<issue>Database issue</issue>
<fpage>D306—D312</fpage>
<pub-id pub-id-type="pmid">22096229</pub-id>
</mixed-citation>
</ref>
<ref id="B23"><mixed-citation publication-type="book"><name><surname>Bresso</surname>
<given-names>E</given-names>
</name>
<name><surname>Benabderrahmane</surname>
<given-names>S</given-names>
</name>
<name><surname>Smail-Tabbone</surname>
<given-names>M</given-names>
</name>
<name><surname>Marchetti</surname>
<given-names>G</given-names>
</name>
<name><surname>Karaboga</surname>
<given-names>AS</given-names>
</name>
<name><surname>Souchet</surname>
<given-names>M</given-names>
</name>
<name><surname>Napoli</surname>
<given-names>A</given-names>
</name>
<name><surname>Devignes</surname>
<given-names>MD</given-names>
</name>
<article-title>Use of domain knowledge for dimension reduction - application to mining of drug side effects</article-title>
<source>Proceedings of the International Conference on Knowledge Discovery and Information Retrieval</source>
<year>2011</year>
<publisher-name>SciTePress Digital Library</publisher-name>
<fpage>271</fpage>
<lpage>276</lpage>
</mixed-citation>
</ref>
<ref id="B24"><mixed-citation publication-type="other"><article-title>Medical Dictionary for Regulatory Activities</article-title>
<comment>[<ext-link ext-link-type="uri" xlink:href="http://www.meddramsso.com">http://www.meddramsso.com</ext-link>
]</comment>
</mixed-citation>
</ref>
<ref id="B25"><mixed-citation publication-type="other"><name><surname>Szathmary</surname>
<given-names>L</given-names>
</name>
<article-title>Symbolic data mining methods with the Coron platform</article-title>
<source>PhD Thesis in Computer Science,</source>
<comment>Univ. Henri Poincaré – Nancy 1, France, 2006</comment>
</mixed-citation>
</ref>
<ref id="B26"><mixed-citation publication-type="other"><article-title>Coron</article-title>
<comment>[<ext-link ext-link-type="uri" xlink:href="http://coron.loria.fr">http://coron.loria.fr</ext-link>
]</comment>
</mixed-citation>
</ref>
<ref id="B27"><mixed-citation publication-type="journal"><name><surname>Hall</surname>
<given-names>M</given-names>
</name>
<name><surname>Frank</surname>
<given-names>E</given-names>
</name>
<name><surname>Holmes</surname>
<given-names>G</given-names>
</name>
<name><surname>Pfahringer</surname>
<given-names>B</given-names>
</name>
<name><surname>Reutemann</surname>
<given-names>P</given-names>
</name>
<name><surname>Witten</surname>
<given-names>IH</given-names>
</name>
<name><surname>Witten</surname>
<given-names>IH</given-names>
</name>
<article-title>The WEKA data mining software: an update</article-title>
<source>SIGKDD Explorations</source>
<year>2009</year>
<volume>11</volume>
<fpage>10</fpage>
<lpage>18</lpage>
<pub-id pub-id-type="doi">10.1145/1656274.1656278</pub-id>
</mixed-citation>
</ref>
<ref id="B28"><mixed-citation publication-type="book"><name><surname>Muggleton</surname>
<given-names>S</given-names>
</name>
<name><surname>Srinivasan</surname>
<given-names>A</given-names>
</name>
<name><surname>King</surname>
<given-names>RD</given-names>
</name>
<name><surname>Sternberg</surname>
<given-names>MJE</given-names>
</name>
<person-group person-group-type="editor">Arikawa S, Motoda H</person-group>
<article-title>Biochemical knowledge discovery using inductive logic programming.</article-title>
<source>Discovey Science, Volume 1532 of Lecture Notes in Computer Science</source>
<year>1998</year>
<publisher-name>Berlin Heidelberg: Springer</publisher-name>
<fpage>326</fpage>
<lpage>341</lpage>
</mixed-citation>
</ref>
<ref id="B29"><mixed-citation publication-type="journal"><name><surname>Page</surname>
<given-names>D</given-names>
</name>
<name><surname>Craven</surname>
<given-names>M</given-names>
</name>
<article-title>Biological applications of multi-relational data mining</article-title>
<source>SIGKDD Explorations</source>
<year>2003</year>
<volume>5</volume>
<fpage>69</fpage>
<lpage>79</lpage>
</mixed-citation>
</ref>
<ref id="B30"><mixed-citation publication-type="journal"><name><surname>Santos</surname>
<given-names>JC</given-names>
</name>
<name><surname>Nassif</surname>
<given-names>H</given-names>
</name>
<name><surname>Page</surname>
<given-names>D</given-names>
</name>
<name><surname>Muggleton</surname>
<given-names>SH</given-names>
</name>
<name><surname>Sternberg</surname>
<given-names>MJ</given-names>
</name>
<article-title>Automated identification of protein-ligand interaction features using inductive logic programming: A hexose binding case study</article-title>
<source>BMC Bioinformatics</source>
<year>2012</year>
<volume>13</volume>
<fpage>162</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-13-162</pub-id>
<pub-id pub-id-type="pmid">22783946</pub-id>
</mixed-citation>
</ref>
<ref id="B31"><mixed-citation publication-type="journal"><name><surname>Muggleton</surname>
<given-names>S</given-names>
</name>
<article-title>Inductive logic programming</article-title>
<source>New Generat Comput</source>
<year>1991</year>
<volume>8</volume>
<issue>4</issue>
<fpage>295</fpage>
<lpage>318</lpage>
<pub-id pub-id-type="doi">10.1007/BF03037089</pub-id>
</mixed-citation>
</ref>
<ref id="B32"><mixed-citation publication-type="other"><article-title>The Aleph Manual</article-title>
<comment>[<ext-link ext-link-type="uri" xlink:href="http://www.cs.ox.ac.uk/activities/machlearn/Aleph/aleph.html">http://www.cs.ox.ac.uk/activities/machlearn/Aleph/aleph.html</ext-link>
]</comment>
</mixed-citation>
</ref>
<ref id="B33"><mixed-citation publication-type="book"><name><surname>Bresso</surname>
<given-names>E</given-names>
</name>
<name><surname>Grisoni</surname>
<given-names>R</given-names>
</name>
<name><surname>Devignes</surname>
<given-names>MD</given-names>
</name>
<name><surname>Napoli</surname>
<given-names>A</given-names>
</name>
<name><surname>Smail-Tabbone</surname>
<given-names>M</given-names>
</name>
<article-title>Formal concept analysis for the interpretation of relational learning applied on 3D protein-binding sites</article-title>
<source>Proceedings of the International Conference on Knowledge Discovery and Information Retrieval</source>
<year>2012</year>
<publisher-name>SciTePress Digital Library</publisher-name>
<fpage>111</fpage>
<lpage>120</lpage>
</mixed-citation>
</ref>
<ref id="B34"><mixed-citation publication-type="other"><article-title>KNIME</article-title>
<comment>[<ext-link ext-link-type="uri" xlink:href="http://www.knime.org">http://www.knime.org</ext-link>
]</comment>
</mixed-citation>
</ref>
<ref id="B35"><mixed-citation publication-type="book"><name><surname>Napoli</surname>
<given-names>A</given-names>
</name>
<person-group person-group-type="editor">Cohen H, Lefebvre C</person-group>
<article-title>A smooth introduction to symbolic methods for knowledge discovery</article-title>
<source>Handbook of Categorization in Cognitive Science</source>
<year>2005</year>
<publisher-name>Amsterdam: Elsevier</publisher-name>
<fpage>913</fpage>
<lpage>933</lpage>
</mixed-citation>
</ref>
<ref id="B36"><mixed-citation publication-type="journal"><name><surname>Dolcino</surname>
<given-names>M</given-names>
</name>
<name><surname>Cozzani</surname>
<given-names>E</given-names>
</name>
<name><surname>Riva</surname>
<given-names>S</given-names>
</name>
<name><surname>Parodi</surname>
<given-names>A</given-names>
</name>
<name><surname>Tinazzi</surname>
<given-names>E</given-names>
</name>
<name><surname>Lunardi</surname>
<given-names>C</given-names>
</name>
<name><surname>Puccetti</surname>
<given-names>A</given-names>
</name>
<article-title>Gene expression profiling in dermatitis herpetiformis skin lesions</article-title>
<source>Clin Dev Immunol</source>
<year>2012</year>
<volume>2012</volume>
<fpage>198956</fpage>
<pub-id pub-id-type="pmid">22991566</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Pmc/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0000520 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0000520 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022

	Serveur d'exploration sur la recherche en informatique en Lorraine
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la recherche en informatique en Lorraine

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri