The folded k-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies
Identifieur interne : 001035 ( Pmc/Curation ); précédent : 001034; suivant : 001036The folded k-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies
Auteurs : Abdulkadir Elmas [États-Unis] ; Xiaodong Wang [États-Unis] ; Jacqueline M. Dresch [États-Unis]Source :
- PLoS ONE [ 1932-6203 ] ; 2017.
Abstract
Understanding the molecular machinery involved in transcriptional regulation is central to improving our knowledge of an organism’s development, disease, and evolution. The building blocks of this complex molecular machinery are an organism’s genomic DNA sequence and transcription factor proteins. Despite the vast amount of sequence data now available for many model organisms, predicting where transcription factors bind, often referred to as ‘motif detection’ is still incredibly challenging. In this study, we develop a novel bioinformatic approach to binding site prediction. We do this by extending pre-existing SVM approaches in an unbiased way to include all possible gapped
Url:
DOI: 10.1371/journal.pone.0185570
PubMed: 28982128
PubMed Central: 5628859
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: Pour aller vers cette notice dans l'étape Curation :001035
Links to Exploration step
PMC:5628859Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">The folded <italic>k</italic>
-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies</title>
<author><name sortKey="Elmas, Abdulkadir" sort="Elmas, Abdulkadir" uniqKey="Elmas A" first="Abdulkadir" last="Elmas">Abdulkadir Elmas</name>
<affiliation wicri:level="1"><nlm:aff id="aff001"><addr-line>Department of Electrical Engineering, Columbia University, New York, NY, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, Columbia University, New York, NY</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Wang, Xiaodong" sort="Wang, Xiaodong" uniqKey="Wang X" first="Xiaodong" last="Wang">Xiaodong Wang</name>
<affiliation wicri:level="1"><nlm:aff id="aff001"><addr-line>Department of Electrical Engineering, Columbia University, New York, NY, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, Columbia University, New York, NY</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Dresch, Jacqueline M" sort="Dresch, Jacqueline M" uniqKey="Dresch J" first="Jacqueline M." last="Dresch">Jacqueline M. Dresch</name>
<affiliation wicri:level="1"><nlm:aff id="aff002"><addr-line>Department of Mathematics and Computer Science, Clark University, Worcester, MA, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Mathematics and Computer Science, Clark University, Worcester, MA</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">28982128</idno>
<idno type="pmc">5628859</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5628859</idno>
<idno type="RBID">PMC:5628859</idno>
<idno type="doi">10.1371/journal.pone.0185570</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Pmc/Corpus">001035</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">001035</idno>
<idno type="wicri:Area/Pmc/Curation">001035</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">001035</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">The folded <italic>k</italic>
-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies</title>
<author><name sortKey="Elmas, Abdulkadir" sort="Elmas, Abdulkadir" uniqKey="Elmas A" first="Abdulkadir" last="Elmas">Abdulkadir Elmas</name>
<affiliation wicri:level="1"><nlm:aff id="aff001"><addr-line>Department of Electrical Engineering, Columbia University, New York, NY, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, Columbia University, New York, NY</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Wang, Xiaodong" sort="Wang, Xiaodong" uniqKey="Wang X" first="Xiaodong" last="Wang">Xiaodong Wang</name>
<affiliation wicri:level="1"><nlm:aff id="aff001"><addr-line>Department of Electrical Engineering, Columbia University, New York, NY, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, Columbia University, New York, NY</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Dresch, Jacqueline M" sort="Dresch, Jacqueline M" uniqKey="Dresch J" first="Jacqueline M." last="Dresch">Jacqueline M. Dresch</name>
<affiliation wicri:level="1"><nlm:aff id="aff002"><addr-line>Department of Mathematics and Computer Science, Clark University, Worcester, MA, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Mathematics and Computer Science, Clark University, Worcester, MA</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series><title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint><date when="2017">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>Understanding the molecular machinery involved in transcriptional regulation is central to improving our knowledge of an organism’s development, disease, and evolution. The building blocks of this complex molecular machinery are an organism’s genomic DNA sequence and transcription factor proteins. Despite the vast amount of sequence data now available for many model organisms, predicting where transcription factors bind, often referred to as ‘motif detection’ is still incredibly challenging. In this study, we develop a novel bioinformatic approach to binding site prediction. We do this by extending pre-existing SVM approaches in an unbiased way to include all possible gapped <italic>k</italic>
-mers, representing different combinations of complex nucleotide dependencies within binding sites. We show the advantages of this new approach when compared to existing SVM approaches, through a rigorous set of cross-validation experiments. We also demonstrate the effectiveness of our new approach by reporting on its improved performance on a set of 127 genomic regions known to regulate gene expression along the anterio-posterior axis in early <italic>Drosophila</italic>
embryos.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Borok, M" uniqKey="Borok M">M Borok</name>
</author>
<author><name sortKey="Tran, D" uniqKey="Tran D">D Tran</name>
</author>
<author><name sortKey="Ho, M" uniqKey="Ho M">M Ho</name>
</author>
<author><name sortKey="Ra, D" uniqKey="Ra D">D RA</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Starr, M" uniqKey="Starr M">M Starr</name>
</author>
<author><name sortKey="Ho, M" uniqKey="Ho M">M Ho</name>
</author>
<author><name sortKey="Gunther, E" uniqKey="Gunther E">E Gunther</name>
</author>
<author><name sortKey="Tu, Y" uniqKey="Tu Y">Y Tu</name>
</author>
<author><name sortKey="Shur, A" uniqKey="Shur A">A Shur</name>
</author>
<author><name sortKey="Goetz, S" uniqKey="Goetz S">S Goetz</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Crocker, J" uniqKey="Crocker J">J Crocker</name>
</author>
<author><name sortKey="Tamori, Y" uniqKey="Tamori Y">Y Tamori</name>
</author>
<author><name sortKey="Erives, A" uniqKey="Erives A">A Erives</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Martinez, Ca" uniqKey="Martinez C">CA Martinez</name>
</author>
<author><name sortKey="Barr, K" uniqKey="Barr K">K Barr</name>
</author>
<author><name sortKey="Kim, Ar" uniqKey="Kim A">AR Kim</name>
</author>
<author><name sortKey="Reinitz, J" uniqKey="Reinitz J">J Reinitz</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
<author><name sortKey="Joyce, J" uniqKey="Joyce J">J Joyce</name>
</author>
<author><name sortKey="Eggert, M" uniqKey="Eggert M">M Eggert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author><name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author><name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author><name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author><name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bairoch, A" uniqKey="Bairoch A">A Bairoch</name>
</author>
<author><name sortKey="Bucher, P" uniqKey="Bucher P">P Bucher</name>
</author>
<author><name sortKey="Hofmann, K" uniqKey="Hofmann K">K Hofmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Attwood, Tk" uniqKey="Attwood T">TK Attwood</name>
</author>
<author><name sortKey="Beck, Me" uniqKey="Beck M">ME Beck</name>
</author>
<author><name sortKey="Flower, Dr" uniqKey="Flower D">DR Flower</name>
</author>
<author><name sortKey="Scordis, P" uniqKey="Scordis P">P Scordis</name>
</author>
<author><name sortKey="Selley, Jn" uniqKey="Selley J">JN Selley</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Krogh, A" uniqKey="Krogh A">A Krogh</name>
</author>
<author><name sortKey="Brown, M" uniqKey="Brown M">M Brown</name>
</author>
<author><name sortKey="Mian, I" uniqKey="Mian I">I Mian</name>
</author>
<author><name sortKey="Sjolander, K" uniqKey="Sjolander K">K Sjolander</name>
</author>
<author><name sortKey="Haussler, D" uniqKey="Haussler D">D Haussler</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zellers, Rg" uniqKey="Zellers R">RG Zellers</name>
</author>
<author><name sortKey="Drewell, Ra" uniqKey="Drewell R">RA Drewell</name>
</author>
<author><name sortKey="Dresch, Jm" uniqKey="Dresch J">JM Dresch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dresch, Jm" uniqKey="Dresch J">JM Dresch</name>
</author>
<author><name sortKey="Zellers, Rg" uniqKey="Zellers R">RG Zellers</name>
</author>
<author><name sortKey="Bork, Dk" uniqKey="Bork D">DK Bork</name>
</author>
<author><name sortKey="Drewell, Ra" uniqKey="Drewell R">RA Drewell</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Stormo, G" uniqKey="Stormo G">G Stormo</name>
</author>
<author><name sortKey="Schneider, Td" uniqKey="Schneider T">TD Schneider</name>
</author>
<author><name sortKey="Gold, L" uniqKey="Gold L">L Gold</name>
</author>
<author><name sortKey="Ehrenfeucht, A" uniqKey="Ehrenfeucht A">A Ehrenfeucht</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Staden, R" uniqKey="Staden R">R Staden</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Berg, Og" uniqKey="Berg O">OG Berg</name>
</author>
<author><name sortKey="Von Hippel, Ph" uniqKey="Von Hippel P">PH von Hippel</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author><name sortKey="Gribskov, M" uniqKey="Gribskov M">M Gribskov</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hertz, Gz" uniqKey="Hertz G">GZ Hertz</name>
</author>
<author><name sortKey="Stormo, Gd" uniqKey="Stormo G">GD Stormo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Man, Tk" uniqKey="Man T">TK Man</name>
</author>
<author><name sortKey="Stormo, Gd" uniqKey="Stormo G">GD Stormo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Benos, Pv" uniqKey="Benos P">PV Benos</name>
</author>
<author><name sortKey="Lapedes, As" uniqKey="Lapedes A">AS Lapedes</name>
</author>
<author><name sortKey="Stormo, Gd" uniqKey="Stormo G">GD Stormo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lassig, M" uniqKey="Lassig M">M Lassig</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Badis, G" uniqKey="Badis G">G Badis</name>
</author>
<author><name sortKey="Berger, Mf" uniqKey="Berger M">MF Berger</name>
</author>
<author><name sortKey="Philippakis, Aa" uniqKey="Philippakis A">AA Philippakis</name>
</author>
<author><name sortKey="Talukder, S" uniqKey="Talukder S">S Talukder</name>
</author>
<author><name sortKey="Gehrke, Ar" uniqKey="Gehrke A">AR Gehrke</name>
</author>
<author><name sortKey="Jaeger, Sa" uniqKey="Jaeger S">SA Jaeger</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Siddharthan, R" uniqKey="Siddharthan R">R Siddharthan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Annala, M" uniqKey="Annala M">M Annala</name>
</author>
<author><name sortKey="Laurila, K" uniqKey="Laurila K">K Laurila</name>
</author>
<author><name sortKey="Lahdesmaki, H" uniqKey="Lahdesmaki H">H Lahdesmaki</name>
</author>
<author><name sortKey="Nykter, M" uniqKey="Nykter M">M Nykter</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cristianini, N" uniqKey="Cristianini N">N Cristianini</name>
</author>
<author><name sortKey="Shawe Taylor, J" uniqKey="Shawe Taylor J">J Shawe-Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Vapnik, V" uniqKey="Vapnik V">V Vapnik</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Jaakkola, T" uniqKey="Jaakkola T">T Jaakkola</name>
</author>
<author><name sortKey="Diekhans, M" uniqKey="Diekhans M">M Diekhans</name>
</author>
<author><name sortKey="Haussler, D" uniqKey="Haussler D">D Haussler</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Murzin, Ag" uniqKey="Murzin A">AG Murzin</name>
</author>
<author><name sortKey="Brenner, Se" uniqKey="Brenner S">SE Brenner</name>
</author>
<author><name sortKey="Hubbard, T" uniqKey="Hubbard T">T Hubbard</name>
</author>
<author><name sortKey="Chothia, C" uniqKey="Chothia C">C Chothia</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author><name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author><name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author><name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author><name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author><name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author><name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author><name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author><name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mathelier, A" uniqKey="Mathelier A">A Mathelier</name>
</author>
<author><name sortKey="Wasserman, Ww" uniqKey="Wasserman W">WW Wasserman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Magbanua, Jp" uniqKey="Magbanua J">JP Magbanua</name>
</author>
<author><name sortKey="Runneburger, E" uniqKey="Runneburger E">E Runneburger</name>
</author>
<author><name sortKey="Russell, S" uniqKey="Russell S">S Russell</name>
</author>
<author><name sortKey="White, R" uniqKey="White R">R White</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ghandi, M" uniqKey="Ghandi M">M Ghandi</name>
</author>
<author><name sortKey="Lee, D" uniqKey="Lee D">D Lee</name>
</author>
<author><name sortKey="Mohammad Noori, M" uniqKey="Mohammad Noori M">M Mohammad-Noori</name>
</author>
<author><name sortKey="Beer, Ma" uniqKey="Beer M">MA Beer</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author><name sortKey="Liu, F" uniqKey="Liu F">F Liu</name>
</author>
<author><name sortKey="Fang, L" uniqKey="Fang L">L Fang</name>
</author>
<author><name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author><name sortKey="Chou, Kc" uniqKey="Chou K">KC Chou</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lee, D" uniqKey="Lee D">D Lee</name>
</author>
<author><name sortKey="Karchin, R" uniqKey="Karchin R">R Karchin</name>
</author>
<author><name sortKey="Beer, M" uniqKey="Beer M">M Beer</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Erwin, Gd" uniqKey="Erwin G">GD Erwin</name>
</author>
<author><name sortKey="Oksenberg, N" uniqKey="Oksenberg N">N Oksenberg</name>
</author>
<author><name sortKey="Truty, Rm" uniqKey="Truty R">RM Truty</name>
</author>
<author><name sortKey="Kostka, D" uniqKey="Kostka D">D Kostka</name>
</author>
<author><name sortKey="Murphy, Kk" uniqKey="Murphy K">KK Murphy</name>
</author>
<author><name sortKey="Ahituv, N" uniqKey="Ahituv N">N Ahituv</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author><name sortKey="Fang, L" uniqKey="Fang L">L Fang</name>
</author>
<author><name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
<author><name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author><name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author><name sortKey="Chou, Kc" uniqKey="Chou K">KC Chou</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author><name sortKey="Liu, F" uniqKey="Liu F">F Liu</name>
</author>
<author><name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author><name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author><name sortKey="Fang, L" uniqKey="Fang L">L Fang</name>
</author>
<author><name sortKey="Chou, Kc" uniqKey="Chou K">KC Chou</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author><name sortKey="Xu, J" uniqKey="Xu J">J Xu</name>
</author>
<author><name sortKey="Lan, X" uniqKey="Lan X">X Lan</name>
</author>
<author><name sortKey="Xu, R" uniqKey="Xu R">R Xu</name>
</author>
<author><name sortKey="Zhou, J" uniqKey="Zhou J">J Zhou</name>
</author>
<author><name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhang, H" uniqKey="Zhang H">H Zhang</name>
</author>
<author><name sortKey="Zhu, L" uniqKey="Zhu L">L Zhu</name>
</author>
<author><name sortKey="Huang, Ds" uniqKey="Huang D">DS Huang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhu, L" uniqKey="Zhu L">L Zhu</name>
</author>
<author><name sortKey="Zhang, H" uniqKey="Zhang H">H Zhang</name>
</author>
<author><name sortKey="Huang, Ds" uniqKey="Huang D">DS Huang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Stringham, Jl" uniqKey="Stringham J">JL Stringham</name>
</author>
<author><name sortKey="Brown, As" uniqKey="Brown A">AS Brown</name>
</author>
<author><name sortKey="Drewell, Ra" uniqKey="Drewell R">RA Drewell</name>
</author>
<author><name sortKey="Dresch, Jm" uniqKey="Dresch J">JM Dresch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gallo, S" uniqKey="Gallo S">S Gallo</name>
</author>
<author><name sortKey="Gerrard, D" uniqKey="Gerrard D">D Gerrard</name>
</author>
<author><name sortKey="Miner, D" uniqKey="Miner D">D Miner</name>
</author>
<author><name sortKey="Simich, M" uniqKey="Simich M">M Simich</name>
</author>
<author><name sortKey="Des Soye, B" uniqKey="Des Soye B">B Des Soye</name>
</author>
<author><name sortKey="Bergman, C" uniqKey="Bergman C">C Bergman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Batista, Geapa" uniqKey="Batista G">GEAPA Batista</name>
</author>
<author><name sortKey="Prati, Rc" uniqKey="Prati R">RC Prati</name>
</author>
<author><name sortKey="Monard, Mc" uniqKey="Monard M">MC Monard</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chawla, Nv" uniqKey="Chawla N">NV Chawla</name>
</author>
<author><name sortKey="Bowyer, Kw" uniqKey="Bowyer K">KW Bowyer</name>
</author>
<author><name sortKey="Hall, Lo" uniqKey="Hall L">LO Hall</name>
</author>
<author><name sortKey="Kegelmeyer, Wp" uniqKey="Kegelmeyer W">WP Kegelmeyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Brown, Mp" uniqKey="Brown M">MP Brown</name>
</author>
<author><name sortKey="Grundy, Wn" uniqKey="Grundy W">WN Grundy</name>
</author>
<author><name sortKey="Lin, D" uniqKey="Lin D">D Lin</name>
</author>
<author><name sortKey="Cristianini, N" uniqKey="Cristianini N">N Cristianini</name>
</author>
<author><name sortKey="Sugnet, Cw" uniqKey="Sugnet C">CW Sugnet</name>
</author>
<author><name sortKey="Furey, Ts" uniqKey="Furey T">TS Furey</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Joachims, T" uniqKey="Joachims T">T Joachims</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author><name sortKey="Elkan, C" uniqKey="Elkan C">C Elkan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wasserman, Ww" uniqKey="Wasserman W">WW Wasserman</name>
</author>
<author><name sortKey="Sandelin, A" uniqKey="Sandelin A">A Sandelin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Turatsinze, J" uniqKey="Turatsinze J">J Turatsinze</name>
</author>
<author><name sortKey="Thomas Chollier, M" uniqKey="Thomas Chollier M">M Thomas-Chollier</name>
</author>
<author><name sortKey="Defrance, M" uniqKey="Defrance M">M Defrance</name>
</author>
<author><name sortKey="Van Helden, J" uniqKey="Van Helden J">J van Helden</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hardison, R" uniqKey="Hardison R">R Hardison</name>
</author>
<author><name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Weirauch, Mt" uniqKey="Weirauch M">MT Weirauch</name>
</author>
<author><name sortKey="Cote, A" uniqKey="Cote A">A Cote</name>
</author>
<author><name sortKey="Norel, R" uniqKey="Norel R">R Norel</name>
</author>
<author><name sortKey="Annala, M" uniqKey="Annala M">M Annala</name>
</author>
<author><name sortKey="Zhao, Y" uniqKey="Zhao Y">Y Zhao</name>
</author>
<author><name sortKey="Riley, Tr" uniqKey="Riley T">TR Riley</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mathelier, A" uniqKey="Mathelier A">A Mathelier</name>
</author>
<author><name sortKey="Zhao, X" uniqKey="Zhao X">X Zhao</name>
</author>
<author><name sortKey="Zhang, A" uniqKey="Zhang A">A Zhang</name>
</author>
<author><name sortKey="Parcy, F" uniqKey="Parcy F">F Parcy</name>
</author>
<author><name sortKey="Worsley Hunt, R" uniqKey="Worsley Hunt R">R Worsley-Hunt</name>
</author>
<author><name sortKey="Arenillas, D" uniqKey="Arenillas D">D Arenillas</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gupta, S" uniqKey="Gupta S">S Gupta</name>
</author>
<author><name sortKey="Stamatoyannopoulos, J" uniqKey="Stamatoyannopoulos J">J Stamatoyannopoulos</name>
</author>
<author><name sortKey="Bailey, T" uniqKey="Bailey T">T Bailey</name>
</author>
<author><name sortKey="Noble, W" uniqKey="Noble W">W Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mcquilton, P" uniqKey="Mcquilton P">P McQuilton</name>
</author>
<author><name sortKey="Pierre, Ses" uniqKey="Pierre S">SES Pierre</name>
</author>
<author><name sortKey="Thurmond, J" uniqKey="Thurmond J">J Thurmond</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Noyes, Mb" uniqKey="Noyes M">MB Noyes</name>
</author>
<author><name sortKey="Meng, X" uniqKey="Meng X">X Meng</name>
</author>
<author><name sortKey="Wakabayashi, A" uniqKey="Wakabayashi A">A Wakabayashi</name>
</author>
<author><name sortKey="Sinha, S" uniqKey="Sinha S">S Sinha</name>
</author>
<author><name sortKey="Brodsky, Mh" uniqKey="Brodsky M">MH Brodsky</name>
</author>
<author><name sortKey="Wolfe, Sa" uniqKey="Wolfe S">SA Wolfe</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Desplan, C" uniqKey="Desplan C">C Desplan</name>
</author>
<author><name sortKey="Theis, J" uniqKey="Theis J">J Theis</name>
</author>
<author><name sortKey="O Arrell, P" uniqKey="O Arrell P">P O’Farrell</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gehring, Wj" uniqKey="Gehring W">WJ Gehring</name>
</author>
<author><name sortKey="Qian, Yq" uniqKey="Qian Y">YQ Qian</name>
</author>
<author><name sortKey="Billeter, M" uniqKey="Billeter M">M Billeter</name>
</author>
<author><name sortKey="Furukubo Tokunaga, K" uniqKey="Furukubo Tokunaga K">K Furukubo-Tokunaga</name>
</author>
<author><name sortKey="Schier, Af" uniqKey="Schier A">AF Schier</name>
</author>
<author><name sortKey="Resendez Perez, D" uniqKey="Resendez Perez D">D Resendez-Perez</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Baird Titus, J" uniqKey="Baird Titus J">J Baird-Titus</name>
</author>
<author><name sortKey="Clark Baldwin, K" uniqKey="Clark Baldwin K">K Clark-Baldwin</name>
</author>
<author><name sortKey="Dave, V" uniqKey="Dave V">V Dave</name>
</author>
<author><name sortKey="Caperelli, C" uniqKey="Caperelli C">C Caperelli</name>
</author>
<author><name sortKey="Ma, J" uniqKey="Ma J">J Ma</name>
</author>
<author><name sortKey="Rance, M" uniqKey="Rance M">M Rance</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhou, T" uniqKey="Zhou T">T Zhou</name>
</author>
<author><name sortKey="Yang, L" uniqKey="Yang L">L Yang</name>
</author>
<author><name sortKey="Lu, Y" uniqKey="Lu Y">Y Lu</name>
</author>
<author><name sortKey="Dror, I" uniqKey="Dror I">I Dror</name>
</author>
<author><name sortKey="Dantas Machado, A" uniqKey="Dantas Machado A">A Dantas Machado</name>
</author>
<author><name sortKey="Ghane, T" uniqKey="Ghane T">T Ghane</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yang, L" uniqKey="Yang L">L Yang</name>
</author>
<author><name sortKey="Zhou, T" uniqKey="Zhou T">T Zhou</name>
</author>
<author><name sortKey="Dror, I" uniqKey="Dror I">I Dror</name>
</author>
<author><name sortKey="Mathelier, A" uniqKey="Mathelier A">A Mathelier</name>
</author>
<author><name sortKey="Wasserman, Ww" uniqKey="Wasserman W">WW Wasserman</name>
</author>
<author><name sortKey="Gordan, R" uniqKey="Gordan R">R Gordan</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group><journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher><publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, CA USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">28982128</article-id>
<article-id pub-id-type="pmc">5628859</article-id>
<article-id pub-id-type="publisher-id">PONE-D-17-26105</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0185570</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Research and Analysis Methods</subject>
<subj-group><subject>Database and Informatics Methods</subject>
<subj-group><subject>Bioinformatics</subject>
<subj-group><subject>Sequence Analysis</subject>
<subj-group><subject>Sequence Motif Analysis</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Research and Analysis Methods</subject>
<subj-group><subject>Database and Informatics Methods</subject>
<subj-group><subject>Biological Databases</subject>
<subj-group><subject>Sequence Databases</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Research and Analysis Methods</subject>
<subj-group><subject>Database and Informatics Methods</subject>
<subj-group><subject>Bioinformatics</subject>
<subj-group><subject>Sequence Analysis</subject>
<subj-group><subject>Sequence Databases</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Computer and Information Sciences</subject>
<subj-group><subject>Artificial Intelligence</subject>
<subj-group><subject>Machine Learning</subject>
<subj-group><subject>Support Vector Machines</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Biology and Life Sciences</subject>
<subj-group><subject>Molecular Biology</subject>
<subj-group><subject>Molecular Biology Techniques</subject>
<subj-group><subject>Sequencing Techniques</subject>
<subj-group><subject>Nucleotide Sequencing</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Research and Analysis Methods</subject>
<subj-group><subject>Molecular Biology Techniques</subject>
<subj-group><subject>Sequencing Techniques</subject>
<subj-group><subject>Nucleotide Sequencing</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Research and Analysis Methods</subject>
<subj-group><subject>Database and Informatics Methods</subject>
<subj-group><subject>Bioinformatics</subject>
<subj-group><subject>Sequence Analysis</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Biology and Life Sciences</subject>
<subj-group><subject>Genetics</subject>
<subj-group><subject>Gene Expression</subject>
<subj-group><subject>Gene Regulation</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Biology and Life Sciences</subject>
<subj-group><subject>Genetics</subject>
<subj-group><subject>Gene Expression</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Biology and life sciences</subject>
<subj-group><subject>Biochemistry</subject>
<subj-group><subject>Proteins</subject>
<subj-group><subject>DNA-binding proteins</subject>
<subj-group><subject>Transcription Factors</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Biology and Life Sciences</subject>
<subj-group><subject>Genetics</subject>
<subj-group><subject>Gene Expression</subject>
<subj-group><subject>Gene Regulation</subject>
<subj-group><subject>Transcription Factors</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3"><subject>Biology and Life Sciences</subject>
<subj-group><subject>Biochemistry</subject>
<subj-group><subject>Proteins</subject>
<subj-group><subject>Regulatory Proteins</subject>
<subj-group><subject>Transcription Factors</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</article-categories>
<title-group><article-title>The folded <italic>k</italic>
-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies</article-title>
<alt-title alt-title-type="running-head">The folded <italic>k</italic>
-spectrum kernel for detecting transcription factor binding sites</alt-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Elmas</surname>
<given-names>Abdulkadir</given-names>
</name>
<role content-type="http://credit.casrai.org/">Conceptualization</role>
<role content-type="http://credit.casrai.org/">Data curation</role>
<role content-type="http://credit.casrai.org/">Formal analysis</role>
<role content-type="http://credit.casrai.org/">Methodology</role>
<role content-type="http://credit.casrai.org/">Resources</role>
<role content-type="http://credit.casrai.org/">Software</role>
<role content-type="http://credit.casrai.org/">Validation</role>
<role content-type="http://credit.casrai.org/">Visualization</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<role content-type="http://credit.casrai.org/">Writing – review & editing</role>
<xref ref-type="aff" rid="aff001"><sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Wang</surname>
<given-names>Xiaodong</given-names>
</name>
<role content-type="http://credit.casrai.org/">Conceptualization</role>
<role content-type="http://credit.casrai.org/">Investigation</role>
<role content-type="http://credit.casrai.org/">Project administration</role>
<role content-type="http://credit.casrai.org/">Supervision</role>
<role content-type="http://credit.casrai.org/">Validation</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<role content-type="http://credit.casrai.org/">Writing – review & editing</role>
<xref ref-type="aff" rid="aff001"><sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author"><contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0001-7626-4959</contrib-id>
<name><surname>Dresch</surname>
<given-names>Jacqueline M.</given-names>
</name>
<role content-type="http://credit.casrai.org/">Conceptualization</role>
<role content-type="http://credit.casrai.org/">Funding acquisition</role>
<role content-type="http://credit.casrai.org/">Investigation</role>
<role content-type="http://credit.casrai.org/">Methodology</role>
<role content-type="http://credit.casrai.org/">Project administration</role>
<role content-type="http://credit.casrai.org/">Supervision</role>
<role content-type="http://credit.casrai.org/">Validation</role>
<role content-type="http://credit.casrai.org/">Writing – original draft</role>
<role content-type="http://credit.casrai.org/">Writing – review & editing</role>
<xref ref-type="aff" rid="aff002"><sup>2</sup>
</xref>
<xref ref-type="corresp" rid="cor001">*</xref>
</contrib>
</contrib-group>
<aff id="aff001"><label>1</label>
<addr-line>Department of Electrical Engineering, Columbia University, New York, NY, United States of America</addr-line>
</aff>
<aff id="aff002"><label>2</label>
<addr-line>Department of Mathematics and Computer Science, Clark University, Worcester, MA, United States of America</addr-line>
</aff>
<contrib-group><contrib contrib-type="editor"><name><surname>Liu</surname>
<given-names>Bin</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1"><addr-line>Harbin Institute of Technology Shenzhen Graduate School, CHINA</addr-line>
</aff>
<author-notes><fn fn-type="COI-statement" id="coi001"><p><bold>Competing Interests: </bold>
The authors have declared that no competing interests exist.</p>
</fn>
<corresp id="cor001">* E-mail: <email>jdresch@clarku.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="collection"><year>2017</year>
</pub-date>
<pub-date pub-type="epub"><day>5</day>
<month>10</month>
<year>2017</year>
</pub-date>
<volume>12</volume>
<issue>10</issue>
<elocation-id>e0185570</elocation-id>
<history><date date-type="received"><day>11</day>
<month>7</month>
<year>2017</year>
</date>
<date date-type="accepted"><day>14</day>
<month>9</month>
<year>2017</year>
</date>
</history>
<permissions><copyright-statement>© 2017 Elmas et al</copyright-statement>
<copyright-year>2017</copyright-year>
<copyright-holder>Elmas et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><license-p>This is an open access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>
, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="pone.0185570.pdf"></self-uri>
<abstract><p>Understanding the molecular machinery involved in transcriptional regulation is central to improving our knowledge of an organism’s development, disease, and evolution. The building blocks of this complex molecular machinery are an organism’s genomic DNA sequence and transcription factor proteins. Despite the vast amount of sequence data now available for many model organisms, predicting where transcription factors bind, often referred to as ‘motif detection’ is still incredibly challenging. In this study, we develop a novel bioinformatic approach to binding site prediction. We do this by extending pre-existing SVM approaches in an unbiased way to include all possible gapped <italic>k</italic>
-mers, representing different combinations of complex nucleotide dependencies within binding sites. We show the advantages of this new approach when compared to existing SVM approaches, through a rigorous set of cross-validation experiments. We also demonstrate the effectiveness of our new approach by reporting on its improved performance on a set of 127 genomic regions known to regulate gene expression along the anterio-posterior axis in early <italic>Drosophila</italic>
embryos.</p>
</abstract>
<funding-group><award-group id="award001"><funding-source><institution>National Institutes of Health (US)</institution>
</funding-source>
<award-id>GM110571</award-id>
<principal-award-recipient><contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0001-7626-4959</contrib-id>
<name><surname>Dresch</surname>
<given-names>Jacqueline M.</given-names>
</name>
</principal-award-recipient>
</award-group>
<funding-statement>This work has been supported by a National Institutes of Health (GM110571) grant (<ext-link ext-link-type="uri" xlink:href="https://www.nih.gov">https://www.nih.gov</ext-link>
) to JMD. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement>
</funding-group>
<counts><fig-count count="6"></fig-count>
<table-count count="0"></table-count>
<page-count count="22"></page-count>
</counts>
<custom-meta-group><custom-meta id="data-availability"><meta-name>Data Availability</meta-name>
<meta-value>All relevant data are within the paper and its Supporting Information files.</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
<notes><title>Data Availability</title>
<p>All relevant data are within the paper and its Supporting Information files.</p>
</notes>
</front>
</pmc>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001035 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 001035 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Pmc |étape= Curation |type= RBID |clé= PMC:5628859 |texte= The folded k-spectrum kernel: A machine learning approach to detecting transcription factor binding sites with gapped nucleotide dependencies }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i -Sk "pubmed:28982128" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |