Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Learning “graph-mer” Motifs that Predict Gene Expression Trajectories in Development

Identifieur interne : 001356 ( Pmc/Checkpoint ); précédent : 001355; suivant : 001357

Learning “graph-mer” Motifs that Predict Gene Expression Trajectories in Development

Auteurs : Xuejing Li [États-Unis] ; Casandra Panea [États-Unis] ; Chris H. Wiggins [États-Unis] ; Valerie Reinke [États-Unis] ; Christina Leslie [États-Unis]

Source :

RBID : PMC:2861633

Abstract

A key problem in understanding transcriptional regulatory networks is deciphering what cis regulatory logic is encoded in gene promoter sequences and how this sequence information maps to expression. A typical computational approach to this problem involves clustering genes by their expression profiles and then searching for overrepresented motifs in the promoter sequences of genes in a cluster. However, genes with similar expression profiles may be controlled by distinct regulatory programs. Moreover, if many gene expression profiles in a data set are highly correlated, as in the case of whole organism developmental time series, it may be difficult to resolve fine-grained clusters in the first place. We present a predictive framework for modeling the natural flow of information, from promoter sequence to expression, to learn cis regulatory motifs and characterize gene expression patterns in developmental time courses. We introduce a cluster-free algorithm based on a graph-regularized version of partial least squares (PLS) regression to learn sequence patterns—represented by graphs of k-mers, or “graph-mers”—that predict gene expression trajectories. Applying the approach to wildtype germline development in Caenorhabditis elegans, we found that the first and second latent PLS factors mapped to expression profiles for oocyte and sperm genes, respectively. We extracted both known and novel motifs from the graph-mers associated to these germline-specific patterns, including novel CG-rich motifs specific to oocyte genes. We found evidence supporting the functional relevance of these putative regulatory elements through analysis of positional bias, motif conservation and in situ gene expression. This study demonstrates that our regression model can learn biologically meaningful latent structure and identify potentially functional motifs from subtle developmental time course expression data.


Url:
DOI: 10.1371/journal.pcbi.1000761
PubMed: 20454681
PubMed Central: 2861633


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:2861633

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Learning “graph-mer” Motifs that Predict Gene Expression Trajectories in Development</title>
<author>
<name sortKey="Li, Xuejing" sort="Li, Xuejing" uniqKey="Li X" first="Xuejing" last="Li">Xuejing Li</name>
<affiliation wicri:level="4">
<nlm:aff id="aff1">
<addr-line>Department of Physics, Columbia University, New York, New York, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Physics, Columbia University, New York, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
<settlement type="city">New York</settlement>
</placeName>
<orgName type="university">Université Columbia</orgName>
</affiliation>
</author>
<author>
<name sortKey="Panea, Casandra" sort="Panea, Casandra" uniqKey="Panea C" first="Casandra" last="Panea">Casandra Panea</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Genetics, Yale University, New Haven, Connecticut, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Yale University, New Haven, Connecticut</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Wiggins, Chris H" sort="Wiggins, Chris H" uniqKey="Wiggins C" first="Chris H." last="Wiggins">Chris H. Wiggins</name>
<affiliation wicri:level="4">
<nlm:aff id="aff3">
<addr-line>Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
<settlement type="city">New York</settlement>
</placeName>
<orgName type="university">Université Columbia</orgName>
</affiliation>
</author>
<author>
<name sortKey="Reinke, Valerie" sort="Reinke, Valerie" uniqKey="Reinke V" first="Valerie" last="Reinke">Valerie Reinke</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Genetics, Yale University, New Haven, Connecticut, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Yale University, New Haven, Connecticut</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Leslie, Christina" sort="Leslie, Christina" uniqKey="Leslie C" first="Christina" last="Leslie">Christina Leslie</name>
<affiliation wicri:level="2">
<nlm:aff id="aff4">
<addr-line>Computational Biology Program, Sloan-Kettering Institute, New York, New York, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Biology Program, Sloan-Kettering Institute, New York, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">20454681</idno>
<idno type="pmc">2861633</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2861633</idno>
<idno type="RBID">PMC:2861633</idno>
<idno type="doi">10.1371/journal.pcbi.1000761</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">000F91</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000F91</idno>
<idno type="wicri:Area/Pmc/Curation">000F91</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000F91</idno>
<idno type="wicri:Area/Pmc/Checkpoint">001356</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">001356</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Learning “graph-mer” Motifs that Predict Gene Expression Trajectories in Development</title>
<author>
<name sortKey="Li, Xuejing" sort="Li, Xuejing" uniqKey="Li X" first="Xuejing" last="Li">Xuejing Li</name>
<affiliation wicri:level="4">
<nlm:aff id="aff1">
<addr-line>Department of Physics, Columbia University, New York, New York, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Physics, Columbia University, New York, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
<settlement type="city">New York</settlement>
</placeName>
<orgName type="university">Université Columbia</orgName>
</affiliation>
</author>
<author>
<name sortKey="Panea, Casandra" sort="Panea, Casandra" uniqKey="Panea C" first="Casandra" last="Panea">Casandra Panea</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Genetics, Yale University, New Haven, Connecticut, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Yale University, New Haven, Connecticut</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Wiggins, Chris H" sort="Wiggins, Chris H" uniqKey="Wiggins C" first="Chris H." last="Wiggins">Chris H. Wiggins</name>
<affiliation wicri:level="4">
<nlm:aff id="aff3">
<addr-line>Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
<settlement type="city">New York</settlement>
</placeName>
<orgName type="university">Université Columbia</orgName>
</affiliation>
</author>
<author>
<name sortKey="Reinke, Valerie" sort="Reinke, Valerie" uniqKey="Reinke V" first="Valerie" last="Reinke">Valerie Reinke</name>
<affiliation wicri:level="2">
<nlm:aff id="aff2">
<addr-line>Department of Genetics, Yale University, New Haven, Connecticut, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Yale University, New Haven, Connecticut</wicri:regionArea>
<placeName>
<region type="state">Connecticut</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Leslie, Christina" sort="Leslie, Christina" uniqKey="Leslie C" first="Christina" last="Leslie">Christina Leslie</name>
<affiliation wicri:level="2">
<nlm:aff id="aff4">
<addr-line>Computational Biology Program, Sloan-Kettering Institute, New York, New York, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Biology Program, Sloan-Kettering Institute, New York, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS Computational Biology</title>
<idno type="ISSN">1553-734X</idno>
<idno type="eISSN">1553-7358</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>A key problem in understanding transcriptional regulatory networks is deciphering what
<italic>cis</italic>
regulatory logic is encoded in gene promoter sequences and how this sequence information maps to expression. A typical computational approach to this problem involves clustering genes by their expression profiles and then searching for overrepresented motifs in the promoter sequences of genes in a cluster. However, genes with similar expression profiles may be controlled by distinct regulatory programs. Moreover, if many gene expression profiles in a data set are highly correlated, as in the case of whole organism developmental time series, it may be difficult to resolve fine-grained clusters in the first place. We present a predictive framework for modeling the natural flow of information, from promoter sequence to expression, to learn
<italic>cis</italic>
regulatory motifs and characterize gene expression patterns in developmental time courses. We introduce a cluster-free algorithm based on a graph-regularized version of partial least squares (PLS) regression to learn sequence patterns—represented by graphs of
<italic>k</italic>
-mers, or “graph-mers”—that predict gene expression trajectories. Applying the approach to wildtype germline development in
<italic>Caenorhabditis elegans</italic>
, we found that the first and second latent PLS factors mapped to expression profiles for oocyte and sperm genes, respectively. We extracted both known and novel motifs from the graph-mers associated to these germline-specific patterns, including novel CG-rich motifs specific to oocyte genes. We found evidence supporting the functional relevance of these putative regulatory elements through analysis of positional bias, motif conservation and
<italic>in situ</italic>
gene expression. This study demonstrates that our regression model can learn biologically meaningful latent structure and identify potentially functional motifs from subtle developmental time course expression data.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Tompa, M" uniqKey="Tompa M">M Tompa</name>
</author>
<author>
<name sortKey="Li, N" uniqKey="Li N">N Li</name>
</author>
<author>
<name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author>
<name sortKey="Church, Gm" uniqKey="Church G">GM Church</name>
</author>
<author>
<name sortKey="De Moor, Bd" uniqKey="De Moor B">BD De Moor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tibshirani, R" uniqKey="Tibshirani R">R Tibshirani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Belkin, M" uniqKey="Belkin M">M Belkin</name>
</author>
<author>
<name sortKey="Niyogi, P" uniqKey="Niyogi P">P Niyogi</name>
</author>
<author>
<name sortKey="Sindhwani, V" uniqKey="Sindhwani V">V Sindhwani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ng, Ay" uniqKey="Ng A">AY Ng</name>
</author>
<author>
<name sortKey="Jordan, Mi" uniqKey="Jordan M">MI Jordan</name>
</author>
<author>
<name sortKey="Weiss, Y" uniqKey="Weiss Y">Y Weiss</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rapaport, F" uniqKey="Rapaport F">F Rapaport</name>
</author>
<author>
<name sortKey="Zinovyev, A" uniqKey="Zinovyev A">A Zinovyev</name>
</author>
<author>
<name sortKey="Dutreix, M" uniqKey="Dutreix M">M Dutreix</name>
</author>
<author>
<name sortKey="Barillot, E" uniqKey="Barillot E">E Barillot</name>
</author>
<author>
<name sortKey="Vert, Jp" uniqKey="Vert J">JP Vert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author>
<name sortKey="Elkan, C" uniqKey="Elkan C">C Elkan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reinke, V" uniqKey="Reinke V">V Reinke</name>
</author>
<author>
<name sortKey="Gil, Is" uniqKey="Gil I">IS Gil</name>
</author>
<author>
<name sortKey="Ward, S" uniqKey="Ward S">S Ward</name>
</author>
<author>
<name sortKey="Kazmer, K" uniqKey="Kazmer K">K Kazmer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boulesteix, Al" uniqKey="Boulesteix A">AL Boulesteix</name>
</author>
<author>
<name sortKey="Strimmer, K" uniqKey="Strimmer K">K Strimmer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bader, Gd" uniqKey="Bader G">GD Bader</name>
</author>
<author>
<name sortKey="Hogue, Cwv" uniqKey="Hogue C">CWV Hogue</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shim, Y" uniqKey="Shim Y">Y Shim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="J, C" uniqKey="J C">C J</name>
</author>
<author>
<name sortKey="K, Sm" uniqKey="K S">SM K</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Waterston, R" uniqKey="Waterston R">R Waterston</name>
</author>
<author>
<name sortKey="Lindblad Toh, K" uniqKey="Lindblad Toh K">K Lindblad-Toh</name>
</author>
<author>
<name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
<author>
<name sortKey="Rogers, J" uniqKey="Rogers J">J Rogers</name>
</author>
<author>
<name sortKey="Abril, J" uniqKey="Abril J">J Abril</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raychaudhuri, S" uniqKey="Raychaudhuri S">S Raychaudhuri</name>
</author>
<author>
<name sortKey="Stuart, J" uniqKey="Stuart J">J Stuart</name>
</author>
<author>
<name sortKey="Altman, R" uniqKey="Altman R">R Altman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Beer, Ma" uniqKey="Beer M">MA Beer</name>
</author>
<author>
<name sortKey="Tavazoie, S" uniqKey="Tavazoie S">S Tavazoie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ernst, J" uniqKey="Ernst J">J Ernst</name>
</author>
<author>
<name sortKey="Vainas, O" uniqKey="Vainas O">O Vainas</name>
</author>
<author>
<name sortKey="Harbison, Ct" uniqKey="Harbison C">CT Harbison</name>
</author>
<author>
<name sortKey="Simon, I" uniqKey="Simon I">I Simon</name>
</author>
<author>
<name sortKey="Bar Joseph, Z" uniqKey="Bar Joseph Z">Z Bar-Joseph</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Segal, E" uniqKey="Segal E">E Segal</name>
</author>
<author>
<name sortKey="Shapira, M" uniqKey="Shapira M">M Shapira</name>
</author>
<author>
<name sortKey="Regev, A" uniqKey="Regev A">A Regev</name>
</author>
<author>
<name sortKey="Pe Er, D" uniqKey="Pe Er D">D Pe'er</name>
</author>
<author>
<name sortKey="Botstein, D" uniqKey="Botstein D">D Botstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Middendorf, M" uniqKey="Middendorf M">M Middendorf</name>
</author>
<author>
<name sortKey="Kundaje, A" uniqKey="Kundaje A">A Kundaje</name>
</author>
<author>
<name sortKey="Shah, M" uniqKey="Shah M">M Shah</name>
</author>
<author>
<name sortKey="Freund, Y" uniqKey="Freund Y">Y Freund</name>
</author>
<author>
<name sortKey="Wiggins, C" uniqKey="Wiggins C">C Wiggins</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kundaje, A" uniqKey="Kundaje A">A Kundaje</name>
</author>
<author>
<name sortKey="Xin, X" uniqKey="Xin X">X Xin</name>
</author>
<author>
<name sortKey="Lan, C" uniqKey="Lan C">C Lan</name>
</author>
<author>
<name sortKey="Lianoglou, S" uniqKey="Lianoglou S">S Lianoglou</name>
</author>
<author>
<name sortKey="Zhou, M" uniqKey="Zhou M">M Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bussemaker, Hj" uniqKey="Bussemaker H">HJ Bussemaker</name>
</author>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Siggia, Ed" uniqKey="Siggia E">ED Siggia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, Nr" uniqKey="Zhang N">NR Zhang</name>
</author>
<author>
<name sortKey="Wildermuth, Mc" uniqKey="Wildermuth M">MC Wildermuth</name>
</author>
<author>
<name sortKey="Speed, Tp" uniqKey="Speed T">TP Speed</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bonneau, R" uniqKey="Bonneau R">R Bonneau</name>
</author>
<author>
<name sortKey="Reiss, D" uniqKey="Reiss D">D Reiss</name>
</author>
<author>
<name sortKey="Shannon, P" uniqKey="Shannon P">P Shannon</name>
</author>
<author>
<name sortKey="Facciotti, M" uniqKey="Facciotti M">M Facciotti</name>
</author>
<author>
<name sortKey="Hood, L" uniqKey="Hood L">L Hood</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brilli, M" uniqKey="Brilli M">M Brilli</name>
</author>
<author>
<name sortKey="Fani, R" uniqKey="Fani R">R Fani</name>
</author>
<author>
<name sortKey="Li, P" uniqKey="Li P">P Lió</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Naughton, Bt" uniqKey="Naughton B">BT Naughton</name>
</author>
<author>
<name sortKey="Fratkin, E" uniqKey="Fratkin E">E Fratkin</name>
</author>
<author>
<name sortKey="Batzoglou, S" uniqKey="Batzoglou S">S Batzoglou</name>
</author>
<author>
<name sortKey="Brutlag, Dl" uniqKey="Brutlag D">DL Brutlag</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Das, D" uniqKey="Das D">D Das</name>
</author>
<author>
<name sortKey="Pellegrini, M" uniqKey="Pellegrini M">M Pellegrini</name>
</author>
<author>
<name sortKey="Gray, Jw" uniqKey="Gray J">JW Gray</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Segal, E" uniqKey="Segal E">E Segal</name>
</author>
<author>
<name sortKey="Raveh Sadka, T" uniqKey="Raveh Sadka T">T Raveh-Sadka</name>
</author>
<author>
<name sortKey="Schroeder, M" uniqKey="Schroeder M">M Schroeder</name>
</author>
<author>
<name sortKey="Unnerstall, U" uniqKey="Unnerstall U">U Unnerstall</name>
</author>
<author>
<name sortKey="Gaul, U" uniqKey="Gaul U">U Gaul</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, L" uniqKey="Wang L">L Wang</name>
</author>
<author>
<name sortKey="Chen, G" uniqKey="Chen G">G Chen</name>
</author>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hill, A" uniqKey="Hill A">A Hill</name>
</author>
<author>
<name sortKey="Hunter, C" uniqKey="Hunter C">C Hunter</name>
</author>
<author>
<name sortKey="Tsung, B" uniqKey="Tsung B">B Tsung</name>
</author>
<author>
<name sortKey="Tucker Kellogg, G" uniqKey="Tucker Kellogg G">G Tucker-Kellogg</name>
</author>
<author>
<name sortKey="Brown, E" uniqKey="Brown E">E Brown</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jong, S" uniqKey="Jong S">S Jong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Weinberger, Kq" uniqKey="Weinberger K">KQ Weinberger</name>
</author>
<author>
<name sortKey="Sha, F" uniqKey="Sha F">F Sha</name>
</author>
<author>
<name sortKey="Zhu, Q" uniqKey="Zhu Q">Q Zhu</name>
</author>
<author>
<name sortKey="Saul, Lk" uniqKey="Saul L">LK Saul</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chung, Frk" uniqKey="Chung F">FRK Chung</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author>
<name sortKey="Gelfand, M" uniqKey="Gelfand M">M Gelfand</name>
</author>
<author>
<name sortKey="Pevzner, P" uniqKey="Pevzner P">P Pevzner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shannon, P" uniqKey="Shannon P">P Shannon</name>
</author>
<author>
<name sortKey="Markiel, A" uniqKey="Markiel A">A Markiel</name>
</author>
<author>
<name sortKey="Ozier, O" uniqKey="Ozier O">O Ozier</name>
</author>
<author>
<name sortKey="Baliga, Ns" uniqKey="Baliga N">NS Baliga</name>
</author>
<author>
<name sortKey="Wang, Jt" uniqKey="Wang J">JT Wang</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS Comput Biol</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS Comput. Biol</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">ploscomp</journal-id>
<journal-title-group>
<journal-title>PLoS Computational Biology</journal-title>
</journal-title-group>
<issn pub-type="ppub">1553-734X</issn>
<issn pub-type="epub">1553-7358</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">20454681</article-id>
<article-id pub-id-type="pmc">2861633</article-id>
<article-id pub-id-type="publisher-id">09-PLCB-RA-0689R3</article-id>
<article-id pub-id-type="doi">10.1371/journal.pcbi.1000761</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline">
<subject>Genetics and Genomics/Bioinformatics</subject>
<subject>Molecular Biology/Bioinformatics</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Learning “graph-mer” Motifs that Predict Gene Expression Trajectories in Development</article-title>
<alt-title alt-title-type="running-head">Learning Motifs that Predict Gene Expression</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Xuejing</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Panea</surname>
<given-names>Casandra</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wiggins</surname>
<given-names>Chris H.</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Reinke</surname>
<given-names>Valerie</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Leslie</surname>
<given-names>Christina</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<xref ref-type="corresp" rid="cor1">
<sup>*</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<label>1</label>
<addr-line>Department of Physics, Columbia University, New York, New York, United States of America</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Department of Genetics, Yale University, New Haven, Connecticut, United States of America</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York, United States of America</addr-line>
</aff>
<aff id="aff4">
<label>4</label>
<addr-line>Computational Biology Program, Sloan-Kettering Institute, New York, New York, United States of America</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Regev</surname>
<given-names>Aviv</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">Broad Institute of MIT and Harvard, United States of America</aff>
<author-notes>
<corresp id="cor1">* E-mail:
<email>cleslie@cbio.mskcc.org</email>
</corresp>
<fn fn-type="con">
<p>Conceived and designed the experiments: XL CP CHW VR CL. Performed the experiments: XL CP. Analyzed the data: XL CP CHW VR CL. Contributed reagents/materials/analysis tools: XL CP CHW VR CL. Wrote the paper: XL VR CL.</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<month>4</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>29</day>
<month>4</month>
<year>2010</year>
</pub-date>
<volume>6</volume>
<issue>4</issue>
<elocation-id>e1000761</elocation-id>
<history>
<date date-type="received">
<day>22</day>
<month>6</month>
<year>2009</year>
</date>
<date date-type="accepted">
<day>24</day>
<month>3</month>
<year>2010</year>
</date>
</history>
<permissions>
<copyright-statement>Li et al.</copyright-statement>
<copyright-year>2010</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.</license-p>
</license>
</permissions>
<abstract>
<p>A key problem in understanding transcriptional regulatory networks is deciphering what
<italic>cis</italic>
regulatory logic is encoded in gene promoter sequences and how this sequence information maps to expression. A typical computational approach to this problem involves clustering genes by their expression profiles and then searching for overrepresented motifs in the promoter sequences of genes in a cluster. However, genes with similar expression profiles may be controlled by distinct regulatory programs. Moreover, if many gene expression profiles in a data set are highly correlated, as in the case of whole organism developmental time series, it may be difficult to resolve fine-grained clusters in the first place. We present a predictive framework for modeling the natural flow of information, from promoter sequence to expression, to learn
<italic>cis</italic>
regulatory motifs and characterize gene expression patterns in developmental time courses. We introduce a cluster-free algorithm based on a graph-regularized version of partial least squares (PLS) regression to learn sequence patterns—represented by graphs of
<italic>k</italic>
-mers, or “graph-mers”—that predict gene expression trajectories. Applying the approach to wildtype germline development in
<italic>Caenorhabditis elegans</italic>
, we found that the first and second latent PLS factors mapped to expression profiles for oocyte and sperm genes, respectively. We extracted both known and novel motifs from the graph-mers associated to these germline-specific patterns, including novel CG-rich motifs specific to oocyte genes. We found evidence supporting the functional relevance of these putative regulatory elements through analysis of positional bias, motif conservation and
<italic>in situ</italic>
gene expression. This study demonstrates that our regression model can learn biologically meaningful latent structure and identify potentially functional motifs from subtle developmental time course expression data.</p>
</abstract>
<abstract abstract-type="summary">
<title>Author Summary</title>
<p>A major challenge in functional genomics is to decipher the gene regulatory networks operating in multi-cellular organisms, such as the nematode
<italic>C. elegans</italic>
. The expression level of a gene is controlled, to a great extent, by regulatory proteins called transcription factors that bind short motifs in the gene's promoter (regulatory region in the non-coding DNA). In a temporal regulatory process, for example in development, the “regulatory logic” of DNA motifs in the promoter largely determines the gene's expression trajectory, as the gene responds over time to changing concentrations of the transcription factors that control it. This study addresses the problem of learning DNA motifs that predict temporal expression profiles, using genomewide expression data from developmental time series in
<italic>C. elegans</italic>
. We developed a novel algorithm based on techniques from multivariate regression that sets up a correspondence between sequence patterns and expression trajectories. Sequence motifs are represented as graphs of sequence-similar
<italic>k</italic>
-length subsequences called “graph-mers”. By applying the method to germline development in
<italic>C. elegans</italic>
, we found both known and novel DNA motifs associated with oocyte and sperm genes.</p>
</abstract>
<counts>
<page-count count="13"></page-count>
</counts>
</article-meta>
</front>
</pmc>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Connecticut</li>
<li>État de New York</li>
</region>
<settlement>
<li>New York</li>
</settlement>
<orgName>
<li>Université Columbia</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="État de New York">
<name sortKey="Li, Xuejing" sort="Li, Xuejing" uniqKey="Li X" first="Xuejing" last="Li">Xuejing Li</name>
</region>
<name sortKey="Leslie, Christina" sort="Leslie, Christina" uniqKey="Leslie C" first="Christina" last="Leslie">Christina Leslie</name>
<name sortKey="Panea, Casandra" sort="Panea, Casandra" uniqKey="Panea C" first="Casandra" last="Panea">Casandra Panea</name>
<name sortKey="Reinke, Valerie" sort="Reinke, Valerie" uniqKey="Reinke V" first="Valerie" last="Reinke">Valerie Reinke</name>
<name sortKey="Wiggins, Chris H" sort="Wiggins, Chris H" uniqKey="Wiggins C" first="Chris H." last="Wiggins">Chris H. Wiggins</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001356 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd -nk 001356 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Checkpoint
   |type=    RBID
   |clé=     PMC:2861633
   |texte=   Learning “graph-mer” Motifs that Predict Gene Expression Trajectories in Development
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/RBID.i   -Sk "pubmed:20454681" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021