Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000C909 ( Pmc/Corpus ); précédent : 000C908; suivant : 000C910 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Overlapping ETS and CRE Motifs (
<sup>G</sup>
/
<sub>C</sub>
CGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins</title>
<author>
<name sortKey="Chatterjee, Raghunath" sort="Chatterjee, Raghunath" uniqKey="Chatterjee R" first="Raghunath" last="Chatterjee">Raghunath Chatterjee</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhao, Jianfei" sort="Zhao, Jianfei" uniqKey="Zhao J" first="Jianfei" last="Zhao">Jianfei Zhao</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="He, Ximiao" sort="He, Ximiao" uniqKey="He X" first="Ximiao" last="He">Ximiao He</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shlyakhtenko, Andrey" sort="Shlyakhtenko, Andrey" uniqKey="Shlyakhtenko A" first="Andrey" last="Shlyakhtenko">Andrey Shlyakhtenko</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mann, Ishminder" sort="Mann, Ishminder" uniqKey="Mann I" first="Ishminder" last="Mann">Ishminder Mann</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Waterfall, Joshua J" sort="Waterfall, Joshua J" uniqKey="Waterfall J" first="Joshua J." last="Waterfall">Joshua J. Waterfall</name>
<affiliation>
<nlm:aff id="aff2">Genetics Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Meltzer, Paul" sort="Meltzer, Paul" uniqKey="Meltzer P" first="Paul" last="Meltzer">Paul Meltzer</name>
<affiliation>
<nlm:aff id="aff2">Genetics Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sathyanarayana, B K" sort="Sathyanarayana, B K" uniqKey="Sathyanarayana B" first="B. K." last="Sathyanarayana">B. K. Sathyanarayana</name>
<affiliation>
<nlm:aff id="aff3">Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fitzgerald, Peter C" sort="Fitzgerald, Peter C" uniqKey="Fitzgerald P" first="Peter C." last="Fitzgerald">Peter C. Fitzgerald</name>
<affiliation>
<nlm:aff id="aff4">Genome Analysis Unit, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vinson, Charles" sort="Vinson, Charles" uniqKey="Vinson C" first="Charles" last="Vinson">Charles Vinson</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">23050235</idno>
<idno type="pmc">3464117</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3464117</idno>
<idno type="RBID">PMC:3464117</idno>
<idno type="doi">10.1534/g3.112.004002</idno>
<date when="2012">2012</date>
<idno type="wicri:Area/Pmc/Corpus">000C90</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000C90</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Overlapping ETS and CRE Motifs (
<sup>G</sup>
/
<sub>C</sub>
CGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins</title>
<author>
<name sortKey="Chatterjee, Raghunath" sort="Chatterjee, Raghunath" uniqKey="Chatterjee R" first="Raghunath" last="Chatterjee">Raghunath Chatterjee</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhao, Jianfei" sort="Zhao, Jianfei" uniqKey="Zhao J" first="Jianfei" last="Zhao">Jianfei Zhao</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="He, Ximiao" sort="He, Ximiao" uniqKey="He X" first="Ximiao" last="He">Ximiao He</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shlyakhtenko, Andrey" sort="Shlyakhtenko, Andrey" uniqKey="Shlyakhtenko A" first="Andrey" last="Shlyakhtenko">Andrey Shlyakhtenko</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mann, Ishminder" sort="Mann, Ishminder" uniqKey="Mann I" first="Ishminder" last="Mann">Ishminder Mann</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Waterfall, Joshua J" sort="Waterfall, Joshua J" uniqKey="Waterfall J" first="Joshua J." last="Waterfall">Joshua J. Waterfall</name>
<affiliation>
<nlm:aff id="aff2">Genetics Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Meltzer, Paul" sort="Meltzer, Paul" uniqKey="Meltzer P" first="Paul" last="Meltzer">Paul Meltzer</name>
<affiliation>
<nlm:aff id="aff2">Genetics Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sathyanarayana, B K" sort="Sathyanarayana, B K" uniqKey="Sathyanarayana B" first="B. K." last="Sathyanarayana">B. K. Sathyanarayana</name>
<affiliation>
<nlm:aff id="aff3">Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fitzgerald, Peter C" sort="Fitzgerald, Peter C" uniqKey="Fitzgerald P" first="Peter C." last="Fitzgerald">Peter C. Fitzgerald</name>
<affiliation>
<nlm:aff id="aff4">Genome Analysis Unit, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vinson, Charles" sort="Vinson, Charles" uniqKey="Vinson C" first="Charles" last="Vinson">Charles Vinson</name>
<affiliation>
<nlm:aff id="aff1">Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">G3: Genes|Genomes|Genetics</title>
<idno type="eISSN">2160-1836</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X
<sub>4</sub>
-N
<sub>1-30</sub>
-X
<sub>4</sub>
) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (
<sup>C</sup>
/
<sub>G</sub>
CCGGAA
<bold>G</bold>
CGGAA) and the ETS⇔CRE motif (
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAA
<bold>GTG</bold>
ACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions.
<italic>In vivo</italic>
GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Ahn, S" uniqKey="Ahn S">S. Ahn</name>
</author>
<author>
<name sortKey="Olive, M" uniqKey="Olive M">M. Olive</name>
</author>
<author>
<name sortKey="Aggarwal, S" uniqKey="Aggarwal S">S. Aggarwal</name>
</author>
<author>
<name sortKey="Krylov, D" uniqKey="Krylov D">D. Krylov</name>
</author>
<author>
<name sortKey="Ginty, D D" uniqKey="Ginty D">D. D. Ginty</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Badis, G" uniqKey="Badis G">G. Badis</name>
</author>
<author>
<name sortKey="Berger, M F" uniqKey="Berger M">M. F. Berger</name>
</author>
<author>
<name sortKey="Philippakis, A A" uniqKey="Philippakis A">A. A. Philippakis</name>
</author>
<author>
<name sortKey="Talukder, S" uniqKey="Talukder S">S. Talukder</name>
</author>
<author>
<name sortKey="Gehrke, A R" uniqKey="Gehrke A">A. R. Gehrke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Batchelor, A H" uniqKey="Batchelor A">A. H. Batchelor</name>
</author>
<author>
<name sortKey="Piper, D E" uniqKey="Piper D">D. E. Piper</name>
</author>
<author>
<name sortKey="De La Brousse, F C" uniqKey="De La Brousse F">F. C. de la Brousse</name>
</author>
<author>
<name sortKey="Mcknight, S L" uniqKey="Mcknight S">S. L. McKnight</name>
</author>
<author>
<name sortKey="Wolberger, C" uniqKey="Wolberger C">C. Wolberger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Biddie, S C" uniqKey="Biddie S">S. C. Biddie</name>
</author>
<author>
<name sortKey="John, S" uniqKey="John S">S. John</name>
</author>
<author>
<name sortKey="Sabo, P J" uniqKey="Sabo P">P. J. Sabo</name>
</author>
<author>
<name sortKey="Thurman, R E" uniqKey="Thurman R">R. E. Thurman</name>
</author>
<author>
<name sortKey="Johnson, T A" uniqKey="Johnson T">T. A. Johnson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bina, M" uniqKey="Bina M">M. Bina</name>
</author>
<author>
<name sortKey="Wyss, P" uniqKey="Wyss P">P. Wyss</name>
</author>
<author>
<name sortKey="Ren, W" uniqKey="Ren W">W. Ren</name>
</author>
<author>
<name sortKey="Szpankowski, W" uniqKey="Szpankowski W">W. Szpankowski</name>
</author>
<author>
<name sortKey="Thomas, E" uniqKey="Thomas E">E. Thomas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bird, A" uniqKey="Bird A">A. Bird</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boehlk, S" uniqKey="Boehlk S">S. Boehlk</name>
</author>
<author>
<name sortKey="Fessele, S" uniqKey="Fessele S">S. Fessele</name>
</author>
<author>
<name sortKey="Mojaat, A" uniqKey="Mojaat A">A. Mojaat</name>
</author>
<author>
<name sortKey="Miyamoto, N G" uniqKey="Miyamoto N">N. G. Miyamoto</name>
</author>
<author>
<name sortKey="Werner, T" uniqKey="Werner T">T. Werner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Carninci, P" uniqKey="Carninci P">P. Carninci</name>
</author>
<author>
<name sortKey="Sandelin, A" uniqKey="Sandelin A">A. Sandelin</name>
</author>
<author>
<name sortKey="Lenhard, B" uniqKey="Lenhard B">B. Lenhard</name>
</author>
<author>
<name sortKey="Katayama, S" uniqKey="Katayama S">S. Katayama</name>
</author>
<author>
<name sortKey="Shimokawa, K" uniqKey="Shimokawa K">K. Shimokawa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Val, S" uniqKey="De Val S">S. De Val</name>
</author>
<author>
<name sortKey="Chi, N C" uniqKey="Chi N">N. C. Chi</name>
</author>
<author>
<name sortKey="Meadows, S M" uniqKey="Meadows S">S. M. Meadows</name>
</author>
<author>
<name sortKey="Minovitsky, S" uniqKey="Minovitsky S">S. Minovitsky</name>
</author>
<author>
<name sortKey="Anderson, J P" uniqKey="Anderson J">J. P. Anderson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Farnham, P J" uniqKey="Farnham P">P. J. Farnham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fitzgerald, P C" uniqKey="Fitzgerald P">P. C. FitzGerald</name>
</author>
<author>
<name sortKey="Shlyakhtenko, A" uniqKey="Shlyakhtenko A">A. Shlyakhtenko</name>
</author>
<author>
<name sortKey="Mir, A A" uniqKey="Mir A">A. A. Mir</name>
</author>
<author>
<name sortKey="Vinson, C" uniqKey="Vinson C">C. Vinson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fitzgerald, P C" uniqKey="Fitzgerald P">P. C. FitzGerald</name>
</author>
<author>
<name sortKey="Sturgill, D" uniqKey="Sturgill D">D. Sturgill</name>
</author>
<author>
<name sortKey="Shyakhtenko, A" uniqKey="Shyakhtenko A">A. Shyakhtenko</name>
</author>
<author>
<name sortKey="Oliver, B" uniqKey="Oliver B">B. Oliver</name>
</author>
<author>
<name sortKey="Vinson, C" uniqKey="Vinson C">C. Vinson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Frith, M C" uniqKey="Frith M">M. C. Frith</name>
</author>
<author>
<name sortKey="Spouge, J L" uniqKey="Spouge J">J. L. Spouge</name>
</author>
<author>
<name sortKey="Hansen, U" uniqKey="Hansen U">U. Hansen</name>
</author>
<author>
<name sortKey="Weng, Z" uniqKey="Weng Z">Z. Weng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garvie, C W" uniqKey="Garvie C">C. W. Garvie</name>
</author>
<author>
<name sortKey="Hagman, J" uniqKey="Hagman J">J. Hagman</name>
</author>
<author>
<name sortKey="Wolberger, C" uniqKey="Wolberger C">C. Wolberger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Graves, B J" uniqKey="Graves B">B. J. Graves</name>
</author>
<author>
<name sortKey="Petersen, J M" uniqKey="Petersen J">J. M. Petersen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hollenhorst, P C" uniqKey="Hollenhorst P">P. C. Hollenhorst</name>
</author>
<author>
<name sortKey="Ferris, M W" uniqKey="Ferris M">M. W. Ferris</name>
</author>
<author>
<name sortKey="Hull, M A" uniqKey="Hull M">M. A. Hull</name>
</author>
<author>
<name sortKey="Chae, H" uniqKey="Chae H">H. Chae</name>
</author>
<author>
<name sortKey="Kim, S" uniqKey="Kim S">S. Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hollenhorst, P C" uniqKey="Hollenhorst P">P. C. Hollenhorst</name>
</author>
<author>
<name sortKey="Mcintosh, L P" uniqKey="Mcintosh L">L. P. McIntosh</name>
</author>
<author>
<name sortKey="Graves, B J" uniqKey="Graves B">B. J. Graves</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Iguchi Ariga, S M" uniqKey="Iguchi Ariga S">S. M. Iguchi-Ariga</name>
</author>
<author>
<name sortKey="Schaffner, W" uniqKey="Schaffner W">W. Schaffner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ji, H" uniqKey="Ji H">H. Ji</name>
</author>
<author>
<name sortKey="Jiang, H" uniqKey="Jiang H">H. Jiang</name>
</author>
<author>
<name sortKey="Ma, W" uniqKey="Ma W">W. Ma</name>
</author>
<author>
<name sortKey="Johnson, D S" uniqKey="Johnson D">D. S. Johnson</name>
</author>
<author>
<name sortKey="Myers, R M" uniqKey="Myers R">R. M. Myers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Johnson, D S" uniqKey="Johnson D">D. S. Johnson</name>
</author>
<author>
<name sortKey="Mortazavi, A" uniqKey="Mortazavi A">A. Mortazavi</name>
</author>
<author>
<name sortKey="Myers, R M" uniqKey="Myers R">R. M. Myers</name>
</author>
<author>
<name sortKey="Wold, B" uniqKey="Wold B">B. Wold</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Johnson, P F" uniqKey="Johnson P">P. F. Johnson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kaplan, N" uniqKey="Kaplan N">N. Kaplan</name>
</author>
<author>
<name sortKey="Hughes, T R" uniqKey="Hughes T">T. R. Hughes</name>
</author>
<author>
<name sortKey="Lieb, J D" uniqKey="Lieb J">J. D. Lieb</name>
</author>
<author>
<name sortKey="Widom, J" uniqKey="Widom J">J. Widom</name>
</author>
<author>
<name sortKey="Segal, E" uniqKey="Segal E">E. Segal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kel, A E" uniqKey="Kel A">A. E. Kel</name>
</author>
<author>
<name sortKey="Gossling, E" uniqKey="Gossling E">E. Gossling</name>
</author>
<author>
<name sortKey="Reuter, I" uniqKey="Reuter I">I. Reuter</name>
</author>
<author>
<name sortKey="Cheremushkin, E" uniqKey="Cheremushkin E">E. Cheremushkin</name>
</author>
<author>
<name sortKey="Kel Margoulis, O V" uniqKey="Kel Margoulis O">O. V. Kel-Margoulis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kharchenko, P V" uniqKey="Kharchenko P">P. V. Kharchenko</name>
</author>
<author>
<name sortKey="Tolstorukov, M Y" uniqKey="Tolstorukov M">M. Y. Tolstorukov</name>
</author>
<author>
<name sortKey="Park, P J" uniqKey="Park P">P. J. Park</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lagrange, T" uniqKey="Lagrange T">T. Lagrange</name>
</author>
<author>
<name sortKey="Kapanidis, A N" uniqKey="Kapanidis A">A. N. Kapanidis</name>
</author>
<author>
<name sortKey="Tang, H" uniqKey="Tang H">H. Tang</name>
</author>
<author>
<name sortKey="Reinberg, D" uniqKey="Reinberg D">D. Reinberg</name>
</author>
<author>
<name sortKey="Ebright, R H" uniqKey="Ebright R">R. H. Ebright</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Machanick, P" uniqKey="Machanick P">P. Machanick</name>
</author>
<author>
<name sortKey="Bailey, T L" uniqKey="Bailey T">T. L. Bailey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marino Ramirez, L" uniqKey="Marino Ramirez L">L. Marino-Ramirez</name>
</author>
<author>
<name sortKey="Spouge, J L" uniqKey="Spouge J">J. L. Spouge</name>
</author>
<author>
<name sortKey="Kanga, G C" uniqKey="Kanga G">G. C. Kanga</name>
</author>
<author>
<name sortKey="Landsman, D" uniqKey="Landsman D">D. Landsman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Martianov, I" uniqKey="Martianov I">I. Martianov</name>
</author>
<author>
<name sortKey="Choukrallah, M A" uniqKey="Choukrallah M">M. A. Choukrallah</name>
</author>
<author>
<name sortKey="Krebs, A" uniqKey="Krebs A">A. Krebs</name>
</author>
<author>
<name sortKey="Ye, T" uniqKey="Ye T">T. Ye</name>
</author>
<author>
<name sortKey="Legras, S" uniqKey="Legras S">S. Legras</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Matys, V" uniqKey="Matys V">V. Matys</name>
</author>
<author>
<name sortKey="Kel Margoulis, O V" uniqKey="Kel Margoulis O">O. V. Kel-Margoulis</name>
</author>
<author>
<name sortKey="Fricke, E" uniqKey="Fricke E">E. Fricke</name>
</author>
<author>
<name sortKey="Liebich, I" uniqKey="Liebich I">I. Liebich</name>
</author>
<author>
<name sortKey="Land, S" uniqKey="Land S">S. Land</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mirny, L A" uniqKey="Mirny L">L. A. Mirny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Oh, Y M" uniqKey="Oh Y">Y. M. Oh</name>
</author>
<author>
<name sortKey="Kim, J K" uniqKey="Kim J">J. K. Kim</name>
</author>
<author>
<name sortKey="Choi, S" uniqKey="Choi S">S. Choi</name>
</author>
<author>
<name sortKey="Yoo, J Y" uniqKey="Yoo J">J. Y. Yoo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ohler, U" uniqKey="Ohler U">U. Ohler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pachkov, M" uniqKey="Pachkov M">M. Pachkov</name>
</author>
<author>
<name sortKey="Erb, I" uniqKey="Erb I">I. Erb</name>
</author>
<author>
<name sortKey="Molina, N" uniqKey="Molina N">N. Molina</name>
</author>
<author>
<name sortKey="Van Nimwegen, E" uniqKey="Van Nimwegen E">E. van Nimwegen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Panne, D" uniqKey="Panne D">D. Panne</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Panne, D" uniqKey="Panne D">D. Panne</name>
</author>
<author>
<name sortKey="Maniatis, T" uniqKey="Maniatis T">T. Maniatis</name>
</author>
<author>
<name sortKey="Harrison, S C" uniqKey="Harrison S">S. C. Harrison</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Panne, D" uniqKey="Panne D">D. Panne</name>
</author>
<author>
<name sortKey="Mcwhirter, S M" uniqKey="Mcwhirter S">S. M. McWhirter</name>
</author>
<author>
<name sortKey="Maniatis, T" uniqKey="Maniatis T">T. Maniatis</name>
</author>
<author>
<name sortKey="Harrison, S C" uniqKey="Harrison S">S. C. Harrison</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pettersen, E F" uniqKey="Pettersen E">E. F. Pettersen</name>
</author>
<author>
<name sortKey="Goddard, T D" uniqKey="Goddard T">T. D. Goddard</name>
</author>
<author>
<name sortKey="Huang, C C" uniqKey="Huang C">C. C. Huang</name>
</author>
<author>
<name sortKey="Couch, G S" uniqKey="Couch G">G. S. Couch</name>
</author>
<author>
<name sortKey="Greenblatt, D M" uniqKey="Greenblatt D">D. M. Greenblatt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Polach, K J" uniqKey="Polach K">K. J. Polach</name>
</author>
<author>
<name sortKey="Widom, J" uniqKey="Widom J">J. Widom</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pollard, K S" uniqKey="Pollard K">K. S. Pollard</name>
</author>
<author>
<name sortKey="Hubisz, M J" uniqKey="Hubisz M">M. J. Hubisz</name>
</author>
<author>
<name sortKey="Rosenbloom, K R" uniqKey="Rosenbloom K">K. R. Rosenbloom</name>
</author>
<author>
<name sortKey="Siepel, A" uniqKey="Siepel A">A. Siepel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Portales Casamar, E" uniqKey="Portales Casamar E">E. Portales-Casamar</name>
</author>
<author>
<name sortKey="Thongjuea, S" uniqKey="Thongjuea S">S. Thongjuea</name>
</author>
<author>
<name sortKey="Kwon, A T" uniqKey="Kwon A">A. T. Kwon</name>
</author>
<author>
<name sortKey="Arenillas, D" uniqKey="Arenillas D">D. Arenillas</name>
</author>
<author>
<name sortKey="Zhao, X" uniqKey="Zhao X">X. Zhao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rishi, V" uniqKey="Rishi V">V. Rishi</name>
</author>
<author>
<name sortKey="Bhattacharya, P" uniqKey="Bhattacharya P">P. Bhattacharya</name>
</author>
<author>
<name sortKey="Chatterjee, R" uniqKey="Chatterjee R">R. Chatterjee</name>
</author>
<author>
<name sortKey="Rozenberg, J" uniqKey="Rozenberg J">J. Rozenberg</name>
</author>
<author>
<name sortKey="Zhao, J" uniqKey="Zhao J">J. Zhao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rozenberg, J M" uniqKey="Rozenberg J">J. M. Rozenberg</name>
</author>
<author>
<name sortKey="Shlyakhtenko, A" uniqKey="Shlyakhtenko A">A. Shlyakhtenko</name>
</author>
<author>
<name sortKey="Glass, K" uniqKey="Glass K">K. Glass</name>
</author>
<author>
<name sortKey="Rishi, V" uniqKey="Rishi V">V. Rishi</name>
</author>
<author>
<name sortKey="Myakishev, M V" uniqKey="Myakishev M">M. V. Myakishev</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sabo, P J" uniqKey="Sabo P">P. J. Sabo</name>
</author>
<author>
<name sortKey="Humbert, R" uniqKey="Humbert R">R. Humbert</name>
</author>
<author>
<name sortKey="Hawrylycz, M" uniqKey="Hawrylycz M">M. Hawrylycz</name>
</author>
<author>
<name sortKey="Wallace, J C" uniqKey="Wallace J">J. C. Wallace</name>
</author>
<author>
<name sortKey="Dorschner, M O" uniqKey="Dorschner M">M. O. Dorschner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sawada, J" uniqKey="Sawada J">J. Sawada</name>
</author>
<author>
<name sortKey="Simizu, N" uniqKey="Simizu N">N. Simizu</name>
</author>
<author>
<name sortKey="Suzuki, F" uniqKey="Suzuki F">F. Suzuki</name>
</author>
<author>
<name sortKey="Sawa, C" uniqKey="Sawa C">C. Sawa</name>
</author>
<author>
<name sortKey="Goto, M" uniqKey="Goto M">M. Goto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schumacher, M A" uniqKey="Schumacher M">M. A. Schumacher</name>
</author>
<author>
<name sortKey="Goodman, R H" uniqKey="Goodman R">R. H. Goodman</name>
</author>
<author>
<name sortKey="Brennan, R G" uniqKey="Brennan R">R. G. Brennan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smale, S T" uniqKey="Smale S">S. T. Smale</name>
</author>
<author>
<name sortKey="Kadonaga, J T" uniqKey="Kadonaga J">J. T. Kadonaga</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thomas Chollier, M" uniqKey="Thomas Chollier M">M. Thomas-Chollier</name>
</author>
<author>
<name sortKey="Herrmann, C" uniqKey="Herrmann C">C. Herrmann</name>
</author>
<author>
<name sortKey="Defrance, M" uniqKey="Defrance M">M. Defrance</name>
</author>
<author>
<name sortKey="Sand, O" uniqKey="Sand O">O. Sand</name>
</author>
<author>
<name sortKey="Thieffry, D" uniqKey="Thieffry D">D. Thieffry</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Umezawa, A" uniqKey="Umezawa A">A. Umezawa</name>
</author>
<author>
<name sortKey="Yamamoto, H" uniqKey="Yamamoto H">H. Yamamoto</name>
</author>
<author>
<name sortKey="Rhodes, K" uniqKey="Rhodes K">K. Rhodes</name>
</author>
<author>
<name sortKey="Klemsz, M J" uniqKey="Klemsz M">M. J. Klemsz</name>
</author>
<author>
<name sortKey="Maki, R A" uniqKey="Maki R">R. A. Maki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Valouev, A" uniqKey="Valouev A">A. Valouev</name>
</author>
<author>
<name sortKey="Johnson, D S" uniqKey="Johnson D">D. S. Johnson</name>
</author>
<author>
<name sortKey="Sundquist, A" uniqKey="Sundquist A">A. Sundquist</name>
</author>
<author>
<name sortKey="Medina, C" uniqKey="Medina C">C. Medina</name>
</author>
<author>
<name sortKey="Anton, E" uniqKey="Anton E">E. Anton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vinson, C" uniqKey="Vinson C">C. Vinson</name>
</author>
<author>
<name sortKey="Chatterjee, R" uniqKey="Chatterjee R">R. Chatterjee</name>
</author>
<author>
<name sortKey="Fitzgerald, P" uniqKey="Fitzgerald P">P. Fitzgerald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wei, G H" uniqKey="Wei G">G. H. Wei</name>
</author>
<author>
<name sortKey="Badis, G" uniqKey="Badis G">G. Badis</name>
</author>
<author>
<name sortKey="Berger, M F" uniqKey="Berger M">M. F. Berger</name>
</author>
<author>
<name sortKey="Kivioja, T" uniqKey="Kivioja T">T. Kivioja</name>
</author>
<author>
<name sortKey="Palin, K" uniqKey="Palin K">K. Palin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Whitington, T" uniqKey="Whitington T">T. Whitington</name>
</author>
<author>
<name sortKey="Frith, M C" uniqKey="Frith M">M. C. Frith</name>
</author>
<author>
<name sortKey="Johnson, J" uniqKey="Johnson J">J. Johnson</name>
</author>
<author>
<name sortKey="Bailey, T L" uniqKey="Bailey T">T. L. Bailey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wunderlich, Z" uniqKey="Wunderlich Z">Z. Wunderlich</name>
</author>
<author>
<name sortKey="Mirny, L A" uniqKey="Mirny L">L. A. Mirny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Xie, X" uniqKey="Xie X">X. Xie</name>
</author>
<author>
<name sortKey="Lu, J" uniqKey="Lu J">J. Lu</name>
</author>
<author>
<name sortKey="Kulbokas, E J" uniqKey="Kulbokas E">E. J. Kulbokas</name>
</author>
<author>
<name sortKey="Golub, T R" uniqKey="Golub T">T. R. Golub</name>
</author>
<author>
<name sortKey="Mootha, V" uniqKey="Mootha V">V. Mootha</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X. Zhang</name>
</author>
<author>
<name sortKey="Odom, D T" uniqKey="Odom D">D. T. Odom</name>
</author>
<author>
<name sortKey="Koo, S H" uniqKey="Koo S">S. H. Koo</name>
</author>
<author>
<name sortKey="Conkright, M D" uniqKey="Conkright M">M. D. Conkright</name>
</author>
<author>
<name sortKey="Canettieri, G" uniqKey="Canettieri G">G. Canettieri</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">G3 (Bethesda)</journal-id>
<journal-id journal-id-type="iso-abbrev">Genetics</journal-id>
<journal-id journal-id-type="hwp">ggg</journal-id>
<journal-id journal-id-type="pmc">ggg</journal-id>
<journal-id journal-id-type="publisher-id">ggg</journal-id>
<journal-title-group>
<journal-title>G3: Genes|Genomes|Genetics</journal-title>
</journal-title-group>
<issn pub-type="epub">2160-1836</issn>
<publisher>
<publisher-name>Genetics Society of America</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">23050235</article-id>
<article-id pub-id-type="pmc">3464117</article-id>
<article-id pub-id-type="publisher-id">GGG_004002</article-id>
<article-id pub-id-type="doi">10.1534/g3.112.004002</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Investigations</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Overlapping ETS and CRE Motifs (
<sup>G</sup>
/
<sub>C</sub>
CGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Chatterjee</surname>
<given-names>Raghunath</given-names>
</name>
<xref ref-type="aff" rid="aff1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhao</surname>
<given-names>Jianfei</given-names>
</name>
<xref ref-type="aff" rid="aff1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>He</surname>
<given-names>Ximiao</given-names>
</name>
<xref ref-type="aff" rid="aff1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Shlyakhtenko</surname>
<given-names>Andrey</given-names>
</name>
<xref ref-type="aff" rid="aff1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Mann</surname>
<given-names>Ishminder</given-names>
</name>
<xref ref-type="aff" rid="aff1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Waterfall</surname>
<given-names>Joshua J.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup></sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Meltzer</surname>
<given-names>Paul</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup></sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sathyanarayana</surname>
<given-names>B. K.</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup></sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>FitzGerald</surname>
<given-names>Peter C.</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>§</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Vinson</surname>
<given-names>Charles</given-names>
</name>
<xref ref-type="aff" rid="aff1">*</xref>
<xref ref-type="corresp" rid="cor1">
<sup>1</sup>
</xref>
</contrib>
<aff id="aff1">
<label>*</label>
Laboratory of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</aff>
<aff id="aff2">
<label></label>
Genetics Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</aff>
<aff id="aff3">
<label></label>
Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</aff>
<aff id="aff4">
<label>§</label>
Genome Analysis Unit, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</aff>
</contrib-group>
<author-notes>
<fn>
<p>Supporting information is available online at
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1">http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1</ext-link>
</p>
</fn>
<corresp id="cor1">
<label>1</label>
Corresponding author: 9000 Rockville Pike, Bldg. 37, Rm. 3128, Bethesda, MD 20892. E-mail:
<email>Vinsonc@mail.nih.gov</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>1</day>
<month>10</month>
<year>2012</year>
</pub-date>
<pub-date pub-type="collection">
<month>10</month>
<year>2012</year>
</pub-date>
<volume>2</volume>
<issue>10</issue>
<fpage>1243</fpage>
<lpage>1256</lpage>
<history>
<date date-type="received">
<day>07</day>
<month>6</month>
<year>2012</year>
</date>
<date date-type="accepted">
<day>19</day>
<month>8</month>
<year>2012</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2012 Chatterjee
<italic>et al.</italic>
</copyright-statement>
<copyright-year>2012</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/3.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/3.0/">http://creativecommons.org/licenses/by/3.0/</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:title="pdf" xlink:type="simple" xlink:href="1243.pdf"></self-uri>
<abstract>
<p>Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X
<sub>4</sub>
-N
<sub>1-30</sub>
-X
<sub>4</sub>
) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (
<sup>C</sup>
/
<sub>G</sub>
CCGGAA
<bold>G</bold>
CGGAA) and the ETS⇔CRE motif (
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAA
<bold>GTG</bold>
ACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions.
<italic>In vivo</italic>
GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif.</p>
</abstract>
<kwd-group>
<kwd>proximal promoters</kwd>
<kwd>transcription factor binding sites</kwd>
<kwd>co-localization</kwd>
<kwd>transcriptional start site</kwd>
<kwd>EMSA</kwd>
</kwd-group>
<custom-meta-group>
<custom-meta>
<meta-name> DJS Export </meta-name>
<meta-value>v1</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<p>Gene expression is controlled by many genetic and epigenetic elements in a highly coordinated manner, but the DNA sequence of the genome is the ultimate arbiter. Specific DNA sequences in both proximal promoters and more distant regions are bound by sequence-specific DNA binding proteins that regulate gene expression (
<xref rid="bib45" ref-type="bibr">Smale & Kadonaga 2003</xref>
;
<xref rid="bib9" ref-type="bibr">Farnham 2009</xref>
). Additionally, CpG islands (regions of 300-bps to 3000-bps containing a high frequency of the CG dinucleotide) are frequently located at or near mammalian promoters (
<xref rid="bib5" ref-type="bibr">Bird 2011</xref>
). Many experimental (
<xref rid="bib7" ref-type="bibr">Carninci
<italic>et al.</italic>
2006</xref>
;
<xref rid="bib19" ref-type="bibr">Johnson
<italic>et al.</italic>
2007</xref>
) and computational methods have been employed to identify biologically relevant transcription factor binding sites (TFBS). The computational methods typically examine DNA sequence enrichment near a biologically defined regulatory region like the transcriptional start site (TSS) (
<xref rid="bib12" ref-type="bibr">Frith
<italic>et al.</italic>
2002</xref>
;
<xref rid="bib31" ref-type="bibr">Ohler
<italic>et al.</italic>
2002</xref>
;
<xref rid="bib22" ref-type="bibr">Kel
<italic>et al.</italic>
2003</xref>
;
<xref rid="bib4" ref-type="bibr">Bina
<italic>et al.</italic>
2004</xref>
;
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
;
<xref rid="bib26" ref-type="bibr">Marino-Ramirez
<italic>et al.</italic>
2004</xref>
;
<xref rid="bib28" ref-type="bibr">Matys
<italic>et al.</italic>
2006</xref>
;
<xref rid="bib32" ref-type="bibr">Pachkov
<italic>et al.</italic>
2007</xref>
;
<xref rid="bib18" ref-type="bibr">Ji
<italic>et al.</italic>
2008</xref>
;
<xref rid="bib23" ref-type="bibr">Kharchenko
<italic>et al.</italic>
2008</xref>
;
<xref rid="bib39" ref-type="bibr">Portales-Casamar
<italic>et al.</italic>
2010</xref>
;
<xref rid="bib30" ref-type="bibr">Oh
<italic>et al.</italic>
2011</xref>
;
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
). Examination of related mammals has also identified many DNA motifs in promoters that are conserved, suggesting that they may be TFBS, while the 3′UTR have conserved sequences thought to be microRNAs (
<xref rid="bib53" ref-type="bibr">Xie
<italic>et al.</italic>
2005</xref>
).</p>
<p>In an earlier study, we identified 8-bps long DNA sequences (8-mers) that are localized in human proximal promoters (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
) and Drosophila promoters (
<xref rid="bib11" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2006</xref>
), and we presented evidence that motifs near the TSS are biologically functional. In human promoters, these sequences were grouped into known TFBS, including SP1, CCAAT, ETS, E-Box, CRE, Box A, NRF1, and TATA. Analyses of promoters with the conservation of DNA sequences among the related mammals greatly enhanced the identification of regulatory motifs (
<xref rid="bib53" ref-type="bibr">Xie
<italic>et al.</italic>
2005</xref>
).</p>
<p>To identify additional biologically important DNA sequences in human proximal promoters, we analyzed the distribution of discontinuous 8-mers, also called split 8-mers (
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
). Each split 8-mer is composed of two 4-mers separated by 1-bp to 30-bps. If each 4-mer represents a part of a TFBS, this calculation would identify pairs of TFBS that co-occur in the same proximal promoter as observed in other mammalian promoters (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
). Split 8-mer enrichment in promoters declines with increasing distance between the two 4-mers. In contrast, Drosophila contains many split 8-mers in which the 4-mers are separated by 20-bps to 30-bps that localize in promoters (
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
).</p>
<p>This article examines the split 8-mers that localize in human promoters. We extended our previous work with split 8-mers in human promoters (
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
) by evaluating whether the split 8-mers that localize in promoters have a preferred distance between the two 4-mers. This analysis identified an ETS motif overlapping with a CRE motif (ETS⇔CRE) that localizes in proximal promoters. DNA binding experiments show that GABPα and CREB preferentially bind the two TFBS when they overlap and produce the ETS⇔CRE motif enriched in proximal promoters.</p>
<sec sec-type="materials|methods" id="s1">
<title>Materials and Methods</title>
<sec id="s2">
<title>Dataset generation</title>
<p>From University of California Santa Cruz Genome Bioinformatics website (
<ext-link ext-link-type="uri" xlink:href="http://genome.ucsc.edu/">http://genome.ucsc.edu/</ext-link>
), we obtained the DNA sequence data for RefSeq genes in the Golden Path Human Genome Assembly with annotated TSS, representing sequences from –1,000 bp to +500 bp relative to the TSS. The initial dataset contained 26,431 promoters. The set was further processed to improved relevance and the validity of the analysis using the following criteria. First, for promoters with 100% identical sequences, only one copy of them was kept (5483 promoters were removed). Second, promoters containing unknown nucleotides (N) of at least 150 bps were removed (8 promoters). Third, promoters with duplicated RefSeq numbers were removed (411 promoters). Fourth, of the remaining 20,529 promoters, 18,451 were determined to have unique sequences, whereas 2078 promoters had duplicated sequences shared among themselves. Among these 2078 promoter sequences, 68 had more than 10 overlapping duplicated regions of at least 250 bps with other promoter sequences and were deleted from the analysis. One thousand five hundred thirty-five (1535) promoter sequences contained closely identical sequences among themselves, and they comprised 701 unique groups (pairs in most cases); only 701 “representative” promoters were kept for the analysis. An additional 475 promoters were kept for the analysis, although they did have some mixed overlapping sequencing. This allowed us to retain only 1176 out of these 2078 promoters. Fifth, two thousand four hundred eighty-four (2484) promoters had start of the coding sequences (translational start sites) within 30-bps of the TSS, and these promoters were excluded from the following analysis. Finally, a set of 17,143 promoters (18,451 + 1,176 − 2,484) was obtained and considered for the analysis.</p>
</sec>
<sec id="s3">
<title>Analysis of split 8-mers distributions</title>
<p>There are 4
<sup>8</sup>
discontinuous non-degenerative 8-mers (X
<sub>4</sub>
-N
<italic>
<sub>k</sub>
</italic>
-X
<sub>4</sub>
; N denotes any arbitrary nucleotides and
<italic>k</italic>
denotes spacing between two 4-mers), and of these,
<italic>ξ</italic>
4
<sup>4</sup>
are palindromes and
<inline-formula>
<mml:math id="me1">
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mn>4</mml:mn>
<mml:mn>8</mml:mn>
</mml:msup>
<mml:mo></mml:mo>
<mml:mi>ξ</mml:mi>
<mml:msup>
<mml:mn>4</mml:mn>
<mml:mn>4</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>
are non-palindromes, where each sequence and its complement is represented and
<italic>ξ</italic>
= 1 if
<italic>k</italic>
is even and 0 if odd. Thus, the number of 8-mers can be reduced to
<inline-formula>
<mml:math id="me2">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mn>4</mml:mn>
<mml:mn>8</mml:mn>
</mml:msup>
<mml:mo></mml:mo>
<mml:mi>ξ</mml:mi>
<mml:msup>
<mml:mn>4</mml:mn>
<mml:mn>4</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>ξ</mml:mi>
<mml:msup>
<mml:mn>4</mml:mn>
<mml:mn>4</mml:mn>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mn>4</mml:mn>
<mml:mn>4</mml:mn>
</mml:msup>
<mml:mfrac>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mn>4</mml:mn>
<mml:mn>4</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>ξ</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula>
. Those 32,896 or 32,768 8-mers were automatically generated by a custom-made program. The promoter set was searched against them, and final distributions were generated. To analyze the data, we divided 1500-bps into 75 bins each containing 20-bps, numbering bin 1 [–1000 bp; –981 bp] to bin 75 [+481 bp; +500 bp]. We determined the number of times the first nucleotide of a studied DNA sequence (or the last of its complement) occurred within each 20-bps bin. To detect and quantify non-uniform distributions (localization) and the probability of non-uniformity of split 8-mers, we determined localization factor (LF) and
<italic>P</italic>
-value as described previously (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
;
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
).</p>
</sec>
<sec id="s4">
<title>Molecular modeling</title>
<p>The molecular model of the ETS and CREB dimer interacting with a single chain DNA with a specific base pair sequence of CCGGAAGTGACGTCA was built by using two PDB structures, the ETS-1 protein bound to an ETS site (PDB ID: 1K79) (
<xref rid="bib13" ref-type="bibr">Garvie
<italic>et al.</italic>
2001</xref>
) and the CREB dimer bound to the CRE (PDB ID: 1DH3) (
<xref rid="bib44" ref-type="bibr">Schumacher
<italic>et al.</italic>
2000</xref>
). The 10 nucleotides (shown underlined) of the E chain of the DNA (TAGTG
<underline>CCGGAA</underline>
<bold>
<underline>ATG</underline>
</bold>
<underline>T</underline>
) of 1K79 were aligned to the 10 nucleotides (shown underlined) in the B chain of the DNA (
<underline>CCTTGG</underline>
<bold>
<underline>CTG</underline>
</bold>
<underline>A</underline>
CGTCAGCCAAG) of 1DH3, using Chimera visualization software (
<xref rid="bib36" ref-type="bibr">Pettersen
<italic>et al.</italic>
2004</xref>
). This alignment also results in the nucleotides ATG (shown in bold) of 1K79 aligning with the nucleotides CTG (shown in bold) of 1DH3. The ETS-1 protein and the complementary strand (F chain) of DNA of 1K79 were carried along with the E chain of its DNA during this alignment. From this aligned structures, the first 10 nucleotides (CCTTGGCTGA) and their base pairs in the complimentary chain in the 1DH3 structure were deleted. The remaining chains containing the nucleotides TAGTGCCGGAAATGT of 1K79 and the nucleotides CGTCAGCCAAG of 1DH3 were covalently linked to one another using Chimera software to form one long chain of DNA with the sequence TAGTG
<underline>CCGGAA</underline>
<bold>
<underline>A</underline>
</bold>
<underline>TG</underline>
<bold>
<underline>T</underline>
</bold>
<underline>CGTCA</underline>
GCCAAG. Similarly, its complimentary DNA chain was also built. The 12
<sup>th</sup>
and 15
<sup>th</sup>
bases in this long chain (shown in bold) were mutated to G and A bases, respectively, and the final complex containing this long DNA and the ETS and CRE was subjected to an energy minimization using the Discovery Studio (Accelrys Software) molecular modeling software.</p>
</sec>
<sec id="s5">
<title>Electrophoretic mobility shift assay (EMSA)</title>
<p>EMSA was performed similarly as described previously (
<xref rid="bib40" ref-type="bibr">Rishi
<italic>et al.</italic>
2010</xref>
). GABPα and CREB proteins were
<italic>in vitro</italic>
translated using PURExpress
<italic>In Vitro</italic>
Protein Synthesis Kit (New England Biolabs, USA) according to manufacturer instructions. The T7 expression plasmids containing the DNA binding domain of GABPα (
<xref rid="bib55" ref-type="bibr">Badis
<italic>et al.</italic>
2009</xref>
) or the B-ZIP domain of CREB (
<xref rid="bib1" ref-type="bibr">Ahn
<italic>et al.</italic>
1998</xref>
) was used as the template DNA. GABPα has a GST-tag at the N-terminus. The protein concentrations were estimated by Western blot using purified GST-CREB or CREB with known concentrations as concentration standards.
<italic>In vitro</italic>
translated proteins were mixed with 7 pM
<sup>32</sup>
P end-labeled double-stranded oligonucleotides containing variants of ETS and CREB binding sites in the gel shift buffer (0.5 mg/ml BSA, 10% glycerol, 2.5 mM DTT, 12.5 mM K
<sub>2</sub>
HPO
<sub>4</sub>
-KH
<sub>2</sub>
PO
<sub>4</sub>
, pH 7.4, 0.25 mM EDTA). The final volume of the reaction was adjusted to 20 µl. For regular EMSA, the reactions were incubated at 37° for 20 min, followed by cooling at room temperature for 5 min before loading. For supershift experiments, the reactions were first incubated at 37° for 20 min without antibodies. Antibodies (catalog # sc-186, sc-459, or sc-2027, Santa Cruz Biotechnology, USA) were then added, and the reactions were incubated on ice for 30 min, followed by incubation at room temperature for 15 min before loading. 10 µl samples were resolved on 7.5% PAGE at 150 V for 1.5 hr in the 1x TBE buffer (25 mM Tris-boric acid, 0.5 mM EDTA). Sequences of oligonucleotides used for EMSA experiments are listed in
<xref ref-type="table" rid="t1">Table 1</xref>
. For EMSA using ETV5 and CREB, we used purified proteins containing the DNA binding domain of ETV5 or the B-ZIP domain of CREB.</p>
<table-wrap id="t1" position="float">
<label>Table 1</label>
<caption>
<title>DNA probe sequences for EMSA (binding sites underlined)</title>
</caption>
<table frame="hsides" rules="groups">
<col width="32.74%" span="1"></col>
<col width="67.26%" span="1"></col>
<thead>
<tr>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Probe</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Sequence (5′ to 3′)</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GTCAGTCAGA
<underline>CCGGAAGTGACGTCA</underline>
TATCGGTCAG</monospace>
</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS-1−CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GTCAGTCAGA
<underline>CCGGAATGACGTCA</underline>
TATCGGTCAGT</monospace>
</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS+1−CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>TCAGTCAGA
<underline>CCGGAAGTTGACGTCA</underline>
TATCGGTCAG</monospace>
</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS+2−CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>TCAGTCAGA
<underline>CCGGAAGTGTGACGTCA</underline>
TATCGGTCA</monospace>
</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS+3−CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CAGTCAGA
<underline>CCGGAAGTGGTGACGTCA</underline>
TATCGGTCA</monospace>
</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETSm−CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GTCAGTCAGA
<underline>GGCCAAGTGACGTCA</underline>
TATCGGTCAG</monospace>
</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS−CREm</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GTCAGTCAGA
<underline>CCGGAAGTGTGCACA</underline>
TATCGGTCAG</monospace>
</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETSm−CREm</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GTCAGTCAGA
<underline>GGCCAAGTGTGCACA</underline>
TATCGGTCAG</monospace>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s6">
<title>Motif enrichment using ChIP-seq peaks</title>
<p>For motif analysis, we used published 6442 GABPα ChIP-seq peaks from human Jurket cell line (
<xref rid="bib48" ref-type="bibr">Valouev
<italic>et al.</italic>
2008</xref>
) and 3998 CREB ChIP-seq peaks from mouse in GC1 cells (
<xref rid="bib27" ref-type="bibr">Martianov
<italic>et al.</italic>
2010</xref>
). For motif detection, we used MEME (
<xref rid="bib25" ref-type="bibr">Machanick & Bailey 2011</xref>
) and the peak-motifs package of the Regulatory Sequence Analysis Tools (RSAT) (
<xref rid="bib46" ref-type="bibr">Thomas-Chollier
<italic>et al.</italic>
2011</xref>
). Two thousand eight hundred thirty-four (2834) CREB binding promoters, which were obtained from the ChIP-chip data on human HEK293T cells in three time points (
<xref rid="bib54" ref-type="bibr">Zhang
<italic>et al.</italic>
2005</xref>
), were mapped to human (hg18), which successfully resulted in 2384 promoters bound by CREB. For
<italic>de novo</italic>
motif prediction, we used 1463 common binding regions of human CREB ChIP-chip and GABPα ChIP-seq data.</p>
</sec>
<sec id="s7">
<title>PhyloP conservation</title>
<p>Base by base PhyloP score or the
<italic>P</italic>
-values for conservation or acceleration
<italic>P</italic>
-values based on an alignment and a model of neutral evolution among the 36 mammalian genomes were (
<xref rid="bib38" ref-type="bibr">Pollard
<italic>et al.</italic>
2010</xref>
) downloaded from UCSC database (
<ext-link ext-link-type="uri" xlink:href="http://genome.ucsc.edu/">http://genome.ucsc.edu/</ext-link>
). PhyloP scores for each nucleotide in the motif, including 15-bps upstream and 15-bps downstream of each occurrence in the genome, were averaged for all occurrences of each motif.</p>
</sec>
<sec id="s8">
<title>Gene Ontology analysis</title>
<p>Gene Ontology (GO) analysis was performed using DAVID (
<ext-link ext-link-type="uri" xlink:href="http://david.abcc.ncifcrf.gov/">http://david.abcc.ncifcrf.gov/</ext-link>
). Go terms with
<italic>P</italic>
-values < 0.01 were considered as significantly enriched GO terms. Additionally, Benjamini-Hochberg corrected
<italic>P</italic>
-values < 0.01 were considered for the analysis with
<italic>in vivo</italic>
ChIP data.</p>
</sec>
</sec>
<sec sec-type="results" id="s9">
<title>Results</title>
<sec id="s10">
<title>Split 8-mers that localize in human proximal promoters</title>
<p>We aligned human promoters relative to the TSS and determined the distribution of split 8-mers in the promoter region. The split 8-mers consist of two 4-mers separated by 1-bp to 30-bps (X
<sub>4</sub>
-N
<sub>1-30</sub>
-X
<sub>4</sub>
). We considered the promoter region from −1000-bps to +500-bps relative to the TSS and divided the 1500-bp region into 75 bins of 20-bps each. We used a human DNA promoter sequence set obtained from UCSC and removed promoters containing repetitive sequences, resulting in a set of 17,143 promoter sequences (see
<italic>Materials and Methods</italic>
). The distribution of each split 8-mer in promoters was determined and a measure of non-uniform distribution termed “localization factor” (LF) was calculated (
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
). The statistical significance of the non-random distribution of LF was determined by calculating a probability value (
<italic>P</italic>
-value) for each split 8-mer.</p>
<p>Many continuous 8-mers (X
<sub>4</sub>
-N
<sub>0</sub>
-X
<sub>4</sub>
) are enriched in proximal promoters (−120-bps to the TSS) (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/004002SI.pdf">supporting information</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS1.pdf">Figure S1</ext-link>
, A and B, and
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS1.pdf">Table S1</ext-link>
) (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
,
<xref rid="bib11" ref-type="bibr">2006</xref>
;
<xref rid="bib53" ref-type="bibr">Xie
<italic>et al.</italic>
2005</xref>
;
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
). In contrast, fewer split 8-mers with an insert length of 4-bps (X
<sub>4</sub>
-N
<sub>4</sub>
-X
<sub>4</sub>
) localize in proximal promoters (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
;
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
) (
<xref ref-type="fig" rid="fig1">Figure 1, A and B</xref>
). As insert length increases, preferential localization of split 8-mers in the proximal promoter decreases for both CG- and non-CG split 8-mers and is much more pronounced for the non-CG 8-mers (
<xref ref-type="fig" rid="fig1">Figure 1, C and D</xref>
).</p>
<fig id="fig1" fig-type="figure" position="float">
<label>Figure 1 </label>
<caption>
<p>(A and B) LF and probability for split 8-mers with a 4-bp insert (X
<sub>4</sub>
-N
<sub>4</sub>
-X
<sub>4</sub>
). (C and D) For each 8-mer (X
<sub>4</sub>
-N
<sub>0-30</sub>
-X
<sub>4</sub>
), we determine which insert length produced the largest LF and plot that value in the column representing that insert length. (C) LF for the 12,547 continuous 8-mers and 10,951 split 8-mers containing the CG dinucleotide. We plot that –log
<italic>P</italic>
-value at the insert length with the highest LF. (D) Same as (C) but for all non-CG containing 8-mers, the 20,349 continuous 8-mers, and 21,945 split 8-mers with insert length from 1-bp to 30-bps.</p>
</caption>
<graphic xlink:href="1243f1"></graphic>
</fig>
<p>The most localizing split 8-mer sequences with an insert length of 1-bps and 2-bps both represent the CRE motif (
<xref ref-type="fig" rid="fig1">Figure 1D</xref>
and
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS1.pdf">Table S1</ext-link>
), suggesting that the CRE is 10-bps long (GTGACGTCAC). The most localizing sequence with both a 3-bps and 4-bps insert are a CG-rich 4-mer followed by TATA (CCGG-N
<sub>3</sub>
-TATA and GCCG-N
<sub>4</sub>
-TATA), sequences previously identified that function in proximal promoters (
<xref rid="bib24" ref-type="bibr">Lagrange
<italic>et al.</italic>
1998</xref>
). These split 4-mers are not strand specific, indicating that the CG-rich 4-mer can be either before or after the strand-specific TATAA (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
). Virtually all the localizing split 8-mers with an insert length of 5-bps or more contain the CG dinucleotide (
<xref ref-type="fig" rid="fig1">Figure 1, C and D</xref>
). The 20 most localizing split 8-mers with insert length of 0-bps, 2-bps, 4-bps, and 5-bps to 30-bps are presented in
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS1.pdf">Table S1</ext-link>
.</p>
</sec>
<sec id="s11">
<title>Split 8-mers that localize in promoters at a unique insert length</title>
<p>The split 8-mers that localize in proximal promoters were grouped into three classes (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS1.pdf">Table S1</ext-link>
): (i) split 8-mers with a short insert length of 1-bps or 2-bps representing a single TFBS (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS2.pdf">Figure S2</ext-link>
, A–D); (ii) split 8-mers that localize in proximal promoters at many insert lengths representing co-localizing TFBS, each represented by a single 4-mer (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS2.pdf">Figure S2</ext-link>
, E–H); and (iii) split 8-mers that localize in proximal promoters at a specific insert length. These include CGGA-N
<sub>4</sub>
-ACGT, which represents an ETS motif and a CRE motif, and unidentified sequences;
<italic>e.g.</italic>
GGGA-N
<sub>2</sub>
-TGTA (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS2.pdf">Figure S2</ext-link>
, I and J).</p>
<p>To identify split 8-mers that localize in proximal promoters at only a precise insert length, the max LF for all split 8-mers with insert lengths from 0-bps to 30-bps (X
<sub>4</sub>
-N
<sub>1-30</sub>
-X
<sub>4</sub>
) were determined and compared with the ratio of max LF to the second highest LF (
<xref ref-type="fig" rid="fig2">Figure 2, A and B</xref>
). A close to 1 ratio of max LF to the second highest LF indicates localization of split 8-mers at various insert lengths, whereas a ratio with higher values is indicative of split 8-mers that are localized at a precise insert length. Both kinds of sequences are observed for 8-mers with a high LF. To identify the insert length that produces the precisely positioned pairs of 4-mers, we examined each insert length. Continuous 8-mers (X
<sub>4</sub>
-N
<sub>0</sub>
-X
<sub>4</sub>
) have many sequences with a high LF and large ratio (LF(MAX)/LF(MAX-1). These sequences are the TFBS previously described that localize in proximal promoters (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
). The two 4-mers (TGAC and GTCA) that create the CRE (TGACGTCA) motif preferentially localize in promoters when the insert length is 0-bps (
<xref ref-type="fig" rid="fig2">Figure 2D</xref>
and
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS2.pdf">Figure S2</ext-link>
, A and B). Similar results were obtained for the ETS motif (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS2.pdf">Figure S2</ext-link>
, C and D). When we examined split 8-mers with an insert length of 2-bps, fewer 8-mers had both a high LF and ratio (
<xref ref-type="fig" rid="fig2">Figure 2, E and F</xref>
). These include GTGA-N
<sub>2</sub>
-TCAC, representing the CRE; CGGA-N
<sub>2</sub>
-TGAC, representing overlapping ETS and CRE TFBS (ETS⇔CRE) (CGGA
<italic>AG</italic>
TGAC); and GGAA-N
<sub>2</sub>
-GGAA, representing an ETS motif overlapping with a second ETS motif (ETS⇔ETS) (GGAA
<italic>GC</italic>
GGAA) (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS1.pdf">Table S1</ext-link>
and
<xref ref-type="fig" rid="fig2">Figure 2</xref>
). A systematic analysis of the human promoters using comparative genomics for the detection of regulatory motifs also identified an unannotated motif GGAANCGGAANY (
<xref rid="bib53" ref-type="bibr">Xie
<italic>et al.</italic>
2005</xref>
), which is essentially the ETS⇔ETS motif. Insert length of 4-bps produced even fewer sequences that are precisely localized (
<xref ref-type="fig" rid="fig2">Figure 2, G and H</xref>
). Insert length of 5-bps to 30-bps identified many 8-mers with a high LF but a low ratio, indicating that they are co-occurring in promoters at many insert lengths (
<xref ref-type="fig" rid="fig2">Figure 2, I and J</xref>
).</p>
<fig id="fig2" fig-type="figure" position="float">
<label>Figure 2 </label>
<caption>
<p>Identification of split 8-mers that localize in promoters only at a unique insert length. (A) The maximum LF for all 8-mers (X
<sub>4</sub>
-N
<sub>0-30</sub>
-X
<sub>4</sub>
) is plotted on the horizontal axis
<italic>vs.</italic>
maximum LF for 8-mers with an insert length from 0-bps to 30-bps [LF(Max)] divided by the second highest LF [LF(Max-1)]. The points at the top right of the plot represent 8-mers that localize in promoters only at one inset length. (B) Probability (P) (
<italic>P</italic>
= 10
<sup>–x</sup>
) of the LF being non-random. To identify the insert length that produces unique localization in promoters, the horizontal axis shows the probability of LF for split 8-mers for specific insert lengths. (C and D) Localization of continuous 8-mers (X
<sub>4</sub>
-N
<sub>0</sub>
-X
<sub>4</sub>
) in proximal promoters only when the inset length is 0-bps. (E and F) Localization of split 8-mers with insert length of 2-bps (X
<sub>4</sub>
-N
<sub>2</sub>
-X
<sub>4</sub>
) in proximal promoters only when the inset length is 2-bps. (G and H) Localization of split 8-mers with insert length of 4-bps (X
<sub>4</sub>
-N
<sub>4</sub>
-X
<sub>4</sub>
) in proximal promoters only when the inset length is 4-bps. (I and J) Unique localization of split 8-mers with insert length ranging from 5-bps to 30-bps (X
<sub>4</sub>
-N
<sub>5-30</sub>
-X
<sub>4</sub>
).</p>
</caption>
<graphic xlink:href="1243f2"></graphic>
</fig>
<p>This analysis identified many split 8-mers with distinctive distributions; we focused our analysis on the overlapping ETS and CRE motifs. The distribution of the ETS⇔CRE motif split 8-mer CGGA-N
<sub>4</sub>
-ACGT shows localization in proximal promoters (
<xref ref-type="fig" rid="fig3">Figure 3A</xref>
). The split 8-mer CGGA-N
<sub>0-30</sub>
-ACGT preferentially localizes in proximal promoters when separated by 4-bps, with the continuous 12-mer CGGAA
<bold>GTG</bold>
ACGT being the most localizing and abundant (
<xref ref-type="fig" rid="fig3">Figure 3, A and B</xref>
). More modest localization is observed at 20-bps and 22-bps, which has not been evaluated. This sequence contains both the ETS motif (CGGAA
<bold>GTG)</bold>
and the CRE motif (
<bold>GTG</bold>
ACGT). The
<bold>GTG</bold>
trinucleotide is common to both the ETS and CRE motifs. These TFBS overlap to produce the ETS⇔CRE motif. The full ETS⇔CRE motif would be the two 16-mers
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACGTCAC that occur five times in the human genome (
<xref ref-type="table" rid="t2">Table 2</xref>
). There are more than 4×10
<sup>9</sup>
16-mers, and thus, each 16-mer would be expected to occur by chance only about once in a vertebrate genome of ∼3×10
<sup>9</sup>
bps.</p>
<fig id="fig3" fig-type="figure" position="float">
<label>Figure 3 </label>
<caption>
<p>The ETS⇔CRE motif. (A) Distribution of the split 8-mer CGGA-N
<sub>4</sub>
-ACGT and the 12-mer CGGAA
<bold>GTG</bold>
ACGT in human promoters. (B) LF for CGGA-N
<sub>4</sub>
-ACGT from insert size of 0-bps to 30-bps. (C) Space-filling model of ETS and CREB proteins binding to ETS⇔CRE (GCGGAA
<bold>GTG</bold>
ACGTCA). Note the 3-bp overlap of the two TFBS. (D and E) Ribbon presentation of ETS and CREB proteins binding to ETS⇔CRE motif from the side and top relative to DNA.</p>
</caption>
<graphic xlink:href="1243f3"></graphic>
</fig>
<table-wrap id="t2" position="float">
<label>Table 2</label>
<caption>
<title>Occurrence of specific motifs in human genome, promoters, CpG islands, and housekeeping DHS regions</title>
</caption>
<table frame="hsides" rules="groups">
<col width="9.51%" span="1"></col>
<col width="6.29%" span="1"></col>
<col width="19.14%" span="1"></col>
<col width="8.78%" span="1"></col>
<col width="10.22%" span="1"></col>
<col width="8.78%" span="1"></col>
<col width="9.5%" span="1"></col>
<col width="8.78%" span="1"></col>
<col width="9.5%" span="1"></col>
<col width="9.5%" span="1"></col>
<thead>
<tr>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1"></th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1"></th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1"></th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Whole Genome</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Promoter</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Proximal Promoter</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">CpG Islands</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Housekeeping DHS</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">All DHS</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Tissue-specific DHS</th>
</tr>
<tr>
<th valign="top" align="left" scope="col" rowspan="1" colspan="1">Motifs</th>
<th valign="top" align="left" scope="col" rowspan="1" colspan="1">N-mers</th>
<th valign="top" align="left" scope="col" rowspan="1" colspan="1">DNA Sequence</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1"># Unmasked (100%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">(−1000…500) (0.8%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">(−200…60) (0.1%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">(0.7%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">(0.2%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">(8.9%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">(8.7%)</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">8-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>TGACGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">10,355</td>
<td valign="top" align="center" rowspan="1" colspan="1">713 (7%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">431 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">757 (7%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">458 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">3,110 (30%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">2,652 (26%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">9-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GTGACGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">4,301</td>
<td valign="top" align="center" rowspan="1" colspan="1">654 (15%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">427 (10%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">772 (18%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">449 (10%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,890 (44%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,441 (34%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">10-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GTGACGTCAC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">644</td>
<td valign="top" align="center" rowspan="1" colspan="1">167 (26%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">116 (18%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">217 (34%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">117 (18%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">356 (55%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">239 (37%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS</td>
<td valign="top" align="center" rowspan="1" colspan="1">8-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">13,975</td>
<td valign="top" align="center" rowspan="1" colspan="1">1,654 (12%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,030 (7%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,611 (12%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,136 (8%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">4,384 (31%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">3,248 (23%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS</td>
<td valign="top" align="center" rowspan="1" colspan="1">8-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTG</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">16,846</td>
<td valign="top" align="center" rowspan="1" colspan="1">1,631 (10%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">980 (6%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,761 (10%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,073 (6%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">5,068 (30%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">3,997 (24%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS</td>
<td valign="top" align="center" rowspan="1" colspan="1">9-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGTG</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">4,913</td>
<td valign="top" align="center" rowspan="1" colspan="1">868 (18%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">568 (12%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">852 (17%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">587 (12%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,887 (38%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,300 (26%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS</td>
<td valign="top" align="center" rowspan="1" colspan="1">9-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GCGGAAGTG</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">4,010</td>
<td valign="top" align="center" rowspan="1" colspan="1">469 (12%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">282 (7%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">563 (14%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">316 (8%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,370 (34%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,054 (26%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS</td>
<td valign="top" align="center" rowspan="1" colspan="1">9-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">4,675</td>
<td valign="top" align="center" rowspan="1" colspan="1">465 (10%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">298 (6%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">446 (10%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">343 (7%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,456 (31%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1,113 (24%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS</td>
<td valign="top" align="center" rowspan="1" colspan="1">10-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGAC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">1,030</td>
<td valign="top" align="center" rowspan="1" colspan="1">227 (22%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">162 (16%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">227 (22%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">180 (17%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">458 (44%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">278 (27%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">11-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACG</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">226</td>
<td valign="top" align="center" rowspan="1" colspan="1">157 (69%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">124 (55%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">164 (73%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">134 (59%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">186 (82%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">52 (23%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔?</td>
<td valign="top" align="center" rowspan="1" colspan="1">11-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">335</td>
<td valign="top" align="center" rowspan="1" colspan="1">13 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">9 (3%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">12 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">9 (3%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">88 (26%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">79 (24%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔?</td>
<td valign="top" align="center" rowspan="1" colspan="1">11-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">197</td>
<td valign="top" align="center" rowspan="1" colspan="1">21 (11%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">7 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">23 (12%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">18 (9%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">71 (36%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">53 (27%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔AP1</td>
<td valign="top" align="center" rowspan="1" colspan="1">11-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">267</td>
<td valign="top" align="center" rowspan="1" colspan="1">36 (13%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">12 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">28 (10%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">19 (7%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">111 (42%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">92 (34%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔AP1</td>
<td valign="top" align="center" rowspan="1" colspan="1">11-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGAGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">250</td>
<td valign="top" align="center" rowspan="1" colspan="1">20 (8%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">11 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">19 (8%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">12 (5%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">91 (36%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">79 (32%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">12-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">93</td>
<td valign="top" align="center" rowspan="1" colspan="1">70 (75%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">53 (57%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">71 (76%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">60 (65%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">84 (90%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">24 (26%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">12-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACGC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">81</td>
<td valign="top" align="center" rowspan="1" colspan="1">62 (77%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">53 (65%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">67 (83%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">53 (65%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">68 (84%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">15 (19%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GGAAGTGACGTC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">33</td>
<td valign="top" align="center" rowspan="1" colspan="1">23 (70%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">17 (52%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">25 (76%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">19 (58%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">29 (88%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">10 (30%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGTGACGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">35</td>
<td valign="top" align="center" rowspan="1" colspan="1">26 (74%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">17 (49%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">27 (77%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">22 (63%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">34 (97%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">12 (34%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GCGGAAGTGACGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">32</td>
<td valign="top" align="center" rowspan="1" colspan="1">28 (88%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">25 (78%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">25 (78%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">24 (75%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">29 (91%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">5 (16%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGTGACGC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">52</td>
<td valign="top" align="center" rowspan="1" colspan="1">42 (81%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">36 (69%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">44 (85%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">32 (62%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">46 (88%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">14 (27%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GCGGAAGTGACGC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">19</td>
<td valign="top" align="center" rowspan="1" colspan="1">15 (79%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">12 (63%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">14 (74%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">14 (74%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">15 (79%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (5%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔AP1</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">17</td>
<td valign="top" align="center" rowspan="1" colspan="1">3 (18%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">13 (76%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">13 (76%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔AP1</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGAGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">22</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">16 (73%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">16 (73%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">14-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">18</td>
<td valign="top" align="center" rowspan="1" colspan="1">13 (72%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">11 (61%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">15 (83%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">12 (67%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">18 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">6 (33%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">15-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACGTCAC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">7</td>
<td valign="top" align="center" rowspan="1" colspan="1">5 (71%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">4 (57%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">6 (86%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">4 (57%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">7 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">3 (43%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">15-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGTGACGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">7</td>
<td valign="top" align="center" rowspan="1" colspan="1">4 (57%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">2 (29%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">4 (57%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">3 (43%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">7 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">4 (57%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">15-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GCGGAAGTGACGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">8</td>
<td valign="top" align="center" rowspan="1" colspan="1">7 (88%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">7 (88%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">8 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">7 (88%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">8 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (13%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">16-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGTGACGTCAC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">3</td>
<td valign="top" align="center" rowspan="1" colspan="1">2 (67%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (33%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">2 (67%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (33%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">3 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">2 (67%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">16-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GCGGAAGTGACGTCAC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">2</td>
<td valign="top" align="center" rowspan="1" colspan="1">2 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">2 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">2 (100%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">2 (100%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">2 (100%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">N
<sub>1</sub>
CGN
<sub>7</sub>
CG</td>
<td valign="top" align="center" rowspan="1" colspan="1">12-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>ACGCACACACCG</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">45</td>
<td valign="top" align="center" rowspan="1" colspan="1">5 (11%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (2%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">3 (7%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (2%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">19 (42%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">18 (40%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">N
<sub>2</sub>
CGN
<sub>7</sub>
CG</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CACGCACACACCG</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">27</td>
<td valign="top" align="center" rowspan="1" colspan="1">2 (7%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (4%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">1 (4%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">0 (0%)</td>
<td valign="top" align="center" rowspan="1" colspan="1">9 (33%)</td>
<td valign="top" align="char" char="(" rowspan="1" colspan="1">9 (33%)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Two versions of the ETS motif that localize in proximal promoters differ only in the first nucleotide, the more common CCGGAA and the rarer GCGGAA (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS3.pdf">Figure S3</ext-link>
A) (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
). DNA binding specificities of the 27 human ETS family members identify three proteins (SPI1, SPIB, and SPIC) that preferentially bind the rarer ETS motif (
<xref rid="bib21" ref-type="bibr">Kaplan
<italic>et al.</italic>
2010</xref>
). The rarer GCGGAA ETS motif is enriched compared with the CCGGAA motif in the ETS⇔CRE motif (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS3.pdf">Figure S3</ext-link>
B).</p>
</sec>
<sec id="s12">
<title>Molecular model of ETS⇔CRE motif bound by DNA</title>
<p>To evaluate the potential for simultaneous binding of three proteins (ETS monomer and CREB dimer) to the ETS⇔CRE motif, we built a molecular model using PDB files of the ETS1 protein bound to an ETS site (PDB ID: 1K79) (
<xref rid="bib13" ref-type="bibr">Garvie
<italic>et al.</italic>
2001</xref>
) and the CREB dimer bound to the CRE (PDB ID: 1DH3) (
<xref rid="bib44" ref-type="bibr">Schumacher
<italic>et al.</italic>
2000</xref>
). The two structures were aligned computationally after superimposing 10 DNA bases on each strand of DNA. The combined structure did not produce protein clashes, suggesting that both proteins could potentially bind the ETS⇔CRE motif simultaneously (
<xref ref-type="fig" rid="fig3">Figure 3, C–E</xref>
). The
<bold>GTG</bold>
trinucleotide, which is common to both the ETS and CRE motifs, interacts with both proteins in the model. The ETS domain, a winged helix-turn-helix protein fold, interacts with the major groove using an α-helix to bind the core GGAA of the motif. It also crosses the phosphate backbone and interacts with the minor groove of the
<bold>GTG</bold>
trinucleotide (
<xref rid="bib16" ref-type="bibr">Hollenhorst
<italic>et al.</italic>
2011b</xref>
). The CREB dimer interacts with the
<bold>GTG</bold>
trinucleotide in the major groove and never crosses the DNA backbone.</p>
</sec>
<sec id="s13">
<title>The ETS protein GABPα and the B-ZIP protein CREB preferentially bind to ETS⇔CRE</title>
<p>EMSA was used to investigate whether ETS and B-ZIP proteins could simultaneously bind the ETS⇔CRE motif (
<xref ref-type="table" rid="t1">Table 1</xref>
). In the EMSA experiments, we used the B-ZIP protein CREB to bind the CRE motif and the ETS proteins GABPα or ETV5 to bind the ETS motif (
<xref ref-type="fig" rid="fig4">Figure 4</xref>
). Eight DNA probes were examined. Three DNA probes contained mutations in either or both motifs that abolished protein binding to the expected TFBS (
<xref ref-type="fig" rid="fig4">Figure 4A</xref>
). Five DNA probes examined the spacing between the two motifs; one probe has a deletion of 1-bp and three DNA probes have an insert of 1-bps, 2-bps, or 3-bps between the ETS and CRE motifs. CREB bound well at 10 nM (
<xref rid="bib1" ref-type="bibr">Ahn
<italic>et al.</italic>
1998</xref>
), whereas GABPα binding was weaker, being detectable at 200 nM. When GABPα and CREB were mixed, GABPα binding was enhanced only on the DNA probe containing the ETS⇔CRE motif (compare lane 17 with lane 9 of
<xref ref-type="fig" rid="fig4">Figure 4A</xref>
). None of the deletion or insertion probes form the CREB|GABPα|DNA complex (lanes 18–24,
<xref ref-type="fig" rid="fig4">Figure 4A</xref>
). Supershift experiments demonstrated that both GABPα and CREB proteins were present in the complex formed only on the ETS⇔CRE motif containing DNA probe (
<xref ref-type="fig" rid="fig4">Figure 4A</xref>
), suggesting that this specific overlap of three base pairs between ETS and CRE motifs is important for binding by both GABPα and CREB. Importantly, the ETV5 member of the ETS family formed neither the CREB|ETV5|DNA complexes nor the CREB|DNA or ETV5|DNA complex forms (
<xref ref-type="fig" rid="fig4">Figure 4B</xref>
). A dose-response EMSA showed that binding of one protein precludes the binding of another protein. Even when we saturated the probes with higher concentrations of ETV5 or CREB proteins, no CREB|ETV5|DNA complex was observed.</p>
<fig id="fig4" fig-type="figure" position="float">
<label>Figure 4 </label>
<caption>
<p>EMSA showing preferential DNA binding of the ETS protein GABPα and B-ZIP protein CREB to the ETS⇔CRE sequence (GCGGAA
<bold>GTG</bold>
ACGTCA). (A) Left panel: The DNA binding domain of GABPα with N-terminal GST tag and the B-ZIP domain of CREB were
<italic>in vitro</italic>
translated alone or together, and subjected to EMSA with eight DNA probes (
<xref ref-type="table" rid="t1">Table 1</xref>
). Lanes 1–8, 3 nM CREB; lanes 9–16, 200 nM GABPα; lanes 17–24, 3 nM CREB and 200 nM GABPα. Right panel: Supershift experiment demonstrates that the indicated CREB-GABPα-DNA complex contains both CREB and GABPα. Lanes 1, 4, and 5,
<italic>in vitro</italic>
translated 3 nM CREB and 200 nM GABPα; lane 2, no protein; lane 3,
<italic>in vitro</italic>
translation without protein-encoding DNA.*GABPα-DNA complex;
<sup></sup>
CREB-GABPα-DNA complex. (B) A dose response EMSA of the ETS protein ETV5 and B-ZIP protein CREB binding to the ETS⇔CRE sequence (GCGGAA
<bold>GTG</bold>
ACGTCA). Increasing concentrations of ETV5 (1.3, 4, 12.5, 40, and 125 nM) or CREB (1, 3, 10, 30, and 100 nM) alone shows dose-responsive binding (lanes 2–6 and lanes 7–11) to the ETS⇔CRE motif. Increasing concentrations of ETV5 with fixed concentrations of CREB shows that both proteins cannot simultaneously bind to the ETS⇔CRE motif. (C) Enriched motifs generated using the peak-motifs package of Regulatory Sequence Analysis Tools (RSAT). For
<italic>de novo</italic>
motif detection, we used all 6442 human GABPα ChIP-seq peaks (
<xref rid="bib48" ref-type="bibr">Valouev
<italic>et al.</italic>
2008</xref>
) and all 3998 mouse CREB ChIP-seq peaks (
<xref rid="bib27" ref-type="bibr">Martianov
<italic>et al.</italic>
2010</xref>
) as input sequences. In CREB ChIP-seq peaks, the most enriched motif is the canonical CRE, and ETS⇔CRE motif is among the other significantly enriched motifs. In GABPα ChIP-seq peaks, ETS motif is the primary enriched motif, and ETS⇔ETS is among the other enriched motifs.
<italic>De novo</italic>
motif detection using all 2953 ETS motif–containing regions predicted ETS⇔CRE as the best-enriched motif.
<italic>De novo</italic>
motif detection using 1453 commonly bound region by CREB and GABPα predicted ETS⇔CRE as the best-enriched motif. ETS⇔CRE⇔ETS is one of the other enriched motifs in these regions. The number of sites below each motif indicates the number of peaks that have at least one predicted motif.</p>
</caption>
<graphic xlink:href="1243f4"></graphic>
</fig>
</sec>
<sec id="s14">
<title>Motif detection in CREB and GABPα ChIP-seq peaks</title>
<p>We examined published ChIP-seq data sets for GABPα (
<xref rid="bib48" ref-type="bibr">Valouev
<italic>et al.</italic>
2008</xref>
) in humans and CREB in mouse (
<xref rid="bib27" ref-type="bibr">Martianov
<italic>et al.</italic>
2010</xref>
) to determine whether the ETS⇔CRE motif is enriched in the ChIP-seq peaks. The peak-motif package (
<xref rid="bib46" ref-type="bibr">Thomas-Chollier
<italic>et al.</italic>
2011</xref>
) of RSAT was used for evaluating the enriched motifs in these ChIP-seq regions. Using all CREB peak regions, the peak-motif identified the overlapping ETS⇔CRE motif, which is more enriched than the canonical CRE motif (
<xref ref-type="fig" rid="fig4">Figure 4C</xref>
and
<xref ref-type="table" rid="t3">Table 3</xref>
). When we used only the GABPα ChIP-seq peaks for
<italic>de novo</italic>
motif detection, we identified the canonical ETS and the ETS⇔ETS motif, but not the ETS⇔CRE motif. However, when we examined the 2953 peaks that contain the canonical ETS motif, we detected that the ETS⇔CRE motif is the best-enriched motif (
<xref ref-type="fig" rid="fig4">Figure 4C</xref>
).</p>
<table-wrap id="t3" position="float">
<label>Table 3</label>
<caption>
<title>Enrichment of ETS, CRE and ETS⇔CRE motifs in CREB and GABPα ChIP-seq peaks</title>
</caption>
<table frame="hsides" rules="groups">
<col width="12.04%" span="1"></col>
<col width="8.87%" span="1"></col>
<col width="24.98%" span="1"></col>
<col width="10.47%" span="1"></col>
<col width="14.92%" span="1"></col>
<col width="13.52%" span="1"></col>
<col width="15.2%" span="1"></col>
<thead>
<tr>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Motifs</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">N-mers</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">DNA Sequence</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Mouse Whole Genome (100%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">CREB ChIP-seq Peaks (%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Human Whole Genome (100%)</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">GABPα ChIP-seq Peaks (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS</td>
<td valign="top" align="center" rowspan="1" colspan="1">8-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">16,346</td>
<td valign="top" align="center" rowspan="1" colspan="1">652 (4%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">14,031</td>
<td valign="top" align="center" rowspan="1" colspan="1">1459 (10%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">8-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>TGACGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">14,297</td>
<td valign="top" align="center" rowspan="1" colspan="1">591 (4%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">10,389</td>
<td valign="top" align="center" rowspan="1" colspan="1">180 (2%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">11-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACG</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">238</td>
<td valign="top" align="center" rowspan="1" colspan="1">118 (50%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">226</td>
<td valign="top" align="center" rowspan="1" colspan="1">179 (79%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">12-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">89</td>
<td valign="top" align="center" rowspan="1" colspan="1">51 (57%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">93</td>
<td valign="top" align="center" rowspan="1" colspan="1">80 (86%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">12-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACGC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">82</td>
<td valign="top" align="center" rowspan="1" colspan="1">48 (59%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">81</td>
<td valign="top" align="center" rowspan="1" colspan="1">67 (83%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGTGACGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">45</td>
<td valign="top" align="center" rowspan="1" colspan="1">25 (56%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">35</td>
<td valign="top" align="center" rowspan="1" colspan="1">35 (100%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GCGGAAGTGACGT</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">21</td>
<td valign="top" align="center" rowspan="1" colspan="1">15 (71%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">32</td>
<td valign="top" align="center" rowspan="1" colspan="1">28 (88%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CCGGAAGTGACGC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">42</td>
<td valign="top" align="center" rowspan="1" colspan="1">27 (64%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">52</td>
<td valign="top" align="center" rowspan="1" colspan="1">42 (81%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">13-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>GCGGAAGTGACGC</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">28</td>
<td valign="top" align="center" rowspan="1" colspan="1">21 (75%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">19</td>
<td valign="top" align="center" rowspan="1" colspan="1">17 (90%)</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">ETS⇔CRE</td>
<td valign="top" align="center" rowspan="1" colspan="1">15-mer</td>
<td valign="top" align="center" rowspan="1" colspan="1">
<monospace>CGGAAGTGACGTCA</monospace>
</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">12</td>
<td valign="top" align="center" rowspan="1" colspan="1">8 (67%)</td>
<td valign="top" align="char" char="." rowspan="1" colspan="1">7</td>
<td valign="top" align="center" rowspan="1" colspan="1">5 (71%)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>An additional analysis used the GABPα ChIP-seq data already described from human Jurkat cell line and CREB ChIP-chip data from human HEK293T cells (
<xref rid="bib54" ref-type="bibr">Zhang
<italic>et al.</italic>
2005</xref>
). One thousand four hundred sixty-three (1463) peaks are common between CREB and GABPα binding sites.
<italic>De novo</italic>
motif detection using these regions by peak-motif detected ETS⇔CRE motif as the best-enriched motif (
<xref ref-type="fig" rid="fig4">Figure 4C</xref>
). Interestingly, among the other enriched motifs, we observed a palindromic ETS⇔CRE⇔ETS motif, in which the second ETS canonical motif is in the opposite strand (
<xref ref-type="fig" rid="fig4">Figure 4C</xref>
), suggesting the biological significance of the coordinated regulation of ETS and CREB in regulating the gene expression. The promoters with ETS⇔CRE, obtained from the commonly bound regions by CREB and GABPα, are significantly enriched for the GO terms of proteolysis involved in macromolecule catabolic process, RNA processing, and cellular response to stress (
<xref ref-type="table" rid="t4">Table 4</xref>
). However, the MEME-ChIP package (
<xref rid="bib25" ref-type="bibr">Machanick & Bailey 2011</xref>
) of the MEME Suite failed to detect the ETS⇔CRE motif as an enriched motif in any data set.</p>
<table-wrap id="t4" position="float">
<label>Table 4</label>
<caption>
<title>Enriched GO terms of genes commonly bound by CREB and GABPα with ETS⇔CRE motifs</title>
</caption>
<table frame="hsides" rules="groups">
<col width="16.73%" span="1"></col>
<col width="39.04%" span="1"></col>
<col width="10.1%" span="1"></col>
<col width="13.88%" span="1"></col>
<col width="20.25%" span="1"></col>
<thead>
<tr>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Term</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Name</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Count</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">
<italic>P</italic>
</th>
<th valign="top" align="center" scope="col" rowspan="1" colspan="1">Corrected
<italic>P</italic>
(Benjamini)</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">GO:0006396</td>
<td valign="top" align="center" rowspan="1" colspan="1">RNA processing</td>
<td valign="top" align="center" rowspan="1" colspan="1">32</td>
<td valign="top" align="center" rowspan="1" colspan="1">7.5E-09</td>
<td valign="top" align="center" rowspan="1" colspan="1">9.8E-06</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">GO:0009057</td>
<td valign="top" align="center" rowspan="1" colspan="1">Macromolecule catabolic process</td>
<td valign="top" align="center" rowspan="1" colspan="1">31</td>
<td valign="top" align="center" rowspan="1" colspan="1">4.2E-05</td>
<td valign="top" align="center" rowspan="1" colspan="1">7.9E-03</td>
</tr>
<tr>
<td valign="top" align="left" scope="row" rowspan="1" colspan="1">GO:0033554</td>
<td valign="top" align="center" rowspan="1" colspan="1">Cellular response to stress</td>
<td valign="top" align="center" rowspan="1" colspan="1">25</td>
<td valign="top" align="center" rowspan="1" colspan="1">5.7E-05</td>
<td valign="top" align="center" rowspan="1" colspan="1">9.4E-03</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s15">
<title>Length of ETS⇔CRE motif</title>
<p>Two strategies were used to evaluate the length of the ETS⇔CRE motif: (i) enrichment in 8000 housekeeping DNase I hypersensitive sites (DHS) (
<xref rid="bib42" ref-type="bibr">Sabo
<italic>et al.</italic>
2004</xref>
) and (ii) conservation in mammalian genomes.</p>
<p>We extended the ETS motif 8-mer CGGAAGTG toward the CRE (
<xref ref-type="fig" rid="fig5">Figure 5A</xref>
) and counted the occurrences in the genome and known regulatory regions, including annotated promoters, proximal promoters, CpG islands, housekeeping DHS, and all DHS identified in 45 cell types (
<xref rid="bib42" ref-type="bibr">Sabo
<italic>et al.</italic>
2004</xref>
) (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS2.pdf">Table S2</ext-link>
). The housekeeping DHS are defined as the DNase hypersensitive regions that are present in all 45 cell types (
<xref rid="bib42" ref-type="bibr">Sabo
<italic>et al.</italic>
2004</xref>
). The ETS 8-mer CGGAAGTG occurs 16,846 times in the genome and 6% of them are in housekeeping DHS. Similar results were observed when the motif is extended to the 9-mer (CGGAAGTGA) and 10-mer (CGGAAGTGAC). A transition occurs with the 11-mer (CGGAAGTGACG), with 60% occurring in housekeeping DHS and 83% occurring in known regulatory regions (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS2.pdf">Table S2</ext-link>
). The 11-mer contains two CG dinucleotides, which are rare outside of regulatory regions.</p>
<fig id="fig5" fig-type="figure" position="float">
<label>Figure 5 </label>
<caption>
<p>(A) Preferential localization in housekeeping DHS compared with the genome for different length of ETS⇔CRE sequences. The ETS (CGGAAGTG) and CRE (AAGTGACG) 8-mers were lengthened toward the indicated arrows, and for each bp extension, preferential localization in housekeeping DHS are calculated. A jump in localization of ETS (CGGAAGTG) occurs when the second CG dinucleotide is included, which creates the 11-mer CGGAAGTGACG. The ETS 8-mer CGGAAGTG occurs 16,846 times in the genome and 1073 times in housekeeping DHS, a ratio of ∼8%. The ratio in housekeeping DHS of 8-mers (CGGAAGTN) with a different final nucleotide are shown as a colored dot (G = yellow, A = green, T = red, C = blue). The ETS 9-mer CGGAAGTGA occurs 343 times in housekeeping DHS with a similar enrichment in housekeeping DHS as the 8-mer. When the sequence is extended to the 11-mer CGGAAGTGACG, enrichment in housekeeping DHS jumps to 60%. If the final G in the 11-mer is changed to the three other nucleotides, enrichment in housekeeping DHS is only 10%. When the ETS⇔CRE motif is extended to a 12-mer and beyond, enrichment in housekeeping DHS remains constant. When the ETS⇔CRE motif is extended from the CRE side toward the ETS side, a jump in localization in housekeeping DHS occurs when the AAGTGACG 8-mer is extended to the CGGAAGTGACG 11-mer. (B) Conservation or phyloP score in 30 mammals for the CRE 8-mer. (C) phyloP score for the ETS 8-mer. (D) phyloP score for the ETS⇔CRE 11-mer.</p>
</caption>
<graphic xlink:href="1243f5"></graphic>
</fig>
<p>It is important to note that the 11-mer CGGAAGTGACT can represent the overlapping of an ETS motif and an AP1 motif (TGA
<sup>C</sup>
/
<sub>G</sub>
TCA) to create the ETS⇔AP1. The ETS⇔AP1 motif may be cooperatively bound by an ETS protein and B-ZIP proteins that bind the AP1 motif. This sequence does not occur in housekeeping DHS, but it is enriched in tissue-specific DHS (
<xref ref-type="table" rid="t2">Table 2</xref>
) as observed previously (
<xref rid="bib16" ref-type="bibr">Hollenhorst
<italic>et al.</italic>
2011b</xref>
). When the motif is extended to a 12-mer, localization in housekeeping DHS does not increase but the occurrence decreases, indicating that the 11-mer is the core of longer and diverse ETS⇔CRE motifs (
<xref ref-type="fig" rid="fig5">Figure 5A</xref>
).</p>
<p>When the motif is extended from the CRE side toward the ETS motif, we again observe that localization in housekeeping DHS jumps to its maximal value when the motif is extended to the second CG and forms the 11-mer CGGAA
<bold>GTG</bold>
ACG. This suggests that the 226 ETS⇔CRE 11-mers in the genome contain different versions of the longer ETS⇔CRE 16-mers that may have distinct functions when they are bound by different combinations of ETS and B-ZIP family members.</p>
</sec>
<sec id="s16">
<title>Conservation of the ETS⇔CRE motif in mammals</title>
<p>The conservation of the ETS⇔CRE motif was examined in 36 mammalian genomes (
<xref rid="bib38" ref-type="bibr">Pollard
<italic>et al.</italic>
2010</xref>
). Initially, we examined the PhyloP signature for the ETS (CGGAAGTG) and CRE (TGACGTCA) 8-mers. Both PhyloP signatures show conservation (
<xref ref-type="fig" rid="fig5">Figure 5, B and C</xref>
), except for the CG that has negative PhyloP values. We presume this simply reflects the chemical deamination of the C in the CG dinucleotide when it is methylated, a well-known hypermutable process that is not directly modeled in PhyloP. In contrast, in the ETS⇔CRE 11-mer (GGAA
<bold>GTG</bold>
ACG), all nucleotides, including both CG, are “highly” conserved, having scores four times larger than either the ETS or CRE motifs (
<xref ref-type="fig" rid="fig5">Figure 5D</xref>
). Conservation extends 1-bp beyond the CG on the ETS (5′) side of the motif to either a C or G, which is known to affect DNA binding of ETS family members (
<xref rid="bib50" ref-type="bibr">Wei
<italic>et al.</italic>
2010</xref>
). Beyond the CG on the CRE (3′) side to the ETS⇔CRE motif, the 4-bps (TCAC) region, which is the second half of the CRE motif, is not conserved. Provocatively, these nucleotides actually have negative PhyloP values and as here it does not have deamination effect of CG dinucleotides, it suggests that the sequences bound by the second monomer of the B-ZIP dimer in this context are evolving faster than neutral (
<xref rid="bib38" ref-type="bibr">Pollard
<italic>et al.</italic>
2010</xref>
).</p>
</sec>
<sec id="s17">
<title>1-bp variants of the ETS⇔CRE 11-mer</title>
<p>We examined whether 1-bp variants of the ETS⇔CRE 11-mer are also enriched in housekeeping DHS (
<xref ref-type="fig" rid="fig6">Figure 6, A–D</xref>
). Of the 147 occurrences, 51 (35%) of the most abundant 1-bp variant (CGGAAGTG
<italic>G</italic>
CG) are in housekeeping DHS. Two additional variants (CGGA
<italic>C</italic>
GTGACG and CGGAAGTG
<italic>C</italic>
CG) are abundant and enriched in housekeeping promoters, suggesting that they may also be functional. The GGA in the core of the ETS motif is critical for the sequence-specific binding (
<xref rid="bib14" ref-type="bibr">Graves & Petersen 1998</xref>
) and shows very little variability in housekeeping DHS, suggesting that there are virtually no occurrences of the crippled ETS⇔CRE motif in regulatory regions. In the genome, all 1-bp variants that do not disrupt the CG are less abundant than the ETS⇔CRE 11-mer. In contrast, 1-bp variants that do disrupt either of the two CG are typically more abundant than the ETS⇔CRE, highlighting the profound effect of the CG dinucleotide on the occurrence of a DNA sequence in the genome. A molecular model of the ETS⇔CRE 16-mer bound by ETS and CREB is color-coded to visualize each nucleotide (
<xref ref-type="fig" rid="fig6">Figure 6E</xref>
). Potentially, the abundant 1-bp nucleotide variants of the ETS⇔CRE motif in housekeeping promoters are bound by different combinations of ETS and B-ZIP proteins.</p>
<fig id="fig6" fig-type="figure" position="float">
<label>Figure 6 </label>
<caption>
<p>(A) Occurrences of the ETS⇔CRE 11-mer CGGAA
<bold>GTG</bold>
ACG and all 1-bp variants in housekeeping DHS
<italic>vs.</italic>
enrichment in housekeeping DHS compared with the genome. (B) Histogram showing abundance of the ETS⇔CRE 11-mer CGGAA
<bold>GTG</bold>
ACG and 1-bp variants in housekeeping DHS. (C) Table showing occurrences of 1-bp variances of the ETS⇔CRE 11-mer CGGAAGTGACG in the genome and housekeeping DHS. The numbers highlighted in yellow are the consensus ETS⇔CRE 11-mer. (D) Graphical presentation of the 1-bp variants of ETS⇔CRE 11-mer in housekeeping DHS. (E) Molecular model of the ETS⇔CRE motif with the nucleotides in color to highlight which parts of the structure are conserved and variable.</p>
</caption>
<graphic xlink:href="1243f6"></graphic>
</fig>
</sec>
<sec id="s18">
<title>Four abundant ETS⇔CRE 13-mers
<bold>(
<sup>C</sup>
/
<sub>G</sub>
CGGAA</bold>
GTG
<bold>ACG
<sup>T</sup>
/
<sub>C</sub>
)</bold>
</title>
<p>The abundance of longer versions of the ETS⇔CRE 11-mer in the genome and regulatory regions was evaluated (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS3.pdf">Figure S3</ext-link>
C). We initially focused on 16-mers, the potential length of the ETS⇔CRE motif. Of the 226 11-mers in the genome, 171 different 16-mers occur, and the most abundant 16-mer (CCGGAAGTGACGCGAG) occurs seven times. The canonical motif CCGGAA
<bold>GTG</bold>
ACGTCAC occurs three times in the genome. The alignment of ETS⇔CRE 11-mers, including surrounding DNA sequences, identified four abundant ETS⇔CRE 13-mers (
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACG
<sup>T</sup>
/
<sub>C</sub>
) (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS3.pdf">Figure S3</ext-link>
C), representing 70% of all ETS⇔CRE 11-mers (
<xref ref-type="fig" rid="fig7">Figure 7, A and B</xref>
). Each 13-mer correlated with different GO terms, suggesting distinct functions (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS3.pdf">Table S3</ext-link>
). The nucleotide before the CG in the ETS motif is either G or C, and these are known to be bound by different ETS family members (
<xref rid="bib50" ref-type="bibr">Wei
<italic>et al.</italic>
2010</xref>
). The nucleotide after the central CG in the CRE is typically a pyrimidine, T and C. They are 5-fold more abundant than the purines G and A (
<xref ref-type="table" rid="t2">Table 2</xref>
). The T and C in this position are optimal for binding the B-ZIP proteins CREB and C/EBP, respectively (
<xref rid="bib20" ref-type="bibr">Johnson 1993</xref>
). Each of the four ETS⇔CRE 13-mers is expected to be optimally bound by a specific combination of ETS monomers and B-ZIP dimers.</p>
<fig id="fig7" fig-type="figure" position="float">
<label>Figure 7 </label>
<caption>
<p>(A) Abundance of 4 ETS⇔CRE 13-mers (
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACG
<sup>T</sup>
/
<sub>C</sub>
) and 1-bp variants in housekeeping DHS
<italic>vs.</italic>
percentage of occurrences in housekeeping DHS compared with the genome. All N-CG-N
<sub>7</sub>
-CG-
<sup>13</sup>
N-mers are shown. The four abundant ETS⇔CRE 13-mers (
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACG
<sup>T</sup>
/
<sub>C</sub>
) are shown in red. (B) Histogram of occurrences of the ETS⇔CRE 13-mers
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACG
<sup>T</sup>
/
<sub>C</sub>
and all 1-bp variances in housekeeping DHS. (C) Pie chart representation of the occurrence of the dinucleotides at the end of the CRE motif TGACGTNN that occurs 2046 times in proximal promoters (−200-bps to +60-bps). (D) Pie chart representation of the occurrence of the dinucleotides at the end of the ETS⇔CRE motif CGGAAGTGACGTNN. (E) Preferential occurrence in promoters compared to the genome for all pairs of CG separated by 0-bps to 9-bps [CG-
<sub>(0-9)</sub>
-CG]. (F) Same as (E), but sequences with an internal CG are excluded. The one sequence that is abundant primarily in promoters is the ETS⇔CRE motif.</p>
</caption>
<graphic xlink:href="1243f7"></graphic>
</fig>
<p>The dinucleotides following the CRE 6-mer TGACGT-N
<sub>2</sub>
in proximal promoters are enriched only for CA dinucleotide, which produces the canonical CRE 8-mer TGACGTCA (
<xref ref-type="fig" rid="fig7">Figure 7C</xref>
). In contrast, the dinucleotides following the ETS⇔CRE 12-mer CGGAAGTGACGT are also enriched for AN dinucleotides, suggesting that the CRE and the ETS⇔CRE motifs in promoters are bound by different B-ZIP proteins (
<xref ref-type="fig" rid="fig7">Figure 7D</xref>
).</p>
</sec>
<sec id="s19">
<title>Localization of pairs of CG in DHS</title>
<p>In the ETS⇔CRE motif, the two CG are separated by 7-bps (CG-N
<sub>7</sub>
-CG). To identify whether additional pairs of CG preferentially occur in promoters, we counted in the whole genome the occurrence of sequences containing a pair of CG separated by 0-bps to 9-bps (CG-N
<sub>0-9</sub>
-CG) and determined what fraction are in housekeeping DHS. The ETS⇔CRE motif stands out among all other sequences containing pairs of CG, being abundant and primarily in promoters (
<xref ref-type="fig" rid="fig7">Figure 7, E and F</xref>
).</p>
</sec>
<sec id="s20">
<title>CG methylation status of the ETS⇔CRE motif in two mouse primary cells</title>
<p>Methylation of the CG dinucleotide in canonical ETS and CRE motifs inhibits binding of both ETS and CREB proteins (
<xref rid="bib17" ref-type="bibr">Iguchi-Ariga & Schaffner 1989</xref>
;
<xref rid="bib47" ref-type="bibr">Umezawa
<italic>et al.</italic>
1997</xref>
;
<xref rid="bib41" ref-type="bibr">Rozenberg
<italic>et al.</italic>
2008</xref>
). An important feature of the ETS⇔CRE motif is the presence of two CG that can be methylated. We used two mouse methylomes at 100X coverage for newborn mouse dermal fibroblasts and 45X coverage for primary keratinocytes. The four ETS⇔CRE 13-mers have different methylation properties (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS4.pdf">Table S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS4.pdf">Figure S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS5.pdf">Figure S5</ext-link>
, and
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS6.pdf">Figure S6</ext-link>
). All 21 occurrences of the GCGGAAGTGACGT 13-mer are unmethylated on both CG dinucleotides in dermal fibroblasts and keratinocytes, suggesting that they are in functional regions of the genome. Of the 45 occurrences of the more abundant 13-mer CCGGAAGTGACGT, 33 are unmethylated in both cells (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS4.pdf">Figure S4</ext-link>
C,
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS5.pdf">Figure S5</ext-link>
A, and
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS6.pdf">Figure S6</ext-link>
A).</p>
<p>Not all 13-mers with two CG dinucleotides separated by 7-bp are unmethylated. Only 10% of CACGCACACACCG is unmethylated (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS4.pdf">Figure S4</ext-link>
G,
<xref ref-type="fig" rid="fig5">Figures 5E</xref>
and
<xref ref-type="fig" rid="fig6">6E</xref>
). Comparing two methylome data for these motifs shows that unmethylated 13-mer motifs are common and generally unmethylated in both cell types (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/FigureS6.pdf">Figure S6</ext-link>
, A–D) and that these unmethylated ETS⇔CRE sequences are mainly enriched in promoters (
<ext-link ext-link-type="uri" xlink:href="http://www.g3journal.org/lookup/suppl/doi:10.1534/g3.112.004002/-/DC1/TableS4.pdf">Table S4</ext-link>
), lending support to the suggestion that every occurrence of an unmethylated version of the ETS⇔CRE motif is biologically important.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s21">
<title>Discussion</title>
<p>We determined the distribution in human promoters of split DNA 8-mers consisting of a pair of 4-mers separated by 1-bp to 30-bps. A striking result is that few split 8-mers with insert length of 5-bps or greater (X
<sub>4</sub>
-N
<sub>5-30</sub>
-X
<sub>4</sub>
) localize in proximal promoters. This is in sharp contrast to Drosophila promoters, in which many split 8-mers with a 20-bp to 30-bp insert length (X
<sub>4</sub>
-N
<sub>20-30</sub>
-X
<sub>4</sub>
) localize in proximal promoters (
<xref rid="bib49" ref-type="bibr">Vinson
<italic>et al.</italic>
2011</xref>
). We examined split 8-mers in human promoters and identified pairs of 4-mers that localized at a specific insert length and not others. This article focused on the ETS motif (
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG)</bold>
precisely overlapping with a CRE motif (
<bold>GTG</bold>
ACGTCAC) to create a composite site, the ETS⇔CRE motif (
<sup>C</sup>
/
<sub>G</sub>
CGGAA
<bold>GTG</bold>
ACGTCAC). The trinucleotide
<bold>GTG</bold>
is common in the two TFBS, being the 3′ end of the ETS motif and 5′ end of the palindromic CRE motif. Molecular modeling using X-ray structures of ETS and B-ZIP proteins binding the ETS⇔CRE motif suggests that the ETS monomer and B-ZIP dimer can bind the overlapping TFBS without any protein-protein clashes. Instead of ETS and B-ZIP proteins competing for binding the ETS⇔CRE motif, the ETS protein GABPα and the B-ZIP protein CREB preferentially bind the ETS⇔CRE motif only when the
<bold>GTG</bold>
trinucleotide overlaps. In contrast, the ETS protein ETV5 competes with CREB to bind the ETS⇔CRE motif.
<italic>De novo</italic>
enriched motif detection using the
<italic>in vivo</italic>
CREB and GABPα ChIP-seq binding regions identified the ETS⇔CRE motif along with the canonical CRE and ETS motifs, suggesting an
<italic>in vivo</italic>
function for the motif. Additionally, the conservation of the ETS⇔CRE motif is signifying its biological function (
<xref rid="bib53" ref-type="bibr">Xie
<italic>et al.</italic>
2005</xref>
;
<xref rid="bib38" ref-type="bibr">Pollard
<italic>et al.</italic>
2010</xref>
).</p>
<p>The ETS domain has been shown to interact with several different DNA binding proteins to bind sequences containing chimeric aspects of each TFBS (
<xref rid="bib16" ref-type="bibr">Hollenhorst
<italic>et al.</italic>
2011b</xref>
). The ETS protein GABPα initially was observed interacting with GABPβ to bind a chimeric sequence (
<xref rid="bib2" ref-type="bibr">Batchelor
<italic>et al.</italic>
1998</xref>
). ETS was subsequently shown to interact with additional proteins. The forkhead proteins interact at the 5′ end of the ETS motif (
<xref rid="bib8" ref-type="bibr">De Val
<italic>et al.</italic>
2008</xref>
), whereas SRF, PAX, and potentially CREB interact at the 3′ end of the ETS motif (
<xref rid="bib16" ref-type="bibr">Hollenhorst
<italic>et al.</italic>
2011b</xref>
). Several of these interactions have been identified by examining tissue-specific enhancer sequences (
<xref rid="bib16" ref-type="bibr">Hollenhorst
<italic>et al.</italic>
2011b</xref>
). The cytokine, RANTES (regulated upon activation, normal T cell expressed) is induced by LPS through binding in promoters by ATF and Jun proteins to a composite site containing non-overlapping ETS and CRE motifs (
<xref rid="bib6" ref-type="bibr">Boehlk
<italic>et al.</italic>
2000</xref>
).</p>
<p>ETS and CRE motifs co-occur in proximal promoters (
<xref rid="bib10" ref-type="bibr">FitzGerald
<italic>et al.</italic>
2004</xref>
). Cooperative DNA binding by GABPα and CREB to adjacent ETS and CRE sites separated by various distances up to 15-bps has been reported (
<xref rid="bib43" ref-type="bibr">Sawada
<italic>et al.</italic>
1999</xref>
). The cooperative binding is mapped to the non-DNA binding region of GABPα, suggesting that cooperativity is via protein-protein interactions. These investigators did not observe that the two motifs needed to be precisely aligned relative to each other for cooperative binding. These results are in sharp contrast to what we observed; the precise overlap produces enhanced GABPα and CREB binding, suggesting that the cooperative binding we observed between the ETS and CREB DNA binding domains is distinct from the cooperative binding observed when full-length proteins are examined. The ETS and CRE motifs at different spacing than the observed ETS⇔CRE motif may be preferentially bound by different combinations of ETS and B-ZIP proteins and may have specific functions in regulating gene expression. Oncogenic ETS family members in prostate cancer localize at ETS⇔AP1 motifs that have the same overlap (
<xref rid="bib15" ref-type="bibr">Hollenhorst
<italic>et al.</italic>
2011a</xref>
) observed in the ETS⇔CRE motif. The AP1 or TRE 7-mer (TGA
<sup>C</sup>
/
<sub>G</sub>
TCA) is a 1-bp deletion at the center of the CRE, disrupting the CG dinucleotide. Recently, the ETS and CRE motifs were observed to co-occur in ChIP-seq data sets with a spacing of 1-bp to 2-bp (
<xref rid="bib51" ref-type="bibr">Whitington
<italic>et al.</italic>
2011</xref>
), whereas we highlight the ETS⇔CRE motif at a precise spacing with unique biochemical properties.</p>
<p>Overlapping protein binding is observed in the enhanceosome where the ATF-2/c-Jun heterodimer binds to the same DNA base pairs as the IRF-3 protein. Again, there are no protein-protein interactions (
<xref rid="bib34" ref-type="bibr">Panne
<italic>et al.</italic>
2004</xref>
,
<xref rid="bib35" ref-type="bibr">2007</xref>
;
<xref rid="bib33" ref-type="bibr">Panne 2008</xref>
); instead, it appears that the cooperative binding of these three polypeptides is via allosteric changes to the DNA. This is similar to what may occur when GABPα and CREB preferentially bind the ETS⇔CRE motif.</p>
<p>Recently, it was suggested that a fundamental difference between prokaryotic and eukaryotic systems is that eukaryotic systems have short TFBS that proteins do not recognize with sufficient specificity to bind to cognate sites exclusively (
<xref rid="bib52" ref-type="bibr">Wunderlich & Mirny 2009</xref>
) and need to cooperate with other TF to displace a nucleosome and become functional (
<xref rid="bib37" ref-type="bibr">Polach & Widom 1996</xref>
;
<xref rid="bib29" ref-type="bibr">Mirny 2010</xref>
). The overlap of two TFBS as observed in the ETS⇔CRE motif creates a long DNA sequences that are generally rare in mammalian genomes and could thus function like a prokaryotic system in which each occurrence is functional.</p>
<p>An alternative method to create specificity in vertebrate genomes is to have two TFBS that only need to be within 150-bps of each other and function together because they compete with nucleosomes for binding (
<xref rid="bib37" ref-type="bibr">Polach & Widom 1996</xref>
;
<xref rid="bib29" ref-type="bibr">Mirny 2010</xref>
;
<xref rid="bib3" ref-type="bibr">Biddie
<italic>et al.</italic>
2011</xref>
). It appears that both mechanisms operate in mammalian genomes. An advantage of the overlapping TFBS is that it allows for cooperative binding between specific members of each TF family, thus increasing specificity. This is absent in the model of two TF independently binding to DNA to displace a nucleosome. The nucleosome displacement mechanism allows different TF to act cooperatively, and it allows selection of which family member is functioning.</p>
<p>We have taken a DNA-centric perspective to evaluate which DNA sequences are important, eschewing the common practice embodied in the use of position weight matrices (PWM), of averaging two or more DNA sequences to create a logo or hybrid sequence. An inherent issue with the DNA-centric perspective is to know the length of the DNA sequence. An upper bound to the length of a DNA sequence is when it becomes unique in the genome, instead of having thousands of occurrences in which only a subset is functional. Vertebrate genomes are not big enough to accommodate all possible 16-mers. The ETS⇔CRE 16-mer is long enough so that random occurrences are not expected. Here, we have taken the approach that different sequences should not be averaged because this could obscure details concerning longer sequences having a distinct function. For example, the ETS⇔CRE 13-mers GCGGAAGTGACGT and CCGGAAGTGACGT enrich for distinct GO terms in addition to having distinct methylation properties. Closer examination of proximal promoters may identify additional examples of pairs of DNA sequences that are constrained relative to each other as we observed for the ETS⇔CRE motif. The identification of these sequences will be essential as we deconvolute the genome into functional units.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material id="PMC_1" content-type="local-data">
<caption>
<title>Supporting Information</title>
</caption>
<media mimetype="text" mime-subtype="html" xlink:href="supp_2_10_1243__index.html"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_004002SI.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_TableS4.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_FigureS1.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_FigureS2.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_FigureS3.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_FigureS4.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_FigureS5.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_FigureS6.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_TableS1.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_TableS2.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_2.10.1243_TableS3.pdf"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>We thank B. K. Lee of National Cancer Institute, National Institutes of Health (NIH) for his thoughtful comments, and we thank the high-performance computational capabilities of the Helix & Biowulf Systems at the NIH, Bethesda, MD (
<ext-link ext-link-type="uri" xlink:href="http://helix.nih.gov">http://helix.nih.gov</ext-link>
).</p>
</ack>
<fn-group>
<fn>
<p>Communicating editor: T. R. Hughes</p>
</fn>
</fn-group>
<ref-list>
<title>Literature Cited</title>
<ref id="bib1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ahn</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Olive</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Aggarwal</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Krylov</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Ginty</surname>
<given-names>D. D.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>1998</year>
<article-title>A dominant-negative inhibitor of CREB reveals that it is a general mediator of stimulus-dependent transcription of c-fos</article-title>
.
<source>Mol. Cell. Biol.</source>
<volume>18</volume>
:
<fpage>967</fpage>
<lpage>977</lpage>
<pub-id pub-id-type="pmid">9447994</pub-id>
</mixed-citation>
</ref>
<ref id="bib55">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Badis</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>M. F.</given-names>
</name>
<name>
<surname>Philippakis</surname>
<given-names>A. A.</given-names>
</name>
<name>
<surname>Talukder</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Gehrke</surname>
<given-names>A. R.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2009</year>
<article-title>Diversity and complexity in DNA recognition by transcription factors</article-title>
.
<source>Science</source>
<volume>324</volume>
:
<fpage>1720</fpage>
<lpage>1723</lpage>
<pub-id pub-id-type="pmid">19443739</pub-id>
</mixed-citation>
</ref>
<ref id="bib2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Batchelor</surname>
<given-names>A. H.</given-names>
</name>
<name>
<surname>Piper</surname>
<given-names>D. E.</given-names>
</name>
<name>
<surname>de la Brousse</surname>
<given-names>F. C.</given-names>
</name>
<name>
<surname>McKnight</surname>
<given-names>S. L.</given-names>
</name>
<name>
<surname>Wolberger</surname>
<given-names>C.</given-names>
</name>
</person-group>
,
<year>1998</year>
<article-title>The structure of GABPalpha/beta: an ETS domain- ankyrin repeat heterodimer bound to DNA</article-title>
.
<source>Science</source>
<volume>279</volume>
:
<fpage>1037</fpage>
<lpage>1041</lpage>
<pub-id pub-id-type="pmid">9461436</pub-id>
</mixed-citation>
</ref>
<ref id="bib3">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Biddie</surname>
<given-names>S. C.</given-names>
</name>
<name>
<surname>John</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sabo</surname>
<given-names>P. J.</given-names>
</name>
<name>
<surname>Thurman</surname>
<given-names>R. E.</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>T. A.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2011</year>
<article-title>Transcription factor AP1 potentiates chromatin accessibility and glucocorticoid receptor binding</article-title>
.
<source>Mol. Cell</source>
<volume>43</volume>
:
<fpage>145</fpage>
<lpage>155</lpage>
<pub-id pub-id-type="pmid">21726817</pub-id>
</mixed-citation>
</ref>
<ref id="bib4">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bina</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wyss</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Szpankowski</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>E.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2004</year>
<article-title>Exploring the characteristics of sequence elements in proximal promoters of human genes</article-title>
.
<source>Genomics</source>
<volume>84</volume>
:
<fpage>929</fpage>
<lpage>940</lpage>
<pub-id pub-id-type="pmid">15533710</pub-id>
</mixed-citation>
</ref>
<ref id="bib5">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bird</surname>
<given-names>A.</given-names>
</name>
</person-group>
,
<year>2011</year>
<article-title>The dinucleotide CG as a genomic signalling module</article-title>
.
<source>J. Mol. Biol.</source>
<volume>409</volume>
:
<fpage>47</fpage>
<lpage>53</lpage>
<pub-id pub-id-type="pmid">21295585</pub-id>
</mixed-citation>
</ref>
<ref id="bib6">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boehlk</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Fessele</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mojaat</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Miyamoto</surname>
<given-names>N. G.</given-names>
</name>
<name>
<surname>Werner</surname>
<given-names>T.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2000</year>
<article-title>ATF and Jun transcription factors, acting through an Ets/CRE promoter module, mediate lipopolysaccharide inducibility of the chemokine RANTES in monocytic Mono Mac 6 cells</article-title>
.
<source>Eur. J. Immunol.</source>
<volume>30</volume>
:
<fpage>1102</fpage>
<lpage>1112</lpage>
<pub-id pub-id-type="pmid">10760799</pub-id>
</mixed-citation>
</ref>
<ref id="bib7">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carninci</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sandelin</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lenhard</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Katayama</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Shimokawa</surname>
<given-names>K.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2006</year>
<article-title>Genome-wide analysis of mammalian promoter architecture and evolution</article-title>
.
<source>Nat. Genet.</source>
<volume>38</volume>
:
<fpage>626</fpage>
<lpage>635</lpage>
<pub-id pub-id-type="pmid">16645617</pub-id>
</mixed-citation>
</ref>
<ref id="bib8">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>De Val</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Chi</surname>
<given-names>N. C.</given-names>
</name>
<name>
<surname>Meadows</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Minovitsky</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>J. P.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2008</year>
<article-title>Combinatorial regulation of endothelial gene expression by ets and forkhead transcription factors</article-title>
.
<source>Cell</source>
<volume>135</volume>
:
<fpage>1053</fpage>
<lpage>1064</lpage>
<pub-id pub-id-type="pmid">19070576</pub-id>
</mixed-citation>
</ref>
<ref id="bib9">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Farnham</surname>
<given-names>P. J.</given-names>
</name>
</person-group>
,
<year>2009</year>
<article-title>Insights from genomic profiling of transcription factors</article-title>
.
<source>Nat. Rev. Genet.</source>
<volume>10</volume>
:
<fpage>605</fpage>
<lpage>616</lpage>
<pub-id pub-id-type="pmid">19668247</pub-id>
</mixed-citation>
</ref>
<ref id="bib10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>FitzGerald</surname>
<given-names>P. C.</given-names>
</name>
<name>
<surname>Shlyakhtenko</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Mir</surname>
<given-names>A. A.</given-names>
</name>
<name>
<surname>Vinson</surname>
<given-names>C.</given-names>
</name>
</person-group>
,
<year>2004</year>
<article-title>Clustering of DNA sequences in human promoters</article-title>
.
<source>Genome Res.</source>
<volume>14</volume>
:
<fpage>1562</fpage>
<lpage>1574</lpage>
<pub-id pub-id-type="pmid">15256515</pub-id>
</mixed-citation>
</ref>
<ref id="bib11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>FitzGerald</surname>
<given-names>P. C.</given-names>
</name>
<name>
<surname>Sturgill</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Shyakhtenko</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Oliver</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Vinson</surname>
<given-names>C.</given-names>
</name>
</person-group>
,
<year>2006</year>
<article-title>Comparative genomics of Drosophila and human core promoters</article-title>
.
<source>Genome Biol.</source>
<volume>7</volume>
:
<fpage>R53</fpage>
<pub-id pub-id-type="pmid">16827941</pub-id>
</mixed-citation>
</ref>
<ref id="bib12">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Frith</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Spouge</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>Hansen</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>Weng</surname>
<given-names>Z.</given-names>
</name>
</person-group>
,
<year>2002</year>
<article-title>Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>30</volume>
:
<fpage>3214</fpage>
<lpage>3224</lpage>
<pub-id pub-id-type="pmid">12136103</pub-id>
</mixed-citation>
</ref>
<ref id="bib13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garvie</surname>
<given-names>C. W.</given-names>
</name>
<name>
<surname>Hagman</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wolberger</surname>
<given-names>C.</given-names>
</name>
</person-group>
,
<year>2001</year>
<article-title>Structural studies of Ets-1/Pax5 complex formation on DNA</article-title>
.
<source>Mol. Cell</source>
<volume>8</volume>
:
<fpage>1267</fpage>
<lpage>1276</lpage>
<pub-id pub-id-type="pmid">11779502</pub-id>
</mixed-citation>
</ref>
<ref id="bib14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Graves</surname>
<given-names>B. J.</given-names>
</name>
<name>
<surname>Petersen</surname>
<given-names>J. M.</given-names>
</name>
</person-group>
,
<year>1998</year>
<article-title>Specificity within the ets family of transcription factors</article-title>
.
<source>Adv. Cancer Res.</source>
<volume>75</volume>
:
<fpage>1</fpage>
<lpage>55</lpage>
<pub-id pub-id-type="pmid">9709806</pub-id>
</mixed-citation>
</ref>
<ref id="bib15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hollenhorst</surname>
<given-names>P. C.</given-names>
</name>
<name>
<surname>Ferris</surname>
<given-names>M. W.</given-names>
</name>
<name>
<surname>Hull</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Chae</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2011</year>
<comment>a</comment>
<article-title>Oncogenic ETS proteins mimic activated RAS/MAPK signaling in prostate cells</article-title>
.
<source>Genes Dev.</source>
<volume>25</volume>
:
<fpage>2147</fpage>
<lpage>2157</lpage>
<pub-id pub-id-type="pmid">22012618</pub-id>
</mixed-citation>
</ref>
<ref id="bib16">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hollenhorst</surname>
<given-names>P. C.</given-names>
</name>
<name>
<surname>McIntosh</surname>
<given-names>L. P.</given-names>
</name>
<name>
<surname>Graves</surname>
<given-names>B. J.</given-names>
</name>
</person-group>
,
<year>2011</year>
<comment>b</comment>
<article-title>Genomic and biochemical insights into the specificity of ETS transcription factors</article-title>
.
<source>Annu. Rev. Biochem.</source>
<volume>80</volume>
:
<fpage>437</fpage>
<lpage>471</lpage>
<pub-id pub-id-type="pmid">21548782</pub-id>
</mixed-citation>
</ref>
<ref id="bib17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Iguchi-Ariga</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Schaffner</surname>
<given-names>W.</given-names>
</name>
</person-group>
,
<year>1989</year>
<article-title>CpG methylation of the cAMP-responsive enhancer/promoter sequence TGACGTCA abolishes specific factor binding as well as transcriptional activation</article-title>
.
<source>Genes Dev.</source>
<volume>3</volume>
:
<fpage>612</fpage>
<lpage>619</lpage>
<pub-id pub-id-type="pmid">2545524</pub-id>
</mixed-citation>
</ref>
<ref id="bib18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ji</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>D. S.</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>R. M.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2008</year>
<article-title>An integrated software system for analyzing ChIP-chip and ChIP-seq data</article-title>
.
<source>Nat. Biotechnol.</source>
<volume>26</volume>
:
<fpage>1293</fpage>
<lpage>1300</lpage>
<pub-id pub-id-type="pmid">18978777</pub-id>
</mixed-citation>
</ref>
<ref id="bib19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Johnson</surname>
<given-names>D. S.</given-names>
</name>
<name>
<surname>Mortazavi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>R. M.</given-names>
</name>
<name>
<surname>Wold</surname>
<given-names>B.</given-names>
</name>
</person-group>
,
<year>2007</year>
<article-title>Genome-wide mapping of in vivo protein-DNA interactions</article-title>
.
<source>Science</source>
<volume>316</volume>
:
<fpage>1497</fpage>
<lpage>1502</lpage>
<pub-id pub-id-type="pmid">17540862</pub-id>
</mixed-citation>
</ref>
<ref id="bib20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Johnson</surname>
<given-names>P. F.</given-names>
</name>
</person-group>
,
<year>1993</year>
<article-title>Identification of C/EBP basic region residues involved in DNA sequence recognition and half-site spacing preference</article-title>
.
<source>Mol. Cell. Biol.</source>
<volume>13</volume>
:
<fpage>6919</fpage>
<lpage>6930</lpage>
<pub-id pub-id-type="pmid">8413284</pub-id>
</mixed-citation>
</ref>
<ref id="bib21">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kaplan</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Hughes</surname>
<given-names>T. R.</given-names>
</name>
<name>
<surname>Lieb</surname>
<given-names>J. D.</given-names>
</name>
<name>
<surname>Widom</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Segal</surname>
<given-names>E.</given-names>
</name>
</person-group>
,
<year>2010</year>
<article-title>Contribution of histone sequence preferences to nucleosome organization: proposed definitions and methodology</article-title>
.
<source>Genome Biol.</source>
<volume>11</volume>
:
<fpage>140</fpage>
<pub-id pub-id-type="pmid">21118582</pub-id>
</mixed-citation>
</ref>
<ref id="bib22">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kel</surname>
<given-names>A. E.</given-names>
</name>
<name>
<surname>Gossling</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Reuter</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Cheremushkin</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Kel-Margoulis</surname>
<given-names>O. V.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2003</year>
<article-title>MATCH: a tool for searching transcription factor binding sites in DNA sequences</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>31</volume>
:
<fpage>3576</fpage>
<lpage>3579</lpage>
<pub-id pub-id-type="pmid">12824369</pub-id>
</mixed-citation>
</ref>
<ref id="bib23">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kharchenko</surname>
<given-names>P. V.</given-names>
</name>
<name>
<surname>Tolstorukov</surname>
<given-names>M. Y.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>P. J.</given-names>
</name>
</person-group>
,
<year>2008</year>
<article-title>Design and analysis of ChIP-seq experiments for DNA-binding proteins</article-title>
.
<source>Nat. Biotechnol.</source>
<volume>26</volume>
:
<fpage>1351</fpage>
<lpage>1359</lpage>
<pub-id pub-id-type="pmid">19029915</pub-id>
</mixed-citation>
</ref>
<ref id="bib24">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lagrange</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kapanidis</surname>
<given-names>A. N.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Reinberg</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Ebright</surname>
<given-names>R. H.</given-names>
</name>
</person-group>
,
<year>1998</year>
<article-title>New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB</article-title>
.
<source>Genes Dev.</source>
<volume>12</volume>
:
<fpage>34</fpage>
<lpage>44</lpage>
<pub-id pub-id-type="pmid">9420329</pub-id>
</mixed-citation>
</ref>
<ref id="bib25">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Machanick</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Bailey</surname>
<given-names>T. L.</given-names>
</name>
</person-group>
,
<year>2011</year>
<article-title>MEME-ChIP: motif analysis of large DNA datasets</article-title>
.
<source>Bioinformatics</source>
<volume>27</volume>
:
<fpage>1696</fpage>
<lpage>1697</lpage>
<pub-id pub-id-type="pmid">21486936</pub-id>
</mixed-citation>
</ref>
<ref id="bib26">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marino-Ramirez</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Spouge</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>Kanga</surname>
<given-names>G. C.</given-names>
</name>
<name>
<surname>Landsman</surname>
<given-names>D.</given-names>
</name>
</person-group>
,
<year>2004</year>
<article-title>Statistical analysis of over-represented words in human promoter sequences</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>32</volume>
:
<fpage>949</fpage>
<lpage>958</lpage>
<pub-id pub-id-type="pmid">14963262</pub-id>
</mixed-citation>
</ref>
<ref id="bib27">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martianov</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Choukrallah</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Krebs</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Legras</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2010</year>
<article-title>Cell-specific occupancy of an extended repertoire of CREM and CREB binding loci in male germ cells</article-title>
.
<source>BMC Genomics</source>
<volume>11</volume>
:
<fpage>530</fpage>
<pub-id pub-id-type="pmid">20920259</pub-id>
</mixed-citation>
</ref>
<ref id="bib28">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Matys</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Kel-Margoulis</surname>
<given-names>O. V.</given-names>
</name>
<name>
<surname>Fricke</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Liebich</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Land</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2006</year>
<article-title>TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>34</volume>
:
<fpage>D108</fpage>
<lpage>D110</lpage>
<pub-id pub-id-type="pmid">16381825</pub-id>
</mixed-citation>
</ref>
<ref id="bib29">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mirny</surname>
<given-names>L. A.</given-names>
</name>
</person-group>
,
<year>2010</year>
<article-title>Nucleosome-mediated cooperativity between transcription factors</article-title>
.
<source>Proc. Natl. Acad. Sci. USA</source>
<volume>107</volume>
:
<fpage>22534</fpage>
<lpage>22539</lpage>
<pub-id pub-id-type="pmid">21149679</pub-id>
</mixed-citation>
</ref>
<ref id="bib30">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oh</surname>
<given-names>Y. M.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>J. K.</given-names>
</name>
<name>
<surname>Choi</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Yoo</surname>
<given-names>J. Y.</given-names>
</name>
</person-group>
,
<year>2011</year>
<article-title>Identification of co-occurring transcription factor binding sites from DNA sequence using clustered position weight matrices</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>40</volume>
:
<fpage>e38</fpage>
<pub-id pub-id-type="pmid">22187154</pub-id>
</mixed-citation>
</ref>
<ref id="bib31">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ohler</surname>
<given-names>U.</given-names>
</name>
</person-group>
, G. C
<person-group person-group-type="author">
<name>
<surname>Liao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Niemann</surname>
</name>
<name>
<surname>Rubin</surname>
<given-names>G. M.</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>Computational analysis of core promoters in the Drosophila genome</article-title>
.
<source>Genome Biol.</source>
<volume>3</volume>
: RESEARCH0087.</mixed-citation>
</ref>
<ref id="bib32">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pachkov</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Erb</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Molina</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>van Nimwegen</surname>
<given-names>E.</given-names>
</name>
</person-group>
,
<year>2007</year>
<article-title>SwissRegulon: a database of genome-wide annotations of regulatory sites</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>35</volume>
:
<fpage>D127</fpage>
<lpage>D131</lpage>
<pub-id pub-id-type="pmid">17130146</pub-id>
</mixed-citation>
</ref>
<ref id="bib33">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Panne</surname>
<given-names>D.</given-names>
</name>
</person-group>
,
<year>2008</year>
<article-title>The enhanceosome</article-title>
.
<source>Curr. Opin. Struct. Biol.</source>
<volume>18</volume>
:
<fpage>236</fpage>
<lpage>242</lpage>
<pub-id pub-id-type="pmid">18206362</pub-id>
</mixed-citation>
</ref>
<ref id="bib34">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Panne</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Maniatis</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Harrison</surname>
<given-names>S. C.</given-names>
</name>
</person-group>
,
<year>2004</year>
<article-title>Crystal structure of ATF-2/c-Jun and IRF-3 bound to the interferon-beta enhancer</article-title>
.
<source>EMBO J.</source>
<volume>23</volume>
:
<fpage>4384</fpage>
<lpage>4393</lpage>
<pub-id pub-id-type="pmid">15510218</pub-id>
</mixed-citation>
</ref>
<ref id="bib35">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Panne</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>McWhirter</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Maniatis</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Harrison</surname>
<given-names>S. C.</given-names>
</name>
</person-group>
,
<year>2007</year>
<article-title>Interferon regulatory factor 3 is regulated by a dual phosphorylation-dependent switch</article-title>
.
<source>J. Biol. Chem.</source>
<volume>282</volume>
:
<fpage>22816</fpage>
<lpage>22822</lpage>
<pub-id pub-id-type="pmid">17526488</pub-id>
</mixed-citation>
</ref>
<ref id="bib36">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pettersen</surname>
<given-names>E. F.</given-names>
</name>
<name>
<surname>Goddard</surname>
<given-names>T. D.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>C. C.</given-names>
</name>
<name>
<surname>Couch</surname>
<given-names>G. S.</given-names>
</name>
<name>
<surname>Greenblatt</surname>
<given-names>D. M.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2004</year>
<article-title>UCSF Chimera–a visualization system for exploratory research and analysis</article-title>
.
<source>J. Comput. Chem.</source>
<volume>25</volume>
:
<fpage>1605</fpage>
<lpage>1612</lpage>
<pub-id pub-id-type="pmid">15264254</pub-id>
</mixed-citation>
</ref>
<ref id="bib37">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Polach</surname>
<given-names>K. J.</given-names>
</name>
<name>
<surname>Widom</surname>
<given-names>J.</given-names>
</name>
</person-group>
,
<year>1996</year>
<article-title>A model for the cooperative binding of eukaryotic regulatory proteins to nucleosomal target sites</article-title>
.
<source>J. Mol. Biol.</source>
<volume>258</volume>
:
<fpage>800</fpage>
<lpage>812</lpage>
<pub-id pub-id-type="pmid">8637011</pub-id>
</mixed-citation>
</ref>
<ref id="bib38">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pollard</surname>
<given-names>K. S.</given-names>
</name>
<name>
<surname>Hubisz</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Rosenbloom</surname>
<given-names>K. R.</given-names>
</name>
<name>
<surname>Siepel</surname>
<given-names>A.</given-names>
</name>
</person-group>
,
<year>2010</year>
<article-title>Detection of nonneutral substitution rates on mammalian phylogenies</article-title>
.
<source>Genome Res.</source>
<volume>20</volume>
:
<fpage>110</fpage>
<lpage>121</lpage>
<pub-id pub-id-type="pmid">19858363</pub-id>
</mixed-citation>
</ref>
<ref id="bib39">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Portales-Casamar</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Thongjuea</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kwon</surname>
<given-names>A. T.</given-names>
</name>
<name>
<surname>Arenillas</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>X.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2010</year>
<article-title>JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>38</volume>
:
<fpage>D105</fpage>
<lpage>D110</lpage>
<pub-id pub-id-type="pmid">19906716</pub-id>
</mixed-citation>
</ref>
<ref id="bib40">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rishi</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Bhattacharya</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Chatterjee</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Rozenberg</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>J.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2010</year>
<article-title>CpG methylation of half-CRE sequences creates C/EBPalpha binding sites that activate some tissue-specific genes</article-title>
.
<source>Proc. Natl. Acad. Sci. USA</source>
<volume>107</volume>
:
<fpage>20311</fpage>
<lpage>20316</lpage>
<pub-id pub-id-type="pmid">21059933</pub-id>
</mixed-citation>
</ref>
<ref id="bib41">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rozenberg</surname>
<given-names>J. M.</given-names>
</name>
<name>
<surname>Shlyakhtenko</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Glass</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Rishi</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Myakishev</surname>
<given-names>M. V.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2008</year>
<article-title>All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues</article-title>
.
<source>BMC Genomics</source>
<volume>9</volume>
:
<fpage>67</fpage>
<pub-id pub-id-type="pmid">18252004</pub-id>
</mixed-citation>
</ref>
<ref id="bib42">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sabo</surname>
<given-names>P. J.</given-names>
</name>
<name>
<surname>Humbert</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Hawrylycz</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wallace</surname>
<given-names>J. C.</given-names>
</name>
<name>
<surname>Dorschner</surname>
<given-names>M. O.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2004</year>
<article-title>Genome-wide identification of DNaseI hypersensitive sites using active chromatin sequence libraries</article-title>
.
<source>Proc. Natl. Acad. Sci. USA</source>
<volume>101</volume>
:
<fpage>4537</fpage>
<lpage>4542</lpage>
<pub-id pub-id-type="pmid">15070753</pub-id>
</mixed-citation>
</ref>
<ref id="bib43">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sawada</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Simizu</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Sawa</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Goto</surname>
<given-names>M.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>1999</year>
<article-title>Synergistic transcriptional activation by hGABP and select members of the activation transcription factor/cAMP response element-binding protein family</article-title>
.
<source>J. Biol. Chem.</source>
<volume>274</volume>
:
<fpage>35475</fpage>
<lpage>35482</lpage>
<pub-id pub-id-type="pmid">10585419</pub-id>
</mixed-citation>
</ref>
<ref id="bib44">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schumacher</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Goodman</surname>
<given-names>R. H.</given-names>
</name>
<name>
<surname>Brennan</surname>
<given-names>R. G.</given-names>
</name>
</person-group>
,
<year>2000</year>
<article-title>The structure of a CREB bZIP.somatostatin CRE complex reveals the basis for selective dimerization and divalent cation-enhanced DNA binding</article-title>
.
<source>J. Biol. Chem.</source>
<volume>275</volume>
:
<fpage>35242</fpage>
<lpage>35247</lpage>
<pub-id pub-id-type="pmid">10952992</pub-id>
</mixed-citation>
</ref>
<ref id="bib45">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smale</surname>
<given-names>S. T.</given-names>
</name>
<name>
<surname>Kadonaga</surname>
<given-names>J. T.</given-names>
</name>
</person-group>
,
<year>2003</year>
<article-title>The RNA polymerase II core promoter</article-title>
.
<source>Annu. Rev. Biochem.</source>
<volume>72</volume>
:
<fpage>449</fpage>
<lpage>479</lpage>
<pub-id pub-id-type="pmid">12651739</pub-id>
</mixed-citation>
</ref>
<ref id="bib46">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thomas-Chollier</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Herrmann</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Defrance</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Sand</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Thieffry</surname>
<given-names>D.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2011</year>
<article-title>RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>40</volume>
:
<fpage>e31</fpage>
<pub-id pub-id-type="pmid">22156162</pub-id>
</mixed-citation>
</ref>
<ref id="bib47">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Umezawa</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Yamamoto</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Rhodes</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Klemsz</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Maki</surname>
<given-names>R. A.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>1997</year>
<article-title>Methylation of an ETS site in the intron enhancer of the keratin 18 gene participates in tissue-specific repression</article-title>
.
<source>Mol. Cell. Biol.</source>
<volume>17</volume>
:
<fpage>4885</fpage>
<lpage>4894</lpage>
<pub-id pub-id-type="pmid">9271368</pub-id>
</mixed-citation>
</ref>
<ref id="bib48">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Valouev</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>D. S.</given-names>
</name>
<name>
<surname>Sundquist</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Medina</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Anton</surname>
<given-names>E.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2008</year>
<article-title>Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data</article-title>
.
<source>Nat. Methods</source>
<volume>5</volume>
:
<fpage>829</fpage>
<lpage>834</lpage>
<pub-id pub-id-type="pmid">19160518</pub-id>
</mixed-citation>
</ref>
<ref id="bib49">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vinson</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Chatterjee</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Fitzgerald</surname>
<given-names>P.</given-names>
</name>
</person-group>
,
<year>2011</year>
<article-title>Transcription factor binding sites and other features in human and Drosophila proximal promoters</article-title>
.
<source>Subcell. Biochem.</source>
<volume>52</volume>
:
<fpage>205</fpage>
<lpage>222</lpage>
<pub-id pub-id-type="pmid">21557085</pub-id>
</mixed-citation>
</ref>
<ref id="bib50">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>G. H.</given-names>
</name>
<name>
<surname>Badis</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>M. F.</given-names>
</name>
<name>
<surname>Kivioja</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Palin</surname>
<given-names>K.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2010</year>
<article-title>Genome-wide analysis of ETS-family DNA-binding in vitro and in vivo</article-title>
.
<source>EMBO J.</source>
<volume>29</volume>
:
<fpage>2147</fpage>
<lpage>2160</lpage>
<pub-id pub-id-type="pmid">20517297</pub-id>
</mixed-citation>
</ref>
<ref id="bib51">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Whitington</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Frith</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bailey</surname>
<given-names>T. L.</given-names>
</name>
</person-group>
,
<year>2011</year>
<article-title>Inferring transcription factor complexes from ChIP-seq data</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>39</volume>
:
<fpage>e98</fpage>
<pub-id pub-id-type="pmid">21602262</pub-id>
</mixed-citation>
</ref>
<ref id="bib52">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wunderlich</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Mirny</surname>
<given-names>L. A.</given-names>
</name>
</person-group>
,
<year>2009</year>
<article-title>Different gene regulation strategies revealed by analysis of binding motifs</article-title>
.
<source>Trends Genet.</source>
<volume>25</volume>
:
<fpage>434</fpage>
<lpage>440</lpage>
<pub-id pub-id-type="pmid">19815308</pub-id>
</mixed-citation>
</ref>
<ref id="bib53">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xie</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kulbokas</surname>
<given-names>E. J.</given-names>
</name>
<name>
<surname>Golub</surname>
<given-names>T. R.</given-names>
</name>
<name>
<surname>Mootha</surname>
<given-names>V.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2005</year>
<article-title>Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals</article-title>
.
<source>Nature</source>
<volume>434</volume>
:
<fpage>338</fpage>
<lpage>345</lpage>
<pub-id pub-id-type="pmid">15735639</pub-id>
</mixed-citation>
</ref>
<ref id="bib54">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Odom</surname>
<given-names>D. T.</given-names>
</name>
<name>
<surname>Koo</surname>
<given-names>S. H.</given-names>
</name>
<name>
<surname>Conkright</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Canettieri</surname>
<given-names>G.</given-names>
</name>
<etal></etal>
</person-group>
,
<year>2005</year>
<article-title>Genome-wide analysis of cAMP-response element binding protein occupancy, phosphorylation, and target gene activation in human tissues</article-title>
.
<source>Proc. Natl. Acad. Sci. USA</source>
<volume>102</volume>
:
<fpage>4459</fpage>
<lpage>4464</lpage>
<pub-id pub-id-type="pmid">15753290</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C909 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000C909 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021