Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 0005469 ( Pmc/Corpus ); précédent : 0005468; suivant : 0005470 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues</title>
<author>
<name sortKey="Rozenberg, Julian M" sort="Rozenberg, Julian M" uniqKey="Rozenberg J" first="Julian M" last="Rozenberg">Julian M. Rozenberg</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shlyakhtenko, Andrey" sort="Shlyakhtenko, Andrey" uniqKey="Shlyakhtenko A" first="Andrey" last="Shlyakhtenko">Andrey Shlyakhtenko</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Glass, Kimberly" sort="Glass, Kimberly" uniqKey="Glass K" first="Kimberly" last="Glass">Kimberly Glass</name>
<affiliation>
<nlm:aff id="I2">Physics Department, University of Maryland, College Park, MD 20742, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rishi, Vikas" sort="Rishi, Vikas" uniqKey="Rishi V" first="Vikas" last="Rishi">Vikas Rishi</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Myakishev, Maxim V" sort="Myakishev, Maxim V" uniqKey="Myakishev M" first="Maxim V" last="Myakishev">Maxim V. Myakishev</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Department of Dermatology University of Rochester School of Medicine, Rochester, NY 14642, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fitzgerald, Peter C" sort="Fitzgerald, Peter C" uniqKey="Fitzgerald P" first="Peter C" last="Fitzgerald">Peter C. Fitzgerald</name>
<affiliation>
<nlm:aff id="I4">Genome Analysis Unit, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vinson, Charles" sort="Vinson, Charles" uniqKey="Vinson C" first="Charles" last="Vinson">Charles Vinson</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">18252004</idno>
<idno type="pmc">2267717</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2267717</idno>
<idno type="RBID">PMC:2267717</idno>
<idno type="doi">10.1186/1471-2164-9-67</idno>
<date when="2008">2008</date>
<idno type="wicri:Area/Pmc/Corpus">000546</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000546</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues</title>
<author>
<name sortKey="Rozenberg, Julian M" sort="Rozenberg, Julian M" uniqKey="Rozenberg J" first="Julian M" last="Rozenberg">Julian M. Rozenberg</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shlyakhtenko, Andrey" sort="Shlyakhtenko, Andrey" uniqKey="Shlyakhtenko A" first="Andrey" last="Shlyakhtenko">Andrey Shlyakhtenko</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Glass, Kimberly" sort="Glass, Kimberly" uniqKey="Glass K" first="Kimberly" last="Glass">Kimberly Glass</name>
<affiliation>
<nlm:aff id="I2">Physics Department, University of Maryland, College Park, MD 20742, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rishi, Vikas" sort="Rishi, Vikas" uniqKey="Rishi V" first="Vikas" last="Rishi">Vikas Rishi</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Myakishev, Maxim V" sort="Myakishev, Maxim V" uniqKey="Myakishev M" first="Maxim V" last="Myakishev">Maxim V. Myakishev</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Department of Dermatology University of Rochester School of Medicine, Rochester, NY 14642, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fitzgerald, Peter C" sort="Fitzgerald, Peter C" uniqKey="Fitzgerald P" first="Peter C" last="Fitzgerald">Peter C. Fitzgerald</name>
<affiliation>
<nlm:aff id="I4">Genome Analysis Unit, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Vinson, Charles" sort="Vinson, Charles" uniqKey="Vinson C" first="Charles" last="Vinson">Charles Vinson</name>
<affiliation>
<nlm:aff id="I1">Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Genomics</title>
<idno type="eISSN">1471-2164</idno>
<imprint>
<date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>The promoters of housekeeping genes are well-bound by RNA polymerase II (RNAP) in different tissues. Although the promoters of these genes are known to contain CpG islands, the specific DNA sequences that are associated with high RNAP binding to housekeeping promoters has not been described.</p>
</sec>
<sec>
<title>Results</title>
<p>ChIP-chip experiments from three mouse tissues, liver, heart ventricles, and primary keratinocytes, indicate that 94% of promoters have similar RNAP binding, ranging from well-bound to poorly-bound in all tissues. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters of housekeeping genes, focusing on those DNA sequences which are preferentially localized in the proximal promoter. We observe a bimodal distribution. Virtually all sequences enriched in promoters with high RNAP binding values contain a CpG dinucleotide. These results suggest that only transcription factor binding sites (TFBS) that contain the CpG dinucleotide are involved in RNAP binding to housekeeping promoters while TFBS that do not contain a CpG are involved in regulated promoter activity. Abundant 8-mers that are preferentially localized in the proximal promoters and exhibit the best enrichment in RNAP bound promoters are all variants of six known CpG-containing TFBS: ETS, NRF-1, BoxA, SP1, CRE, and E-Box. The frequency of these six DNA motifs can predict housekeeping promoters as accurately as the presence of a CpG island, suggesting that they are the structural elements critical for CpG island function. Experimental EMSA results demonstrate that methylation of the CpG in the ETS, NRF-1, and SP1 motifs prevent DNA binding in nuclear extracts in both keratinocytes and liver.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>In general, TFBS that do not contain a CpG are involved in regulated gene expression while TFBS that contain a CpG are involved in constitutive gene expression with some CpG containing sequences also involved in inducible and tissue specific gene regulation. These TFBS are not bound when the CpG is methylated. Unmethylated CpG dinucleotides in the TFBS in CpG islands allow the transcription factors to find their binding sites which occur only in promoters, in turn localizing RNAP to promoters.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Genomics</journal-id>
<journal-title>BMC Genomics</journal-title>
<issn pub-type="epub">1471-2164</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">18252004</article-id>
<article-id pub-id-type="pmc">2267717</article-id>
<article-id pub-id-type="publisher-id">1471-2164-9-67</article-id>
<article-id pub-id-type="doi">10.1186/1471-2164-9-67</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues</article-title>
</title-group>
<contrib-group>
<contrib id="A1" contrib-type="author">
<name>
<surname>Rozenberg</surname>
<given-names>Julian M</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>rozenbej@mail.nih.gov</email>
</contrib>
<contrib id="A2" contrib-type="author">
<name>
<surname>Shlyakhtenko</surname>
<given-names>Andrey</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>shlyakha@mail.nih.gov</email>
</contrib>
<contrib id="A3" contrib-type="author">
<name>
<surname>Glass</surname>
<given-names>Kimberly</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>kg234f@nih.gov</email>
</contrib>
<contrib id="A4" contrib-type="author">
<name>
<surname>Rishi</surname>
<given-names>Vikas</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>rishiv@mail.nih.gov</email>
</contrib>
<contrib id="A5" contrib-type="author">
<name>
<surname>Myakishev</surname>
<given-names>Maxim V</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>Max_Myakishev@urmc.rochester.edu</email>
</contrib>
<contrib id="A6" contrib-type="author">
<name>
<surname>FitzGerald</surname>
<given-names>Peter C</given-names>
</name>
<xref ref-type="aff" rid="I4">4</xref>
<email>FitzgePe@mail.nih.gov</email>
</contrib>
<contrib id="A7" corresp="yes" contrib-type="author">
<name>
<surname>Vinson</surname>
<given-names>Charles</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>Vinsonc@mail.nih.gov</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Laboratory of Metabolism, National Cancer Institute, Bethesda, MD 20892 USA</aff>
<aff id="I2">
<label>2</label>
Physics Department, University of Maryland, College Park, MD 20742, USA</aff>
<aff id="I3">
<label>3</label>
Department of Dermatology University of Rochester School of Medicine, Rochester, NY 14642, USA</aff>
<aff id="I4">
<label>4</label>
Genome Analysis Unit, National Cancer Institute, Bethesda, MD 20892 USA</aff>
<pub-date pub-type="collection">
<year>2008</year>
</pub-date>
<pub-date pub-type="epub">
<day>5</day>
<month>2</month>
<year>2008</year>
</pub-date>
<volume>9</volume>
<fpage>67</fpage>
<lpage>67</lpage>
<ext-link ext-link-type="uri" xlink:href="http://www.biomedcentral.com/1471-2164/9/67"></ext-link>
<history>
<date date-type="received">
<day>8</day>
<month>10</month>
<year>2007</year>
</date>
<date date-type="accepted">
<day>5</day>
<month>2</month>
<year>2008</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2008 Rozenberg et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2008</copyright-year>
<copyright-holder>Rozenberg et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0"></ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</p>
<pmc-comment> Rozenberg M Julian rozenbej@mail.nih.gov All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues 2008BMC Genomics 9(1): 67-. (2008)1471-2164(2008)9:1<67>urn:ISSN:1471-2164</pmc-comment>
</license>
</permissions>
<abstract>
<sec>
<title>Background</title>
<p>The promoters of housekeeping genes are well-bound by RNA polymerase II (RNAP) in different tissues. Although the promoters of these genes are known to contain CpG islands, the specific DNA sequences that are associated with high RNAP binding to housekeeping promoters has not been described.</p>
</sec>
<sec>
<title>Results</title>
<p>ChIP-chip experiments from three mouse tissues, liver, heart ventricles, and primary keratinocytes, indicate that 94% of promoters have similar RNAP binding, ranging from well-bound to poorly-bound in all tissues. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters of housekeeping genes, focusing on those DNA sequences which are preferentially localized in the proximal promoter. We observe a bimodal distribution. Virtually all sequences enriched in promoters with high RNAP binding values contain a CpG dinucleotide. These results suggest that only transcription factor binding sites (TFBS) that contain the CpG dinucleotide are involved in RNAP binding to housekeeping promoters while TFBS that do not contain a CpG are involved in regulated promoter activity. Abundant 8-mers that are preferentially localized in the proximal promoters and exhibit the best enrichment in RNAP bound promoters are all variants of six known CpG-containing TFBS: ETS, NRF-1, BoxA, SP1, CRE, and E-Box. The frequency of these six DNA motifs can predict housekeeping promoters as accurately as the presence of a CpG island, suggesting that they are the structural elements critical for CpG island function. Experimental EMSA results demonstrate that methylation of the CpG in the ETS, NRF-1, and SP1 motifs prevent DNA binding in nuclear extracts in both keratinocytes and liver.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>In general, TFBS that do not contain a CpG are involved in regulated gene expression while TFBS that contain a CpG are involved in constitutive gene expression with some CpG containing sequences also involved in inducible and tissue specific gene regulation. These TFBS are not bound when the CpG is methylated. Unmethylated CpG dinucleotides in the TFBS in CpG islands allow the transcription factors to find their binding sites which occur only in promoters, in turn localizing RNAP to promoters.</p>
</sec>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>The promoter region of genes is typically divided into two regions: the core or basal promoter region and the proximal promoter. The core promoter region stretches from around -50 bp to +20 bp and is the location in the promoter where the pre-initiation complex forms and the general transcriptional machinery assembles, including RNA polymerase II (RNAP). The proximal promoter extends from -200 bp to the transcriptional start site (TSS) and contains transcription factor binding sites (TFBS) that are critical for the recruitment of RNA polymerase II (RNAP) to DNA [
<xref ref-type="bibr" rid="B2">2</xref>
-
<xref ref-type="bibr" rid="B4">4</xref>
]. In mammalian genomes, the CpG dinucleotide occurs at 20% of the expected frequency [
<xref ref-type="bibr" rid="B5">5</xref>
] and is typically methylated both in cell cuture and animal tissues [
<xref ref-type="bibr" rid="B6">6</xref>
,
<xref ref-type="bibr" rid="B7">7</xref>
]. The exception is in CpG islands. CpG islands are defined as regions in the DNA at least 200 bp long where C+G comprise more than 50% of the nucleotides and CpG dinucleotides occur at greater than 60% the expected frequency (this represents roughly 8 or more CpGs in 200 bp) [
<xref ref-type="bibr" rid="B8">8</xref>
]. The presence of CpG islands is associated with gene regulatory regions [
<xref ref-type="bibr" rid="B9">9</xref>
] and in the promoters of genes generally correlates with binding by RNA polymerase II (RNAP) [
<xref ref-type="bibr" rid="B9">9</xref>
]. Promoters of housekeeping genes are constitutively bound by RNAP in all tissues while regulated promoters, either tissue specific or inducible, are selectively bound by RNAP in only certain tissue(s) or contexts respectively [
<xref ref-type="bibr" rid="B2">2</xref>
].</p>
<p>Three advances allow us to interrogate the genome-wide properties of promoters. First is the availability of complete genomic sequences. Second is the determination of full-length cDNAs that can identify the TSS and proximal promoter [
<xref ref-type="bibr" rid="B10">10</xref>
]. Third is the determination of the chromatin architecture of the genome by the identification of hypersensitive sites [
<xref ref-type="bibr" rid="B11">11</xref>
,
<xref ref-type="bibr" rid="B12">12</xref>
] or the location of particular proteins or their modified forms using chromatin immunoprecipitation followed by microarray analysis (ChIP-chip) [
<xref ref-type="bibr" rid="B13">13</xref>
]. Although ChIP-chip experiments have identified the location of RNAP and components of the preinitiation complex in particular tissues [
<xref ref-type="bibr" rid="B9">9</xref>
,
<xref ref-type="bibr" rid="B14">14</xref>
], these experiments have not been done systematically over a range of tissues.</p>
<p>We show that all and only CpG containing DNA sequences are associated with RNAP binding to the same promoter in multiple tissues. Many DNA sequences are more abundant near the TSS than elsewhere [
<xref ref-type="bibr" rid="B15">15</xref>
-
<xref ref-type="bibr" rid="B18">18</xref>
] and the six most abundant CpG containing sequences that are localized in proximal promoters are known TFBS and can predict RNAP binding to housekeeping promoters with similar accuracy as the presence of CpG islands.</p>
</sec>
<sec>
<title>Results and discussion</title>
<sec>
<title>Binding of RNAP and H3K9me2 to mouse promoters in keratinocytes, liver, and heart ventricles</title>
<p>To gain insight into the DNA sequence properties of housekeeping promoters, we analyzed RNAP binding to promoters in three mouse tissues: primary skin keratinocytes, liver, and heart ventricles. Using ChIP-chip experiments [
<xref ref-type="bibr" rid="B19">19</xref>
], we determined the genomic localization of initiating (hypo-phosphorylated) RNAP [
<xref ref-type="bibr" rid="B20">20</xref>
,
<xref ref-type="bibr" rid="B21">21</xref>
] in all three tissues (Figure
<xref ref-type="fig" rid="F1">1A–C</xref>
). DNA from the RNAP ChIP analysis was amplified and hybridized to Nimblegen mouse promoter microarrays containing 15 probes spanning from -1,000 bp to +500 bp (see methods). Signal intensities were averaged for each promoter to produce a number representing binding at each promoter. This produced a graded binding of RNAP to promoter regions as has been previously observed [
<xref ref-type="bibr" rid="B9">9</xref>
,
<xref ref-type="bibr" rid="B14">14</xref>
,
<xref ref-type="bibr" rid="B22">22</xref>
]. Raw data for these ChIP-chip experiments can be found at the Vinson laboratory Web site [
<xref ref-type="bibr" rid="B1">1</xref>
]. We limited the following analysis of DNA sequence properties to the set of 14,790 promoters that contains neither similar/duplicated sequences nor a poorly annotated transcriptional start site (TSS).</p>
<fig position="float" id="F1">
<label>Figure 1</label>
<caption>
<p>
<bold>A-C) </bold>
<bold>RNAP</bold>
<bold>binding to 14,790 promoters from ChIP-chip data in different mouse tissues with each spot representing a single promoter</bold>
.
<bold>A) </bold>
keratinocytes versus heart ventricles (R = +0.76).
<bold>B) </bold>
keratinocytes versus liver (R = +0.73).
<bold>C) </bold>
heart ventricle versus liver (R = +0.76).
<bold>D-F) </bold>
RNAP binding to the 13,861 promoters with similar RNAP binding values in heart, liver and keratinocytes.</p>
</caption>
<graphic xlink:href="1471-2164-9-67-1"></graphic>
</fig>
<p>To identify promoters that had similar RNAP binding values in all three tissues, we excluded genes where RNAP binding between any pair of tissues was significantly different. This excluded 534 tissue-specific (356 in liver, 131 in heart, and 47 in keratinocytes) promoters, and 395 with high RNAP binding in two of the three tissues. The remaining 13,861 promoters (94%) have similar RNAP binding in all three tissues, some being well bound by RNAP and others having little RNAP at the promoter (Figure
<xref ref-type="fig" rid="F1">1D–F</xref>
). For each of these 13,861 promoters, termed common RNAP promoters, RNAP binding values from the three tissues were normalized and averaged, producing a single number representing RNAP binding to a promoter across the three tissues.</p>
<p>To investigate the DNA sequence properties of the 13,861 common promoters (-1,000 bp to +500 bp) and determine potential transcription factor binding sites (TFBS) that are responsible for RNAP binding we analyzed the occurrences of 8 bp-long DNA sequences (8-mers) in common RNAP promoters. 8-mers were chosen because their length is similar to that of known TFBS. 8-mers were counted on the sense and anti-sense strands because, with the exception of TATA [
<xref ref-type="bibr" rid="B23">23</xref>
], 8-mers are not restricted to a single strand. Of all 32,896 8-mers (38% contain CpG) we extensively characterized the 12,208 most abundant 8-mers (see materials and methods) of which only 20% contained a CpG highlighting that the CpG dinucleotide is underrepresented even in promoter regions [
<xref ref-type="bibr" rid="B15">15</xref>
].</p>
</sec>
<sec>
<title>All 8-mers enriched in promoters well bound by RNAP in multiple tissues contain a CpG dinucleotide</title>
<p>To measure 8-mer enrichment in promoters commonly bound by RNAP, we calculated the term "8-mer-association-with-RNAP" for all 8-mers. This term is the average RNAP binding to promoters that contain a particular 8-mer normalized by the average RNAP binding to all common promoters. The value "8-mer-association-with-RNAP" is calculated for each 8-mer by first identifying all the promoters that contain that particular 8-mer, and then averaging the RNAP binding values of those promoters. These values are then normalized by dividing by the average of the RNAP binding values of all common promoters. A histogram of these values has a bimodal distribution. 20% of 8-mers are associated with high RNAP binding to common RNAP promoters (Figure
<xref ref-type="fig" rid="F2">2</xref>
). This result suggests that the graded binding of RNAP to promoters is caused by a combination of 8-mers, some of which favor RNAP binding and others which do not. The region of the promoter (-1,000 bp to +500 bp) critical for the observed bimodal distribution extends from -600 bp to +400 bp (see Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
). Strikingly, nearly all the 8-mers that are associated with RNAP binding contain the CpG dinucleotide while virtually none of the remaining 8-mers contain a CpG. In contrast to the CpG dinucleotide, the other dinucleotides did not exclusively occur in either part of the bimodal distribution (Additional file
<xref ref-type="supplementary-material" rid="S2">2</xref>
). A spreadsheet containing the 8-mer-association-with-RNAP for all 8-mers is included in the supplementary material (Additional file
<xref ref-type="supplementary-material" rid="S5">5</xref>
).</p>
<fig position="float" id="F2">
<label>Figure 2</label>
<caption>
<p>8-mer-association-with-RNAP for abundant 8-mers calculated for 13, 861 common promoters between -1,000 bp and +500 bp. 8-mers that contain a CpG are noted in black.</p>
</caption>
<graphic xlink:href="1471-2164-9-67-2"></graphic>
</fig>
<p>To evaluate if other types of promoters have a different enrichment of 8-mers, we examined the transcriptionally inactive genes marked by a post-translationally modified form of histone 3, H3K9me2 (lysine 9 containing a dimethyl group) [
<xref ref-type="bibr" rid="B24">24</xref>
,
<xref ref-type="bibr" rid="B25">25</xref>
]. In keratinocytes, ChIP-chip identification of H3K9me2 genomic localization negatively correlated with RNAP (correlation coefficient, R = -0.50) (Figure
<xref ref-type="fig" rid="F3">3A</xref>
). The 8-mer-association-with-H3K9me2 also had a bimodal distribution with the CpG containing 8-mers associating the least with H3K9me2 binding (Figure
<xref ref-type="fig" rid="F3">3B</xref>
). As anticipated (comparing Figure
<xref ref-type="fig" rid="F2">2</xref>
and
<xref ref-type="fig" rid="F3">3B</xref>
), practically all the 8-mers most associated with common RNAP binding also are least associated with H3K9me2 binding (Figure
<xref ref-type="fig" rid="F3">3C</xref>
). Similar results were obtained when all 8-mers were examined (Additional file
<xref ref-type="supplementary-material" rid="S3">3A–E</xref>
).</p>
<fig position="float" id="F3">
<label>Figure 3</label>
<caption>
<p>
<bold>A) </bold>
<bold>Binding of</bold>
<bold>RNAP</bold>
<bold>vs. H3K9me2 (R = -0.50) in mouse tissue culture keratinocytes.</bold>
<bold>B) </bold>
8-mer-association-with-H3K9me2 for 12,208 abundant 8-mers, calculated for 14,790 promoters between -1,000 bp and +500 bp; CpG containing 8-mers are noted in black.
<bold>C-E) </bold>
8-mer-association-with-RNAP vs. 8-mer-association-with-H3K9me2.
<bold>C) </bold>
All 8-mers. The association-with-RNAP and the association-with-H3K9me2 for the core promoter elements at their unique position in promoters is presented for TATA (TATAWAAR), INR (YYANWYY) and DPE (RGWYV).
<bold>D) </bold>
8-mers without a CpG.
<bold>E) </bold>
8-mers with a CpG.</p>
</caption>
<graphic xlink:href="1471-2164-9-67-3"></graphic>
</fig>
<p>The 8-mers with and without a CpG were plotted separately to highlight the few 8-mers that are the exception to the general conclusion that only CpG containing sequences are associated with RNAP binding to a promoters (Figure
<xref ref-type="fig" rid="F3">3D–E</xref>
). The most notable exception is the GACCAATC 8-mer, a CCAAT sequence that is enriched in housekeeping promoters.</p>
<p>Previous work indicated that ~50% of human promoters bound by RNAP contain the INR and DPE consensus sequences between -200 bp and +200 bp [
<xref ref-type="bibr" rid="B9">9</xref>
]. To see if these non-CpG-containing sequences were also exceptions to our general conclusion, we calculated the association-with-RNAP and association-with-H3K9me2 for TATA, INR and DPE in the set of promoters with similar RNAP binding values in the three tissues we have examined. This was accomplished by averaging the binding values of those promoters that contained the consensus sequence at the expected position [
<xref ref-type="bibr" rid="B3">3</xref>
]. In mouse, the consensus TATA is uniquely positioned in only 1.8% of promoters and has a very high association-with-H3K9me2 binding to promoters. The INR was uniquely positioned in only 9% of promoters and is associated with H3K9me2 bound promoters. DPE is not uniquely positioned in promoters, but occurs in 19% of promoters at the expected location and is also associated with H3K9me2 binding (Figure
<xref ref-type="fig" rid="F3">3C</xref>
). This suggests that TATA, INR and the DPE are not important for RNAP binding to promoters in multiple tissues. Presumably these sequences are important for tissue-specific gene expression.</p>
</sec>
<sec>
<title>CpG sequences are also associated with mRNA expression</title>
<p>We examined whether RNAP binding to the promoter correlates with mRNA expression levels in the genes whose promoters are bound similarly by RNAP in the three tissues examined. mRNA expression data for heart ventricle was obtained [
<xref ref-type="bibr" rid="B26">26</xref>
] and compared to RNAP binding levels for the 4,522 promoters that share a common identifier (Figure
<xref ref-type="fig" rid="F4">4A</xref>
). We calculated the 8-mers-association-with-mRNA-expression and found the same 8-mers associated with RNAP binding to promoters also associated with mRNA expression (Figure
<xref ref-type="fig" rid="F4">4B</xref>
). Thus, CpG-containing 8-mers are most enriched in promoters that have the highest RNAP binding and mRNA expression.</p>
<fig position="float" id="F4">
<label>Figure 4</label>
<caption>
<p>
<bold>A) </bold>
<bold>RNAP</bold>
<bold>binding to promoters vs. mRNA expression for 4,522 promoters with common identifiers.</bold>
<bold>B) </bold>
8-mer-association-with-RNAP vs. 8-mer-association-with-mRNA-expression for abundant 8-mers calculated using the 4,522 promoters graphed in (A). CpG-containing 8-mers are notated in black.</p>
</caption>
<graphic xlink:href="1471-2164-9-67-4"></graphic>
</fig>
</sec>
<sec>
<title>Sequences most enriched in tissue-specific promoters do not contain a CpG</title>
<p>The DNA sequence properties of tissue specific promoters that were well bound by RNAP in only one tissue were compared with housekeeping promoters well bound by RNAP in all three tissues. The abundant 8-mers most enriched in the 356 liver specific promoters do not contain CpG and were different than those associated with RNAP binding in all three tissues (Figure
<xref ref-type="fig" rid="F5">5</xref>
, Additional file
<xref ref-type="supplementary-material" rid="S3">3F</xref>
). As expected, the liver-specific transcription factor HNF4 is enriched in the liver-specific genes. The fact that TATA sequences are also enriched in the liver specific genes is consistent with suggestions that it is a marker for tissue specific, not constitutive gene expression [
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B27">27</xref>
]. Some CpG containing 8-mers are enriched in the liver specific genes indicating that in addition to their housekeeping function, these sequences also mediate tissues specific gene expression. This has been well documented for the CRE (TGACGTCA) [
<xref ref-type="bibr" rid="B28">28</xref>
,
<xref ref-type="bibr" rid="B29">29</xref>
].</p>
<fig position="float" id="F5">
<label>Figure 5</label>
<caption>
<p>
<bold>8-mer-association-with-</bold>
<bold>RNAP</bold>
<bold>vs. 8-mer enrichment in 356 liver specific promoters for abundant 8-mers.</bold>
Highlighted 8-mers contain TATA sequences (STable 1 in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
) and the liver specific HNF4 binding sites (8-mers containing TGACCT). The CpG containing 8-mers are plotted in black.</p>
</caption>
<graphic xlink:href="1471-2164-9-67-5"></graphic>
</fig>
</sec>
<sec>
<title>Non-random distribution of 8-mers in promoters</title>
<p>If the 8-mers that associate with RNAP binding are TFBS, they may be localized in the proximal promoter as has been observed in human [
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B16">16</xref>
] and Drosophila promoters [
<xref ref-type="bibr" rid="B23">23</xref>
]. We thus determined the "Clustering Factor" (CF, a measure of non-random distribution between -1,000 bp and +500 bp) [
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B23">23</xref>
] for abundant 8-mers in promoters, and compared it to 8-mer-association-with-RNAP. Some 8-mers were preferentially localized near the TSS (Figure
<xref ref-type="fig" rid="F6">6A–B</xref>
). The 8-mers most associated with promoters commonly bound by RNAP had a high CF (Figure
<xref ref-type="fig" rid="F6">6C</xref>
, Additional file
<xref ref-type="supplementary-material" rid="S3">3G</xref>
). However, there was also a class of 8-mers with high CFs but low 8-mer-association-with-RNAP values that may represent TFBS involved in regulated gene expression.</p>
<fig position="float" id="F6">
<label>Figure 6</label>
<caption>
<p>
<bold>A) </bold>
<bold>A measure of non-random distribution termed a Clustering Factor (CF) is plotted in the most populated bin for 8-mers with at least 20 members in the most populated 20 bp bin (abundant 8-mers).</bold>
Note the dots between -100 bp and the TSS with large CF values representing 8-mers that are more abundant near the TSS than elsewhere.
<bold>B) </bold>
A probability term P for the 8-mers in (A). A P value of 24 means that the distribution of the 8-mer has a less than 10
<sup>-24 </sup>
chance of being random.
<bold>C) </bold>
Non-random distribution of 8-mers (Clustering Factor) vs. 8-mer-association-with-RNAP for abundant 8-mers.</p>
</caption>
<graphic xlink:href="1471-2164-9-67-6"></graphic>
</fig>
<p>The 120 8-mers with the statistically highest CF (Figure
<xref ref-type="fig" rid="F6">6B</xref>
) that localize upstream of the TSS could be manually grouped into ten consensus motifs that are known TFBS: ETS, NRF-1, E-Box, BoxA, CRE, SP1, KLF, CCAAT, TATA, and CRE-T (STable 1 in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
), six of which contain a CpG dinucleotide (ETS, NRF-1, E-Box, BoxA, CRE, and SP1). A similar analysis has identified that these ten motifs localize to the proximal promoter in human promoters [
<xref ref-type="bibr" rid="B15">15</xref>
]. The six motifs that contain a CpG in the consensus motif (ETS, NRF-1, E-Box, BoxA, CRE, and SP1) always positively correlated with each other in the proximal promoter, exceeding expectations by up to two fold (STable 2A in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
), were enriched in the 20% of promoters best bound by RNAP in all three tissues (STable 2B in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
), and were underrepresented in H3K9me2 marked promoters (STable 2C in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
). ETS, NRF-1, and BoxA correlate the best with RNAP binding to promoters in multiple tissues (STable 2B in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
). Of the ten identified motifs, only TATA and CRE-T were enriched in the 20% of promoters best marked by H3K9me2 in keratinocytes (STable 2C in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
). To see if these TFBS play some specific role in mRNA expression or RNAP binding, we calculated the association-with-mRNA-expression and association-with-RNAP for the consensus sequences of these TFBS (Table
<xref ref-type="table" rid="T1">1</xref>
). As expected, the CpG-containing TFBS have high association values for both mRNA expression and RNAP binding.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption>
<p>Association of the 10 localized motifs with RNAP binding and mRNA expression.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="center">
<bold>Motif</bold>
</td>
<td align="center">
<bold>Sequence</bold>
</td>
<td align="center">
<bold>8-mer-association-with-RNAP</bold>
</td>
<td align="center">
<bold>8-mer-association-with-mRNA expression</bold>
</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left">BoxA</td>
<td align="left">TCTCGCGA</td>
<td align="center">1.30</td>
<td align="center">2.50</td>
</tr>
<tr>
<td align="left">NRF-1</td>
<td align="left">GCGVTGCG</td>
<td align="center">1.24</td>
<td align="center">2.44</td>
</tr>
<tr>
<td align="left">ETS</td>
<td align="left">VCCGGAARY</td>
<td align="center">1.21</td>
<td align="center">2.39</td>
</tr>
<tr>
<td align="left">CRE</td>
<td align="left">TGACGTCA</td>
<td align="center">1.19</td>
<td align="center">2.32</td>
</tr>
<tr>
<td align="left">SP-1</td>
<td align="left">CCCCGCCC</td>
<td align="center">1.14</td>
<td align="center">2.38</td>
</tr>
<tr>
<td align="left">E-Box</td>
<td align="left">YCACGTGA</td>
<td align="center">1.10</td>
<td align="center">2.28</td>
</tr>
<tr>
<td align="left">CCAAT</td>
<td align="left">RRCCAATSR</td>
<td align="center">1.04</td>
<td align="center">2.27</td>
</tr>
<tr>
<td align="left">KLF</td>
<td align="left">CCCCTCCC</td>
<td align="center">1.04</td>
<td align="center">2.28</td>
</tr>
<tr>
<td align="left">TATA</td>
<td align="left">TATAAAD</td>
<td align="center">0.96</td>
<td align="center">2.22</td>
</tr>
<tr>
<td align="left">CRE-T</td>
<td align="left">TGATGTCA</td>
<td align="center">0.90</td>
<td align="center">2.17</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Column one contains the name of the motif; column two contains the DNA sequence of the motif; column three is the 8-mer-association-with-RNAP for promoters (-1,000 bp to +500 bp) commonly bound by RNAP in the three tissues examined ranked in order by association; column four is 8-mer-association-with-mRNA-expression.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>CpG islands can be defined by two or more of the six CpG containing TFBS</title>
<p>Previous work has suggested that housekeeping genes can be defined by the presence of a CpG island in the promoter region [
<xref ref-type="bibr" rid="B8">8</xref>
], but the DNA sequences properties of CpG islands has not been described. We evaluated if the presence of the six CpG consensus motifs in proximal promoters (-200 bp to the TSS) predicts RNAP binding to promoters commonly bound by RNAP and compared these results with the occurrence of a CpG island between -200 bp to the TSS (Figure
<xref ref-type="fig" rid="F7">7A</xref>
). The results demonstrate that the presence of any two of these motifs recapitulates the discrimination based on the presence of a CpG island in regards to RNAP binding to common promoters. In order to compare these two measures, we grouped promoters into ten equal size groups with increased RNAP binding. 80% of promoters in the group best bound by RNAP contain a CpG island and a similar number contain two or more of the six motifs (Figure
<xref ref-type="fig" rid="F7">7A</xref>
). Similarly, only 5% of promoters with the lowest RNAP binding values are CpG islands, and only about 5% have two or more motifs (Figure
<xref ref-type="fig" rid="F7">7A</xref>
). The presence of three or more of these motifs produced a lower positive hit rate in the best bound group (48%) but occurred in only 1% of promoters not bound by RNAP. Therefore, our analysis suggests that CpG islands have predictive value in defining housekeeping genes because of the presence of these six TFBS motifs. These six motifs do not account for all CpGs in CpG islands. Some of the other CpGs are known TFBS but the function of the rest remains unclear. They could be sequences that persist because they are protected from methylation and ultimate destruction or they could be involved in the higher-level regulatory processes that have been proposed for CpG islands [
<xref ref-type="bibr" rid="B30">30</xref>
]. In contrast to promoters well bound by RNAP in multiple tissues, only 20% of tissue specific proximal promoters are CpG islands and similarly only 20% contain two or more of these six motifs. This indicates that these six motifs correlate with promoters that are bound by RNAP in multiple tissues and not tissue specific promoters (Figure
<xref ref-type="fig" rid="F7">7B</xref>
).</p>
<fig position="float" id="F7">
<label>Figure 7</label>
<caption>
<p>
<bold>A) </bold>
<bold>Fraction of promoters that contain particular sequences between -200 bp and TSS: 1) CpG island, 2) two or more of six CpG containing motifs (SP1: CCCGCC, CCGCCC, CGCCCC; ETS: CCGGAA, GCGGAA; NRF-1:CGCATGCG, CGCGTGCG, CGCCTGCG; BoxA: TCTCGCG, CTCGCGA; CRE: ACGTCA; E-Box: CACGTG), 3) three or more of the six motifs.</bold>
<bold>B) </bold>
Fraction of promoters that contain particular motifs: top 20% of common RNAP promoters (Const), liver specific (LS), heart ventricle specific (HS), and keratinocyte specific (KS) promoters. Average RNAP binding for each class is presented.</p>
</caption>
<graphic xlink:href="1471-2164-9-67-7"></graphic>
</fig>
</sec>
<sec>
<title>Nuclear extracts do not bind TFBS with a methylated CpG</title>
<p>Methylation of CpG dinucleotides in CpG islands inhibits promoter activity and occurs in many cancers where the oncogenic event is the transcriptional suppression of tumor suppressor genes [
<xref ref-type="bibr" rid="B30">30</xref>
]. One simple explanation is that CpG methylation inhibits TFs from binding their TFBS resulting in promoter inactivity. A more prevalent, but not mutually exclusive view suggests that a more active mechanism is functioning in which methyl binding proteins bind methylated CpGs to facilitate chromatin mediated occlusion of the promoter [
<xref ref-type="bibr" rid="B30">30</xref>
,
<xref ref-type="bibr" rid="B31">31</xref>
]. The effect of CpG methylation on the function of five of the six CpG containing TFBS (DNA binding and/or transcriptional potential) that localize in the proximal promoter has been described. The one exception is BoxA, for which the effect of CpG methylation on DNA binding has not been reported in the literature. In general, methylation inhibits the activity of CpG containing TFBS [
<xref ref-type="bibr" rid="B32">32</xref>
]. CpG methylation is reported to inhibit the function of a CRE [
<xref ref-type="bibr" rid="B33">33</xref>
], ETS [
<xref ref-type="bibr" rid="B34">34</xref>
], NRF-1 [
<xref ref-type="bibr" rid="B35">35</xref>
], and E-Box [
<xref ref-type="bibr" rid="B36">36</xref>
]. Other CpG containing motifs are also inhibited via methylation including AP2 [
<xref ref-type="bibr" rid="B37">37</xref>
] and CTCF [
<xref ref-type="bibr" rid="B38">38</xref>
]. Methylation of the CpG in the SP1 motif, the most abundant CpG containing motif, is reported to either not affect DNA binding [
<xref ref-type="bibr" rid="B39">39</xref>
-
<xref ref-type="bibr" rid="B41">41</xref>
], affect binding when a cytosine flanking the CpG is methylated [
<xref ref-type="bibr" rid="B41">41</xref>
,
<xref ref-type="bibr" rid="B42">42</xref>
] or inhibit binding [
<xref ref-type="bibr" rid="B43">43</xref>
].</p>
<p>We observe that CpG methylation of a canonical SP1, ETS, or NRF-1 site abolishes DNA binding of nuclear extracts isolated from either liver or primary keratinocytes (Figure
<xref ref-type="fig" rid="F8">8</xref>
). When both DNA strands of a canonical SP1 site are methylated, nuclear extract binding are abolished. For ETS, methylation of a one strand of DNA is sufficient to abolish DNA binding while for NRF-1, methyation of both CpGs in the canonical site on either strand is sufficient to abolished binding. As a control, we show that the methylated SP1 oligonucleotides could bind to the non-specific prokaryotic protein HU. Reexamination of previous reports indicates that SP1 methylation causes a modest decrease in SP1 binding that our experimental system is able to demonstrate more dramatically.</p>
<fig position="float" id="F8">
<label>Figure 8</label>
<caption>
<p>EMSA using keratinocyte and liver nuclear extracts and pure HU protein with 28 bp double stranded oligonucleotides containing on the sense strand a canonical SP1 (GGGGCGGG), ETS (CCGGAA), and NRF-1 (GCGVTGCG) site where the cytosine in the CpG is non methylated (-/-), hemi-methylated (-/+), hemi-methylated (+/-), or methylated (+/+).</p>
</caption>
<graphic xlink:href="1471-2164-9-67-8"></graphic>
</fig>
</sec>
</sec>
<sec>
<title>Conclusion</title>
<p>We identified promoters that are bound similarly by RNAP in multiple tissues and determined the association between the presence of 8-mers in these promoters and the extent of RNAP binding to the promoter. Looking at RNAP binding to housekeeping promoters, we observed a bimodal distribution: only 8-mers with the CpG dinucleotide are in the class of sequences most associated with RNAP binding and only 8-mers without a CpG are in the class least associated with RNAP binding. An implication of this observation is that knowing if a TFBS contains a CpG reveals aspects of its biological function. If the TFBS contains a CpG, it is involved in constitutive gene expression and if the TFBS does not contain a CpG, it is involved in regulated gene expression. This insight will help identify potential functions for transcription factors when their TFBS is identified. Additionally, if a transcription factor shows degeneracy in its TFBS [
<xref ref-type="bibr" rid="B44">44</xref>
,
<xref ref-type="bibr" rid="B45">45</xref>
], binding to a CpG sequence and a similar sequence without a CpG, it suggests that this transcription factor is involved in both constitutive and regulated gene expression. This is observed for the CRE and CRE-T sequences, two sequences that are localized in the proximal promoter and differ by a single base: CRE contains a CpG (TGACGTCA) while CRE-T does not (TGA
<bold>T</bold>
GTCA). The CREB protein binds both sequences well (data not shown) but the two sequences correlate very differently with RNAP binding suggesting that the CREB transcription factor can regulate either constitutive gene expression by binding the CRE sequence or regulated gene expression by binding the CRE-T sequence.</p>
<p>In vertebrates CpG dinucleotides are rare and usually are methylated on the cytosine but do occur at close to the expected frequency in clusters called CpG islands where the CpGs remain unmethylated [
<xref ref-type="bibr" rid="B30">30</xref>
,
<xref ref-type="bibr" rid="B46">46</xref>
]. These CpG islands often occur in promoters of housekeeping genes [
<xref ref-type="bibr" rid="B8">8</xref>
,
<xref ref-type="bibr" rid="B9">9</xref>
]. We show that the presence of two or more of any of the six CpG containing TFBS (SP1, ETS, NRF-1, CRE, E-Box, and BoxA) in the proximal promoter can predict RNAP binding to housekeeping promoters as accurately as the presence of a CpG island in the proximal promoter.</p>
<p>Methylation of the CpG in the TFBS has been found to inhibit the DNA binding for five of the six TFBS that are abundant and localize in proximal promoters suggesting this may be a general result for CpG containing TFBS. Methylation dependent inhibition of transcription factor binding to DNA has two implications. First, the transcription factors that are critical for the activation of housekeeping genes solve the problem of finding their TFBS in the genome by only binding to unmethylated TFBS. Since most CpGs in the genome are methylated, the only places these transcription factors can bind are in the unmethylated CpG islands in promoters. Second, the pathological methylation of CpG dinucleotides in CpG islands, as occurs in many cancers [
<xref ref-type="bibr" rid="B30">30</xref>
], would prevent these abundant transcription factors from binding their TFBS thus causing the promoters to become inactive. This could be a critical initial step that subsequently allows CpG methyl binding proteins to bind to methylated CpGs and actively repress a promoter [
<xref ref-type="bibr" rid="B31">31</xref>
].</p>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<sec>
<title>Promoter annotation</title>
<p>Mouse (
<italic>Mus musculus</italic>
) annotation data and genomic DNA sequences for the region -1,000 bp to +500 bp, relative to the annotated transcription start site (TSS), were downloaded from the UCSC Genome Browser site (
<italic>version mm5, May 2004</italic>
). This dataset contains the putative promoter regions of 26,000 genes that are represented on the MM5 minimum promoter mouse Nimblegen ChiP-chip array. However, since the TSS for many of these genes is poorly annotated (e.g. the TSS is the same as the translation start), we refined this dataset to include only those genes where the distance between the TSS to the translation start (ATG) was greater than 30 nucleotides. This reduced the total number of putative promoter regions to 15,180. We further reduced this number by excluding promoter with gaps greater than 200 bps and the blastclust program was used to confirm that this dataset did not contain multiple copies of the same DNA sequences resulting in 14,790 promoters.</p>
<p>The 14,790 analyzed promoters are a biased subset of the 26,000 promoters on the ChIP-chip array. The annotated promoters are enriched 1.3 fold for the 20% of promoters best bound by RNAP and depleted by 2 fold for H3K9me2 bound promoters. This could reflect that the H3K9me2 genes are not universally expressed and full-length cDNA data does not exist for them, preventing identification of a TSS.</p>
</sec>
<sec>
<title>Clustering Factor (CF) calculation</title>
<p>To determine if a DNA sequence has a non-random distribution (i.e. clustered), we used an automated method of detecting and quantifying peak height as described previously [
<xref ref-type="bibr" rid="B15">15</xref>
]. Abundant 8-mers contained 20 or more members in a 20 base pair window in the 14,970 examined promoters.</p>
</sec>
<sec>
<title>Cultures of primary keratinocytes</title>
<p>Primary keratinocytes were isolated from newborn FVB mice epidermis [
<xref ref-type="bibr" rid="B47">47</xref>
]. Primary keratinocytes were seeded at a density of 0.6 pelt or 5 × 10
<sup>6 </sup>
cells per 100-mm dish in Ca
<sup>+2 </sup>
and Mg
<sup>+2 </sup>
free EMEM (Cambrex Bio Science Walkersville, Inc), supplemented with 8% Chelex (Bio-Rad, Richmond, CA) treated FBS (Atlanta Biologicals, Inc), 0.2 mM Ca2+ and Antibiotic-antimycotic. After 20 h, cultures were washed with PBS and switched to the same medium containing 0.05 mM Ca
<sup>+2</sup>
. After three days cells were used for ChIP.</p>
</sec>
<sec>
<title>Liver and heart samples</title>
<p>Tissues from 5 adult FVB mice were frozen and ground in fine powder in liquid nitrogen. After nitrogen evaporation, samples were moved into a 50 ml conical tube and 10 mls of 1% formaldehyde in PBS was added and samples incubated for 10 minutes at 37°C with vortexing. 125 mM glycine was added for 5 minutes, cells were washed in PBS with 1 mM PMSF once, dounced in Lyzis buffer (5 mM PIPES pH 8.0 85 mM KCL 0.5% NP40 1 mM NF 1 mM NaVa Roche protease inhibitors cocktail) and re-suspended in 200
<italic>μ</italic>
l Nuclear lysis buffer (50 mM Tris-Cl pH 8.1 10 mM EDTA, 1% SDS proteases and phosphates inhibitors as above). DNA was sheared by sonication to yield fragments from 3,000 to 300 bp. Samples were centrifuged and supernatants were diluted 6 times (0.01% SDS, 1.1% Trition × 100, 1.2 mM EDTA, 16.7 mM Tris-Cl pH 8.1, 167 mM NaCl) and used for ChIP.</p>
</sec>
<sec>
<title>Chromatin immunoprecipitation</title>
<p>Chromatin immunoprecipitation (ChIP) was performed using antibodies against RNAP from Covance, (8WG16) that recognizes the unphosphorylated form of RNAP, H3K9me2 from Upstate (07–441), and CREB using a mixture of antibodies from Santa Cruz (sc-186) and Upstate (06–863), c-Jun from Santa Cruz (sc-1694). The ChIP protocol was from P. Farnham [
<xref ref-type="bibr" rid="B19">19</xref>
,
<xref ref-type="bibr" rid="B48">48</xref>
]. For immunoprecipitation, we used protein G agarose beads (Invitrogen). Starting with 2 × 10
<sup>6 </sup>
cells, we typically isolate 1 ng of ChIP DNA for RNAP and 5 ng for histone H3K9me2.</p>
</sec>
<sec>
<title>ChIP DNA amplification and hybridization</title>
<p>Protocol for random DNA amplification [
<xref ref-type="bibr" rid="B49">49</xref>
,
<xref ref-type="bibr" rid="B50">50</xref>
] was adapted from DeRisi lab. We used primers conjugated with Cy3 or Cy5. After amplification 10–15 ug of DNA was purified using Quiagen PCR purification Kit, concentrated by isopropanol precipitation and dried for 5 min under vacuum. DNA was dissolved in 3
<italic>μ</italic>
l water, mixed with Component A and Hybridization buffer (Nimblegen) according to manufacturer instructions. Amplified ChIP DNA was hybridized to Nimblegen MM5 min Mouse promoter microarrays containing 400,000 oligos interrogating 26,000 promoters. Arrays were washed in 45C 0.2%SDS, 0.2%SSC for 15 sec, in the same buffer at room temperature for 2 min, 0.2%SSC for one minute, 0.05% SSC for 15 sec. Arrays were dried by centrifugation and scanned using Axon 4000B scanner. Images were processed with NIMBLESCAN (Nimblegen) using default settings. Average of enrichment for fifteen spots representing one promoter were used as a measure of "binding" for a protein. We averaged binding of RNAP and H3K9me2 from two independent hybridizations for each tissue using independent biological samples. Correlation coefficients for keratinocytes replicates were: RNAP – 0.79, H3K9me2 – 0.67 and for RNAP ChIP's from liver samples: 0.86; heart samples: 0.83.</p>
</sec>
<sec>
<title>Electrophoretic Mobility Shift Assay (EMSA)</title>
<p>Following PAGE purified 28 base pairs long oligonucleotides, the sense strand, with their complimentary strands were purchased from Sigma-genosys (USA).</p>
<p>SP-1: GTCAGTCA
<underline>GGGGG(C/C
<sup>m</sup>
)GGGG</underline>
CATCGGTCAG</p>
<p>ETS: GTCAGTCAGA
<underline>C(C/C
<sup>m</sup>
)GGAAGT</underline>
TATCGGTCAG</p>
<p>NRF-1: GTCAGTCAGA
<underline>(C/C
<sup>m</sup>
)GCCTG(C/C
<sup>m</sup>
)G</underline>
TATCGGTCAG</p>
<p>A single consensus binding site for each transcription factor containing either nonmethylated (C) or methylated cytosine (C
<sup>m</sup>
) (1 methyleted cytosine in SP-1 and ETS and 2 in NRF-1) is underlined. Sense strands of non-methylated and methylated oligos were end labeled with (
<italic>γ</italic>
<sup>32</sup>
P) ATP (5000 mCi/mmol; MP Biomedical) using T4 PNK enzyme (New England Biolabs). Equimolar labeled sense and complimentary cold anti-sense oligos were annealed by heating the mixture in annealing buffer to 65°C for 15 minutes and snap cooling it on ice for 2 minutes followed by incubation at room temperature for 15 min. Annealing resulted in four types of labeled double stranded oligos (1 non-methylated, 2 hemi-methylated oligos and 1 methylated oligo) and these were used for EMSA.</p>
<p>Nuclear extract was prepared from mouse liver and cultured mouse primary keratinocytes [
<xref ref-type="bibr" rid="B51">51</xref>
]. In 20
<italic>μ</italic>
l of reaction sample, 7 pg of labeled oligonucleotide (50,000 cpm) was added to 5
<italic>μ</italic>
g of nuclear extract, and incubated in binding buffer (10 mM HEPES, 80 mM KCl, 0.05 mM EDTA, 6% glycerol, 1 mM DTT and 1 mM MgCl
<sub>2</sub>
) at 37°C for 20 min. Samples were separated on a 5% native PAGE gel in 0.25 × TBE at 150 V for 1.5 hrs. Gels were dried and exposed for autoradiography. For EMSA involving E. coli HU protein, a kind gift from Shankar Adhya, 30 nM of HPLC purified recombinant HU was incubated in binding buffer (25 mM Tris-HCl pH 8.0, 50 mM KCl, 0.5 mM EDTA, 2.5 mM DTT, 1
<italic>μ</italic>
g BSA) with 7 pg of labeled double stranded oligo in a total volume of 20
<italic>μ</italic>
l and complex was separated on 7.5% native page (0.25 × TBE, 150 V, 1.5 Hrs), dried and autoradiographed.</p>
</sec>
<sec>
<title>8-mer-association-with-binding</title>
<p>To find the "8-mer-association-with-binding" (
<italic>b</italic>
<sub>8</sub>
), we averaged the binding values of the promoters (
<italic>b</italic>
<sub>
<italic>p</italic>
</sub>
) whose sequence contained that 8-mer and divided by the average of the binding values to the promoters (
<inline-formula>
<mml:math id="M1" name="1471-2164-9-67-i1" overflow="scroll">
<mml:semantics>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">¯</mml:mo>
</mml:mover>
</mml:mrow>
</mml:semantics>
</mml:math>
</inline-formula>
).</p>
<p>
<disp-formula>
<mml:math id="M2" name="1471-2164-9-67-i2" overflow="scroll">
<mml:semantics>
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mn>8</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mi>p</mml:mi>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mn>8</mml:mn>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">¯</mml:mo>
</mml:mover>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mi>p</mml:mi>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mn>8</mml:mn>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:semantics>
</mml:math>
</disp-formula>
</p>
<p>Where
<italic>p </italic>
is the promoter in question. ∂
<sub>8
<italic>p </italic>
</sub>
is equal to one if the 8-mer occurs in the promoter sequence and zero otherwise. Summing over
<italic>p </italic>
implies summing over all the promoters in the set in question.</p>
</sec>
<sec>
<title>Promoters with similar RNAP binding</title>
<p>In order to identify promoters with similar RNAP binding in two tissues, we rotated the data so that the best-fit line was the 45-degree line through the origin. The two-dimensional rotation matrix is:</p>
<p>
<disp-formula>
<mml:math id="M3" name="1471-2164-9-67-i3" overflow="scroll">
<mml:semantics>
<mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>cos</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>sin</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi>sin</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>cos</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:semantics>
</mml:math>
</disp-formula>
</p>
<p>where
<italic>θ </italic>
is the angle by which we rotated the coordinates in the two-dimensional plane. For a given pair of data sets, this angle can be determined by subtracting the angle of the best-fit line from 45 degrees. For each data point, the rotated values are calculated by operating the rotation matrix on the original data point. The line can be forced to the origin by adding or subtracting the value of the vertical-intercept of the best-fit line from the vertical data before the rotation. The new "rotated binding values" are then determined by operating on the original binding values:</p>
<p>
<disp-formula>
<mml:math id="M4" name="1471-2164-9-67-i4" overflow="scroll">
<mml:semantics>
<mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>A</mml:mi>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msubsup>
<mml:mi>b</mml:mi>
<mml:mi>B</mml:mi>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>cos</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>sin</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi>sin</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>cos</mml:mi>
<mml:mo></mml:mo>
<mml:mi>θ</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>A</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>b</mml:mi>
<mml:mi>B</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:semantics>
</mml:math>
</disp-formula>
</p>
<p>In order to assure that the rotation was robust and not heavily influenced by outliers in the data set, we temporarily removed data more than one standard deviation from the original best fit line. If the best-fit line of the transposed data still maintained its 45-degree angle within some small error range, we concluded the data was successfully rotated. If not, then we repeated the procedure using the new rotated values and only those points within one standard deviation of the best-fit line to determine the new rotation angle and intercept adjustment. This was repeated until the best-fit line did not significantly alter with the removal of data points more than one standard deviation from 45 degree line.</p>
<p>In our case we had RNAP binding values for three distinct tissues: primary mouse keratinocytes, heart ventricle, and liver. We knew that the results are similar in all three tissues, with the exception of genes involved with tissue-specific expression in those tissues. We rotated the data by pairs in the method described above. This took several iterations since the rotation of one pair might affect the values of another pair. The end result was new "rotated binding values" for the promoters in each of the three tissues. These values were then averaged to produce the "Average RNAP binding" of that promoter in all three tissues.</p>
</sec>
<sec>
<title>Determining Tissue Specific Promoters</title>
<p>Promoters which were more than two standard deviations off of the 45-degree best-fit line (as determined above) through any of the three pair of data (liver-heart, liver-keratinocytes, and heart-keratinocytes), were considered "tissue-specific" (not commonly bound). Of our original set of 14,790 promotes, 929 were not commonly bound by RNAP in all three tissues, leaving 13,861 promoters which were commonly bound in all three tissues. Of 929 promoters that were not commonly bound by RNAP, tissue specific promoters were selected based on following criteria using the raw RNAP binding values:</p>
<p>356 liver specific promoters: L > 1.5 × H, L > 1.5 × K, H< 1.5 (raw RNAP binding value), K < 1.5</p>
<p>131 heart specific promoters: H > 1.3 × L, H > 1.3 × K, L < 1.5, K < 1.5</p>
<p>47 keratinocytes specific promoters: K > 1.5 × L, K > 1.5 × H, H < 1.5, L < 1.5</p>
<p>Where L stands for RNAP binding value in liver, H is RNAP binding in heart and K – RNAP binding in keratinocytes.</p>
</sec>
</sec>
<sec>
<title>Authors' contributions</title>
<p>JMR did the ChIP-chip experiments and helped in data analysis, AS, KG, and PCF helped in data analysis, VR did the SP1 gel shift, MVM helped in ChIP-chip experiments, and all authors helped in manuscript preparation. All authors read and approved the final manuscript.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="S1">
<caption>
<title>Additional file 1</title>
<p>The region of the promoter critical for the bimodal distribution of the 8-mer-association-with-RNAP. Histogram of the 8-mer-association-with-RNAP between -1,000 bp and +500 bp and in 200 bp increments from -1,200 bp to +1,000 bp for abundant 8-mers in the common RNAP promoters. 8-mers that contain a CpG are noted in black.</p>
</caption>
<media xlink:href="1471-2164-9-67-S1.ppt" mimetype="application" mime-subtype="vnd.ms-powerpoint">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S2">
<caption>
<title>Additional file 2</title>
<p>Distribution of the 8-mer-association-with-RNAP for 8-mers containing particular dinucleotide. Histograms of the 8-mer-association-with-RNAP between -1,000 bp and +500 bp for abundant and all 8-mers with 8-mers containing each of the 10 dinucleotides noted in black.</p>
</caption>
<media xlink:href="1471-2164-9-67-S2.ppt" mimetype="application" mime-subtype="vnd.ms-powerpoint">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S5">
<caption>
<title>Additional file 5</title>
<p>8-mer-association-with-RNAP for all 8-mers. Spreadsheet containing the 8-mer-association-with-RNAP for all 8-mers.</p>
</caption>
<media xlink:href="1471-2164-9-67-S5.xls" mimetype="application" mime-subtype="vnd.ms-excel">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S3">
<caption>
<title>Additional file 3</title>
<p>Data presented at figures
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3(D–E)</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
and
<xref ref-type="fig" rid="F6">6C</xref>
for all 8-mers. The data presented at figures
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3(D–E)</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
and
<xref ref-type="fig" rid="F6">6C</xref>
is shown here for all 8-mers. Histograms and scatter plots for 8-mer-association-with-RNAP vs. 8-mer-association-with-H3K9me2, enrichment of 8-mers in 356 liver specific promoters vs. 8-mer-association-with-RNAP, clustering factor vs. 8-mer-association-with-RNAP.</p>
</caption>
<media xlink:href="1471-2164-9-67-S3.ppt" mimetype="application" mime-subtype="vnd.ms-powerpoint">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S4">
<caption>
<title>Additional file 4</title>
<p>Supplementary tables. Table 1 shows the 120 statistically most non-randomly distributed sequences placed into 10 groups. Table 2 shows co-occurrence of the 10 proximal promoter motifs between -200 bp and the TSS in 14,790 mouse promoters, top 20% of common RNAP promoters and top 20% of promoters best bound by H3K9me2.</p>
</caption>
<media xlink:href="1471-2164-9-67-S4.ppt" mimetype="application" mime-subtype="vnd.ms-powerpoint">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<sec>
<title>Acknowledgements</title>
<p>We thank Brian Oliver and Dave Leven for comments on an earlier version of this manuscript, Yatrik Shah for the cross-linked liver sample and Shankar Adhya for the purified HU protein.</p>
</sec>
</ack>
<ref-list>
<ref id="B1">
<citation citation-type="other">
<article-title>Web site of the Charles Vinson laboratory</article-title>
<ext-link ext-link-type="uri" xlink:href="http://home.ccr.cancer.gov/metabolism/vinson/vinsonccr.htm"></ext-link>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smale</surname>
<given-names>ST</given-names>
</name>
<name>
<surname>Kadonaga</surname>
<given-names>JT</given-names>
</name>
</person-group>
<article-title>The RNA polymerase II core promoter</article-title>
<source>Annu Rev Biochem</source>
<year>2003</year>
<volume>72</volume>
<fpage>449</fpage>
<lpage>479</lpage>
<pub-id pub-id-type="pmid">12651739</pub-id>
<pub-id pub-id-type="doi">10.1146/annurev.biochem.72.121801.161520</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Maston</surname>
<given-names>GA</given-names>
</name>
<name>
<surname>Evans</surname>
<given-names>SK</given-names>
</name>
<name>
<surname>Green</surname>
<given-names>MR</given-names>
</name>
</person-group>
<article-title>Transcriptional Regulatory Elements in the Human Genome</article-title>
<source>Annu Rev Genomics Hum Genet</source>
<year>2006</year>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Heintzman</surname>
<given-names>ND</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>The gateway to transcription: identifying, characterizing and understanding promoters in the eukaryotic genome</article-title>
<source>Cell Mol Life Sci</source>
<year>2007</year>
<volume>64</volume>
<fpage>386</fpage>
<lpage>400</lpage>
<pub-id pub-id-type="pmid">17171231</pub-id>
<pub-id pub-id-type="doi">10.1007/s00018-006-6295-0</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Swartz</surname>
<given-names>MN</given-names>
</name>
<name>
<surname>Trautner</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Kornberg</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Enzymatic synthesis of deoxyribonucleic acid. XI. Further studies on nearest neighbor base sequences in deoxyribonucleic acids</article-title>
<source>J Biol Chem</source>
<year>1962</year>
<volume>237</volume>
<fpage>1961</fpage>
<lpage>1967</lpage>
<pub-id pub-id-type="pmid">13918810</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bird</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Taggart</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Frommer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>OJ</given-names>
</name>
<name>
<surname>Macleod</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>A fraction of the mouse genome that is derived from islands of nonmethylated, CpG-rich DNA</article-title>
<source>Cell</source>
<year>1985</year>
<volume>40</volume>
<fpage>91</fpage>
<lpage>99</lpage>
<pub-id pub-id-type="pmid">2981636</pub-id>
<pub-id pub-id-type="doi">10.1016/0092-8674(85)90312-5</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bird</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>DNA methylation patterns and epigenetic memory</article-title>
<source>Genes Dev</source>
<year>2002</year>
<volume>16</volume>
<fpage>6</fpage>
<lpage>21</lpage>
<pub-id pub-id-type="pmid">11782440</pub-id>
<pub-id pub-id-type="doi">10.1101/gad.947102</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gardiner-Garden</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Frommer</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>CpG islands in vertebrate genomes</article-title>
<source>J Mol Biol</source>
<year>1987</year>
<volume>196</volume>
<fpage>261</fpage>
<lpage>282</lpage>
<pub-id pub-id-type="pmid">3656447</pub-id>
<pub-id pub-id-type="doi">10.1016/0022-2836(87)90689-9</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Barrera</surname>
<given-names>LO</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Qu</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Singer</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Richmond</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Green</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>A high-resolution map of active promoters in the human genome</article-title>
<source>Nature</source>
<year>2005</year>
<volume>436</volume>
<fpage>876</fpage>
<lpage>880</lpage>
<pub-id pub-id-type="pmid">15988478</pub-id>
<pub-id pub-id-type="doi">10.1038/nature03877</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carninci</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Sandelin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lenhard</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Katayama</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Shimokawa</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ponjavic</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Semple</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>MS</given-names>
</name>
<name>
<surname>Engstrom</surname>
<given-names>PG</given-names>
</name>
<name>
<surname>Frith</surname>
<given-names>MC</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Genome-wide analysis of mammalian promoter architecture and evolution</article-title>
<source>Nat Genet</source>
<year>2006</year>
<volume>38</volume>
<fpage>626</fpage>
<lpage>635</lpage>
<pub-id pub-id-type="pmid">16645617</pub-id>
<pub-id pub-id-type="doi">10.1038/ng1789</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sabo</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Humbert</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hawrylycz</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wallace</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Dorschner</surname>
<given-names>MO</given-names>
</name>
<name>
<surname>McArthur</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Stamatoyannopoulos</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>Genome-wide identification of DNaseI hypersensitive sites using active chromatin sequence libraries</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2004</year>
<volume>101</volume>
<fpage>4537</fpage>
<lpage>4542</lpage>
<pub-id pub-id-type="pmid">15070753</pub-id>
<pub-id pub-id-type="doi">10.1073/pnas.0400678101</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Crawford</surname>
<given-names>GE</given-names>
</name>
<name>
<surname>Holt</surname>
<given-names>IE</given-names>
</name>
<name>
<surname>Whittle</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Webb</surname>
<given-names>BD</given-names>
</name>
<name>
<surname>Tai</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Margulies</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Bernat</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Ginsburg</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS)</article-title>
<source>Genome Res</source>
<year>2006</year>
<volume>16</volume>
<fpage>123</fpage>
<lpage>131</lpage>
<pub-id pub-id-type="pmid">16344561</pub-id>
<pub-id pub-id-type="doi">10.1101/gr.4074106</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ren</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Robert</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Wyrick</surname>
<given-names>JJ</given-names>
</name>
<name>
<surname>Aparicio</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Jennings</surname>
<given-names>EG</given-names>
</name>
<name>
<surname>Simon</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Zeitlinger</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schreiber</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hannett</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Kanin</surname>
<given-names>E</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Genome-wide location and function of DNA binding proteins</article-title>
<source>Science</source>
<year>2000</year>
<volume>290</volume>
<fpage>2306</fpage>
<lpage>2309</lpage>
<pub-id pub-id-type="pmid">11125145</pub-id>
<pub-id pub-id-type="doi">10.1126/science.290.5500.2306</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barski</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Cuddapah</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Roh</surname>
<given-names>TY</given-names>
</name>
<name>
<surname>Schones</surname>
<given-names>DE</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Chepelev</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>High-resolution profiling of histone methylations in the human genome</article-title>
<source>Cell</source>
<year>2007</year>
<volume>129</volume>
<fpage>823</fpage>
<lpage>837</lpage>
<pub-id pub-id-type="pmid">17512414</pub-id>
<pub-id pub-id-type="doi">10.1016/j.cell.2007.05.009</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>FitzGerald</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Shlyakhtenko</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Mir</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Vinson</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Clustering of DNA sequences in human promoters</article-title>
<source>Genome Res</source>
<year>2004</year>
<volume>14</volume>
<fpage>1562</fpage>
<lpage>1574</lpage>
<pub-id pub-id-type="pmid">15256515</pub-id>
<pub-id pub-id-type="doi">10.1101/gr.1953904</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bina</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wyss</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Szpankowski</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Randhawa</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Reddy</surname>
<given-names>S</given-names>
</name>
<name>
<surname>John</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Pares-Matos</surname>
<given-names>EI</given-names>
</name>
<name>
<surname>Stein</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Exploring the characteristics of sequence elements in proximal promoters of human genes</article-title>
<source>Genomics</source>
<year>2004</year>
<volume>84</volume>
<fpage>929</fpage>
<lpage>940</lpage>
<pub-id pub-id-type="pmid">15533710</pub-id>
<pub-id pub-id-type="doi">10.1016/j.ygeno.2004.08.013</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marino-Ramirez</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Spouge</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Kanga</surname>
<given-names>GC</given-names>
</name>
<name>
<surname>Landsman</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Statistical analysis of over-represented words in human promoter sequences</article-title>
<source>Nucleic Acids Res</source>
<year>2004</year>
<volume>32</volume>
<fpage>949</fpage>
<lpage>958</lpage>
<pub-id pub-id-type="pmid">14963262</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/gkh246</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xie</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Kulbokas</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Golub</surname>
<given-names>TR</given-names>
</name>
<name>
<surname>Mootha</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Lindblad-Toh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Lander</surname>
<given-names>ES</given-names>
</name>
<name>
<surname>Kellis</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals</article-title>
<source>Nature</source>
<year>2005</year>
<volume>434</volume>
<fpage>338</fpage>
<lpage>345</lpage>
<pub-id pub-id-type="pmid">15735639</pub-id>
<pub-id pub-id-type="doi">10.1038/nature03441</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Weinmann</surname>
<given-names>AS</given-names>
</name>
<name>
<surname>Farnham</surname>
<given-names>PJ</given-names>
</name>
</person-group>
<article-title>Identification of unknown target genes of human transcription factors using chromatin immunoprecipitation</article-title>
<source>Methods</source>
<year>2002</year>
<volume>26</volume>
<fpage>37</fpage>
<lpage>47</lpage>
<pub-id pub-id-type="pmid">12054903</pub-id>
<pub-id pub-id-type="doi">10.1016/S1046-2023(02)00006-3</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ptashne</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Gann</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Transcriptional activation by recruitment</article-title>
<source>Nature</source>
<year>1997</year>
<volume>386</volume>
<fpage>569</fpage>
<lpage>577</lpage>
<pub-id pub-id-type="pmid">9121580</pub-id>
<pub-id pub-id-type="doi">10.1038/386569a0</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schubeler</surname>
<given-names>D</given-names>
</name>
<name>
<surname>MacAlpine</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Scalzo</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Wirbelauer</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kooperberg</surname>
<given-names>C</given-names>
</name>
<name>
<surname>van Leeuwen</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Gottschling</surname>
<given-names>DE</given-names>
</name>
<name>
<surname>O'Neill</surname>
<given-names>LP</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>BM</given-names>
</name>
<name>
<surname>Delrow</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The histone modification pattern of active genes revealed through genome-wide chromatin analysis of a higher eukaryote</article-title>
<source>Genes Dev</source>
<year>2004</year>
<volume>18</volume>
<fpage>1263</fpage>
<lpage>1271</lpage>
<pub-id pub-id-type="pmid">15175259</pub-id>
<pub-id pub-id-type="doi">10.1101/gad.1198204</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guenther</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Levine</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Boyer</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Jaenisch</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>RA</given-names>
</name>
</person-group>
<article-title>A chromatin landmark and transcription initiation at most promoters in human cells</article-title>
<source>Cell</source>
<year>2007</year>
<volume>130</volume>
<fpage>77</fpage>
<lpage>88</lpage>
<pub-id pub-id-type="pmid">17632057</pub-id>
<pub-id pub-id-type="doi">10.1016/j.cell.2007.05.042</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fitzgerald</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Sturgill</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Shyakhtenko</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Oliver</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Vinson</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Comparative genomics of Drosophila and human core promoters</article-title>
<source>Genome Biol</source>
<year>2006</year>
<volume>7</volume>
<fpage>R53</fpage>
<pub-id pub-id-type="pmid">16827941</pub-id>
<pub-id pub-id-type="doi">10.1186/gb-2006-7-7-r53</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noma</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Allis</surname>
<given-names>CD</given-names>
</name>
<name>
<surname>Grewal</surname>
<given-names>SI</given-names>
</name>
</person-group>
<article-title>Transitions in distinct histone H3 methylation patterns at the heterochromatin domain boundaries</article-title>
<source>Science</source>
<year>2001</year>
<volume>293</volume>
<fpage>1150</fpage>
<lpage>1155</lpage>
<pub-id pub-id-type="pmid">11498594</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1064150</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Litt</surname>
<given-names>MD</given-names>
</name>
<name>
<surname>Simpson</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Gaszner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Allis</surname>
<given-names>CD</given-names>
</name>
<name>
<surname>Felsenfeld</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Correlation between histone lysine methylation and developmental changes at the chicken beta-globin locus</article-title>
<source>Science</source>
<year>2001</year>
<volume>293</volume>
<fpage>2453</fpage>
<lpage>2455</lpage>
<pub-id pub-id-type="pmid">11498546</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1064413</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="other">
<article-title>GNF Genome Informatics Applications & Datasets</article-title>
<ext-link ext-link-type="uri" xlink:href="http://wombat.gnf.org"></ext-link>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bajic</surname>
<given-names>VB</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Christoffels</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Schonbach</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lipovich</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Hofmann</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Kruger</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hide</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Kai</surname>
<given-names>C</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Mice and men: their promoter properties</article-title>
<source>PLoS Genet</source>
<year>2006</year>
<volume>2</volume>
<fpage>e54</fpage>
<pub-id pub-id-type="pmid">16683032</pub-id>
<pub-id pub-id-type="doi">10.1371/journal.pgen.0020054</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lonze</surname>
<given-names>BE</given-names>
</name>
<name>
<surname>Ginty</surname>
<given-names>DD</given-names>
</name>
</person-group>
<article-title>Function and regulation of CREB family transcription factors in the nervous system</article-title>
<source>Neuron</source>
<year>2002</year>
<volume>35</volume>
<fpage>605</fpage>
<lpage>623</lpage>
<pub-id pub-id-type="pmid">12194863</pub-id>
<pub-id pub-id-type="doi">10.1016/S0896-6273(02)00828-0</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Impey</surname>
<given-names>S</given-names>
</name>
<name>
<surname>McCorkle</surname>
<given-names>SR</given-names>
</name>
<name>
<surname>Cha-Molstad</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Dwyer</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Yochum</surname>
<given-names>GS</given-names>
</name>
<name>
<surname>Boss</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>McWeeney</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Dunn</surname>
<given-names>JJ</given-names>
</name>
<name>
<surname>Mandel</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Goodman</surname>
<given-names>RH</given-names>
</name>
</person-group>
<article-title>Defining the CREB regulon: a genome-wide analysis of transcription factor regulatory regions</article-title>
<source>Cell</source>
<year>2004</year>
<volume>119</volume>
<fpage>1041</fpage>
<lpage>1054</lpage>
<pub-id pub-id-type="pmid">15620361</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jones</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>Baylin</surname>
<given-names>SB</given-names>
</name>
</person-group>
<article-title>The epigenomics of cancer</article-title>
<source>Cell</source>
<year>2007</year>
<volume>128</volume>
<fpage>683</fpage>
<lpage>692</lpage>
<pub-id pub-id-type="pmid">17320506</pub-id>
<pub-id pub-id-type="doi">10.1016/j.cell.2007.01.029</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bird</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Wolffe</surname>
<given-names>AP</given-names>
</name>
</person-group>
<article-title>Methylation-induced repression – belts, braces, and chromatin</article-title>
<source>Cell</source>
<year>1999</year>
<volume>99</volume>
<fpage>451</fpage>
<lpage>454</lpage>
<pub-id pub-id-type="pmid">10589672</pub-id>
<pub-id pub-id-type="doi">10.1016/S0092-8674(00)81532-9</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tate</surname>
<given-names>PH</given-names>
</name>
<name>
<surname>Bird</surname>
<given-names>AP</given-names>
</name>
</person-group>
<article-title>Effects of DNA methylation on DNA-binding proteins and gene expression</article-title>
<source>Curr Opin Genet Dev</source>
<year>1993</year>
<volume>3</volume>
<fpage>226</fpage>
<lpage>231</lpage>
<pub-id pub-id-type="pmid">8504247</pub-id>
<pub-id pub-id-type="doi">10.1016/0959-437X(93)90027-M</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Weih</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Nitsch</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Reik</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Schutz</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Becker</surname>
<given-names>PB</given-names>
</name>
</person-group>
<article-title>Analysis of CpG methylation and genomic footprinting at the tyrosine aminotransferase gene: DNA methylation alone is not sufficient to prevent protein binding in vivo</article-title>
<source>Embo J</source>
<year>1991</year>
<volume>10</volume>
<fpage>2559</fpage>
<lpage>2567</lpage>
<pub-id pub-id-type="pmid">1714382</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gaston</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Fried</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>CpG methylation and the binding of YY1 and ETS proteins to the Surf-1/Surf-2 bidirectional promoter</article-title>
<source>Gene</source>
<year>1995</year>
<volume>157</volume>
<fpage>257</fpage>
<lpage>259</lpage>
<pub-id pub-id-type="pmid">7607503</pub-id>
<pub-id pub-id-type="doi">10.1016/0378-1119(95)00120-U</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Choi</surname>
<given-names>YS</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kyu Lee</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>KU</given-names>
</name>
<name>
<surname>Pak</surname>
<given-names>YK</given-names>
</name>
</person-group>
<article-title>In vitro methylation of nuclear respiratory factor-1 binding site suppresses the promoter activity of mitochondrial transcription factor A</article-title>
<source>Biochem Biophys Res Commun</source>
<year>2004</year>
<volume>314</volume>
<fpage>118</fpage>
<lpage>122</lpage>
<pub-id pub-id-type="pmid">14715254</pub-id>
<pub-id pub-id-type="doi">10.1016/j.bbrc.2003.12.065</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Prendergast</surname>
<given-names>GC</given-names>
</name>
<name>
<surname>Ziff</surname>
<given-names>EB</given-names>
</name>
</person-group>
<article-title>Methylation-sensitive sequence-specific DNA binding by the c-Myc basic region</article-title>
<source>Science</source>
<year>1991</year>
<volume>251</volume>
<fpage>186</fpage>
<lpage>189</lpage>
<pub-id pub-id-type="pmid">1987636</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1987636</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Comb</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Goodman</surname>
<given-names>HM</given-names>
</name>
</person-group>
<article-title>CpG methylation inhibits proenkephalin gene expression and binding of the transcription factor AP-2</article-title>
<source>Nucleic Acids Res</source>
<year>1990</year>
<volume>18</volume>
<fpage>3975</fpage>
<lpage>3982</lpage>
<pub-id pub-id-type="pmid">1695733</pub-id>
<pub-id pub-id-type="doi">10.1093/nar/18.13.3975</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bell</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Felsenfeld</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene</article-title>
<source>Nature</source>
<year>2000</year>
<volume>405</volume>
<fpage>482</fpage>
<lpage>485</lpage>
<pub-id pub-id-type="pmid">10839546</pub-id>
<pub-id pub-id-type="doi">10.1038/35013100</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Harrington</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>Imagawa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Karin</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Cytosine methylation does not affect binding of transcription factor Sp1</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1988</year>
<volume>85</volume>
<fpage>2066</fpage>
<lpage>2070</lpage>
<pub-id pub-id-type="pmid">3281160</pub-id>
<pub-id pub-id-type="doi">10.1073/pnas.85.7.2066</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Holler</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Westin</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Jiricny</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schaffner</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Sp1 transcription factor binds DNA and activates transcription even when the binding site is CpG methylated</article-title>
<source>Genes Dev</source>
<year>1988</year>
<volume>2</volume>
<fpage>1127</fpage>
<lpage>1135</lpage>
<pub-id pub-id-type="pmid">3056778</pub-id>
<pub-id pub-id-type="doi">10.1101/gad.2.9.1127</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Clark</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Harrison</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Molloy</surname>
<given-names>PL</given-names>
</name>
</person-group>
<article-title>Sp1 binding is inhibited by (m)Cp(m)CpG methylation</article-title>
<source>Gene</source>
<year>1997</year>
<volume>195</volume>
<fpage>67</fpage>
<lpage>71</lpage>
<pub-id pub-id-type="pmid">9300822</pub-id>
<pub-id pub-id-type="doi">10.1016/S0378-1119(97)00164-9</pub-id>
</citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>WG</given-names>
</name>
<name>
<surname>Srinivasan</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Duan</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Druhan</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Yee</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Villalona-Calero</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Plass</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Otterson</surname>
<given-names>GA</given-names>
</name>
</person-group>
<article-title>Methylation of adjacent CpG sites affects Sp1/Sp3 binding and activity in the p21(Cip1) promoter</article-title>
<source>Mol Cell Biol</source>
<year>2003</year>
<volume>23</volume>
<fpage>4056</fpage>
<lpage>4065</lpage>
<pub-id pub-id-type="pmid">12773551</pub-id>
<pub-id pub-id-type="doi">10.1128/MCB.23.12.4056-4065.2003</pub-id>
</citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mancini</surname>
<given-names>DN</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Archer</surname>
<given-names>TK</given-names>
</name>
<name>
<surname>Rodenhiser</surname>
<given-names>DI</given-names>
</name>
</person-group>
<article-title>Site-specific DNA methylation in the neurofibromatosis (NF1) promoter interferes with binding of CREB and SP1 transcription factors</article-title>
<source>Oncogene</source>
<year>1999</year>
<volume>18</volume>
<fpage>4108</fpage>
<lpage>4119</lpage>
<pub-id pub-id-type="pmid">10435592</pub-id>
<pub-id pub-id-type="doi">10.1038/sj.onc.1202764</pub-id>
</citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Berger</surname>
<given-names>MF</given-names>
</name>
<name>
<surname>Philippakis</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Qureshi</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>He</surname>
<given-names>FS</given-names>
</name>
<name>
<surname>Estep</surname>
<given-names>PW</given-names>
<suffix>3rd</suffix>
</name>
<name>
<surname>Bulyk</surname>
<given-names>ML</given-names>
</name>
</person-group>
<article-title>Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities</article-title>
<source>Nat Biotechnol</source>
<year>2006</year>
<volume>24</volume>
<fpage>1429</fpage>
<lpage>1435</lpage>
<pub-id pub-id-type="pmid">16998473</pub-id>
<pub-id pub-id-type="doi">10.1038/nbt1246</pub-id>
</citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Puckett</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Muzikar</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Tietjen</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Warren</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Ansari</surname>
<given-names>AZ</given-names>
</name>
<name>
<surname>Dervan</surname>
<given-names>PB</given-names>
</name>
</person-group>
<article-title>Quantitative Microarray Profiling of DNA-Binding Molecules</article-title>
<source>J Am Chem Soc</source>
<year>2007</year>
<volume>129</volume>
<fpage>12310</fpage>
<lpage>12319</lpage>
<pub-id pub-id-type="pmid">17880081</pub-id>
<pub-id pub-id-type="doi">10.1021/ja0744899</pub-id>
</citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bird</surname>
<given-names>AP</given-names>
</name>
</person-group>
<article-title>CpG-rich islands and the function of DNA methylation</article-title>
<source>Nature</source>
<year>1986</year>
<volume>321</volume>
<fpage>209</fpage>
<lpage>213</lpage>
<pub-id pub-id-type="pmid">2423876</pub-id>
<pub-id pub-id-type="doi">10.1038/321209a0</pub-id>
</citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dlugosz</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Glick</surname>
<given-names>AB</given-names>
</name>
<name>
<surname>Tennenbaum</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Weinberg</surname>
<given-names>WC</given-names>
</name>
<name>
<surname>Yuspa</surname>
<given-names>SH</given-names>
</name>
</person-group>
<article-title>Isolation and utilization of epidermal keratinocytes for oncogene research</article-title>
<source>Methods Enzymol</source>
<year>1995</year>
<volume>254</volume>
<fpage>3</fpage>
<lpage>20</lpage>
<pub-id pub-id-type="pmid">8531694</pub-id>
</citation>
</ref>
<ref id="B48">
<citation citation-type="other">
<article-title>The Farnham laboratory</article-title>
<ext-link ext-link-type="uri" xlink:href="http://www.genomecenter.ucdavis.edu/farnham/"></ext-link>
</citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lippman</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Gendrel</surname>
<given-names>AV</given-names>
</name>
<name>
<surname>Colot</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Martienssen</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Profiling DNA methylation patterns using genomic tiling microarrays</article-title>
<source>Nat Methods</source>
<year>2005</year>
<volume>2</volume>
<fpage>219</fpage>
<lpage>224</lpage>
<pub-id pub-id-type="pmid">16163803</pub-id>
<pub-id pub-id-type="doi">10.1038/nmeth0305-219</pub-id>
</citation>
</ref>
<ref id="B50">
<citation citation-type="other">
<article-title>Round A/B/C Random Amplification of DNA Protocol</article-title>
<ext-link ext-link-type="uri" xlink:href="http://cat.ucsf.edu/pdfs/22_Round_A_B_C_protocol.pdf"></ext-link>
</citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gerdes</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Myakishev</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Frost</surname>
<given-names>NA</given-names>
</name>
<name>
<surname>Rishi</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Moitra</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Acharya</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Levy</surname>
<given-names>MR</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>Glick</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yuspa</surname>
<given-names>SH</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Activator protein-1 activity regulates epithelial tumor cell identity</article-title>
<source>Cancer Res</source>
<year>2006</year>
<volume>66</volume>
<fpage>7578</fpage>
<lpage>7588</lpage>
<pub-id pub-id-type="pmid">16885357</pub-id>
<pub-id pub-id-type="doi">10.1158/0008-5472.CAN-06-1247</pub-id>
</citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0005469 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0005469 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021