Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

The disruptive positions in human G-quadruplex motifs are less polymorphic and more conserved than their neutral counterparts

Identifieur interne : 000F44 ( Pmc/Corpus ); précédent : 000F43; suivant : 000F45

The disruptive positions in human G-quadruplex motifs are less polymorphic and more conserved than their neutral counterparts

Auteurs : Sigve Nakken ; Torbj Rn Rognes ; Eivind Hovig

Source :

RBID : PMC:2761265

Abstract

Specific guanine-rich sequence motifs in the human genome have considerable potential to form four-stranded structures known as G-quadruplexes or G4 DNA. The enrichment of these motifs in key chromosomal regions has suggested a functional role for the G-quadruplex structure in genomic regulation. In this work, we have examined the spectrum of nucleotide substitutions in G4 motifs, and related this spectrum to G4 prevalence. Data collected from the large repository of human SNPs indicates that the core feature of G-quadruplex motifs, 5′-GGG-3′, exhibits specific mutational patterns that preserve the potential for G4 formation. In particular, we find a genome-wide pattern in which sites that disrupt the guanine triplets are more conserved and less polymorphic than their neutral counterparts. This also holds when considering non-CpG sites only. However, the low level of polymorphisms in guanine tracts is not only confined to G4 motifs. A complete mapping of DNA three-mers at guanine polymorphisms indicated that short guanine tracts are the most under-represented sequence context at polymorphic sites. Furthermore, we provide evidence for a strand bias upstream of human genes. Here, a significantly lower rate of G4-disruptive SNPs on the non-template strand supports a higher relative influence of G4 formation on this strand during transcription.


Url:
DOI: 10.1093/nar/gkp590
PubMed: 19617376
PubMed Central: 2761265

Links to Exploration step

PMC:2761265

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">The disruptive positions in human G-quadruplex motifs are less polymorphic and more conserved than their neutral counterparts</title>
<author>
<name sortKey="Nakken, Sigve" sort="Nakken, Sigve" uniqKey="Nakken S" first="Sigve" last="Nakken">Sigve Nakken</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027, Oslo,</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rognes, Torbj Rn" sort="Rognes, Torbj Rn" uniqKey="Rognes T" first="Torbj Rn" last="Rognes">Torbj Rn Rognes</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027, Oslo,</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF1">Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo,</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hovig, Eivind" sort="Hovig, Eivind" uniqKey="Hovig E" first="Eivind" last="Hovig">Eivind Hovig</name>
<affiliation>
<nlm:aff id="AFF1">Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo,</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff wicri:cut=" and" id="AFF1">Department of Tumor Biology, Institute for Cancer Research</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF1">Department of Medical Informatics, Oslo University Hospital, Norwegian Radium Hospital, Montebello, NO-0310, Oslo, Norway</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">19617376</idno>
<idno type="pmc">2761265</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761265</idno>
<idno type="RBID">PMC:2761265</idno>
<idno type="doi">10.1093/nar/gkp590</idno>
<date when="2009">2009</date>
<idno type="wicri:Area/Pmc/Corpus">000F44</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000F44</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">The disruptive positions in human G-quadruplex motifs are less polymorphic and more conserved than their neutral counterparts</title>
<author>
<name sortKey="Nakken, Sigve" sort="Nakken, Sigve" uniqKey="Nakken S" first="Sigve" last="Nakken">Sigve Nakken</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027, Oslo,</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rognes, Torbj Rn" sort="Rognes, Torbj Rn" uniqKey="Rognes T" first="Torbj Rn" last="Rognes">Torbj Rn Rognes</name>
<affiliation>
<nlm:aff id="AFF1">Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027, Oslo,</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF1">Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo,</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hovig, Eivind" sort="Hovig, Eivind" uniqKey="Hovig E" first="Eivind" last="Hovig">Eivind Hovig</name>
<affiliation>
<nlm:aff id="AFF1">Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo,</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff wicri:cut=" and" id="AFF1">Department of Tumor Biology, Institute for Cancer Research</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="AFF1">Department of Medical Informatics, Oslo University Hospital, Norwegian Radium Hospital, Montebello, NO-0310, Oslo, Norway</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Nucleic Acids Research</title>
<idno type="ISSN">0305-1048</idno>
<idno type="eISSN">1362-4962</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Specific guanine-rich sequence motifs in the human genome have considerable potential to form four-stranded structures known as G-quadruplexes or G4 DNA. The enrichment of these motifs in key chromosomal regions has suggested a functional role for the G-quadruplex structure in genomic regulation. In this work, we have examined the spectrum of nucleotide substitutions in G4 motifs, and related this spectrum to G4 prevalence. Data collected from the large repository of human SNPs indicates that the core feature of G-quadruplex motifs, 5′-GGG-3′, exhibits specific mutational patterns that preserve the potential for G4 formation. In particular, we find a genome-wide pattern in which sites that disrupt the guanine triplets are more conserved and less polymorphic than their neutral counterparts. This also holds when considering non-CpG sites only. However, the low level of polymorphisms in guanine tracts is not only confined to G4 motifs. A complete mapping of DNA three-mers at guanine polymorphisms indicated that short guanine tracts are the most under-represented sequence context at polymorphic sites. Furthermore, we provide evidence for a strand bias upstream of human genes. Here, a significantly lower rate of G4-disruptive SNPs on the non-template strand supports a higher relative influence of G4 formation on this strand during transcription.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Sen, D" uniqKey="Sen D">D Sen</name>
</author>
<author>
<name sortKey="Gilbert, W" uniqKey="Gilbert W">W Gilbert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gellert, M" uniqKey="Gellert M">M Gellert</name>
</author>
<author>
<name sortKey="Lipsett, Mn" uniqKey="Lipsett M">MN Lipsett</name>
</author>
<author>
<name sortKey="Davies, Dr" uniqKey="Davies D">DR Davies</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hazel, P" uniqKey="Hazel P">P Hazel</name>
</author>
<author>
<name sortKey="Huppert, J" uniqKey="Huppert J">J Huppert</name>
</author>
<author>
<name sortKey="Balasubramanian, S" uniqKey="Balasubramanian S">S Balasubramanian</name>
</author>
<author>
<name sortKey="Neidle, S" uniqKey="Neidle S">S Neidle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Risitano, A" uniqKey="Risitano A">A Risitano</name>
</author>
<author>
<name sortKey="Fox, Kr" uniqKey="Fox K">KR Fox</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burge, S" uniqKey="Burge S">S Burge</name>
</author>
<author>
<name sortKey="Hazel, P" uniqKey="Hazel P">P Hazel</name>
</author>
<author>
<name sortKey="Todd, Ak" uniqKey="Todd A">AK Todd</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rachwal, Pa" uniqKey="Rachwal P">PA Rachwal</name>
</author>
<author>
<name sortKey="Findlow, Is" uniqKey="Findlow I">IS Findlow</name>
</author>
<author>
<name sortKey="Werner, Jm" uniqKey="Werner J">JM Werner</name>
</author>
<author>
<name sortKey="Brown, T" uniqKey="Brown T">T Brown</name>
</author>
<author>
<name sortKey="Fox, Kr" uniqKey="Fox K">KR Fox</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sundquist, Wi" uniqKey="Sundquist W">WI Sundquist</name>
</author>
<author>
<name sortKey="Klug, A" uniqKey="Klug A">A Klug</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Williamson, Jr" uniqKey="Williamson J">JR Williamson</name>
</author>
<author>
<name sortKey="Raghuraman, Mk" uniqKey="Raghuraman M">MK Raghuraman</name>
</author>
<author>
<name sortKey="Cech, Tr" uniqKey="Cech T">TR Cech</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schaffitzel, C" uniqKey="Schaffitzel C">C Schaffitzel</name>
</author>
<author>
<name sortKey="Berger, I" uniqKey="Berger I">I Berger</name>
</author>
<author>
<name sortKey="Postberg, J" uniqKey="Postberg J">J Postberg</name>
</author>
<author>
<name sortKey="Hanes, J" uniqKey="Hanes J">J Hanes</name>
</author>
<author>
<name sortKey="Lipps, Hj" uniqKey="Lipps H">HJ Lipps</name>
</author>
<author>
<name sortKey="Pluckthun, A" uniqKey="Pluckthun A">A Pluckthun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Duquette, Ml" uniqKey="Duquette M">ML Duquette</name>
</author>
<author>
<name sortKey="Handa, P" uniqKey="Handa P">P Handa</name>
</author>
<author>
<name sortKey="Vincent, Ja" uniqKey="Vincent J">JA Vincent</name>
</author>
<author>
<name sortKey="Taylor, Af" uniqKey="Taylor A">AF Taylor</name>
</author>
<author>
<name sortKey="Maizels, N" uniqKey="Maizels N">N Maizels</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paeschke, K" uniqKey="Paeschke K">K Paeschke</name>
</author>
<author>
<name sortKey="Simonsson, T" uniqKey="Simonsson T">T Simonsson</name>
</author>
<author>
<name sortKey="Postberg, J" uniqKey="Postberg J">J Postberg</name>
</author>
<author>
<name sortKey="Rhodes, D" uniqKey="Rhodes D">D Rhodes</name>
</author>
<author>
<name sortKey="Lipps, Hj" uniqKey="Lipps H">HJ Lipps</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bachrati, Cz" uniqKey="Bachrati C">CZ Bachrati</name>
</author>
<author>
<name sortKey="Hickson, Id" uniqKey="Hickson I">ID Hickson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, H" uniqKey="Sun H">H Sun</name>
</author>
<author>
<name sortKey="Karow, Jk" uniqKey="Karow J">JK Karow</name>
</author>
<author>
<name sortKey="Hickson, Id" uniqKey="Hickson I">ID Hickson</name>
</author>
<author>
<name sortKey="Maizels, N" uniqKey="Maizels N">N Maizels</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, Y" uniqKey="Wu Y">Y Wu</name>
</author>
<author>
<name sortKey="Shin Ya, K" uniqKey="Shin Ya K">K Shin-ya</name>
</author>
<author>
<name sortKey="Brosh, Rm" uniqKey="Brosh R">RM Brosh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fry, M" uniqKey="Fry M">M Fry</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huppert, Jl" uniqKey="Huppert J">JL Huppert</name>
</author>
<author>
<name sortKey="Balasubramanian, S" uniqKey="Balasubramanian S">S Balasubramanian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Todd, Ak" uniqKey="Todd A">AK Todd</name>
</author>
<author>
<name sortKey="Johnston, M" uniqKey="Johnston M">M Johnston</name>
</author>
<author>
<name sortKey="Neidle, S" uniqKey="Neidle S">S Neidle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kikin, O" uniqKey="Kikin O">O Kikin</name>
</author>
<author>
<name sortKey="D Antonio, L" uniqKey="D Antonio L">L D'Antonio</name>
</author>
<author>
<name sortKey="Bagga, Ps" uniqKey="Bagga P">PS Bagga</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hanakahi, La" uniqKey="Hanakahi L">LA Hanakahi</name>
</author>
<author>
<name sortKey="Sun, H" uniqKey="Sun H">H Sun</name>
</author>
<author>
<name sortKey="Maizels, N" uniqKey="Maizels N">N Maizels</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dempsey, La" uniqKey="Dempsey L">LA Dempsey</name>
</author>
<author>
<name sortKey="Sun, H" uniqKey="Sun H">H Sun</name>
</author>
<author>
<name sortKey="Hanakahi, La" uniqKey="Hanakahi L">LA Hanakahi</name>
</author>
<author>
<name sortKey="Maizels, N" uniqKey="Maizels N">N Maizels</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Patel, Dj" uniqKey="Patel D">DJ Patel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huppert, Jl" uniqKey="Huppert J">JL Huppert</name>
</author>
<author>
<name sortKey="Balasubramanian, S" uniqKey="Balasubramanian S">S Balasubramanian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Du, Z" uniqKey="Du Z">Z Du</name>
</author>
<author>
<name sortKey="Zhao, Y" uniqKey="Zhao Y">Y Zhao</name>
</author>
<author>
<name sortKey="Li, N" uniqKey="Li N">N Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Siddiqui Jain, A" uniqKey="Siddiqui Jain A">A Siddiqui-Jain</name>
</author>
<author>
<name sortKey="Grand, Cl" uniqKey="Grand C">CL Grand</name>
</author>
<author>
<name sortKey="Bearss, Dj" uniqKey="Bearss D">DJ Bearss</name>
</author>
<author>
<name sortKey="Hurley, Lh" uniqKey="Hurley L">LH Hurley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simonsson, T" uniqKey="Simonsson T">T Simonsson</name>
</author>
<author>
<name sortKey="Pecinka, P" uniqKey="Pecinka P">P Pecinka</name>
</author>
<author>
<name sortKey="Kubista, M" uniqKey="Kubista M">M Kubista</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fernando, H" uniqKey="Fernando H">H Fernando</name>
</author>
<author>
<name sortKey="Reszka, Ap" uniqKey="Reszka A">AP Reszka</name>
</author>
<author>
<name sortKey="Huppert, J" uniqKey="Huppert J">J Huppert</name>
</author>
<author>
<name sortKey="Ladame, S" uniqKey="Ladame S">S Ladame</name>
</author>
<author>
<name sortKey="Rankin, S" uniqKey="Rankin S">S Rankin</name>
</author>
<author>
<name sortKey="Venkitaraman, Ar" uniqKey="Venkitaraman A">AR Venkitaraman</name>
</author>
<author>
<name sortKey="Neidle, S" uniqKey="Neidle S">S Neidle</name>
</author>
<author>
<name sortKey="Balasubramanian, S" uniqKey="Balasubramanian S">S Balasubramanian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yafe, A" uniqKey="Yafe A">A Yafe</name>
</author>
<author>
<name sortKey="Etzioni, S" uniqKey="Etzioni S">S Etzioni</name>
</author>
<author>
<name sortKey="Weisman Shomer, P" uniqKey="Weisman Shomer P">P Weisman-Shomer</name>
</author>
<author>
<name sortKey="Fry, M" uniqKey="Fry M">M Fry</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhao, Y" uniqKey="Zhao Y">Y Zhao</name>
</author>
<author>
<name sortKey="Du, Z" uniqKey="Du Z">Z Du</name>
</author>
<author>
<name sortKey="Li, N" uniqKey="Li N">N Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bugaut, A" uniqKey="Bugaut A">A Bugaut</name>
</author>
<author>
<name sortKey="Balasubramanian, S" uniqKey="Balasubramanian S">S Balasubramanian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kumar, N" uniqKey="Kumar N">N Kumar</name>
</author>
<author>
<name sortKey="Sahoo, B" uniqKey="Sahoo B">B Sahoo</name>
</author>
<author>
<name sortKey="Varun, Ka" uniqKey="Varun K">KA Varun</name>
</author>
<author>
<name sortKey="Maiti, S" uniqKey="Maiti S">S Maiti</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, Jy" uniqKey="Lee J">JY Lee</name>
</author>
<author>
<name sortKey="Kim, Ds" uniqKey="Kim D">DS Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gros, J" uniqKey="Gros J">J Gros</name>
</author>
<author>
<name sortKey="Rosu, F" uniqKey="Rosu F">F Rosu</name>
</author>
<author>
<name sortKey="Amrane, S" uniqKey="Amrane S">S Amrane</name>
</author>
<author>
<name sortKey="De Cian, A" uniqKey="De Cian A">A De Cian</name>
</author>
<author>
<name sortKey="Gabelica, V" uniqKey="Gabelica V">V Gabelica</name>
</author>
<author>
<name sortKey="Lacroix, L" uniqKey="Lacroix L">L Lacroix</name>
</author>
<author>
<name sortKey="Mergny, Jl" uniqKey="Mergny J">JL Mergny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blake, Rd" uniqKey="Blake R">RD Blake</name>
</author>
<author>
<name sortKey="Hess, St" uniqKey="Hess S">ST Hess</name>
</author>
<author>
<name sortKey="Nicholson Tuell, J" uniqKey="Nicholson Tuell J">J Nicholson-Tuell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fryxell, Kj" uniqKey="Fryxell K">KJ Fryxell</name>
</author>
<author>
<name sortKey="Moon, Wj" uniqKey="Moon W">WJ Moon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hodgkinson, A" uniqKey="Hodgkinson A">A Hodgkinson</name>
</author>
<author>
<name sortKey="Ladoukakis, E" uniqKey="Ladoukakis E">E Ladoukakis</name>
</author>
<author>
<name sortKey="Eyre Walker, A" uniqKey="Eyre Walker A">A Eyre-Walker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krawczak, M" uniqKey="Krawczak M">M Krawczak</name>
</author>
<author>
<name sortKey="Ball, Ev" uniqKey="Ball E">EV Ball</name>
</author>
<author>
<name sortKey="Cooper, Dn" uniqKey="Cooper D">DN Cooper</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sherry, St" uniqKey="Sherry S">ST Sherry</name>
</author>
<author>
<name sortKey="Ward, Mh" uniqKey="Ward M">MH Ward</name>
</author>
<author>
<name sortKey="Kholodov, M" uniqKey="Kholodov M">M Kholodov</name>
</author>
<author>
<name sortKey="Baker, J" uniqKey="Baker J">J Baker</name>
</author>
<author>
<name sortKey="Phan, L" uniqKey="Phan L">L Phan</name>
</author>
<author>
<name sortKey="Smigielski, Em" uniqKey="Smigielski E">EM Smigielski</name>
</author>
<author>
<name sortKey="Sirotkin, K" uniqKey="Sirotkin K">K Sirotkin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karolchik, D" uniqKey="Karolchik D">D Karolchik</name>
</author>
<author>
<name sortKey="Baertsch, R" uniqKey="Baertsch R">R Baertsch</name>
</author>
<author>
<name sortKey="Diekhans, M" uniqKey="Diekhans M">M Diekhans</name>
</author>
<author>
<name sortKey="Furey, Ts" uniqKey="Furey T">TS Furey</name>
</author>
<author>
<name sortKey="Hinrichs, A" uniqKey="Hinrichs A">A Hinrichs</name>
</author>
<author>
<name sortKey="Lu, Yt" uniqKey="Lu Y">YT Lu</name>
</author>
<author>
<name sortKey="Roskin, Km" uniqKey="Roskin K">KM Roskin</name>
</author>
<author>
<name sortKey="Schwartz, M" uniqKey="Schwartz M">M Schwartz</name>
</author>
<author>
<name sortKey="Sugnet, Cw" uniqKey="Sugnet C">CW Sugnet</name>
</author>
<author>
<name sortKey="Thomas, Dj" uniqKey="Thomas D">DJ Thomas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eddy, J" uniqKey="Eddy J">J Eddy</name>
</author>
<author>
<name sortKey="Maizels, N" uniqKey="Maizels N">N Maizels</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eddy, J" uniqKey="Eddy J">J Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tomso, Dj" uniqKey="Tomso D">DJ Tomso</name>
</author>
<author>
<name sortKey="Bell, Da" uniqKey="Bell D">DA Bell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eddy, J" uniqKey="Eddy J">J Eddy</name>
</author>
<author>
<name sortKey="Maizels, N" uniqKey="Maizels N">N Maizels</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huppert, Jl" uniqKey="Huppert J">JL Huppert</name>
</author>
<author>
<name sortKey="Bugaut, A" uniqKey="Bugaut A">A Bugaut</name>
</author>
<author>
<name sortKey="Kumari, S" uniqKey="Kumari S">S Kumari</name>
</author>
<author>
<name sortKey="Balasubramanian, S" uniqKey="Balasubramanian S">S Balasubramanian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Todd, Ak" uniqKey="Todd A">AK Todd</name>
</author>
<author>
<name sortKey="Neidle, S" uniqKey="Neidle S">S Neidle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Verma, A" uniqKey="Verma A">A Verma</name>
</author>
<author>
<name sortKey="Halder, K" uniqKey="Halder K">K Halder</name>
</author>
<author>
<name sortKey="Halder, R" uniqKey="Halder R">R Halder</name>
</author>
<author>
<name sortKey="Yadav, Vk" uniqKey="Yadav V">VK Yadav</name>
</author>
<author>
<name sortKey="Rawal, P" uniqKey="Rawal P">P Rawal</name>
</author>
<author>
<name sortKey="Thakur, Rk" uniqKey="Thakur R">RK Thakur</name>
</author>
<author>
<name sortKey="Mohd, F" uniqKey="Mohd F">F Mohd</name>
</author>
<author>
<name sortKey="Sharma, A" uniqKey="Sharma A">A Sharma</name>
</author>
<author>
<name sortKey="Chowdhury, S" uniqKey="Chowdhury S">S Chowdhury</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bird, Ap" uniqKey="Bird A">AP Bird</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sakumi, K" uniqKey="Sakumi K">K Sakumi</name>
</author>
<author>
<name sortKey="Furuichi, M" uniqKey="Furuichi M">M Furuichi</name>
</author>
<author>
<name sortKey="Tsuzuki, T" uniqKey="Tsuzuki T">T Tsuzuki</name>
</author>
<author>
<name sortKey="Kakuma, T" uniqKey="Kakuma T">T Kakuma</name>
</author>
<author>
<name sortKey="Kawabata, S" uniqKey="Kawabata S">S Kawabata</name>
</author>
<author>
<name sortKey="Maki, H" uniqKey="Maki H">H Maki</name>
</author>
<author>
<name sortKey="Sekiguchi, M" uniqKey="Sekiguchi M">M Sekiguchi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bjoras, M" uniqKey="Bjoras M">M Bjoras</name>
</author>
<author>
<name sortKey="Luna, L" uniqKey="Luna L">L Luna</name>
</author>
<author>
<name sortKey="Johnsen, B" uniqKey="Johnsen B">B Johnsen</name>
</author>
<author>
<name sortKey="Hoff, E" uniqKey="Hoff E">E Hoff</name>
</author>
<author>
<name sortKey="Haug, T" uniqKey="Haug T">T Haug</name>
</author>
<author>
<name sortKey="Rognes, T" uniqKey="Rognes T">T Rognes</name>
</author>
<author>
<name sortKey="Seeberg, E" uniqKey="Seeberg E">E Seeberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nash, Hm" uniqKey="Nash H">HM Nash</name>
</author>
<author>
<name sortKey="Bruner, Sd" uniqKey="Bruner S">SD Bruner</name>
</author>
<author>
<name sortKey="Scharer, Od" uniqKey="Scharer O">OD Scharer</name>
</author>
<author>
<name sortKey="Kawate, T" uniqKey="Kawate T">T Kawate</name>
</author>
<author>
<name sortKey="Addona, Ta" uniqKey="Addona T">TA Addona</name>
</author>
<author>
<name sortKey="Spooner, E" uniqKey="Spooner E">E Spooner</name>
</author>
<author>
<name sortKey="Lane, Ws" uniqKey="Lane W">WS Lane</name>
</author>
<author>
<name sortKey="Verdine, Gl" uniqKey="Verdine G">GL Verdine</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Slupska, Mm" uniqKey="Slupska M">MM Slupska</name>
</author>
<author>
<name sortKey="Baikalov, C" uniqKey="Baikalov C">C Baikalov</name>
</author>
<author>
<name sortKey="Luther, Wm" uniqKey="Luther W">WM Luther</name>
</author>
<author>
<name sortKey="Chiang, Jh" uniqKey="Chiang J">JH Chiang</name>
</author>
<author>
<name sortKey="Wei, Yf" uniqKey="Wei Y">YF Wei</name>
</author>
<author>
<name sortKey="Miller, Jh" uniqKey="Miller J">JH Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcculloch, Sd" uniqKey="Mcculloch S">SD McCulloch</name>
</author>
<author>
<name sortKey="Kokoska, Rj" uniqKey="Kokoska R">RJ Kokoska</name>
</author>
<author>
<name sortKey="Garg, P" uniqKey="Garg P">P Garg</name>
</author>
<author>
<name sortKey="Burgers, Pm" uniqKey="Burgers P">PM Burgers</name>
</author>
<author>
<name sortKey="Kunkel, Ta" uniqKey="Kunkel T">TA Kunkel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kresnak, Mt" uniqKey="Kresnak M">MT Kresnak</name>
</author>
<author>
<name sortKey="Davidson, Rl" uniqKey="Davidson R">RL Davidson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Platzer, M" uniqKey="Platzer M">M Platzer</name>
</author>
<author>
<name sortKey="Hiller, M" uniqKey="Hiller M">M Hiller</name>
</author>
<author>
<name sortKey="Szafranski, K" uniqKey="Szafranski K">K Szafranski</name>
</author>
<author>
<name sortKey="Jahn, N" uniqKey="Jahn N">N Jahn</name>
</author>
<author>
<name sortKey="Hampe, J" uniqKey="Hampe J">J Hampe</name>
</author>
<author>
<name sortKey="Schreiber, S" uniqKey="Schreiber S">S Schreiber</name>
</author>
<author>
<name sortKey="Backofen, R" uniqKey="Backofen R">R Backofen</name>
</author>
<author>
<name sortKey="Huse, K" uniqKey="Huse K">K Huse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Siva, N" uniqKey="Siva N">N Siva</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="iso-abbrev">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="publisher-id">nar</journal-id>
<journal-id journal-id-type="hwp">nar</journal-id>
<journal-title-group>
<journal-title>Nucleic Acids Research</journal-title>
</journal-title-group>
<issn pub-type="ppub">0305-1048</issn>
<issn pub-type="epub">1362-4962</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">19617376</article-id>
<article-id pub-id-type="pmc">2761265</article-id>
<article-id pub-id-type="doi">10.1093/nar/gkp590</article-id>
<article-id pub-id-type="publisher-id">gkp590</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genomics</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>The disruptive positions in human G-quadruplex motifs are less polymorphic and more conserved than their neutral counterparts</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Nakken</surname>
<given-names>Sigve</given-names>
</name>
<xref ref-type="aff" rid="AFF1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="COR1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Rognes</surname>
<given-names>Torbjørn</given-names>
</name>
<xref ref-type="aff" rid="AFF1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="AFF1">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hovig</surname>
<given-names>Eivind</given-names>
</name>
<xref ref-type="aff" rid="AFF1">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="AFF1">
<sup>3</sup>
</xref>
<xref ref-type="aff" rid="AFF1">
<sup>4</sup>
</xref>
</contrib>
</contrib-group>
<aff id="AFF1">
<sup>1</sup>
Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027, Oslo,
<sup>2</sup>
Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo,
<sup>3</sup>
Department of Tumor Biology, Institute for Cancer Research and
<sup>4</sup>
Department of Medical Informatics, Oslo University Hospital, Norwegian Radium Hospital, Montebello, NO-0310, Oslo, Norway</aff>
<author-notes>
<corresp id="COR1">*To whom correspondence should be addressed. Tel:
<phone>+47 22 84 47 86</phone>
; Fax:
<fax>+47 22 84 47 82</fax>
; Email:
<email>sigve.nakken@medisin.uio.no</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>9</month>
<year>2009</year>
</pub-date>
<pub-date pub-type="epub">
<day>17</day>
<month>7</month>
<year>2009</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>17</day>
<month>7</month>
<year>2009</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>37</volume>
<issue>17</issue>
<fpage>5749</fpage>
<lpage>5756</lpage>
<history>
<date date-type="received">
<day>18</day>
<month>5</month>
<year>2009</year>
</date>
<date date-type="rev-recd">
<day>25</day>
<month>6</month>
<year>2009</year>
</date>
<date date-type="accepted">
<day>26</day>
<month>6</month>
<year>2009</year>
</date>
</history>
<permissions>
<copyright-statement>© 2009 The Author(s)</copyright-statement>
<copyright-year>2009</copyright-year>
<license license-type="creative-commons" xlink:href="http://creativecommons.org/licenses/by-nc/2.0/uk/">
<license-p>
<pmc-comment>CREATIVE COMMONS</pmc-comment>
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/2.0/uk/">http://creativecommons.org/licenses/by-nc/2.0/uk/</ext-link>
) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>Specific guanine-rich sequence motifs in the human genome have considerable potential to form four-stranded structures known as G-quadruplexes or G4 DNA. The enrichment of these motifs in key chromosomal regions has suggested a functional role for the G-quadruplex structure in genomic regulation. In this work, we have examined the spectrum of nucleotide substitutions in G4 motifs, and related this spectrum to G4 prevalence. Data collected from the large repository of human SNPs indicates that the core feature of G-quadruplex motifs, 5′-GGG-3′, exhibits specific mutational patterns that preserve the potential for G4 formation. In particular, we find a genome-wide pattern in which sites that disrupt the guanine triplets are more conserved and less polymorphic than their neutral counterparts. This also holds when considering non-CpG sites only. However, the low level of polymorphisms in guanine tracts is not only confined to G4 motifs. A complete mapping of DNA three-mers at guanine polymorphisms indicated that short guanine tracts are the most under-represented sequence context at polymorphic sites. Furthermore, we provide evidence for a strand bias upstream of human genes. Here, a significantly lower rate of G4-disruptive SNPs on the non-template strand supports a higher relative influence of G4 formation on this strand during transcription.</p>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>INTRODUCTION</title>
<p>Human genomic DNA usually exists in the double-stranded conformation, but during denaturation, single strands containing tandemly repeated sequences can assemble into higher order DNA structures. In repetitive and guanine-rich sequences of the genome, single-stranded DNA can adopt four-stranded structures known as G-quadruplexes or G4 DNA (
<xref ref-type="bibr" rid="B1">1</xref>
). The G-quadruplex comprises a stack of G-tetrads, which are planar arrays of four guanines connected by Hoogsteen hydrogen bonds (
<xref ref-type="bibr" rid="B2">2</xref>
). G-quadruplexes are rapidly stabilized in the presence of monovalent cations, and their folding topology is influenced by the length and composition of short-sequence loops that link the stacked G-tetrads together (
<xref ref-type="bibr" rid="B3 B4 B5 B6">3–6</xref>
). The first
<italic>in vitro</italic>
observations of G-quadruplex formation came from the single-stranded overhang at human telomeres (
<xref ref-type="bibr" rid="B7">7</xref>
,
<xref ref-type="bibr" rid="B8">8</xref>
), a sequence characterized by tandem repeats of TTAGGG. This finding was later followed by studies that demonstrated the existence of G-quadruplexes
<italic>in vivo</italic>
(
<xref ref-type="bibr" rid="B9 B10 B11">9–11</xref>
). The hypothesized role of G-quadruplex formation in living cells has received further support from the recognition of conserved factors that selectively bind and unwind G4 (
<xref ref-type="bibr" rid="B12 B13 B14 B15">12–15</xref>
). However, the relative impact of G-quadruplex formation in the context of gene regulation and genome stability is still unclear.</p>
<p>Computational algorithms have been used to scan the human genome for the G4 consensus motif, which is a sequence containing at least four runs of at least three guanines (G-tracts) (
<xref ref-type="bibr" rid="B16 B17 B18">16–18</xref>
). These scans have identified enrichment in a number of chromosomal regions of biological importance, including the ribosomal DNA (
<xref ref-type="bibr" rid="B19">19</xref>
), the immunoglobulin heavy chain switch regions (
<xref ref-type="bibr" rid="B20">20</xref>
), telomeres (
<xref ref-type="bibr" rid="B21">21</xref>
) and transcriptional regulatory regions (
<xref ref-type="bibr" rid="B22">22</xref>
,
<xref ref-type="bibr" rid="B23">23</xref>
). With respect to gene transcription, different modes of G4-mediated regulation have been proposed. In one scenario, the formation of G4 is thought to increase the rate of transcription by preventing renaturation of double-stranded DNA (
<xref ref-type="bibr" rid="B23">23</xref>
). Others have though shown experimentally how small compounds can stabilize a promoter G-quadruplex and thereby decrease the expression rate (
<xref ref-type="bibr" rid="B24">24</xref>
). The idea that G-quadruplexes may act as regulators of gene expression has been strengthened by multiple observations of G-quadruplex formation in human promoters, including the proto-oncogenes c-MYC (
<xref ref-type="bibr" rid="B24">24</xref>
,
<xref ref-type="bibr" rid="B25">25</xref>
) and c-KIT (
<xref ref-type="bibr" rid="B26">26</xref>
), as well as muscle-specific genes (
<xref ref-type="bibr" rid="B27">27</xref>
). Moreover, G4 motifs appear to be enriched in the promoters of other warm-blooded animals (
<xref ref-type="bibr" rid="B28">28</xref>
). Within motifs, there is a considerable preference for single-nucleotide loops between the consecutive guanine runs, and this is also characteristic of the experimentally derived structures that are most stable (
<xref ref-type="bibr" rid="B22">22</xref>
,
<xref ref-type="bibr" rid="B29">29</xref>
,
<xref ref-type="bibr" rid="B30">30</xref>
). The latter studies showed how a correlation between common sequence features of G4 motifs and observations
<italic>in vitro</italic>
might aid the interpretation of G4 prevalence. An important set of data that remains to be explored in this respect is the spectrum of common nucleotide polymorphisms in G4 motifs, and how this spectrum relates to findings from recent kinetic and spectroscopic studies of mutated G4 (
<xref ref-type="bibr" rid="B31">31</xref>
,
<xref ref-type="bibr" rid="B32">32</xref>
). The studies of single-base mutated G-quadruplexes have demonstrated a strong relationship between quadruplex stability and the mutation position, with the central guanines of G-tracts being most critical for stable quadruplex folds. Thus, if the G-quadruplexes exhibit biological activity in genomic regions, one would expect to see a relatively lower rate of polymorphic bases at critical sites of the G4 motif, as a consequence of negative selection. Taking into account the non-randomness of point mutagenesis, in which both base composition and DNA sequence contexts influence substitution rates (
<xref ref-type="bibr" rid="B33 B34 B35 B36">33–36</xref>
), it is therefore of importance to see how the different sites in G4 motifs relate to known genetic variation in the form of human single nucleotide polymorphisms (SNPs). The collection of DNA polymorphisms in G4 motifs also represents an additional dimension in the identification of genomic regions undergoing G4 selection. In particular, the relative rate of G4-disruptive SNPs could indicate the extent of selection for the G-quadruplex structure in different genomic regions.</p>
<p>Here, we report a genome-wide analysis of SNPs in human G-quadruplex motifs, with an emphasis towards their occurrences in gene and regulatory sequences. We have used a large collection of validated SNPs from dbSNP as our data source of nucleotide substitutions (
<xref ref-type="bibr" rid="B37">37</xref>
). Overall, the results demonstrate a non-random pattern of nucleotide polymorphism in G-quadruplex motifs. In particular, we show that the internal sites of guanine runs are well protected from polymorphisms in the human genome, indicating a relationship between sequence-dependent mutagenesis of guanine and the prevalence of guanine tracts.</p>
</sec>
<sec sec-type="materials|methods">
<title>MATERIALS AND METHODS</title>
<sec>
<title>SNP data</title>
<p>dbSNP (build 129, released on 18 April 2008) was downloaded in XML format from ftp://ftp.ncbi.nlm.nih.gov/snp/. We included SNPs that (i) were biallelic, (ii) had been uniquely mapped to the human genome with an alignment accuracy of at least 99%, (iii) had been validated by at least one of NCBI’s validation criteria (that is, ‘by-frequency’, ‘byCluster’, ‘by2Hit2Allele’ or ‘byOtherPop’) and (iv) if genotyped by the HapMap project, had a minor allele frequency of at least 1% in minimum one of the sampled populations. A total of 5 717 575 SNPs satisfied the criteria above.</p>
</sec>
<sec>
<title>Sequence and annotation data</title>
<p>We used the
<italic>quadparser</italic>
algorithm to retrieve all sequences in the human genome (NCBI build 36.3) capable of forming a G-quadruplex, identified by the sequence motif G
<sub>3+</sub>
N
<sub>1–7</sub>
G
<sub>3+</sub>
N
<sub>1–7</sub>
G
<sub>3+</sub>
N
<sub>1-7</sub>
G
<sub>3+</sub>
, where G is guanine and N is any nucleotide (
<xref ref-type="bibr" rid="B16">16</xref>
). This simple consensus was inferred after several biophysical experiments had investigated the sequence basis for stable quadruplex folds (
<xref ref-type="bibr" rid="B3">3</xref>
,
<xref ref-type="bibr" rid="B4">4</xref>
), and represents the most common approach to map the grand total of potential G-quadruplex forming sequences. From the
<italic>quadparser</italic>
output, we extracted each putative G-quadruplex motif, regardless of any potential overlap with a neighboring motif [this corresponds to the ‘un-restricted’ set of G4 motifs, as defined by Todd
<italic>et al.</italic>
(
<xref ref-type="bibr" rid="B17">17</xref>
)]. Motifs with guanine tracts of length greater than six were excluded. The choice of overlapping motifs allowed us to evaluate the context and effect of a SNP for each individual putative G-quadruplex-forming structure. We only considered SNPs that mapped to G4 motifs present in the reference genome; SNPs that potentially introduced new G4 motifs were not analyzed.</p>
<p>The genomic coordinates of 24 243 protein-coding RefSeq genes were downloaded from
<ext-link ext-link-type="ftp" xlink:href="ftp://ftp.ncbi.nih.gov/refseq">ftp://ftp.ncbi.nih.gov/refseq</ext-link>
(NCBI build 36.3) and used for the annotation of G4 motifs. CpG islands and 28-way vertebrate MultiZ alignments were obtained from the UCSC genome browser (
<xref ref-type="bibr" rid="B38">38</xref>
), available at
<ext-link ext-link-type="uri" xlink:href="http://genome.ucsc.edu">http://genome.ucsc.edu</ext-link>
. Motifs located in four defined genomic regions were subsequently analyzed: 5′ gene regions, 3′ gene regions, the first gene intron and intergenic regions. In order to target regulatory G4 sequences involved in gene transcription, we set the limits of the 5′ region of genes to 2-kb upstream of the transcription start site (TSS) and 1-kb downstream of the TSS. Only non-coding sequences (i.e. UTR) were targeted downstream of the TSS (
<xref ref-type="fig" rid="F1">Figure 1</xref>
a), since coding sequences exhibit a significant depletion of G4 (
<xref ref-type="bibr" rid="B39">39</xref>
). We are aware that downstream of the TSS, the 5′ region will encompass G4 motifs that could be involved in both transcription and RNA processing. Ideally, one should thus evaluate the upstream and downstream regions of the TSS separately. However, having limited our analysis to the transcriptional aspect of G4, we considered it appropriate to combine the contributions by pre-transcription regulatory G4 (upstream of the TSS) and transcription regulatory G4 (downstream of the TSS). The 3′ end of genes was defined in the same manner as the 5′ end, encompassing 1-kb within 3′ UTR and 2-kb downstream of the transcription stop site. We included an analysis of G4 in the first intron (restricted to the first thousand bp), since this genomic region has shown a particular enrichment of G4 (
<xref ref-type="bibr" rid="B40">40</xref>
). Last, for control purposes, we included G4 motifs located in intergenic regions of the human genome.
<fig id="F1" position="float">
<label>Figure 1.</label>
<caption>
<p>(
<bold>a</bold>
) A simplified illustration of a human gene, showing how the gene 5′ and gene 3′ regions were defined. (
<bold>b</bold>
) An example of a G4 sequence motif. The G4-disruptive sites are in grey colour, while the G4-neutral sites are in black. The underlined guanines are guanines within tracts that, when mutated, will not disrupt the G4 consensus.</p>
</caption>
<graphic xlink:href="gkp590f1"></graphic>
</fig>
</p>
<p>Genomic G4 motifs that were found within high-copy repeats (as identified by RepeatMasker and Tandem Repeats Finder) were excluded from the analysis. There were several reasons for this decision. First of all, in the genomic regions of interest (regulatory sequences), the frequency of G4 within unique sequence is nearly twice as that of G4 within repeats. Second, reliable (i.e. validated) SNPs are under-represented in repeats; whereas 51.1% of all reference SNPs in dbSNP are mapped to repeats, only 45.1% of the validated SNPs are located within repeats. Third, in the vertebrate MultiZ alignments, we noted that the availability of reliable alignments for G4 in repeats was poor compared to unique G4.</p>
</sec>
<sec>
<title>Non-G4 control sequences</title>
<p>In a search for characteristic patterns of substitutions in the G-rich G4 motifs, we established a set of non-G4 control sequences. The selected non-G4 sequences had the same high GC content as the G4 sequences, but did not match the G4 consensus. This approach enabled us to target differences between G4 and non-G4 unrelated to CpG dinucleotides, since the rate of the most common substitution at CpG dinucleotides (i.e. transition caused by spontaneous hydrolytic deamination of 5-methylcytosine) are dependent on GC content (
<xref ref-type="bibr" rid="B34">34</xref>
,
<xref ref-type="bibr" rid="B41">41</xref>
).</p>
<p>We next provide a short description of the stepwise procedure. For each genomic region analyzed, we created a large library of non-G4 sequence fragments (length 20–28 bp; average length of G4 motifs) that originated either outside or within CpG islands. All fragments were subsequently binned according to GC content. We randomly picked sequence fragments within each bin, the number of fragments being dictated by the probability distribution of G4 motifs with respect to GC content and CpG islands. The SNP density in the total collection of non-G4 fragments was then calculated. This procedure was repeated fifty times for each genomic region and averaged.</p>
</sec>
</sec>
<sec>
<title>RESULTS AND DISCUSSION</title>
<p>Previous studies have demonstrated the importance of computational analyses for the understanding of G4 enrichment in vertebrate genomes (
<xref ref-type="bibr" rid="B16">16</xref>
,
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B22">22</xref>
,
<xref ref-type="bibr" rid="B23">23</xref>
,
<xref ref-type="bibr" rid="B28">28</xref>
,
<xref ref-type="bibr" rid="B40">40</xref>
,
<xref ref-type="bibr" rid="B42 B43 B44 B45">42–45</xref>
). In this work, we investigate G4 prevalence from a single nucleotide substitution perspective.</p>
<p>We calculated the density of SNPs in G4 motifs by querying the dbSNP database at the locations of 282 501 motifs in non-repetitive regions of the human genome. Due to the overlapping nature of many G4 motifs (and also some overlapping gene annotations), a number of SNPs were counted more than once in the overall count of SNPs. We checked that this approach did not influence our findings by performing an alternative analysis allowing only one count per SNP in non-overlapping motifs (data not shown). The strandedness of G4 motifs was ignored at this point, and we thus combined the total G4 formation potential involved in either DNA replication or gene transcription.</p>
<p>A total of 10 794 validated SNPs mapped to G4 motifs in the human genome, with an overall density of 1.97 SNPs/kb. With an estimated density of 2.00 SNPs/kb in the genomic background, it was apparent that the level of polymorphism in G4 motifs reflected the genome average. This finding seemed intuitively somewhat unexpected, considering the 2-fold enrichment of hypermutable CpG dinucleotides in G4 compared to the genomic background (
<xref ref-type="table" rid="T1">Table 1</xref>
). However, there are two important characteristics of G4 motifs that impose a relatively lower rate of SNPs at CpGs in these sequences. The first feature is the high GC content of G4, since 5-methylcytosine deamination rates are inversely correlated with local GC content (
<xref ref-type="bibr" rid="B34">34</xref>
). Second, there is an extensive overlap between G4 and CpG islands, that is genomic regions in which the cytosines of CpG dinucleotides preferentially remain unmethylated (
<xref ref-type="bibr" rid="B45">45</xref>
,
<xref ref-type="bibr" rid="B46">46</xref>
). Specifically, the coverage density of G4 inside CpG islands was several-fold higher than outside islands (
<xref ref-type="table" rid="T1">Table 1</xref>
). The latter observation implies that many G4 CpGs inevitably appear unmethylated in the genome, and this will likely reduce their overall mutagenic potential.
<table-wrap id="T1" position="float">
<label>Table 1.</label>
<caption>
<p>Density of SNPs and CpG dinucleotides in G4 motifs</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1">Number of G4 motifs</th>
<th rowspan="1" colspan="1">Number of SNPs
<xref ref-type="table-fn" rid="TF1">
<sup>a</sup>
</xref>
</th>
<th rowspan="1" colspan="1">SNPs/CpG
<xref ref-type="table-fn" rid="TF2">
<sup>b</sup>
</xref>
</th>
<th rowspan="1" colspan="1">CpG island coverage
<xref ref-type="table-fn" rid="TF3">
<sup>c</sup>
</xref>
</th>
<th rowspan="1" colspan="1">CpG/kb
<xref ref-type="table-fn" rid="TF4">
<sup>d</sup>
</xref>
</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">Genome</td>
<td rowspan="1" colspan="1">282 501</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">First introns</td>
<td rowspan="1" colspan="1">17 926 (0.33 Mb)</td>
<td rowspan="1" colspan="1">555 (441)</td>
<td rowspan="1" colspan="1">0.00413 (0.014)</td>
<td rowspan="1" colspan="1">0.093 (0.014)</td>
<td rowspan="1" colspan="1">58.4 (28.7)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Gene 5′</td>
<td rowspan="1" colspan="1">31 694 (0.55 Mb)</td>
<td rowspan="1" colspan="1">1157 (874)</td>
<td rowspan="1" colspan="1">0.0052 (0.012)</td>
<td rowspan="1" colspan="1">0.044 (0.010)</td>
<td rowspan="1" colspan="1">57.7 (34.9)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Gene 3′</td>
<td rowspan="1" colspan="1">17 458 (0.30 Mb)</td>
<td rowspan="1" colspan="1">906 (639)</td>
<td rowspan="1" colspan="1">0.0190 (0.038)</td>
<td rowspan="1" colspan="1">0.048 (0.008)</td>
<td rowspan="1" colspan="1">29.5 (13.6)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Intergenic</td>
<td rowspan="1" colspan="1">103 911 (2.01 Mb)</td>
<td rowspan="1" colspan="1">5001 (4096)</td>
<td rowspan="1" colspan="1">0.023 (0.064)</td>
<td rowspan="1" colspan="1">0.036 (0.002)</td>
<td rowspan="1" colspan="1">22.6 (7.6)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TF1">
<p>
<sup>a</sup>
Total number of SNPs that map to G4 motifs. The number of unique (non-redundant) SNPs is given in parentheses.</p>
</fn>
<fn id="TF2">
<p>
<sup>b</sup>
The density estimate of SNPs at G4-CpGs included only C/T and A/G SNPs, since the majority of substitutions occurring at the hypermutable CpG are methylation-dependent transitions. A similar density estimate of SNPs at CpGs in the genomic background is given in parentheses.</p>
</fn>
<fn id="TF3">
<p>
<sup>c</sup>
Coverage is defined as the fraction of island bases covered by G4 bases. Coverage of G4 outside CpG islands is given in parentheses.</p>
</fn>
<fn id="TF4">
<p>
<sup>d</sup>
CpG density in genomic background is given in parentheses.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
<p>We next sought to identify mutational patterns of G4 motifs that were not related to CpG. To do so, we compared them with a set of randomly picked non-G4 sequences that matched the GC distribution of G4 (see ‘Materials and Methods’ section). Sampling non-G4 sequences in this manner enabled us to target non-CpG types of pattern in G4, since the mutational characteristics of CpG were approximately equalized between G4 and non-G4. We observed that the SNP density in G4 was consistently lower than in non-G4 sequences, although to a varying extent in the different genomic regions (
<xref ref-type="fig" rid="F2">Figure 2</xref>
). Since the primary sequence difference between G4 and the random non-G4 fragments was the density of guanine triplets, we hypothesized a suppression of nucleotide polymorphisms in the G4 tetrad regions (i.e. guanine triplets), and that this phenomenon would influence the relative low rate of G4 SNPs.
<fig id="F2" position="float">
<label>Figure 2.</label>
<caption>
<p>SNP density in G4 sequences versus randomly picked non-G4 sequences. The set of non-G4 sequences were drawn such that their GC-richness was equivalent to that of G4.</p>
</caption>
<graphic xlink:href="gkp590f2"></graphic>
</fig>
</p>
<sec>
<title>Critical sites of G4 motifs display low levels of polymorphism</title>
<p>We next investigated whether loop and tetrad (i.e. G-tracts) regions of G4 motifs are subject to different mutational pressures. The two distinct G4 regions are important for quadruplex formation and stability, the G-tracts that make up the tetrad planes being critical for formation and folding (
<xref ref-type="bibr" rid="B32">32</xref>
). It is worth noting that, in the G-tracts of G4 motifs, not all substitutions of guanine will disrupt the potential to form a quadruplex structure. For example, if a motif contains a run of four guanines, substitutions at either end of the run will not disrupt the required triplet and could therefore, in principle, preserve the quadruplex-forming potential. On the basis of this reasoning, we classified each position in G4 motifs as either ‘G4-disruptive’ or ‘G4-neutral’ (
<xref ref-type="fig" rid="F1">Figure 1</xref>
b). In all genomic regions analyzed, we found a significantly lower rate of SNPs in G4-disruptive positions relative to the G4-neutral positions (
<xref ref-type="fig" rid="F3">Figure 3</xref>
). However, since hypermutable CpGs are more frequent at neutral positions than disruptive positions by a factor of nearly three, we performed an additional analysis where CpG sites were masked (
<xref ref-type="table" rid="T2">Table 2</xref>
). The difference in SNP density between neutral and disruptive G4 positions decreased when considering non-CpG sites only, though disruptive sites still displayed a significantly lower level of sequence polymorphism. We elaborated on this finding with comparative genomics data, assessing the level of sequence conservation within the two classes of G4 sites. This was accomplished by constructing a four-species multiple sequence alignment (human, monkey, dog and mouse) of G4 motifs from the 28-way vertebrate MultiZ alignments. The disruptive sites of CpG-masked G4 motifs showed consistently higher levels of mammalian sequence conservation than non-disruptive sites (
<xref ref-type="fig" rid="F4">Figure 4</xref>
).
<fig id="F3" position="float">
<label>Figure 3.</label>
<caption>
<p>SNP density in G4-disruptive sites versus G4-neutral sites (see
<xref ref-type="fig" rid="F1">Figure 1</xref>
b for a definition of G4-disruptive and G4-neutral).</p>
</caption>
<graphic xlink:href="gkp590f3"></graphic>
</fig>
<table-wrap id="T2" position="float">
<label>Table 2.</label>
<caption>
<p>Density of SNPs in disruptive and neutral sites of G4 sequence motifs</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1"></th>
<th colspan="2" align="center" rowspan="1">G4-neutral
<hr></hr>
</th>
<th rowspan="1" colspan="1"></th>
<th colspan="2" align="center" rowspan="1">G4-disruptive
<hr></hr>
</th>
</tr>
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1">CpGs/kb</th>
<th rowspan="1" colspan="1">SNPs/kb
<xref ref-type="table-fn" rid="TF5">
<sup>a</sup>
</xref>
</th>
<th align="center" rowspan="1" colspan="1">
<italic>P</italic>
<xref ref-type="table-fn" rid="TF6">
<sup>b</sup>
</xref>
</th>
<th rowspan="1" colspan="1">CpGs/kb</th>
<th rowspan="1" colspan="1">SNPs/kb
<xref ref-type="table-fn" rid="TF5">
<sup>a</sup>
</xref>
</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">First introns</td>
<td rowspan="1" colspan="1">166.5</td>
<td rowspan="1" colspan="1">1.52 (1.38)</td>
<td rowspan="1" colspan="1"><0.00001</td>
<td rowspan="1" colspan="1">73.1</td>
<td rowspan="1" colspan="1">0.90 (0.91)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Gene 5′</td>
<td rowspan="1" colspan="1">166.1</td>
<td rowspan="1" colspan="1">1.68 (1.48)</td>
<td rowspan="1" colspan="1"><0.05</td>
<td rowspan="1" colspan="1">71.4</td>
<td rowspan="1" colspan="1">1.28 (1.29)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Gene 3′</td>
<td rowspan="1" colspan="1">83.9</td>
<td rowspan="1" colspan="1">2.37 (1.72)</td>
<td rowspan="1" colspan="1"><0.05</td>
<td rowspan="1" colspan="1">36.1</td>
<td rowspan="1" colspan="1">1.63 (1.45)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Intergenic</td>
<td rowspan="1" colspan="1">65.0</td>
<td rowspan="1" colspan="1">2.28 (1.65)</td>
<td rowspan="1" colspan="1"><0.001</td>
<td rowspan="1" colspan="1">25.6</td>
<td rowspan="1" colspan="1">1.62 (1.46)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TF5">
<p>
<sup>a</sup>
Density of SNPs in non-CpG sites are given in parentheses.</p>
</fn>
<fn id="TF6">
<p>
<sup>b</sup>
Difference in SNP density between G4-disruptive and G4-neutral sites (non-CpG) by Chi-squared analysis.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<fig id="F4" position="float">
<label>Figure 4.</label>
<caption>
<p>Sequence conservation in G4-disruptive sites versus G4-neutral sites. Shown is the fraction of conserved (i.e. all bases identical) sites at G4-disruptive and G4-neutral sites, as extracted from MultiZ sequence alignments of human G4 with monkey (rheMac2), dog (canFam2) and mouse (mm8). Only non-CpG sites were probed for conservation.</p>
</caption>
<graphic xlink:href="gkp590f4"></graphic>
</fig>
</p>
<p>The evident conserved nature and suppressed level of polymorphisms at G4-disruptive sites could, intuitively, be interpreted as if the G-quadruplex consensus sequence is under functional constraints in the genome. The basic rationale for this argument comes from two recent studies of mutated G-quadruplexes, which demonstrated that their conformational dynamics strongly depends on the position of the mutated guanine (
<xref ref-type="bibr" rid="B31">31</xref>
,
<xref ref-type="bibr" rid="B32">32</xref>
). In an analysis that applied single-molecule FRET spectroscopy on telomeric G4 motifs, the G-quadruplex was severely destabilized when a central guanine was substituted with thymine. Substitutions at the end of a guanine tract also produced less stable structures, though with a far less dramatic effect than the central ones (
<xref ref-type="bibr" rid="B31">31</xref>
). In accordance with these data, we observed a tendency in which the critical guanines of human G4 motifs are less polymorphic than their neutral counterparts. However, we found that this characteristic feature of G4 motifs occurred genome-wide, in a strand-independent manner, and also among G4 motifs in intergenic regions. These latter observations suggested that the phenomenon occurs as an effect of intrinsic mutation or DNA repair mechanisms rather than as a consequence of selection for the G4 consensus.</p>
</sec>
<sec>
<title>General under-representation of SNPs in guanine tracts</title>
<p>The distribution of SNPs in G4 motifs revealed that nucleotide polymorphisms in G4 DNA would more likely alter the loop conformation than the quadruplex-forming potential. We next asked whether this pattern of guanine substitutions is occurring in a genome-wide fashion, not restricted to the G-tracts of G4 motifs. More specifically, we estimated the relative over-representation of each DNA three-mer at polymorphic guanines by comparing its frequency at polymorphic sites versus non-polymorphic sites, adopting the approach used by Tomso and co-workers (
<xref ref-type="bibr" rid="B41">41</xref>
). For each polymorphic site, two centered three-mers were recorded, one for each allele. Importantly, since the SNP data does not provide any information as to which strand the original mutational event occurred, we cannot distinguish between a context and its reverse complementary context. We thus ignored strandedness and pooled reverse complementary three-mers together. We confirmed previous observations that CpG-containing three-mers are the most over-represented sequence contexts at human SNPs (
<xref ref-type="bibr" rid="B36">36</xref>
,
<xref ref-type="bibr" rid="B41">41</xref>
). At the opposite end, we observed that a guanine surrounded by other guanines (i.e. 5′-G
<underline>G</underline>
G-3′/5′-C
<underline>C</underline>
C-3′, polymorphic site underlined), is among the DNA sequence contexts that is most under-represented at polymorphic sites (
<xref ref-type="fig" rid="F5">Figure 5</xref>
). In fact, it was the most under-represented sequence context among polymorphisms within first introns, at the 5′ end of genes, and at the 3′ end of genes. Our data thus indicate that SNPs with the highest probability of disrupting G-tracts represent the most under-represented SNP context in regulatory gene sequences. We also noted that for sequence contexts at both ends of G-tracts, which for three-mers constitute the 5′-N
<underline>G</underline>
G-3′/5′-C
<underline>C</underline>
N-3′ and 5′-G
<underline>G</underline>
N-3′/5′-N
<underline>C</underline>
C-3′ contexts, the frequencies of polymorphisms were generally low. An exception was the 5′-G
<underline>G</underline>
T-3′/5′-A
<underline>C</underline>
C-3′ context (and the CpG-containing 5′-C
<underline>G</underline>
G-3′/5′-C
<underline>C</underline>
G-3′, not shown in
<xref ref-type="fig" rid="F5">Figure 5</xref>
).
<fig id="F5" position="float">
<label>Figure 5.</label>
<caption>
<p>The ratio of DNA three-mers at polymorphic to non-polymorphic sites. Only non-CpG three-mers have been plotted, and each three-mer ratio constitutes the combined ratio of the forward and reverse complementary context. Only SNPs that were proven polymorphic by the HapMap project were used in the calculation.</p>
</caption>
<graphic xlink:href="gkp590f5"></graphic>
</fig>
</p>
<p>Which biological mechanisms could underlie the low rate of polymorphisms inside guanine/cytosine tracts? The phenomenon was not only evident in regulatory regions, but also appeared to occur in intergenic regions, where the modulation of mutational output by natural selection is believed to be weaker. The latter suggests that the observed pattern of SNPs reflects a context-dependency in the mechanisms underlying human mutation. The mutational input to polymorphisms in DNA is considered to be base damage or incorporation of incorrect bases by polymerases during replication, followed by no or error-prone DNA repair. Both the frequency of damages, and the efficiency and fidelity of DNA replication and repair are probably dependent on the sequence context. It is clear that a very significant source of mutations is due to deamination of 5-methylcytosine (5mC) in CpG dinucleotides. An important additional source of mutations is due to lacking or error-prone repair of 7,8-dihydro-8-oxo-guanine (8-oxoG) in the DNA. It may be caused by UV radiation or oxidative damage to guanine. Several DNA repair systems targets this type of damage, including base excision repair and mismatch repair, but they are not perfect. The damage may occur either to guanines in the nucleotide pool or directly to the guanines in the DNA. In the former case, 8-oxoG may subsequently be incorporated into the DNA unless degraded by the NUDT1 hydrolase (
<xref ref-type="bibr" rid="B47">47</xref>
). If 8-oxoG in the DNA is not removed by the OGG1 glycosylase (
<xref ref-type="bibr" rid="B48">48</xref>
,
<xref ref-type="bibr" rid="B49">49</xref>
), subsequent replication may lead to an adenine being incorrectly incorporated opposite the 8-oxoG instead of a cytosine. If the adenine is not removed by the MUTYH glycosylase (
<xref ref-type="bibr" rid="B50">50</xref>
) before the next round of replication, this process may result in a G:C to T:A transversion. McCulloch
<italic>et al.</italic>
(
<xref ref-type="bibr" rid="B51">51</xref>
) has recently studied the efficiency and fidelity of DNA in 8-oxoG bypass by polymerases, and their work may indicate a slight dependency on the sequence context for the human polymerase η. Further work is necessary to determine, in detail, the context dependency of polymerases and if this can be a basis for sequence-dependent mutation rates.</p>
<p>Imbalance in the nucleotide precursor pool represents another potential source of mutations. In a mammalian model system that induced thymidine mutations by pool perturbation, it was shown that guanine residues flanked on their 3′ side by other guanine residues are severalfold less mutable than guanine residues flanked on their 3′ side by a different base (
<xref ref-type="bibr" rid="B52">52</xref>
). The underlying mechanism for this pattern was not examined. The authors do, however, argue that differential repair of misincorporated thymidines could be involved. Nonetheless, it is intriguing to see how well these patterns of induced mutations fit with the spectrum we observed for guanine SNPs.</p>
<p>Could systematic DNA-sequencing errors among the polymorphisms collected from dbSNP account for the observed pattern? It has been shown that a few sequence contexts are particularly prone to sequencing errors (one of them being C(A/Y)C), and that these are over-represented among non-validated SNPs (
<xref ref-type="bibr" rid="B53">53</xref>
). However, our strategy to pick SNPs from dbSNP was designed in a conservative manner (see ‘Materials and Methods’ section), thereby excluding the majority of false-positive SNPs. Also, we imposed even stricter requirements in the analysis of SNP three-mers, in which we only considered SNPs that were proven polymorphic by HapMap genotyping.</p>
</sec>
<sec>
<title>A G4 strand bias for disruptive SNPs</title>
<p>In the previous analyses of SNPs in G4 motifs, we considered general G4 formation potential during DNA denaturation, thereby ignoring the strand orientation of motifs. If we regard G4-regulated gene transcription as a separate process, the potential for regulation lies primarily within motifs on the nontemplate strand, which has shown a significant enrichment relative to the template strand (
<xref ref-type="bibr" rid="B40">40</xref>
,
<xref ref-type="bibr" rid="B42">42</xref>
). We therefore undertook an additional analysis of SNPs in G4 that incorporated strandness of motifs. The extent of G4 strand bias was defined as the ratio of SNP density (non-CpG) in G4 on the non-template strand to the SNP density in G4 on the template strand, where a ratio of 1 implies no strand bias. Interestingly, we observed a marked strand bias for G4-disruptive SNPs in regulatory sequences, while negligible biases were observed among the neutral G4 SNPs (
<xref ref-type="fig" rid="F6">Figure 6</xref>
). For disruptive SNPs, it was evident that their density in G4 motifs on the non-template strand was lower than on the template strand. This bias was significant at the 5′ end of genes (
<italic>P</italic>
< 0.02, χ
<sup>2</sup>
= 5.77, df = 1) and at the 3′ end of genes (
<italic>P</italic>
< 0.02, χ
<sup>2</sup>
= 5.82, df = 1). The result was not an artefact of the overlapping G4 motifs (and SNPs), since the count of unique SNPs in non-overlapping G4 motifs also produced significant strand biases at a significance level of 0.05 (data not shown). As a means to validate the observation at the 5′ end, and to test whether the result was a mere consequence of general suppression of polymorphisms in guanine tracts on the nontemplate strand, we carried out a similar type of analysis with a related sequence element, the SP1 transcription factor (5′-GGGCGG-3′) (
<xref ref-type="bibr" rid="B44">44</xref>
). More specifically, we asked whether there is a strand bias (with respect to SP1) for nucleotide polymorphisms that disrupt the SP1 motif at positions 2 or 3 (two non-CpG sites). The level of SP1 disruption did not differ significantly between the two strands at the 5′ end (
<italic>P</italic>
= 0.855, χ
<sup>2</sup>
= 0.03, df = 1), although the set of polymorphisms that mapped to the SP1 motif was considerably smaller than the G4 set (524 SP1 polymorphisms versus 1157 G4 polymorphisms).
<fig id="F6" position="float">
<label>Figure 6.</label>
<caption>
<p>The ratio of SNP density (non-CpG) in nontemplate motifs to the SNP density in template motifs. The dashed line indicates a similar rate of SNPs with respect to the strandedness of the motif, i.e. no strand bias.</p>
</caption>
<graphic xlink:href="gkp590f6"></graphic>
</fig>
</p>
<p>The low rate of human SNPs in G4-disruptive positions on the non-template strand support a higher relative importance for this strand in G4-mediated gene regulation. When present on this strand downstream of the TSS, the G-quadruplex may form as part of the pre-mRNA and/or potentially the mRNA, and it may thus serve as multiple targets for regulation (
<xref ref-type="bibr" rid="B40">40</xref>
). The formation of G4 on the template strand would on the other hand hinder the progression of the RNA polymerase, and is therefore less desirable (
<xref ref-type="bibr" rid="B23">23</xref>
). We also showed that another G-rich element, the SP1 transcription factor, did not display any strand bias with respect to disruptive SNPs at the 5′ end. It may thus seem as if the pattern supports a specific biological importance for G4 motifs on the non-template strand at the 5′ end of genes.</p>
</sec>
</sec>
<sec sec-type="conclusions">
<title>CONCLUSION</title>
<p>The recent genome-wide scans of G4 motifs in the human genome have identified enrichment in gene regulatory sequences, and the same tendency has been shown when searching the genomes of chimpanzee, rat and mouse (
<xref ref-type="bibr" rid="B16">16</xref>
,
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B45">45</xref>
). The prevalence of G4 motifs upstream of mammalian genes has been interpreted as a sign of selection for G4, and consequently implicated the G-quadruplex structure as a potential mechanism for regulating gene expression (
<xref ref-type="bibr" rid="B23">23</xref>
,
<xref ref-type="bibr" rid="B39">39</xref>
). On the basis of sequence data only, it is nonetheless impossible to determine the extent of quadruplex formation
<italic>in vivo</italic>
, although it seems most likely that only a low percentage of the G4 motifs will adopt structures during denaturation.</p>
<p>Here, a close examination of the context-dependent pattern of guanine polymorphisms has provided an additional perspective on G4 prevalence. It shows how the aspect of sequence mutagenesis could impact the evolution of guanine tracts, the key component in G4 motifs. Although significant patterns emerged, our results are limited by the approximately 11 000 SNPs that map to G4 motifs in the human genome. Following next-generation sequencing and collaborative efforts such as the 1000 Genome Projects (
<xref ref-type="bibr" rid="B54">54</xref>
), more data should be available for studying the nature of G4 sequence polymorphism. An interesting extension of our analysis, which requires more validated SNPs available, is to relate the directionality of each SNP (i.e. by determining the ancestral and derived allele) to G4 evolution. Nevertheless, in light of our current findings, we warrant a closer examination of the relationship between G4 and other factors that might constrain the nearest-neighbour sequence patterns in DNA, an example being the physical requirements needed for the dense packing of DNA around nucleosomes.</p>
</sec>
<sec>
<title>FUNDING</title>
<p>Research Council of Norway. Funding for open access charge: the EU FP7 contract 223367.</p>
<p>
<italic>Conflict of interest statement</italic>
. None declared.</p>
</sec>
</body>
<back>
<ref-list>
<title>REFERENCES</title>
<ref id="B1">
<label>1</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Gilbert</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis</article-title>
<source>Nature</source>
<year>1988</year>
<volume>334</volume>
<fpage>364</fpage>
<lpage>366</lpage>
<pub-id pub-id-type="pmid">3393228</pub-id>
</element-citation>
</ref>
<ref id="B2">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gellert</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lipsett</surname>
<given-names>MN</given-names>
</name>
<name>
<surname>Davies</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Helix formation by guanylic acid</article-title>
<source>Proc. Natl Acad. Sci. USA</source>
<year>1962</year>
<volume>48</volume>
<fpage>2013</fpage>
<lpage>2018</lpage>
<pub-id pub-id-type="pmid">13947099</pub-id>
</element-citation>
</ref>
<ref id="B3">
<label>3</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hazel</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Huppert</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Balasubramanian</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Neidle</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Loop-length-dependent folding of G-quadruplexes</article-title>
<source>J. Am. Chem. Soc.</source>
<year>2004</year>
<volume>126</volume>
<fpage>16405</fpage>
<lpage>16415</lpage>
<pub-id pub-id-type="pmid">15600342</pub-id>
</element-citation>
</ref>
<ref id="B4">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Risitano</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Fox</surname>
<given-names>KR</given-names>
</name>
</person-group>
<article-title>Influence of loop size on the stability of intramolecular DNA quadruplexes</article-title>
<source>Nucleic Acids Res.</source>
<year>2004</year>
<volume>32</volume>
<fpage>2598</fpage>
<lpage>2606</lpage>
<pub-id pub-id-type="pmid">15141030</pub-id>
</element-citation>
</ref>
<ref id="B5">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Burge</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hazel</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Todd</surname>
<given-names>AK</given-names>
</name>
</person-group>
<article-title>Quadruplex DNA: sequence, topology and structure</article-title>
<source>Nucleic Acids Res.</source>
<year>2006</year>
<volume>34</volume>
<fpage>5402</fpage>
<lpage>5415</lpage>
<pub-id pub-id-type="pmid">17012276</pub-id>
</element-citation>
</ref>
<ref id="B6">
<label>6</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rachwal</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>Findlow</surname>
<given-names>IS</given-names>
</name>
<name>
<surname>Werner</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Fox</surname>
<given-names>KR</given-names>
</name>
</person-group>
<article-title>Intramolecular DNA quadruplexes with different arrangements of short and long loops</article-title>
<source>Nucleic Acids Res.</source>
<year>2007</year>
<volume>35</volume>
<fpage>4214</fpage>
<lpage>4222</lpage>
<pub-id pub-id-type="pmid">17576685</pub-id>
</element-citation>
</ref>
<ref id="B7">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sundquist</surname>
<given-names>WI</given-names>
</name>
<name>
<surname>Klug</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Telomeric DNA dimerizes by formation of guanine tetrads between hairpin loops</article-title>
<source>Nature</source>
<year>1989</year>
<volume>342</volume>
<fpage>825</fpage>
<lpage>829</lpage>
<pub-id pub-id-type="pmid">2601741</pub-id>
</element-citation>
</ref>
<ref id="B8">
<label>8</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Williamson</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Raghuraman</surname>
<given-names>MK</given-names>
</name>
<name>
<surname>Cech</surname>
<given-names>TR</given-names>
</name>
</person-group>
<article-title>Monovalent cation-induced structure of telomeric DNA: the G-quartet model</article-title>
<source>Cell</source>
<year>1989</year>
<volume>59</volume>
<fpage>871</fpage>
<lpage>880</lpage>
<pub-id pub-id-type="pmid">2590943</pub-id>
</element-citation>
</ref>
<ref id="B9">
<label>9</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schaffitzel</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Postberg</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hanes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lipps</surname>
<given-names>HJ</given-names>
</name>
<name>
<surname>Pluckthun</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>In vitro generated antibodies specific for telomeric guanine-quadruplex DNA react with Stylonychia lemnae macronuclei</article-title>
<source>Proc. Natl Acad. Sci. USA</source>
<year>2001</year>
<volume>98</volume>
<fpage>8572</fpage>
<lpage>8577</lpage>
<pub-id pub-id-type="pmid">11438689</pub-id>
</element-citation>
</ref>
<ref id="B10">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Duquette</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Handa</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Vincent</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Maizels</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA</article-title>
<source>Genes Dev.</source>
<year>2004</year>
<volume>18</volume>
<fpage>1618</fpage>
<lpage>1629</lpage>
<pub-id pub-id-type="pmid">15231739</pub-id>
</element-citation>
</ref>
<ref id="B11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paeschke</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Simonsson</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Postberg</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rhodes</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Lipps</surname>
<given-names>HJ</given-names>
</name>
</person-group>
<article-title>Telomere end-binding proteins control the formation of G-quadruplex DNA structures
<italic>in vivo</italic>
</article-title>
<source>Nat. Struct. Mol. Biol.</source>
<year>2005</year>
<volume>12</volume>
<fpage>847</fpage>
<lpage>854</lpage>
<pub-id pub-id-type="pmid">16142245</pub-id>
</element-citation>
</ref>
<ref id="B12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bachrati</surname>
<given-names>CZ</given-names>
</name>
<name>
<surname>Hickson</surname>
<given-names>ID</given-names>
</name>
</person-group>
<article-title>Analysis of the DNA unwinding activity of RecQ family helicases</article-title>
<source>Methods Enzymol.</source>
<year>2006</year>
<volume>409</volume>
<fpage>86</fpage>
<lpage>100</lpage>
<pub-id pub-id-type="pmid">16793396</pub-id>
</element-citation>
</ref>
<ref id="B13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Karow</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Hickson</surname>
<given-names>ID</given-names>
</name>
<name>
<surname>Maizels</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>The Bloom's syndrome helicase unwinds G4 DNA</article-title>
<source>J. Biol. Chem.</source>
<year>1998</year>
<volume>273</volume>
<fpage>27587</fpage>
<lpage>27592</lpage>
<pub-id pub-id-type="pmid">9765292</pub-id>
</element-citation>
</ref>
<ref id="B14">
<label>14</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Shin-ya</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Brosh</surname>
<given-names>RM</given-names>
<suffix>Jr</suffix>
</name>
</person-group>
<article-title>FANCJ helicase defective in Fanconia anemia and breast cancer unwinds G-quadruplex DNA to defend genomic stability</article-title>
<source>Mol. Cell Biol.</source>
<year>2008</year>
<volume>28</volume>
<fpage>4116</fpage>
<lpage>4128</lpage>
<pub-id pub-id-type="pmid">18426915</pub-id>
</element-citation>
</ref>
<ref id="B15">
<label>15</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fry</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Tetraplex DNA and its interacting proteins</article-title>
<source>Front. Biosci.</source>
<year>2007</year>
<volume>12</volume>
<fpage>4336</fpage>
<lpage>4351</lpage>
<pub-id pub-id-type="pmid">17485378</pub-id>
</element-citation>
</ref>
<ref id="B16">
<label>16</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huppert</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Balasubramanian</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Prevalence of quadruplexes in the human genome</article-title>
<source>Nucleic Acids Res.</source>
<year>2005</year>
<volume>33</volume>
<fpage>2908</fpage>
<lpage>2916</lpage>
<pub-id pub-id-type="pmid">15914667</pub-id>
</element-citation>
</ref>
<ref id="B17">
<label>17</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Todd</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Johnston</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Neidle</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Highly prevalent putative quadruplex sequence motifs in human DNA</article-title>
<source>Nucleic Acids Res.</source>
<year>2005</year>
<volume>33</volume>
<fpage>2901</fpage>
<lpage>2907</lpage>
<pub-id pub-id-type="pmid">15914666</pub-id>
</element-citation>
</ref>
<ref id="B18">
<label>18</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kikin</surname>
<given-names>O</given-names>
</name>
<name>
<surname>D'Antonio</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bagga</surname>
<given-names>PS</given-names>
</name>
</person-group>
<article-title>QGRS Mapper: a web-based server for predicting G-quadruplexes in nucleotide sequences</article-title>
<source>Nucleic Acids Res.</source>
<year>2006</year>
<volume>34</volume>
<fpage>W676</fpage>
<lpage>W682</lpage>
<pub-id pub-id-type="pmid">16845096</pub-id>
</element-citation>
</ref>
<ref id="B19">
<label>19</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hanakahi</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Maizels</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>High affinity interactions of nucleolin with G-G-paired rDNA</article-title>
<source>J. Biol. Chem.</source>
<year>1999</year>
<volume>274</volume>
<fpage>15908</fpage>
<lpage>15912</lpage>
<pub-id pub-id-type="pmid">10336496</pub-id>
</element-citation>
</ref>
<ref id="B20">
<label>20</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dempsey</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Hanakahi</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Maizels</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>G4 DNA binding by LR1 and its subunits, nucleolin and hnRNP D, A role for G-G pairing in immunoglobulin switch recombination</article-title>
<source>J. Biol. Chem.</source>
<year>1999</year>
<volume>274</volume>
<fpage>1066</fpage>
<lpage>1071</lpage>
<pub-id pub-id-type="pmid">9873052</pub-id>
</element-citation>
</ref>
<ref id="B21">
<label>21</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Patel</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Solution structure of the human telomeric repeat d[AG3(T2AG3)3] G-tetraplex</article-title>
<source>Structure</source>
<year>1993</year>
<volume>1</volume>
<fpage>263</fpage>
<lpage>282</lpage>
<pub-id pub-id-type="pmid">8081740</pub-id>
</element-citation>
</ref>
<ref id="B22">
<label>22</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huppert</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Balasubramanian</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>G-quadruplexes in promoters throughout the human genome</article-title>
<source>Nucleic Acids Res.</source>
<year>2007</year>
<volume>35</volume>
<fpage>406</fpage>
<lpage>413</lpage>
<pub-id pub-id-type="pmid">17169996</pub-id>
</element-citation>
</ref>
<ref id="B23">
<label>23</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Du</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Genome-wide analysis reveals regulatory role of G4 DNA in gene transcription</article-title>
<source>Genome Res.</source>
<year>2008</year>
<volume>18</volume>
<fpage>233</fpage>
<lpage>241</lpage>
<pub-id pub-id-type="pmid">18096746</pub-id>
</element-citation>
</ref>
<ref id="B24">
<label>24</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Siddiqui-Jain</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Grand</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Bearss</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Hurley</surname>
<given-names>LH</given-names>
</name>
</person-group>
<article-title>Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription</article-title>
<source>Proc. Natl Acad. Sci. USA</source>
<year>2002</year>
<volume>99</volume>
<fpage>11593</fpage>
<lpage>11598</lpage>
<pub-id pub-id-type="pmid">12195017</pub-id>
</element-citation>
</ref>
<ref id="B25">
<label>25</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simonsson</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Pecinka</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Kubista</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>DNA tetraplex formation in the control region of c-myc</article-title>
<source>Nucleic Acids Res.</source>
<year>1998</year>
<volume>26</volume>
<fpage>1167</fpage>
<lpage>1172</lpage>
<pub-id pub-id-type="pmid">9469822</pub-id>
</element-citation>
</ref>
<ref id="B26">
<label>26</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fernando</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Reszka</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Huppert</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ladame</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Rankin</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Venkitaraman</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Neidle</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Balasubramanian</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>A conserved quadruplex motif located in a transcription activation site of the human c-kit oncogene</article-title>
<source>Biochemistry</source>
<year>2006</year>
<volume>45</volume>
<fpage>7854</fpage>
<lpage>7860</lpage>
<pub-id pub-id-type="pmid">16784237</pub-id>
</element-citation>
</ref>
<ref id="B27">
<label>27</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yafe</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Etzioni</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Weisman-Shomer</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Fry</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Formation and properties of hairpin and tetraplex structures of guanine-rich regulatory sequences of muscle-specific genes</article-title>
<source>Nucleic Acids Res.</source>
<year>2005</year>
<volume>33</volume>
<fpage>2887</fpage>
<lpage>2900</lpage>
<pub-id pub-id-type="pmid">15908587</pub-id>
</element-citation>
</ref>
<ref id="B28">
<label>28</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Du</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Extensive selection for the enrichment of G4 DNA motifs in transcriptional regulatory regions of warm blooded animals</article-title>
<source>FEBS Lett.</source>
<year>2007</year>
<volume>581</volume>
<fpage>1951</fpage>
<lpage>1956</lpage>
<pub-id pub-id-type="pmid">17462634</pub-id>
</element-citation>
</ref>
<ref id="B29">
<label>29</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bugaut</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Balasubramanian</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>A sequence-independent study of the influence of short loop lengths on the stability and topology of intramolecular DNA G-quadruplexes</article-title>
<source>Biochemistry</source>
<year>2008</year>
<volume>47</volume>
<fpage>689</fpage>
<lpage>697</lpage>
<pub-id pub-id-type="pmid">18092816</pub-id>
</element-citation>
</ref>
<ref id="B30">
<label>30</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Sahoo</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Varun</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Maiti</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Effect of loop length variation on quadruplex-Watson Crick duplex competition</article-title>
<source>Nucleic Acids Res.</source>
<year>2008</year>
<volume>36</volume>
<fpage>4433</fpage>
<lpage>4442</lpage>
<pub-id pub-id-type="pmid">18599514</pub-id>
</element-citation>
</ref>
<ref id="B31">
<label>31</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>JY</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>DS</given-names>
</name>
</person-group>
<article-title>Dramatic effect of single-base mutation on the conformational dynamics of human telomeric G-quadruplex</article-title>
<source>Nucleic Acids Res.</source>
<year>2009</year>
<volume>37</volume>
<fpage>3625</fpage>
<lpage>3634</lpage>
<pub-id pub-id-type="pmid">19359361</pub-id>
</element-citation>
</ref>
<ref id="B32">
<label>32</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gros</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rosu</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Amrane</surname>
<given-names>S</given-names>
</name>
<name>
<surname>De Cian</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Gabelica</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Lacroix</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Mergny</surname>
<given-names>JL</given-names>
</name>
</person-group>
<article-title>Guanines are a quartet's best friend: impact of base substitutions on the kinetics and stability of tetramolecular quadruplexes</article-title>
<source>Nucleic Acids Res.</source>
<year>2007</year>
<volume>35</volume>
<fpage>3064</fpage>
<lpage>3075</lpage>
<pub-id pub-id-type="pmid">17452368</pub-id>
</element-citation>
</ref>
<ref id="B33">
<label>33</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blake</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Hess</surname>
<given-names>ST</given-names>
</name>
<name>
<surname>Nicholson-Tuell</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>The influence of nearest neighbors on the rate and pattern of spontaneous point mutations</article-title>
<source>J. Mol. Evol.</source>
<year>1992</year>
<volume>34</volume>
<fpage>189</fpage>
<lpage>200</lpage>
<pub-id pub-id-type="pmid">1588594</pub-id>
</element-citation>
</ref>
<ref id="B34">
<label>34</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fryxell</surname>
<given-names>KJ</given-names>
</name>
<name>
<surname>Moon</surname>
<given-names>WJ</given-names>
</name>
</person-group>
<article-title>CpG mutation rates in the human genome are highly dependent on local GC content</article-title>
<source>Mol. Biol. Evol.</source>
<year>2005</year>
<volume>22</volume>
<fpage>650</fpage>
<lpage>658</lpage>
<pub-id pub-id-type="pmid">15537806</pub-id>
</element-citation>
</ref>
<ref id="B35">
<label>35</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hodgkinson</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ladoukakis</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Eyre-Walker</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Cryptic variation in the human mutation rate</article-title>
<source>PLoS Biol.</source>
<year>2009</year>
<volume>7</volume>
<fpage>e27</fpage>
</element-citation>
</ref>
<ref id="B36">
<label>36</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krawczak</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ball</surname>
<given-names>EV</given-names>
</name>
<name>
<surname>Cooper</surname>
<given-names>DN</given-names>
</name>
</person-group>
<article-title>Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes</article-title>
<source>Am. J. Hum. Genet.</source>
<year>1998</year>
<volume>63</volume>
<fpage>474</fpage>
<lpage>488</lpage>
<pub-id pub-id-type="pmid">9683596</pub-id>
</element-citation>
</ref>
<ref id="B37">
<label>37</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sherry</surname>
<given-names>ST</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>Kholodov</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Phan</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Smigielski</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Sirotkin</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>dbSNP: the NCBI database of genetic variation</article-title>
<source>Nucleic Acids Res.</source>
<year>2001</year>
<volume>29</volume>
<fpage>308</fpage>
<lpage>311</lpage>
<pub-id pub-id-type="pmid">11125122</pub-id>
</element-citation>
</ref>
<ref id="B38">
<label>38</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Karolchik</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Baertsch</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Diekhans</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Furey</surname>
<given-names>TS</given-names>
</name>
<name>
<surname>Hinrichs</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>YT</given-names>
</name>
<name>
<surname>Roskin</surname>
<given-names>KM</given-names>
</name>
<name>
<surname>Schwartz</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sugnet</surname>
<given-names>CW</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>DJ</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The UCSC Genome Browser Database</article-title>
<source>Nucleic Acids Res.</source>
<year>2003</year>
<volume>31</volume>
<fpage>51</fpage>
<lpage>54</lpage>
<pub-id pub-id-type="pmid">12519945</pub-id>
</element-citation>
</ref>
<ref id="B39">
<label>39</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eddy</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Maizels</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Selection for the G4 DNA motif at the 5′ end of human genes</article-title>
<source>Mol. Carcinog.</source>
<year>2009</year>
<volume>48</volume>
<fpage>319</fpage>
<lpage>325</lpage>
<pub-id pub-id-type="pmid">19306310</pub-id>
</element-citation>
</ref>
<ref id="B40">
<label>40</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eddy</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Conserved elements with potential to form polymorphic G-quadruplex structures in the first intron of human genes</article-title>
<source>Nucleic Acids Res.</source>
<year>2007</year>
<volume>36</volume>
<fpage>1321</fpage>
<lpage>1333</lpage>
<pub-id pub-id-type="pmid">18187510</pub-id>
</element-citation>
</ref>
<ref id="B41">
<label>41</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tomso</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Bell</surname>
<given-names>DA</given-names>
</name>
</person-group>
<article-title>Sequence context at human single nucleotide polymorphisms: overrepresentation of CpG dinucleotide at polymorphic sites and suppression of variation in CpG islands</article-title>
<source>J. Mol. Biol.</source>
<year>2003</year>
<volume>327</volume>
<fpage>303</fpage>
<lpage>308</lpage>
<pub-id pub-id-type="pmid">12628237</pub-id>
</element-citation>
</ref>
<ref id="B42">
<label>42</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eddy</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Maizels</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Gene function correlates with potential for G4 DNA formation in the human genome</article-title>
<source>Nucleic Acids Res.</source>
<year>2006</year>
<volume>34</volume>
<fpage>3887</fpage>
<lpage>3896</lpage>
<pub-id pub-id-type="pmid">16914419</pub-id>
</element-citation>
</ref>
<ref id="B43">
<label>43</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huppert</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Bugaut</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Kumari</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Balasubramanian</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>G-quadruplexes: the beginning and end of UTRs</article-title>
<source>Nucleic Acids Res.</source>
<year>2008</year>
<volume>36</volume>
<fpage>6260</fpage>
<lpage>6268</lpage>
<pub-id pub-id-type="pmid">18832370</pub-id>
</element-citation>
</ref>
<ref id="B44">
<label>44</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Todd</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Neidle</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>The relationship of potential G-quadruplex sequences in cis-upstream regions of the human genome to SP1-binding elements</article-title>
<source>Nucleic Acids Res.</source>
<year>2008</year>
<volume>36</volume>
<fpage>2700</fpage>
<lpage>2704</lpage>
<pub-id pub-id-type="pmid">18353860</pub-id>
</element-citation>
</ref>
<ref id="B45">
<label>45</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Verma</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Halder</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Halder</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Yadav</surname>
<given-names>VK</given-names>
</name>
<name>
<surname>Rawal</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Thakur</surname>
<given-names>RK</given-names>
</name>
<name>
<surname>Mohd</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Chowdhury</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Genome-wide computational and expression analyses reveal G-quadruplex DNA motifs as conserved cis-regulatory elements in human and related species</article-title>
<source>J. Med. Chem.</source>
<year>2008</year>
<volume>51</volume>
<fpage>5641</fpage>
<lpage>5649</lpage>
<pub-id pub-id-type="pmid">18767830</pub-id>
</element-citation>
</ref>
<ref id="B46">
<label>46</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bird</surname>
<given-names>AP</given-names>
</name>
</person-group>
<article-title>CpG-rich islands and the function of DNA methylation</article-title>
<source>Nature</source>
<year>1986</year>
<volume>321</volume>
<fpage>209</fpage>
<lpage>213</lpage>
<pub-id pub-id-type="pmid">2423876</pub-id>
</element-citation>
</ref>
<ref id="B47">
<label>47</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sakumi</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Furuichi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tsuzuki</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kakuma</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kawabata</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Maki</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Sekiguchi</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Cloning and expression of cDNA for a human enzyme that hydrolyzes 8-oxo-dGTP, a mutagenic substrate for DNA synthesis</article-title>
<source>J. Biol. Chem.</source>
<year>1993</year>
<volume>268</volume>
<fpage>23524</fpage>
<lpage>23530</lpage>
<pub-id pub-id-type="pmid">8226881</pub-id>
</element-citation>
</ref>
<ref id="B48">
<label>48</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bjoras</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Luna</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Johnsen</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Hoff</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Haug</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rognes</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Seeberg</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Opposite base-dependent reactions of a human base excision repair enzyme on DNA containing 7,8-dihydro-8-oxoguanine and abasic sites</article-title>
<source>EMBO J.</source>
<year>1997</year>
<volume>16</volume>
<fpage>6314</fpage>
<lpage>6322</lpage>
<pub-id pub-id-type="pmid">9321410</pub-id>
</element-citation>
</ref>
<ref id="B49">
<label>49</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nash</surname>
<given-names>HM</given-names>
</name>
<name>
<surname>Bruner</surname>
<given-names>SD</given-names>
</name>
<name>
<surname>Scharer</surname>
<given-names>OD</given-names>
</name>
<name>
<surname>Kawate</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Addona</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Spooner</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Lane</surname>
<given-names>WS</given-names>
</name>
<name>
<surname>Verdine</surname>
<given-names>GL</given-names>
</name>
</person-group>
<article-title>Cloning of a yeast 8-oxoguanine DNA glycosylase reveals the existence of a base-excision DNA-repair protein superfamily</article-title>
<source>Curr. Biol.</source>
<year>1996</year>
<volume>6</volume>
<fpage>968</fpage>
<lpage>980</lpage>
<pub-id pub-id-type="pmid">8805338</pub-id>
</element-citation>
</ref>
<ref id="B50">
<label>50</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Slupska</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Baikalov</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Luther</surname>
<given-names>WM</given-names>
</name>
<name>
<surname>Chiang</surname>
<given-names>JH</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>YF</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>JH</given-names>
</name>
</person-group>
<article-title>Cloning and sequencing a human homolog (hMYH) of the Escherichia coli mutY gene whose function is required for the repair of oxidative DNA damage</article-title>
<source>J. Bacteriol.</source>
<year>1996</year>
<volume>178</volume>
<fpage>3885</fpage>
<lpage>3892</lpage>
<pub-id pub-id-type="pmid">8682794</pub-id>
</element-citation>
</ref>
<ref id="B51">
<label>51</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McCulloch</surname>
<given-names>SD</given-names>
</name>
<name>
<surname>Kokoska</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Garg</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Burgers</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Kunkel</surname>
<given-names>TA</given-names>
</name>
</person-group>
<article-title>The efficiency and fidelity of 8-oxo-guanine bypass by DNA polymerases {delta} and {eta}</article-title>
<source>Nucleic Acids Res.</source>
<year>2009</year>
<volume>37</volume>
<fpage>2830</fpage>
<lpage>2840</lpage>
<pub-id pub-id-type="pmid">19282446</pub-id>
</element-citation>
</ref>
<ref id="B52">
<label>52</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kresnak</surname>
<given-names>MT</given-names>
</name>
<name>
<surname>Davidson</surname>
<given-names>RL</given-names>
</name>
</person-group>
<article-title>Thymidine-induced mutations in mammalian cells: sequence specificity and implications for mutagenesis in vivo</article-title>
<source>Proc. Natl Acad. Sci. USA</source>
<year>1992</year>
<volume>89</volume>
<fpage>2829</fpage>
<lpage>2833</lpage>
<pub-id pub-id-type="pmid">1557389</pub-id>
</element-citation>
</ref>
<ref id="B53">
<label>53</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Platzer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hiller</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Szafranski</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Jahn</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Hampe</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schreiber</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Backofen</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Huse</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Sequencing errors or SNPs at splice-acceptor guanines in dbSNP?</article-title>
<source>Nat. Biotechnol.</source>
<year>2006</year>
<volume>24</volume>
<fpage>1068</fpage>
<lpage>1070</lpage>
<pub-id pub-id-type="pmid">16964207</pub-id>
</element-citation>
</ref>
<ref id="B54">
<label>54</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Siva</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>1000 Genomes project</article-title>
<source>Nat. Biotechnol.</source>
<year>2008</year>
<volume>26</volume>
<fpage>256</fpage>
<pub-id pub-id-type="pmid">18327223</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F44 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000F44 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:2761265
   |texte=   The disruptive positions in human G-quadruplex motifs are less polymorphic and more conserved than their neutral counterparts
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:19617376" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021