Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Clustering of DNA sequences in human promoters.

Identifieur interne : 003075 ( Main/Exploration ); précédent : 003074; suivant : 003076

Clustering of DNA sequences in human promoters.

Auteurs : Peter C. Fitzgerald [États-Unis] ; Andrey Shlyakhtenko ; Alain A. Mir ; Charles Vinson

Source :

RBID : pubmed:15256515

Descripteurs français

English descriptors

Abstract

We have determined the distribution of each of the 65,536 DNA sequences that are eight bases long (8-mer) in a set of 13,010 human genomic promoter sequences aligned relative to the putative transcription start site (TSS). A limited number of 8-mers have peaks in their distribution (cluster), and most cluster within 100 bp of the TSS. The 156 DNA sequences exhibiting the greatest statistically significant clustering near the TSS can be placed into nine groups of related sequences. Each group is defined by a consensus sequence, and seven of these consensus sequences are known binding sites for the transcription factors (TFs) SP1, NF-Y, ETS, CREB, TBP, USF, and NRF-1. One sequence, which we named Clus1, is not a known TF binding site. The ninth sequence group is composed of the strand-specific Kozak sequence that clusters downstream of the TSS. An examination of the co-occurrence of these TF consensus sequences indicates a positive correlation for most of them except for sequences bound by TBP (the TATA box). Human mRNA expression data from 29 tissues indicate that the ETS, NRF-1, and Clus1 sequences that cluster are predominantly found in the promoters of housekeeping genes (e.g., ribosomal genes). In contrast, TATA is more abundant in the promoters of tissue-specific genes. This analysis identified eight DNA sequences in 5082 promoters that we suggest are important for regulating gene expression.

DOI: 10.1101/gr.1953904
PubMed: 15256515


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Clustering of DNA sequences in human promoters.</title>
<author>
<name sortKey="Fitzgerald, Peter C" sort="Fitzgerald, Peter C" uniqKey="Fitzgerald P" first="Peter C" last="Fitzgerald">Peter C. Fitzgerald</name>
<affiliation wicri:level="1">
<nlm:affiliation>Genome Analysis Unit, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Genome Analysis Unit, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</wicri:regionArea>
<wicri:noRegion>Maryland 20892</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Shlyakhtenko, Andrey" sort="Shlyakhtenko, Andrey" uniqKey="Shlyakhtenko A" first="Andrey" last="Shlyakhtenko">Andrey Shlyakhtenko</name>
</author>
<author>
<name sortKey="Mir, Alain A" sort="Mir, Alain A" uniqKey="Mir A" first="Alain A" last="Mir">Alain A. Mir</name>
</author>
<author>
<name sortKey="Vinson, Charles" sort="Vinson, Charles" uniqKey="Vinson C" first="Charles" last="Vinson">Charles Vinson</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2004">2004</date>
<idno type="RBID">pubmed:15256515</idno>
<idno type="pmid">15256515</idno>
<idno type="doi">10.1101/gr.1953904</idno>
<idno type="wicri:Area/PubMed/Corpus">002381</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">002381</idno>
<idno type="wicri:Area/PubMed/Curation">002381</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">002381</idno>
<idno type="wicri:Area/PubMed/Checkpoint">002280</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">002280</idno>
<idno type="wicri:Area/Ncbi/Merge">000290</idno>
<idno type="wicri:Area/Ncbi/Curation">000290</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000290</idno>
<idno type="wicri:doubleKey">1088-9051:2004:Fitzgerald P:clustering:of:dna</idno>
<idno type="wicri:Area/Main/Merge">003107</idno>
<idno type="wicri:Area/Main/Curation">003075</idno>
<idno type="wicri:Area/Main/Exploration">003075</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Clustering of DNA sequences in human promoters.</title>
<author>
<name sortKey="Fitzgerald, Peter C" sort="Fitzgerald, Peter C" uniqKey="Fitzgerald P" first="Peter C" last="Fitzgerald">Peter C. Fitzgerald</name>
<affiliation wicri:level="1">
<nlm:affiliation>Genome Analysis Unit, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Genome Analysis Unit, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892</wicri:regionArea>
<wicri:noRegion>Maryland 20892</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Shlyakhtenko, Andrey" sort="Shlyakhtenko, Andrey" uniqKey="Shlyakhtenko A" first="Andrey" last="Shlyakhtenko">Andrey Shlyakhtenko</name>
</author>
<author>
<name sortKey="Mir, Alain A" sort="Mir, Alain A" uniqKey="Mir A" first="Alain A" last="Mir">Alain A. Mir</name>
</author>
<author>
<name sortKey="Vinson, Charles" sort="Vinson, Charles" uniqKey="Vinson C" first="Charles" last="Vinson">Charles Vinson</name>
</author>
</analytic>
<series>
<title level="j">Genome research</title>
<idno type="ISSN">1088-9051</idno>
<imprint>
<date when="2004" type="published">2004</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Base Sequence</term>
<term>Cluster Analysis</term>
<term>Computational Biology (methods)</term>
<term>Consensus Sequence</term>
<term>Humans</term>
<term>Models, Genetic</term>
<term>Molecular Sequence Data</term>
<term>Promoter Regions, Genetic</term>
<term>Transcription Initiation Site</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Analyse de regroupements</term>
<term>Biologie informatique ()</term>
<term>Données de séquences moléculaires</term>
<term>Humains</term>
<term>Modèles génétiques</term>
<term>Régions promotrices (génétique)</term>
<term>Site d'initiation de la transcription</term>
<term>Séquence consensus</term>
<term>Séquence nucléotidique</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Computational Biology</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Base Sequence</term>
<term>Cluster Analysis</term>
<term>Consensus Sequence</term>
<term>Humans</term>
<term>Models, Genetic</term>
<term>Molecular Sequence Data</term>
<term>Promoter Regions, Genetic</term>
<term>Transcription Initiation Site</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Analyse de regroupements</term>
<term>Biologie informatique</term>
<term>Données de séquences moléculaires</term>
<term>Humains</term>
<term>Modèles génétiques</term>
<term>Régions promotrices (génétique)</term>
<term>Site d'initiation de la transcription</term>
<term>Séquence consensus</term>
<term>Séquence nucléotidique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">We have determined the distribution of each of the 65,536 DNA sequences that are eight bases long (8-mer) in a set of 13,010 human genomic promoter sequences aligned relative to the putative transcription start site (TSS). A limited number of 8-mers have peaks in their distribution (cluster), and most cluster within 100 bp of the TSS. The 156 DNA sequences exhibiting the greatest statistically significant clustering near the TSS can be placed into nine groups of related sequences. Each group is defined by a consensus sequence, and seven of these consensus sequences are known binding sites for the transcription factors (TFs) SP1, NF-Y, ETS, CREB, TBP, USF, and NRF-1. One sequence, which we named Clus1, is not a known TF binding site. The ninth sequence group is composed of the strand-specific Kozak sequence that clusters downstream of the TSS. An examination of the co-occurrence of these TF consensus sequences indicates a positive correlation for most of them except for sequences bound by TBP (the TATA box). Human mRNA expression data from 29 tissues indicate that the ETS, NRF-1, and Clus1 sequences that cluster are predominantly found in the promoters of housekeeping genes (e.g., ribosomal genes). In contrast, TATA is more abundant in the promoters of tissue-specific genes. This analysis identified eight DNA sequences in 5082 promoters that we suggest are important for regulating gene expression.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
</list>
<tree>
<noCountry>
<name sortKey="Mir, Alain A" sort="Mir, Alain A" uniqKey="Mir A" first="Alain A" last="Mir">Alain A. Mir</name>
<name sortKey="Shlyakhtenko, Andrey" sort="Shlyakhtenko, Andrey" uniqKey="Shlyakhtenko A" first="Andrey" last="Shlyakhtenko">Andrey Shlyakhtenko</name>
<name sortKey="Vinson, Charles" sort="Vinson, Charles" uniqKey="Vinson C" first="Charles" last="Vinson">Charles Vinson</name>
</noCountry>
<country name="États-Unis">
<noRegion>
<name sortKey="Fitzgerald, Peter C" sort="Fitzgerald, Peter C" uniqKey="Fitzgerald P" first="Peter C" last="Fitzgerald">Peter C. Fitzgerald</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003075 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 003075 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     pubmed:15256515
   |texte=   Clustering of DNA sequences in human promoters.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:15256515" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021