Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

WordCluster: detecting clusters of DNA words and genomic elements

Identifieur interne : 001340 ( Pmc/Checkpoint ); précédent : 001339; suivant : 001341

WordCluster: detecting clusters of DNA words and genomic elements

Auteurs : Michael Hackenberg [Espagne] ; Pedro Carpena [Espagne, États-Unis] ; Pedro Bernaola-Galván [Espagne] ; Guillermo Barturen [Espagne] ; Ángel M. Alganza [Espagne] ; José L. Oliver [Espagne]

Source :

RBID : PMC:3037320

Abstract

Background

Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.

Results

We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.

Conclusions

WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.


Url:
DOI: 10.1186/1748-7188-6-2
PubMed: 21261981
PubMed Central: 3037320


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:3037320

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">
<italic>WordCluster</italic>
: detecting clusters of DNA words and genomic elements</title>
<author>
<name sortKey="Hackenberg, Michael" sort="Hackenberg, Michael" uniqKey="Hackenberg M" first="Michael" last="Hackenberg">Michael Hackenberg</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
<author>
<name sortKey="Carpena, Pedro" sort="Carpena, Pedro" uniqKey="Carpena P" first="Pedro" last="Carpena">Pedro Carpena</name>
<affiliation wicri:level="2">
<nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
</placeName>
</affiliation>
<affiliation wicri:level="2">
<nlm:aff id="I3">Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Bernaola Galvan, Pedro" sort="Bernaola Galvan, Pedro" uniqKey="Bernaola Galvan P" first="Pedro" last="Bernaola-Galván">Pedro Bernaola-Galván</name>
<affiliation wicri:level="2">
<nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Barturen, Guillermo" sort="Barturen, Guillermo" uniqKey="Barturen G" first="Guillermo" last="Barturen">Guillermo Barturen</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
<author>
<name sortKey="Alganza, Angel M" sort="Alganza, Angel M" uniqKey="Alganza A" first="Ángel M" last="Alganza">Ángel M. Alganza</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
<author>
<name sortKey="Oliver, Jose L" sort="Oliver, Jose L" uniqKey="Oliver J" first="José L" last="Oliver">José L. Oliver</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">21261981</idno>
<idno type="pmc">3037320</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037320</idno>
<idno type="RBID">PMC:3037320</idno>
<idno type="doi">10.1186/1748-7188-6-2</idno>
<date when="2011">2011</date>
<idno type="wicri:Area/Pmc/Corpus">000A32</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000A32</idno>
<idno type="wicri:Area/Pmc/Curation">000A32</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000A32</idno>
<idno type="wicri:Area/Pmc/Checkpoint">001340</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">001340</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">
<italic>WordCluster</italic>
: detecting clusters of DNA words and genomic elements</title>
<author>
<name sortKey="Hackenberg, Michael" sort="Hackenberg, Michael" uniqKey="Hackenberg M" first="Michael" last="Hackenberg">Michael Hackenberg</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
<author>
<name sortKey="Carpena, Pedro" sort="Carpena, Pedro" uniqKey="Carpena P" first="Pedro" last="Carpena">Pedro Carpena</name>
<affiliation wicri:level="2">
<nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
</placeName>
</affiliation>
<affiliation wicri:level="2">
<nlm:aff id="I3">Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115</wicri:regionArea>
<placeName>
<region type="state">Massachusetts</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Bernaola Galvan, Pedro" sort="Bernaola Galvan, Pedro" uniqKey="Bernaola Galvan P" first="Pedro" last="Bernaola-Galván">Pedro Bernaola-Galván</name>
<affiliation wicri:level="2">
<nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Barturen, Guillermo" sort="Barturen, Guillermo" uniqKey="Barturen G" first="Guillermo" last="Barturen">Guillermo Barturen</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
<author>
<name sortKey="Alganza, Angel M" sort="Alganza, Angel M" uniqKey="Alganza A" first="Ángel M" last="Alganza">Ángel M. Alganza</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
<author>
<name sortKey="Oliver, Jose L" sort="Oliver, Jose L" uniqKey="Oliver J" first="José L" last="Oliver">José L. Oliver</name>
<affiliation wicri:level="4">
<nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
<placeName>
<region nuts="2" type="communauté">Andalousie</region>
<settlement type="city">Grenade (Espagne)</settlement>
</placeName>
<orgName type="university">Université de Grenade</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Algorithms for Molecular Biology : AMB</title>
<idno type="eISSN">1748-7188</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Many
<italic>k-</italic>
mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.</p>
</sec>
<sec>
<title>Results</title>
<p>We introduce here an algorithm to detect clusters of DNA words (
<italic>k-</italic>
mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used
<italic>WordCluster </italic>
to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>
<italic>WordCluster </italic>
seems to predict biological meaningful clusters of DNA words (
<italic>k-</italic>
mers) and genomic entities. The implementation of the method into a web server is available at
<ext-link ext-link-type="uri" xlink:href="http://bioinfo2.ugr.es/wordCluster/wordCluster.php">http://bioinfo2.ugr.es/wordCluster/wordCluster.php</ext-link>
including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Durand, D" uniqKey="Durand D">D Durand</name>
</author>
<author>
<name sortKey="Sankoff, D" uniqKey="Sankoff D">D Sankoff</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gardiner Garden, M" uniqKey="Gardiner Garden M">M Gardiner-Garden</name>
</author>
<author>
<name sortKey="Frommer, M" uniqKey="Frommer M">M Frommer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Makeev, Vj" uniqKey="Makeev V">VJ Makeev</name>
</author>
<author>
<name sortKey="Lifanov, Ap" uniqKey="Lifanov A">AP Lifanov</name>
</author>
<author>
<name sortKey="Nazina, Ag" uniqKey="Nazina A">AG Nazina</name>
</author>
<author>
<name sortKey="Papatsenko, Da" uniqKey="Papatsenko D">DA Papatsenko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sandelin, A" uniqKey="Sandelin A">A Sandelin</name>
</author>
<author>
<name sortKey="Bailey, P" uniqKey="Bailey P">P Bailey</name>
</author>
<author>
<name sortKey="Bruce, S" uniqKey="Bruce S">S Bruce</name>
</author>
<author>
<name sortKey="Engstrom, Pg" uniqKey="Engstrom P">PG Engstrom</name>
</author>
<author>
<name sortKey="Klos, Jm" uniqKey="Klos J">JM Klos</name>
</author>
<author>
<name sortKey="Wasserman, Ww" uniqKey="Wasserman W">WW Wasserman</name>
</author>
<author>
<name sortKey="Ericson, J" uniqKey="Ericson J">J Ericson</name>
</author>
<author>
<name sortKey="Lenhard, B" uniqKey="Lenhard B">B Lenhard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Carpena, P" uniqKey="Carpena P">P Carpena</name>
</author>
<author>
<name sortKey="Bernaola Galvan, P" uniqKey="Bernaola Galvan P">P Bernaola-Galván</name>
</author>
<author>
<name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author>
<name sortKey="Coronado, Av" uniqKey="Coronado A">AV Coronado</name>
</author>
<author>
<name sortKey="Oliver, Jl" uniqKey="Oliver J">JL Oliver</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Giardine, B" uniqKey="Giardine B">B Giardine</name>
</author>
<author>
<name sortKey="Riemer, C" uniqKey="Riemer C">C Riemer</name>
</author>
<author>
<name sortKey="Hardison, Rc" uniqKey="Hardison R">RC Hardison</name>
</author>
<author>
<name sortKey="Burhans, R" uniqKey="Burhans R">R Burhans</name>
</author>
<author>
<name sortKey="Elnitski, L" uniqKey="Elnitski L">L Elnitski</name>
</author>
<author>
<name sortKey="Shah, P" uniqKey="Shah P">P Shah</name>
</author>
<author>
<name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author>
<name sortKey="Blankenberg, D" uniqKey="Blankenberg D">D Blankenberg</name>
</author>
<author>
<name sortKey="Albert, I" uniqKey="Albert I">I Albert</name>
</author>
<author>
<name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author>
<name sortKey="Previti, C" uniqKey="Previti C">C Previti</name>
</author>
<author>
<name sortKey="Luque Escamilla, Pl" uniqKey="Luque Escamilla P">PL Luque-Escamilla</name>
</author>
<author>
<name sortKey="Carpena, P" uniqKey="Carpena P">P Carpena</name>
</author>
<author>
<name sortKey="Martinez Aroza, J" uniqKey="Martinez Aroza J">J Martínez-Aroza</name>
</author>
<author>
<name sortKey="Oliver, Jl" uniqKey="Oliver J">JL Oliver</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karolchik, D" uniqKey="Karolchik D">D Karolchik</name>
</author>
<author>
<name sortKey="Kuhn, Rm" uniqKey="Kuhn R">RM Kuhn</name>
</author>
<author>
<name sortKey="Baertsch, R" uniqKey="Baertsch R">R Baertsch</name>
</author>
<author>
<name sortKey="Barber, Gp" uniqKey="Barber G">GP Barber</name>
</author>
<author>
<name sortKey="Clawson, H" uniqKey="Clawson H">H Clawson</name>
</author>
<author>
<name sortKey="Diekhans, M" uniqKey="Diekhans M">M Diekhans</name>
</author>
<author>
<name sortKey="Giardine, B" uniqKey="Giardine B">B Giardine</name>
</author>
<author>
<name sortKey="Harte, Ra" uniqKey="Harte R">RA Harte</name>
</author>
<author>
<name sortKey="Hinrichs, As" uniqKey="Hinrichs A">AS Hinrichs</name>
</author>
<author>
<name sortKey="Hsu, F" uniqKey="Hsu F">F Hsu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quinlan, Ar" uniqKey="Quinlan A">AR Quinlan</name>
</author>
<author>
<name sortKey="Hall, Im" uniqKey="Hall I">IM Hall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ashburner, M" uniqKey="Ashburner M">M Ashburner</name>
</author>
<author>
<name sortKey="Ball, Ca" uniqKey="Ball C">CA Ball</name>
</author>
<author>
<name sortKey="Blake, Ja" uniqKey="Blake J">JA Blake</name>
</author>
<author>
<name sortKey="Botstein, D" uniqKey="Botstein D">D Botstein</name>
</author>
<author>
<name sortKey="Butler, H" uniqKey="Butler H">H Butler</name>
</author>
<author>
<name sortKey="Cherry, Jm" uniqKey="Cherry J">JM Cherry</name>
</author>
<author>
<name sortKey="Davis, Ap" uniqKey="Davis A">AP Davis</name>
</author>
<author>
<name sortKey="Dolinski, K" uniqKey="Dolinski K">K Dolinski</name>
</author>
<author>
<name sortKey="Dwight, Ss" uniqKey="Dwight S">SS Dwight</name>
</author>
<author>
<name sortKey="Eppig, Jt" uniqKey="Eppig J">JT Eppig</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author>
<name sortKey="Matthiesen, R" uniqKey="Matthiesen R">R Matthiesen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author>
<name sortKey="Matthiesen, R" uniqKey="Matthiesen R">R Matthiesen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pruitt, Kd" uniqKey="Pruitt K">KD Pruitt</name>
</author>
<author>
<name sortKey="Tatusova, T" uniqKey="Tatusova T">T Tatusova</name>
</author>
<author>
<name sortKey="Maglott, Dr" uniqKey="Maglott D">DR Maglott</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hubbard, Tj" uniqKey="Hubbard T">TJ Hubbard</name>
</author>
<author>
<name sortKey="Aken, Bl" uniqKey="Aken B">BL Aken</name>
</author>
<author>
<name sortKey="Ayling, S" uniqKey="Ayling S">S Ayling</name>
</author>
<author>
<name sortKey="Ballester, B" uniqKey="Ballester B">B Ballester</name>
</author>
<author>
<name sortKey="Beal, K" uniqKey="Beal K">K Beal</name>
</author>
<author>
<name sortKey="Bragin, E" uniqKey="Bragin E">E Bragin</name>
</author>
<author>
<name sortKey="Brent, S" uniqKey="Brent S">S Brent</name>
</author>
<author>
<name sortKey="Chen, Y" uniqKey="Chen Y">Y Chen</name>
</author>
<author>
<name sortKey="Clapham, P" uniqKey="Clapham P">P Clapham</name>
</author>
<author>
<name sortKey="Clarke, L" uniqKey="Clarke L">L Clarke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author>
<name sortKey="Barturen, G" uniqKey="Barturen G">G Barturen</name>
</author>
<author>
<name sortKey="Carpena, P" uniqKey="Carpena P">P Carpena</name>
</author>
<author>
<name sortKey="Luque Escamilla, Pl" uniqKey="Luque Escamilla P">PL Luque-Escamilla</name>
</author>
<author>
<name sortKey="Previti, C" uniqKey="Previti C">C Previti</name>
</author>
<author>
<name sortKey="Oliver, Jl" uniqKey="Oliver J">JL Oliver</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Siepel, A" uniqKey="Siepel A">A Siepel</name>
</author>
<author>
<name sortKey="Bejerano, G" uniqKey="Bejerano G">G Bejerano</name>
</author>
<author>
<name sortKey="Pedersen, Js" uniqKey="Pedersen J">JS Pedersen</name>
</author>
<author>
<name sortKey="Hinrichs, As" uniqKey="Hinrichs A">AS Hinrichs</name>
</author>
<author>
<name sortKey="Hou, M" uniqKey="Hou M">M Hou</name>
</author>
<author>
<name sortKey="Rosenbloom, K" uniqKey="Rosenbloom K">K Rosenbloom</name>
</author>
<author>
<name sortKey="Clawson, H" uniqKey="Clawson H">H Clawson</name>
</author>
<author>
<name sortKey="Spieth, J" uniqKey="Spieth J">J Spieth</name>
</author>
<author>
<name sortKey="Hillier, Lw" uniqKey="Hillier L">LW Hillier</name>
</author>
<author>
<name sortKey="Richards, S" uniqKey="Richards S">S Richards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lister, R" uniqKey="Lister R">R Lister</name>
</author>
<author>
<name sortKey="Pelizzola, M" uniqKey="Pelizzola M">M Pelizzola</name>
</author>
<author>
<name sortKey="Dowen, Rh" uniqKey="Dowen R">RH Dowen</name>
</author>
<author>
<name sortKey="Hawkins, Rd" uniqKey="Hawkins R">RD Hawkins</name>
</author>
<author>
<name sortKey="Hon, G" uniqKey="Hon G">G Hon</name>
</author>
<author>
<name sortKey="Tonti Filippini, J" uniqKey="Tonti Filippini J">J Tonti-Filippini</name>
</author>
<author>
<name sortKey="Nery, Jr" uniqKey="Nery J">JR Nery</name>
</author>
<author>
<name sortKey="Lee, L" uniqKey="Lee L">L Lee</name>
</author>
<author>
<name sortKey="Ye, Z" uniqKey="Ye Z">Z Ye</name>
</author>
<author>
<name sortKey="Ngo, Qm" uniqKey="Ngo Q">QM Ngo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aloni, R" uniqKey="Aloni R">R Aloni</name>
</author>
<author>
<name sortKey="Olender, T" uniqKey="Olender T">T Olender</name>
</author>
<author>
<name sortKey="Lancet, D" uniqKey="Lancet D">D Lancet</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Algorithms Mol Biol</journal-id>
<journal-title-group>
<journal-title>Algorithms for Molecular Biology : AMB</journal-title>
</journal-title-group>
<issn pub-type="epub">1748-7188</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">21261981</article-id>
<article-id pub-id-type="pmc">3037320</article-id>
<article-id pub-id-type="publisher-id">1748-7188-6-2</article-id>
<article-id pub-id-type="doi">10.1186/1748-7188-6-2</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Software Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>
<italic>WordCluster</italic>
: detecting clusters of DNA words and genomic elements</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes" id="A1">
<name>
<surname>Hackenberg</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>mlhack@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Carpena</surname>
<given-names>Pedro</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>pcarpena@ctima.uma.es</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Bernaola-Galván</surname>
<given-names>Pedro</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>rick@uma.es</email>
</contrib>
<contrib contrib-type="author" id="A4">
<name>
<surname>Barturen</surname>
<given-names>Guillermo</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>bartg01@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A5">
<name>
<surname>Alganza</surname>
<given-names>Ángel M</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>ama@ugr.es</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A6">
<name>
<surname>Oliver</surname>
<given-names>José L</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>oliver@ugr.es</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</aff>
<aff id="I2">
<label>2</label>
Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</aff>
<aff id="I3">
<label>3</label>
Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115, USA</aff>
<pub-date pub-type="collection">
<year>2011</year>
</pub-date>
<pub-date pub-type="epub">
<day>24</day>
<month>1</month>
<year>2011</year>
</pub-date>
<volume>6</volume>
<fpage>2</fpage>
<lpage>2</lpage>
<history>
<date date-type="received">
<day>30</day>
<month>8</month>
<year>2010</year>
</date>
<date date-type="accepted">
<day>24</day>
<month>1</month>
<year>2011</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright ©2011 Hackenberg et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2011</copyright-year>
<copyright-holder>Hackenberg et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.almob.org/content/6/1/2"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>Many
<italic>k-</italic>
mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.</p>
</sec>
<sec>
<title>Results</title>
<p>We introduce here an algorithm to detect clusters of DNA words (
<italic>k-</italic>
mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used
<italic>WordCluster </italic>
to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>
<italic>WordCluster </italic>
seems to predict biological meaningful clusters of DNA words (
<italic>k-</italic>
mers) and genomic entities. The implementation of the method into a web server is available at
<ext-link ext-link-type="uri" xlink:href="http://bioinfo2.ugr.es/wordCluster/wordCluster.php">http://bioinfo2.ugr.es/wordCluster/wordCluster.php</ext-link>
including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p>
</sec>
</abstract>
</article-meta>
</front>
</pmc>
<affiliations>
<list>
<country>
<li>Espagne</li>
<li>États-Unis</li>
</country>
<region>
<li>Andalousie</li>
<li>Massachusetts</li>
</region>
<settlement>
<li>Grenade (Espagne)</li>
</settlement>
<orgName>
<li>Université de Grenade</li>
</orgName>
</list>
<tree>
<country name="Espagne">
<region name="Andalousie">
<name sortKey="Hackenberg, Michael" sort="Hackenberg, Michael" uniqKey="Hackenberg M" first="Michael" last="Hackenberg">Michael Hackenberg</name>
</region>
<name sortKey="Alganza, Angel M" sort="Alganza, Angel M" uniqKey="Alganza A" first="Ángel M" last="Alganza">Ángel M. Alganza</name>
<name sortKey="Barturen, Guillermo" sort="Barturen, Guillermo" uniqKey="Barturen G" first="Guillermo" last="Barturen">Guillermo Barturen</name>
<name sortKey="Bernaola Galvan, Pedro" sort="Bernaola Galvan, Pedro" uniqKey="Bernaola Galvan P" first="Pedro" last="Bernaola-Galván">Pedro Bernaola-Galván</name>
<name sortKey="Carpena, Pedro" sort="Carpena, Pedro" uniqKey="Carpena P" first="Pedro" last="Carpena">Pedro Carpena</name>
<name sortKey="Oliver, Jose L" sort="Oliver, Jose L" uniqKey="Oliver J" first="José L" last="Oliver">José L. Oliver</name>
</country>
<country name="États-Unis">
<region name="Massachusetts">
<name sortKey="Carpena, Pedro" sort="Carpena, Pedro" uniqKey="Carpena P" first="Pedro" last="Carpena">Pedro Carpena</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001340 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd -nk 001340 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Checkpoint
   |type=    RBID
   |clé=     PMC:3037320
   |texte=   WordCluster: detecting clusters of DNA words and genomic elements
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/RBID.i   -Sk "pubmed:21261981" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021