WordCluster: detecting clusters of DNA words and genomic elements
Identifieur interne : 000A32 ( Pmc/Curation ); précédent : 000A31; suivant : 000A33WordCluster: detecting clusters of DNA words and genomic elements
Auteurs : Michael Hackenberg [Espagne] ; Pedro Carpena [Espagne, États-Unis] ; Pedro Bernaola-Galván [Espagne] ; Guillermo Barturen [Espagne] ; Ángel M. Alganza [Espagne] ; José L. Oliver [Espagne]Source :
- Algorithms for Molecular Biology : AMB [ 1748-7188 ] ; 2011.
Abstract
Many
We introduce here an algorithm to detect clusters of DNA words (
Url:
DOI: 10.1186/1748-7188-6-2
PubMed: 21261981
PubMed Central: 3037320
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000A32
Links to Exploration step
PMC:3037320Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en"><italic>WordCluster</italic>
: detecting clusters of DNA words and genomic elements</title>
<author><name sortKey="Hackenberg, Michael" sort="Hackenberg, Michael" uniqKey="Hackenberg M" first="Michael" last="Hackenberg">Michael Hackenberg</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Carpena, Pedro" sort="Carpena, Pedro" uniqKey="Carpena P" first="Pedro" last="Carpena">Pedro Carpena</name>
<affiliation wicri:level="1"><nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><nlm:aff id="I3">Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Bernaola Galvan, Pedro" sort="Bernaola Galvan, Pedro" uniqKey="Bernaola Galvan P" first="Pedro" last="Bernaola-Galván">Pedro Bernaola-Galván</name>
<affiliation wicri:level="1"><nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Barturen, Guillermo" sort="Barturen, Guillermo" uniqKey="Barturen G" first="Guillermo" last="Barturen">Guillermo Barturen</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Alganza, Angel M" sort="Alganza, Angel M" uniqKey="Alganza A" first="Ángel M" last="Alganza">Ángel M. Alganza</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Oliver, Jose L" sort="Oliver, Jose L" uniqKey="Oliver J" first="José L" last="Oliver">José L. Oliver</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">21261981</idno>
<idno type="pmc">3037320</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037320</idno>
<idno type="RBID">PMC:3037320</idno>
<idno type="doi">10.1186/1748-7188-6-2</idno>
<date when="2011">2011</date>
<idno type="wicri:Area/Pmc/Corpus">000A32</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000A32</idno>
<idno type="wicri:Area/Pmc/Curation">000A32</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000A32</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main"><italic>WordCluster</italic>
: detecting clusters of DNA words and genomic elements</title>
<author><name sortKey="Hackenberg, Michael" sort="Hackenberg, Michael" uniqKey="Hackenberg M" first="Michael" last="Hackenberg">Michael Hackenberg</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Carpena, Pedro" sort="Carpena, Pedro" uniqKey="Carpena P" first="Pedro" last="Carpena">Pedro Carpena</name>
<affiliation wicri:level="1"><nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><nlm:aff id="I3">Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Bernaola Galvan, Pedro" sort="Bernaola Galvan, Pedro" uniqKey="Bernaola Galvan P" first="Pedro" last="Bernaola-Galván">Pedro Bernaola-Galván</name>
<affiliation wicri:level="1"><nlm:aff id="I2">Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Barturen, Guillermo" sort="Barturen, Guillermo" uniqKey="Barturen G" first="Guillermo" last="Barturen">Guillermo Barturen</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Alganza, Angel M" sort="Alganza, Angel M" uniqKey="Alganza A" first="Ángel M" last="Alganza">Ángel M. Alganza</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Oliver, Jose L" sort="Oliver, Jose L" uniqKey="Oliver J" first="José L" last="Oliver">José L. Oliver</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</nlm:aff>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series><title level="j">Algorithms for Molecular Biology : AMB</title>
<idno type="eISSN">1748-7188</idno>
<imprint><date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>Many <italic>k-</italic>
mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.</p>
</sec>
<sec><title>Results</title>
<p>We introduce here an algorithm to detect clusters of DNA words (<italic>k-</italic>
mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used <italic>WordCluster </italic>
to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.</p>
</sec>
<sec><title>Conclusions</title>
<p><italic>WordCluster </italic>
seems to predict biological meaningful clusters of DNA words (<italic>k-</italic>
mers) and genomic entities. The implementation of the method into a web server is available at <ext-link ext-link-type="uri" xlink:href="http://bioinfo2.ugr.es/wordCluster/wordCluster.php">http://bioinfo2.ugr.es/wordCluster/wordCluster.php</ext-link>
including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Durand, D" uniqKey="Durand D">D Durand</name>
</author>
<author><name sortKey="Sankoff, D" uniqKey="Sankoff D">D Sankoff</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gardiner Garden, M" uniqKey="Gardiner Garden M">M Gardiner-Garden</name>
</author>
<author><name sortKey="Frommer, M" uniqKey="Frommer M">M Frommer</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Makeev, Vj" uniqKey="Makeev V">VJ Makeev</name>
</author>
<author><name sortKey="Lifanov, Ap" uniqKey="Lifanov A">AP Lifanov</name>
</author>
<author><name sortKey="Nazina, Ag" uniqKey="Nazina A">AG Nazina</name>
</author>
<author><name sortKey="Papatsenko, Da" uniqKey="Papatsenko D">DA Papatsenko</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sandelin, A" uniqKey="Sandelin A">A Sandelin</name>
</author>
<author><name sortKey="Bailey, P" uniqKey="Bailey P">P Bailey</name>
</author>
<author><name sortKey="Bruce, S" uniqKey="Bruce S">S Bruce</name>
</author>
<author><name sortKey="Engstrom, Pg" uniqKey="Engstrom P">PG Engstrom</name>
</author>
<author><name sortKey="Klos, Jm" uniqKey="Klos J">JM Klos</name>
</author>
<author><name sortKey="Wasserman, Ww" uniqKey="Wasserman W">WW Wasserman</name>
</author>
<author><name sortKey="Ericson, J" uniqKey="Ericson J">J Ericson</name>
</author>
<author><name sortKey="Lenhard, B" uniqKey="Lenhard B">B Lenhard</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Carpena, P" uniqKey="Carpena P">P Carpena</name>
</author>
<author><name sortKey="Bernaola Galvan, P" uniqKey="Bernaola Galvan P">P Bernaola-Galván</name>
</author>
<author><name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author><name sortKey="Coronado, Av" uniqKey="Coronado A">AV Coronado</name>
</author>
<author><name sortKey="Oliver, Jl" uniqKey="Oliver J">JL Oliver</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Giardine, B" uniqKey="Giardine B">B Giardine</name>
</author>
<author><name sortKey="Riemer, C" uniqKey="Riemer C">C Riemer</name>
</author>
<author><name sortKey="Hardison, Rc" uniqKey="Hardison R">RC Hardison</name>
</author>
<author><name sortKey="Burhans, R" uniqKey="Burhans R">R Burhans</name>
</author>
<author><name sortKey="Elnitski, L" uniqKey="Elnitski L">L Elnitski</name>
</author>
<author><name sortKey="Shah, P" uniqKey="Shah P">P Shah</name>
</author>
<author><name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author><name sortKey="Blankenberg, D" uniqKey="Blankenberg D">D Blankenberg</name>
</author>
<author><name sortKey="Albert, I" uniqKey="Albert I">I Albert</name>
</author>
<author><name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author><name sortKey="Previti, C" uniqKey="Previti C">C Previti</name>
</author>
<author><name sortKey="Luque Escamilla, Pl" uniqKey="Luque Escamilla P">PL Luque-Escamilla</name>
</author>
<author><name sortKey="Carpena, P" uniqKey="Carpena P">P Carpena</name>
</author>
<author><name sortKey="Martinez Aroza, J" uniqKey="Martinez Aroza J">J Martínez-Aroza</name>
</author>
<author><name sortKey="Oliver, Jl" uniqKey="Oliver J">JL Oliver</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Karolchik, D" uniqKey="Karolchik D">D Karolchik</name>
</author>
<author><name sortKey="Kuhn, Rm" uniqKey="Kuhn R">RM Kuhn</name>
</author>
<author><name sortKey="Baertsch, R" uniqKey="Baertsch R">R Baertsch</name>
</author>
<author><name sortKey="Barber, Gp" uniqKey="Barber G">GP Barber</name>
</author>
<author><name sortKey="Clawson, H" uniqKey="Clawson H">H Clawson</name>
</author>
<author><name sortKey="Diekhans, M" uniqKey="Diekhans M">M Diekhans</name>
</author>
<author><name sortKey="Giardine, B" uniqKey="Giardine B">B Giardine</name>
</author>
<author><name sortKey="Harte, Ra" uniqKey="Harte R">RA Harte</name>
</author>
<author><name sortKey="Hinrichs, As" uniqKey="Hinrichs A">AS Hinrichs</name>
</author>
<author><name sortKey="Hsu, F" uniqKey="Hsu F">F Hsu</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Quinlan, Ar" uniqKey="Quinlan A">AR Quinlan</name>
</author>
<author><name sortKey="Hall, Im" uniqKey="Hall I">IM Hall</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ashburner, M" uniqKey="Ashburner M">M Ashburner</name>
</author>
<author><name sortKey="Ball, Ca" uniqKey="Ball C">CA Ball</name>
</author>
<author><name sortKey="Blake, Ja" uniqKey="Blake J">JA Blake</name>
</author>
<author><name sortKey="Botstein, D" uniqKey="Botstein D">D Botstein</name>
</author>
<author><name sortKey="Butler, H" uniqKey="Butler H">H Butler</name>
</author>
<author><name sortKey="Cherry, Jm" uniqKey="Cherry J">JM Cherry</name>
</author>
<author><name sortKey="Davis, Ap" uniqKey="Davis A">AP Davis</name>
</author>
<author><name sortKey="Dolinski, K" uniqKey="Dolinski K">K Dolinski</name>
</author>
<author><name sortKey="Dwight, Ss" uniqKey="Dwight S">SS Dwight</name>
</author>
<author><name sortKey="Eppig, Jt" uniqKey="Eppig J">JT Eppig</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author><name sortKey="Matthiesen, R" uniqKey="Matthiesen R">R Matthiesen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author><name sortKey="Matthiesen, R" uniqKey="Matthiesen R">R Matthiesen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pruitt, Kd" uniqKey="Pruitt K">KD Pruitt</name>
</author>
<author><name sortKey="Tatusova, T" uniqKey="Tatusova T">T Tatusova</name>
</author>
<author><name sortKey="Maglott, Dr" uniqKey="Maglott D">DR Maglott</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hubbard, Tj" uniqKey="Hubbard T">TJ Hubbard</name>
</author>
<author><name sortKey="Aken, Bl" uniqKey="Aken B">BL Aken</name>
</author>
<author><name sortKey="Ayling, S" uniqKey="Ayling S">S Ayling</name>
</author>
<author><name sortKey="Ballester, B" uniqKey="Ballester B">B Ballester</name>
</author>
<author><name sortKey="Beal, K" uniqKey="Beal K">K Beal</name>
</author>
<author><name sortKey="Bragin, E" uniqKey="Bragin E">E Bragin</name>
</author>
<author><name sortKey="Brent, S" uniqKey="Brent S">S Brent</name>
</author>
<author><name sortKey="Chen, Y" uniqKey="Chen Y">Y Chen</name>
</author>
<author><name sortKey="Clapham, P" uniqKey="Clapham P">P Clapham</name>
</author>
<author><name sortKey="Clarke, L" uniqKey="Clarke L">L Clarke</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hackenberg, M" uniqKey="Hackenberg M">M Hackenberg</name>
</author>
<author><name sortKey="Barturen, G" uniqKey="Barturen G">G Barturen</name>
</author>
<author><name sortKey="Carpena, P" uniqKey="Carpena P">P Carpena</name>
</author>
<author><name sortKey="Luque Escamilla, Pl" uniqKey="Luque Escamilla P">PL Luque-Escamilla</name>
</author>
<author><name sortKey="Previti, C" uniqKey="Previti C">C Previti</name>
</author>
<author><name sortKey="Oliver, Jl" uniqKey="Oliver J">JL Oliver</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Siepel, A" uniqKey="Siepel A">A Siepel</name>
</author>
<author><name sortKey="Bejerano, G" uniqKey="Bejerano G">G Bejerano</name>
</author>
<author><name sortKey="Pedersen, Js" uniqKey="Pedersen J">JS Pedersen</name>
</author>
<author><name sortKey="Hinrichs, As" uniqKey="Hinrichs A">AS Hinrichs</name>
</author>
<author><name sortKey="Hou, M" uniqKey="Hou M">M Hou</name>
</author>
<author><name sortKey="Rosenbloom, K" uniqKey="Rosenbloom K">K Rosenbloom</name>
</author>
<author><name sortKey="Clawson, H" uniqKey="Clawson H">H Clawson</name>
</author>
<author><name sortKey="Spieth, J" uniqKey="Spieth J">J Spieth</name>
</author>
<author><name sortKey="Hillier, Lw" uniqKey="Hillier L">LW Hillier</name>
</author>
<author><name sortKey="Richards, S" uniqKey="Richards S">S Richards</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lister, R" uniqKey="Lister R">R Lister</name>
</author>
<author><name sortKey="Pelizzola, M" uniqKey="Pelizzola M">M Pelizzola</name>
</author>
<author><name sortKey="Dowen, Rh" uniqKey="Dowen R">RH Dowen</name>
</author>
<author><name sortKey="Hawkins, Rd" uniqKey="Hawkins R">RD Hawkins</name>
</author>
<author><name sortKey="Hon, G" uniqKey="Hon G">G Hon</name>
</author>
<author><name sortKey="Tonti Filippini, J" uniqKey="Tonti Filippini J">J Tonti-Filippini</name>
</author>
<author><name sortKey="Nery, Jr" uniqKey="Nery J">JR Nery</name>
</author>
<author><name sortKey="Lee, L" uniqKey="Lee L">L Lee</name>
</author>
<author><name sortKey="Ye, Z" uniqKey="Ye Z">Z Ye</name>
</author>
<author><name sortKey="Ngo, Qm" uniqKey="Ngo Q">QM Ngo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Aloni, R" uniqKey="Aloni R">R Aloni</name>
</author>
<author><name sortKey="Olender, T" uniqKey="Olender T">T Olender</name>
</author>
<author><name sortKey="Lancet, D" uniqKey="Lancet D">D Lancet</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-ta">Algorithms Mol Biol</journal-id>
<journal-title-group><journal-title>Algorithms for Molecular Biology : AMB</journal-title>
</journal-title-group>
<issn pub-type="epub">1748-7188</issn>
<publisher><publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">21261981</article-id>
<article-id pub-id-type="pmc">3037320</article-id>
<article-id pub-id-type="publisher-id">1748-7188-6-2</article-id>
<article-id pub-id-type="doi">10.1186/1748-7188-6-2</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Software Article</subject>
</subj-group>
</article-categories>
<title-group><article-title><italic>WordCluster</italic>
: detecting clusters of DNA words and genomic elements</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" corresp="yes" id="A1"><name><surname>Hackenberg</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>mlhack@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A2"><name><surname>Carpena</surname>
<given-names>Pedro</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>pcarpena@ctima.uma.es</email>
</contrib>
<contrib contrib-type="author" id="A3"><name><surname>Bernaola-Galván</surname>
<given-names>Pedro</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>rick@uma.es</email>
</contrib>
<contrib contrib-type="author" id="A4"><name><surname>Barturen</surname>
<given-names>Guillermo</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>bartg01@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A5"><name><surname>Alganza</surname>
<given-names>Ángel M</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>ama@ugr.es</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A6"><name><surname>Oliver</surname>
<given-names>José L</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>oliver@ugr.es</email>
</contrib>
</contrib-group>
<aff id="I1"><label>1</label>
Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071-Granada & Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain</aff>
<aff id="I2"><label>2</label>
Dpto. de Física Aplicada II, E.T.S.I. de Telecomunicación, Universidad de Málaga 29071-Malaga, Spain</aff>
<aff id="I3"><label>3</label>
Division of Sleep Medicine, Brigham and Woman's Hospital, Harvard Medical School, Boston, MA 02115, USA</aff>
<pub-date pub-type="collection"><year>2011</year>
</pub-date>
<pub-date pub-type="epub"><day>24</day>
<month>1</month>
<year>2011</year>
</pub-date>
<volume>6</volume>
<fpage>2</fpage>
<lpage>2</lpage>
<history><date date-type="received"><day>30</day>
<month>8</month>
<year>2010</year>
</date>
<date date-type="accepted"><day>24</day>
<month>1</month>
<year>2011</year>
</date>
</history>
<permissions><copyright-statement>Copyright ©2011 Hackenberg et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2011</copyright-year>
<copyright-holder>Hackenberg et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0"><license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.almob.org/content/6/1/2"></self-uri>
<abstract><sec><title>Background</title>
<p>Many <italic>k-</italic>
mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds.</p>
</sec>
<sec><title>Results</title>
<p>We introduce here an algorithm to detect clusters of DNA words (<italic>k-</italic>
mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used <italic>WordCluster </italic>
to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome.</p>
</sec>
<sec><title>Conclusions</title>
<p><italic>WordCluster </italic>
seems to predict biological meaningful clusters of DNA words (<italic>k-</italic>
mers) and genomic entities. The implementation of the method into a web server is available at <ext-link ext-link-type="uri" xlink:href="http://bioinfo2.ugr.es/wordCluster/wordCluster.php">http://bioinfo2.ugr.es/wordCluster/wordCluster.php</ext-link>
including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p>
</sec>
</abstract>
</article-meta>
</front>
</pmc>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A32 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000A32 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Pmc |étape= Curation |type= RBID |clé= PMC:3037320 |texte= WordCluster: detecting clusters of DNA words and genomic elements }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i -Sk "pubmed:21261981" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |