Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Comparative analysis using K-mer and K-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes

Identifieur interne : 000F51 ( Pmc/Curation ); précédent : 000F50; suivant : 000F52

Comparative analysis using K-mer and K-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes

Auteurs : Heejoon Chae [États-Unis] ; Jinwoo Park [Corée du Sud] ; Seong-Whan Lee [Corée du Sud] ; Kenneth P. Nephew [États-Unis] ; Sun Kim [Corée du Sud]

Source :

RBID : PMC:3643570

Abstract

CpG islands are GC-rich regions often located in the 5′ end of genes and normally protected from cytosine methylation in mammals. The important role of CpG islands in gene transcription strongly suggests evolutionary conservation in the mammalian genome. However, as CpG dinucleotides are over-represented in CpG islands, comparative CpG island analysis using conventional sequence analysis techniques remains a major challenge in the epigenetics field. In this study, we conducted a comparative analysis of all CpG island sequences in 10 mammalian genomes. As sequence similarity methods and character composition techniques such as information theory are particularly difficult to conduct, we used exact patterns in CpG island sequences and single character discrepancies to identify differences in CpG island sequences. First, by calculating genome distance based on rank correlation tests, we show that k-mer and k-flank patterns around CpG sites can be used to correctly reconstruct the phylogeny of 10 mammalian genomes. Further, we used various machine learning algorithms to demonstrate that CpG islands sequences can be characterized using k-mers. In addition, by testing a human model on the nine different mammalian genomes, we provide the first evidence that k-mer signatures are consistent with evolutionary history.


Url:
DOI: 10.1093/nar/gkt144
PubMed: 23519616
PubMed Central: 3643570

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:3643570

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Comparative analysis using K-mer and K-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes</title>
<author>
<name sortKey="Chae, Heejoon" sort="Chae, Heejoon" uniqKey="Chae H" first="Heejoon" last="Chae">Heejoon Chae</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Department of Computer Science, School of Informatics and Computing, Indiana University, Bloomington, IN, USA,</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, School of Informatics and Computing, Indiana University, Bloomington, IN</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Park, Jinwoo" sort="Park, Jinwoo" uniqKey="Park J" first="Jinwoo" last="Park">Jinwoo Park</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Lee, Seong Whan" sort="Lee, Seong Whan" uniqKey="Lee S" first="Seong-Whan" last="Lee">Seong-Whan Lee</name>
<affiliation wicri:level="1">
<nlm:aff wicri:cut=" and" id="gkt144-AFF1">Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Brain and Cognitive Engineering, Korea University, Seoul</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Nephew, Kenneth P" sort="Nephew, Kenneth P" uniqKey="Nephew K" first="Kenneth P." last="Nephew">Kenneth P. Nephew</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Medical Sciences Program, Indiana University School of Medicine, Indiana University, Bloomington, IN, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Medical Sciences Program, Indiana University School of Medicine, Indiana University, Bloomington, IN</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Kim, Sun" sort="Kim, Sun" uniqKey="Kim S" first="Sun" last="Kim">Sun Kim</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">23519616</idno>
<idno type="pmc">3643570</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3643570</idno>
<idno type="RBID">PMC:3643570</idno>
<idno type="doi">10.1093/nar/gkt144</idno>
<date when="2013">2013</date>
<idno type="wicri:Area/Pmc/Corpus">000F51</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000F51</idno>
<idno type="wicri:Area/Pmc/Curation">000F51</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000F51</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Comparative analysis using K-mer and K-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes</title>
<author>
<name sortKey="Chae, Heejoon" sort="Chae, Heejoon" uniqKey="Chae H" first="Heejoon" last="Chae">Heejoon Chae</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Department of Computer Science, School of Informatics and Computing, Indiana University, Bloomington, IN, USA,</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, School of Informatics and Computing, Indiana University, Bloomington, IN</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Park, Jinwoo" sort="Park, Jinwoo" uniqKey="Park J" first="Jinwoo" last="Park">Jinwoo Park</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Lee, Seong Whan" sort="Lee, Seong Whan" uniqKey="Lee S" first="Seong-Whan" last="Lee">Seong-Whan Lee</name>
<affiliation wicri:level="1">
<nlm:aff wicri:cut=" and" id="gkt144-AFF1">Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Brain and Cognitive Engineering, Korea University, Seoul</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Nephew, Kenneth P" sort="Nephew, Kenneth P" uniqKey="Nephew K" first="Kenneth P." last="Nephew">Kenneth P. Nephew</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Medical Sciences Program, Indiana University School of Medicine, Indiana University, Bloomington, IN, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Medical Sciences Program, Indiana University School of Medicine, Indiana University, Bloomington, IN</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Kim, Sun" sort="Kim, Sun" uniqKey="Kim S" first="Sun" last="Kim">Sun Kim</name>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="gkt144-AFF1">Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea,</nlm:aff>
<country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Nucleic Acids Research</title>
<idno type="ISSN">0305-1048</idno>
<idno type="eISSN">1362-4962</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>CpG islands are GC-rich regions often located in the 5′ end of genes and normally protected from cytosine methylation in mammals. The important role of CpG islands in gene transcription strongly suggests evolutionary conservation in the mammalian genome. However, as CpG dinucleotides are over-represented in CpG islands, comparative CpG island analysis using conventional sequence analysis techniques remains a major challenge in the epigenetics field. In this study, we conducted a comparative analysis of all CpG island sequences in 10 mammalian genomes. As sequence similarity methods and character composition techniques such as information theory are particularly difficult to conduct, we used exact patterns in CpG island sequences and single character discrepancies to identify differences in CpG island sequences. First, by calculating genome distance based on rank correlation tests, we show that k-mer and k-flank patterns around CpG sites can be used to correctly reconstruct the phylogeny of 10 mammalian genomes. Further, we used various machine learning algorithms to demonstrate that CpG islands sequences can be characterized using k-mers. In addition, by testing a human model on the nine different mammalian genomes, we provide the first evidence that k-mer signatures are consistent with evolutionary history.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Jabbari, K" uniqKey="Jabbari K">K Jabbari</name>
</author>
<author>
<name sortKey="Bernardi, G" uniqKey="Bernardi G">G Bernardi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jones, Pa" uniqKey="Jones P">PA Jones</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bell, Cg" uniqKey="Bell C">CG Bell</name>
</author>
<author>
<name sortKey="Wilson, Ga" uniqKey="Wilson G">GA Wilson</name>
</author>
<author>
<name sortKey="Butcher, Lm" uniqKey="Butcher L">LM Butcher</name>
</author>
<author>
<name sortKey="Roos, C" uniqKey="Roos C">C Roos</name>
</author>
<author>
<name sortKey="Walter, L" uniqKey="Walter L">L Walter</name>
</author>
<author>
<name sortKey="Beck, S" uniqKey="Beck S">S Beck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Portela, A" uniqKey="Portela A">A Portela</name>
</author>
<author>
<name sortKey="Esteller, M" uniqKey="Esteller M">M Esteller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feinberg, Ap" uniqKey="Feinberg A">AP Feinberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burge, C" uniqKey="Burge C">C Burge</name>
</author>
<author>
<name sortKey="Campbell, Am" uniqKey="Campbell A">AM Campbell</name>
</author>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scarano, E" uniqKey="Scarano E">E Scarano</name>
</author>
<author>
<name sortKey="Iaccarino, M" uniqKey="Iaccarino M">M Iaccarino</name>
</author>
<author>
<name sortKey="Grippo, P" uniqKey="Grippo P">P Grippo</name>
</author>
<author>
<name sortKey="Parisi, E" uniqKey="Parisi E">E Parisi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bock, C" uniqKey="Bock C">C Bock</name>
</author>
<author>
<name sortKey="Paulsen, M" uniqKey="Paulsen M">M Paulsen</name>
</author>
<author>
<name sortKey="Tierling, S" uniqKey="Tierling S">S Tierling</name>
</author>
<author>
<name sortKey="Mikeska, T" uniqKey="Mikeska T">T Mikeska</name>
</author>
<author>
<name sortKey="Lengauer, T" uniqKey="Lengauer T">T Lengauer</name>
</author>
<author>
<name sortKey="Walter, J" uniqKey="Walter J">J Walter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fatemi, M" uniqKey="Fatemi M">M Fatemi</name>
</author>
<author>
<name sortKey="Pao, Mm" uniqKey="Pao M">MM Pao</name>
</author>
<author>
<name sortKey="Jeong, S" uniqKey="Jeong S">S Jeong</name>
</author>
<author>
<name sortKey="Gal Yam, En" uniqKey="Gal Yam E">EN Gal-Yam</name>
</author>
<author>
<name sortKey="Egger, G" uniqKey="Egger G">G Egger</name>
</author>
<author>
<name sortKey="Weisenberger, Dj" uniqKey="Weisenberger D">DJ Weisenberger</name>
</author>
<author>
<name sortKey="Jones, Pa" uniqKey="Jones P">PA Jones</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Larsen, F" uniqKey="Larsen F">F Larsen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saxonov, S" uniqKey="Saxonov S">S Saxonov</name>
</author>
<author>
<name sortKey="Berg, P" uniqKey="Berg P">P Berg</name>
</author>
<author>
<name sortKey="Brutlag, Dl" uniqKey="Brutlag D">DL Brutlag</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Antequera, F" uniqKey="Antequera F">F Antequera</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sharif, J" uniqKey="Sharif J">J Sharif</name>
</author>
<author>
<name sortKey="Endo, Ta" uniqKey="Endo T">TA Endo</name>
</author>
<author>
<name sortKey="Toyoda, T" uniqKey="Toyoda T">T Toyoda</name>
</author>
<author>
<name sortKey="Koseki, H" uniqKey="Koseki H">H Koseki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yan, Q" uniqKey="Yan Q">Q Yan</name>
</author>
<author>
<name sortKey="Masson, R" uniqKey="Masson R">R Masson</name>
</author>
<author>
<name sortKey="Ren, Y" uniqKey="Ren Y">Y Ren</name>
</author>
<author>
<name sortKey="Rosati, B" uniqKey="Rosati B">B Rosati</name>
</author>
<author>
<name sortKey="Mckinnon, D" uniqKey="Mckinnon D">D McKinnon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gardiner Garden, M" uniqKey="Gardiner Garden M">M Gardiner-Garden</name>
</author>
<author>
<name sortKey="Frommer, M" uniqKey="Frommer M">M Frommer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Takai, D" uniqKey="Takai D">D Takai</name>
</author>
<author>
<name sortKey="Jones, Pa" uniqKey="Jones P">PA Jones</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bock, C" uniqKey="Bock C">C Bock</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, H" uniqKey="Wu H">H Wu</name>
</author>
<author>
<name sortKey="Caffo, B" uniqKey="Caffo B">B Caffo</name>
</author>
<author>
<name sortKey="Jaffee, Ha" uniqKey="Jaffee H">HA Jaffee</name>
</author>
<author>
<name sortKey="Irizarry, Ra" uniqKey="Irizarry R">RA Irizarry</name>
</author>
<author>
<name sortKey="Feinberg, Ap" uniqKey="Feinberg A">AP Feinberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feuerbach, L" uniqKey="Feuerbach L">L Feuerbach</name>
</author>
<author>
<name sortKey="Halachev, K" uniqKey="Halachev K">K Halachev</name>
</author>
<author>
<name sortKey="Assenov, Y" uniqKey="Assenov Y">Y Assenov</name>
</author>
<author>
<name sortKey="Mller, F" uniqKey="Mller F">F Mller</name>
</author>
<author>
<name sortKey="Bock, C" uniqKey="Bock C">C Bock</name>
</author>
<author>
<name sortKey="Lengauer, T" uniqKey="Lengauer T">T Lengauer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cohen, Nm" uniqKey="Cohen N">NM Cohen</name>
</author>
<author>
<name sortKey="Kenigsberg, E" uniqKey="Kenigsberg E">E Kenigsberg</name>
</author>
<author>
<name sortKey="Tanay, A" uniqKey="Tanay A">A Tanay</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nussinov, R" uniqKey="Nussinov R">R Nussinov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saitou, N" uniqKey="Saitou N">N Saitou</name>
</author>
<author>
<name sortKey="Nei, M" uniqKey="Nei M">M Nei</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kendall, Mg" uniqKey="Kendall M">MG Kendall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kendall, Mg" uniqKey="Kendall M">MG Kendall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fredsiund, J" uniqKey="Fredsiund J">J Fredsiund</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Witten, Lh" uniqKey="Witten L">LH Witten</name>
</author>
<author>
<name sortKey="Frank, E" uniqKey="Frank E">E Frank</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Miele, V" uniqKey="Miele V">V Miele</name>
</author>
<author>
<name sortKey="Bourguignon, Py" uniqKey="Bourguignon P">PY Bourguignon</name>
</author>
<author>
<name sortKey="Robelin, D" uniqKey="Robelin D">D Robelin</name>
</author>
<author>
<name sortKey="Nuel, G" uniqKey="Nuel G">G Nuel</name>
</author>
<author>
<name sortKey="Richard, H" uniqKey="Richard H">H Richard</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="iso-abbrev">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="publisher-id">nar</journal-id>
<journal-id journal-id-type="hwp">nar</journal-id>
<journal-title-group>
<journal-title>Nucleic Acids Research</journal-title>
</journal-title-group>
<issn pub-type="ppub">0305-1048</issn>
<issn pub-type="epub">1362-4962</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">23519616</article-id>
<article-id pub-id-type="pmc">3643570</article-id>
<article-id pub-id-type="doi">10.1093/nar/gkt144</article-id>
<article-id pub-id-type="publisher-id">gkt144</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Gene Regulation, Chromatin and Epigenetics</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Comparative analysis using K-mer and K-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Chae</surname>
<given-names>Heejoon</given-names>
</name>
<xref ref-type="aff" rid="gkt144-AFF1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Park</surname>
<given-names>Jinwoo</given-names>
</name>
<xref ref-type="aff" rid="gkt144-AFF1">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="gkt144-AFF1">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lee</surname>
<given-names>Seong-Whan</given-names>
</name>
<xref ref-type="aff" rid="gkt144-AFF1">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Nephew</surname>
<given-names>Kenneth P.</given-names>
</name>
<xref ref-type="aff" rid="gkt144-AFF1">
<sup>5</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kim</surname>
<given-names>Sun</given-names>
</name>
<xref ref-type="aff" rid="gkt144-AFF1">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="gkt144-AFF1">
<sup>3</sup>
</xref>
<xref ref-type="corresp" rid="gkt144-COR1">*</xref>
</contrib>
</contrib-group>
<aff id="gkt144-AFF1">
<sup>1</sup>
Department of Computer Science, School of Informatics and Computing, Indiana University, Bloomington, IN, USA,
<sup>2</sup>
Department of Computer Science and Engineering, Bioinformatics Institute, Seoul National University, Seoul, Korea,
<sup>3</sup>
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea,
<sup>4</sup>
Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea and
<sup>5</sup>
Medical Sciences Program, Indiana University School of Medicine, Indiana University, Bloomington, IN, USA</aff>
<author-notes>
<corresp id="gkt144-COR1">*To whom correspondence should be addressed. Tel:
<phone>+82 2 880 7280</phone>
; Fax:
<fax>+82 2 886 7589</fax>
; Email:
<email>sunkim.bioinfo@snu.ac.kr</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>5</month>
<year>2013</year>
</pub-date>
<pub-date pub-type="epub">
<day>20</day>
<month>3</month>
<year>2013</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>20</day>
<month>3</month>
<year>2013</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>41</volume>
<issue>9</issue>
<fpage>4783</fpage>
<lpage>4791</lpage>
<history>
<date date-type="received">
<day>27</day>
<month>10</month>
<year>2012</year>
</date>
<date date-type="rev-recd">
<day>24</day>
<month>1</month>
<year>2013</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>2</month>
<year>2013</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2013. Published by Oxford University Press.</copyright-statement>
<copyright-year>2013</copyright-year>
<license license-type="creative-commons" xlink:href="http://creativecommons.org/licenses/by-nc/3.0/">
<license-p>
<pmc-comment>CREATIVE COMMONS</pmc-comment>
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/3.0/">http://creativecommons.org/licenses/by-nc/3.0/</ext-link>
), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>CpG islands are GC-rich regions often located in the 5′ end of genes and normally protected from cytosine methylation in mammals. The important role of CpG islands in gene transcription strongly suggests evolutionary conservation in the mammalian genome. However, as CpG dinucleotides are over-represented in CpG islands, comparative CpG island analysis using conventional sequence analysis techniques remains a major challenge in the epigenetics field. In this study, we conducted a comparative analysis of all CpG island sequences in 10 mammalian genomes. As sequence similarity methods and character composition techniques such as information theory are particularly difficult to conduct, we used exact patterns in CpG island sequences and single character discrepancies to identify differences in CpG island sequences. First, by calculating genome distance based on rank correlation tests, we show that k-mer and k-flank patterns around CpG sites can be used to correctly reconstruct the phylogeny of 10 mammalian genomes. Further, we used various machine learning algorithms to demonstrate that CpG islands sequences can be characterized using k-mers. In addition, by testing a human model on the nine different mammalian genomes, we provide the first evidence that k-mer signatures are consistent with evolutionary history.</p>
</abstract>
<counts>
<page-count count="9"></page-count>
</counts>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F51 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000F51 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:3643570
   |texte=   Comparative analysis using K-mer and K-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:23519616" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021