Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

K-mer Content, Correlation, and Position Analysis of Genome DNA Sequences for the Identification of Function and Evolutionary Features

Identifieur interne : 000C92 ( Pmc/Curation ); précédent : 000C91; suivant : 000C93

K-mer Content, Correlation, and Position Analysis of Genome DNA Sequences for the Identification of Function and Evolutionary Features

Auteurs : Aaron Sievers ; Katharina Bosiek ; Marc Bisch ; Chris Dreessen ; Jascha Riedel ; Patrick Fro ; Michael Hausmann ; Georg Hildenbrand [Allemagne]

Source :

RBID : PMC:5406869

Abstract

In genome analysis, k-mer-based comparison methods have become standard tools. However, even though they are able to deliver reliable results, other algorithms seem to work better in some cases. To improve k-mer-based DNA sequence analysis and comparison, we successfully checked whether adding positional resolution is beneficial for finding and/or comparing interesting organizational structures. A simple but efficient algorithm for extracting and saving local k-mer spectra (frequency distribution of k-mers) was developed and used. The results were analyzed by including positional information based on visualizations as genomic maps and by applying basic vector correlation methods. This analysis was concentrated on small word lengths (1 ≤ k ≤ 4) on relatively small viral genomes of Papillomaviridae and Herpesviridae, while also checking its usability for larger sequences, namely human chromosome 2 and the homologous chromosomes (2A, 2B) of a chimpanzee. Using this alignment-free analysis, several regions with specific characteristics in Papillomaviridae and Herpesviridae formerly identified by independent, mostly alignment-based methods, were confirmed. Correlations between the k-mer content and several genes in these genomes have been found, showing similarities between classified and unclassified viruses, which may be potentially useful for further taxonomic research. Furthermore, unknown k-mer correlations in the genomes of Human Herpesviruses (HHVs), which are probably of major biological function, are found and described. Using the chromosomes of a chimpanzee and human that are currently known, identities between the species on every analyzed chromosome were reproduced. This demonstrates the feasibility of our approach for large data sets of complex genomes. Based on these results, we suggest k-mer analysis with positional resolution as a method for closing a gap between the effectiveness of alignment-based methods (like NCBI BLAST) and the high pace of standard k-mer analysis.


Url:
DOI: 10.3390/genes8040122
PubMed: 28422050
PubMed Central: 5406869

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:5406869

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">
<italic>K</italic>
-mer Content, Correlation, and Position Analysis of Genome DNA Sequences for the Identification of Function and Evolutionary Features</title>
<author>
<name sortKey="Sievers, Aaron" sort="Sievers, Aaron" uniqKey="Sievers A" first="Aaron" last="Sievers">Aaron Sievers</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bosiek, Katharina" sort="Bosiek, Katharina" uniqKey="Bosiek K" first="Katharina" last="Bosiek">Katharina Bosiek</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bisch, Marc" sort="Bisch, Marc" uniqKey="Bisch M" first="Marc" last="Bisch">Marc Bisch</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Dreessen, Chris" sort="Dreessen, Chris" uniqKey="Dreessen C" first="Chris" last="Dreessen">Chris Dreessen</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Riedel, Jascha" sort="Riedel, Jascha" uniqKey="Riedel J" first="Jascha" last="Riedel">Jascha Riedel</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fro, Patrick" sort="Fro, Patrick" uniqKey="Fro P" first="Patrick" last="Fro">Patrick Fro</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hausmann, Michael" sort="Hausmann, Michael" uniqKey="Hausmann M" first="Michael" last="Hausmann">Michael Hausmann</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hildenbrand, Georg" sort="Hildenbrand, Georg" uniqKey="Hildenbrand G" first="Georg" last="Hildenbrand">Georg Hildenbrand</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="af2-genes-08-00122">Department of Radiation Oncology, Universitätsmedizin Mannheim, Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Radiation Oncology, Universitätsmedizin Mannheim, Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">28422050</idno>
<idno type="pmc">5406869</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5406869</idno>
<idno type="RBID">PMC:5406869</idno>
<idno type="doi">10.3390/genes8040122</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Pmc/Corpus">000C92</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000C92</idno>
<idno type="wicri:Area/Pmc/Curation">000C92</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000C92</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">
<italic>K</italic>
-mer Content, Correlation, and Position Analysis of Genome DNA Sequences for the Identification of Function and Evolutionary Features</title>
<author>
<name sortKey="Sievers, Aaron" sort="Sievers, Aaron" uniqKey="Sievers A" first="Aaron" last="Sievers">Aaron Sievers</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bosiek, Katharina" sort="Bosiek, Katharina" uniqKey="Bosiek K" first="Katharina" last="Bosiek">Katharina Bosiek</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bisch, Marc" sort="Bisch, Marc" uniqKey="Bisch M" first="Marc" last="Bisch">Marc Bisch</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Dreessen, Chris" sort="Dreessen, Chris" uniqKey="Dreessen C" first="Chris" last="Dreessen">Chris Dreessen</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Riedel, Jascha" sort="Riedel, Jascha" uniqKey="Riedel J" first="Jascha" last="Riedel">Jascha Riedel</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fro, Patrick" sort="Fro, Patrick" uniqKey="Fro P" first="Patrick" last="Fro">Patrick Fro</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hausmann, Michael" sort="Hausmann, Michael" uniqKey="Hausmann M" first="Michael" last="Hausmann">Michael Hausmann</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hildenbrand, Georg" sort="Hildenbrand, Georg" uniqKey="Hildenbrand G" first="Georg" last="Hildenbrand">Georg Hildenbrand</name>
<affiliation>
<nlm:aff id="af1-genes-08-00122">Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</nlm:aff>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="af2-genes-08-00122">Department of Radiation Oncology, Universitätsmedizin Mannheim, Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Radiation Oncology, Universitätsmedizin Mannheim, Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Genes</title>
<idno type="eISSN">2073-4425</idno>
<imprint>
<date when="2017">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>In genome analysis,
<italic>k-mer</italic>
-based comparison methods have become standard tools. However, even though they are able to deliver reliable results, other algorithms seem to work better in some cases. To improve
<italic>k</italic>
-mer-based DNA sequence analysis and comparison, we successfully checked whether adding positional resolution is beneficial for finding and/or comparing interesting organizational structures. A simple but efficient algorithm for extracting and saving local
<italic>k</italic>
-mer spectra (frequency distribution of
<italic>k</italic>
-mers) was developed and used. The results were analyzed by including positional information based on visualizations as genomic maps and by applying basic vector correlation methods. This analysis was concentrated on small word lengths (1 ≤
<italic>k</italic>
≤ 4) on relatively small viral genomes of
<italic>Papillomaviridae</italic>
and
<italic>Herpesviridae</italic>
, while also checking its usability for larger sequences, namely human chromosome 2 and the homologous chromosomes (2A, 2B) of a chimpanzee. Using this alignment-free analysis, several regions with specific characteristics in
<italic>Papillomaviridae</italic>
and
<italic>Herpesviridae</italic>
formerly identified by independent, mostly alignment-based methods, were confirmed. Correlations between the
<italic>k</italic>
-mer content and several genes in these genomes have been found, showing similarities between classified and unclassified viruses, which may be potentially useful for further taxonomic research. Furthermore, unknown
<italic>k</italic>
-mer correlations in the genomes of Human Herpesviruses (HHVs), which are probably of major biological function, are found and described. Using the chromosomes of a chimpanzee and human that are currently known, identities between the species on every analyzed chromosome were reproduced. This demonstrates the feasibility of our approach for large data sets of complex genomes. Based on these results, we suggest
<italic>k</italic>
-mer analysis with positional resolution as a method for closing a gap between the effectiveness of alignment-based methods (like NCBI BLAST) and the high pace of standard
<italic>k</italic>
-mer analysis.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, S F" uniqKey="Altschul S">S.F. Altschul</name>
</author>
<author>
<name sortKey="Gish, W" uniqKey="Gish W">W. Gish</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W. Miller</name>
</author>
<author>
<name sortKey="Myers, E W" uniqKey="Myers E">E.W. Myers</name>
</author>
<author>
<name sortKey="Lipman, D J" uniqKey="Lipman D">D.J. Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chan, C X" uniqKey="Chan C">C.X. Chan</name>
</author>
<author>
<name sortKey="Ragan, M A" uniqKey="Ragan M">M.A. Ragan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alsop, E B" uniqKey="Alsop E">E.B. Alsop</name>
</author>
<author>
<name sortKey="Raymond, J" uniqKey="Raymond J">J. Raymond</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brendel, V" uniqKey="Brendel V">V. Brendel</name>
</author>
<author>
<name sortKey="Beckmann, J S" uniqKey="Beckmann J">J.S. Beckmann</name>
</author>
<author>
<name sortKey="Trifonov, E N" uniqKey="Trifonov E">E.N. Trifonov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, F" uniqKey="Zhou F">F. Zhou</name>
</author>
<author>
<name sortKey="Olman, V" uniqKey="Olman V">V. Olman</name>
</author>
<author>
<name sortKey="Xu, Y" uniqKey="Xu Y">Y. Xu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bultrini, E" uniqKey="Bultrini E">E. Bultrini</name>
</author>
<author>
<name sortKey="Pizzi, E" uniqKey="Pizzi E">E. Pizzi</name>
</author>
<author>
<name sortKey="Del Giudice, P" uniqKey="Del Giudice P">P. Del Giudice</name>
</author>
<author>
<name sortKey="Frontali, C" uniqKey="Frontali C">C. Frontali</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pizzi, E" uniqKey="Pizzi E">E. Pizzi</name>
</author>
<author>
<name sortKey="Frontali, C" uniqKey="Frontali C">C. Frontali</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hacker, J" uniqKey="Hacker J">J. Hacker</name>
</author>
<author>
<name sortKey="Kaper, J B" uniqKey="Kaper J">J.B. Kaper</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Navarre, W W" uniqKey="Navarre W">W.W. Navarre</name>
</author>
<author>
<name sortKey="Porwollik, S" uniqKey="Porwollik S">S. Porwollik</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
<author>
<name sortKey="Mcclelland, M" uniqKey="Mcclelland M">M. McClelland</name>
</author>
<author>
<name sortKey="Rosen, H" uniqKey="Rosen H">H. Rosen</name>
</author>
<author>
<name sortKey="Libby, S J" uniqKey="Libby S">S.J. Libby</name>
</author>
<author>
<name sortKey="Fang, F C" uniqKey="Fang F">F.C. Fang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pizzi, E" uniqKey="Pizzi E">E. Pizzi</name>
</author>
<author>
<name sortKey="Frontali, C" uniqKey="Frontali C">C. Frontali</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pozzoli, U" uniqKey="Pozzoli U">U. Pozzoli</name>
</author>
<author>
<name sortKey="Menozzi, G" uniqKey="Menozzi G">G. Menozzi</name>
</author>
<author>
<name sortKey="Fumagalli, M" uniqKey="Fumagalli M">M. Fumagalli</name>
</author>
<author>
<name sortKey="Cereda, M" uniqKey="Cereda M">M. Cereda</name>
</author>
<author>
<name sortKey="Comi, G P" uniqKey="Comi G">G.P. Comi</name>
</author>
<author>
<name sortKey="Cagliani, R" uniqKey="Cagliani R">R. Cagliani</name>
</author>
<author>
<name sortKey="Bresolin, N" uniqKey="Bresolin N">N. Bresolin</name>
</author>
<author>
<name sortKey="Sironi, M" uniqKey="Sironi M">M. Sironi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chae, H" uniqKey="Chae H">H. Chae</name>
</author>
<author>
<name sortKey="Jinwoo, P" uniqKey="Jinwoo P">P. Jinwoo</name>
</author>
<author>
<name sortKey="Seong Whan, L" uniqKey="Seong Whan L">L. Seong-Whan</name>
</author>
<author>
<name sortKey="Kenneth, P N" uniqKey="Kenneth P">P.N. Kenneth</name>
</author>
<author>
<name sortKey="Sun, K" uniqKey="Sun K">K. Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benson, D A" uniqKey="Benson D">D.A. Benson</name>
</author>
<author>
<name sortKey="Ilene, K M" uniqKey="Ilene K">K.M. Ilene</name>
</author>
<author>
<name sortKey="Lipman, D J" uniqKey="Lipman D">D.J. Lipman</name>
</author>
<author>
<name sortKey="Ostell, J" uniqKey="Ostell J">J. Ostell</name>
</author>
<author>
<name sortKey="Wheeler, D L" uniqKey="Wheeler D">D.L. Wheeler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pearson, K" uniqKey="Pearson K">K. Pearson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marcais, G" uniqKey="Marcais G">G. Marçais</name>
</author>
<author>
<name sortKey="Kingsford, C" uniqKey="Kingsford C">C. Kingsford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S. Karlin</name>
</author>
<author>
<name sortKey="Mrazek, J" uniqKey="Mrazek J">J. Mrázek</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hunter, J D" uniqKey="Hunter J">J.D. Hunter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Acland, A" uniqKey="Acland A">A. Acland</name>
</author>
<author>
<name sortKey="Agarwala, R" uniqKey="Agarwala R">R. Agarwala</name>
</author>
<author>
<name sortKey="Barrett, T" uniqKey="Barrett T">T. Barrett</name>
</author>
<author>
<name sortKey="Beck, J" uniqKey="Beck J">J. Beck</name>
</author>
<author>
<name sortKey="Benson, D A" uniqKey="Benson D">D.A. Benson</name>
</author>
<author>
<name sortKey="Bollin, C" uniqKey="Bollin C">C. Bollin</name>
</author>
<author>
<name sortKey="Bolton, E" uniqKey="Bolton E">E. Bolton</name>
</author>
<author>
<name sortKey="Bryant, S H" uniqKey="Bryant S">S.H. Bryant</name>
</author>
<author>
<name sortKey="Canese, K" uniqKey="Canese K">K. Canese</name>
</author>
<author>
<name sortKey="Church, D M" uniqKey="Church D">D.M. Church</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zheng, Z M" uniqKey="Zheng Z">Z.M. Zheng</name>
</author>
<author>
<name sortKey="Baker, C C" uniqKey="Baker C">C.C. Baker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Davison, A J" uniqKey="Davison A">A.J. Davison</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Elson, D" uniqKey="Elson D">D. Elson</name>
</author>
<author>
<name sortKey="Chargaff, E" uniqKey="Chargaff E">E. Chargaff</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dominguez, G" uniqKey="Dominguez G">G. Dominguez</name>
</author>
<author>
<name sortKey="Dambaugh, T R" uniqKey="Dambaugh T">T.R. Dambaugh</name>
</author>
<author>
<name sortKey="Stamey, F R" uniqKey="Stamey F">F.R. Stamey</name>
</author>
<author>
<name sortKey="Dewhurst, S N" uniqKey="Dewhurst S">S.N. Dewhurst</name>
</author>
<author>
<name sortKey="Inoue, S" uniqKey="Inoue S">S. Inoue</name>
</author>
<author>
<name sortKey="Pellett, P E" uniqKey="Pellett P">P.E. Pellett</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dolan, A" uniqKey="Dolan A">A. Dolan</name>
</author>
<author>
<name sortKey="Addison, C" uniqKey="Addison C">C. Addison</name>
</author>
<author>
<name sortKey="Gatherer, D" uniqKey="Gatherer D">D. Gatherer</name>
</author>
<author>
<name sortKey="Davison, A J" uniqKey="Davison A">A.J. Davison</name>
</author>
<author>
<name sortKey="Mcgeoch, D J" uniqKey="Mcgeoch D">D.J. McGeoch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Megaw, A G" uniqKey="Megaw A">A.G. Megaw</name>
</author>
<author>
<name sortKey="Rapaport, D" uniqKey="Rapaport D">D. Rapaport</name>
</author>
<author>
<name sortKey="Avidor, B" uniqKey="Avidor B">B. Avidor</name>
</author>
<author>
<name sortKey="Frenkel, N" uniqKey="Frenkel N">N. Frenkel</name>
</author>
<author>
<name sortKey="Davison, A J" uniqKey="Davison A">A.J. Davison</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yunis, J J" uniqKey="Yunis J">J.J. Yunis</name>
</author>
<author>
<name sortKey="Sawyer, J R" uniqKey="Sawyer J">J.R. Sawyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pratas, D" uniqKey="Pratas D">D. Pratas</name>
</author>
<author>
<name sortKey="Silva, R M" uniqKey="Silva R">R.M. Silva</name>
</author>
<author>
<name sortKey="Pinho, A J" uniqKey="Pinho A">A.J. Pinho</name>
</author>
<author>
<name sortKey="Ferreira, P J S G" uniqKey="Ferreira P">P.J.S.G. Ferreira</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Winzeler, E A" uniqKey="Winzeler E">E.A. Winzeler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoelzer, K" uniqKey="Hoelzer K">K. Hoelzer</name>
</author>
<author>
<name sortKey="Shackelton, L A" uniqKey="Shackelton L">L.A. Shackelton</name>
</author>
<author>
<name sortKey="Parrish, C R" uniqKey="Parrish C">C.R. Parrish</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Clay, O" uniqKey="Clay O">O. Clay</name>
</author>
<author>
<name sortKey="Caccio, S" uniqKey="Caccio S">S. Caccio</name>
</author>
<author>
<name sortKey="Zoubak, S" uniqKey="Zoubak S">S. Zoubak</name>
</author>
<author>
<name sortKey="Mouchiroud, D" uniqKey="Mouchiroud D">D. Mouchiroud</name>
</author>
<author>
<name sortKey="Bernardi, G" uniqKey="Bernardi G">G. Bernardi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Duret, L" uniqKey="Duret L">L. Duret</name>
</author>
<author>
<name sortKey="Mouchiroud, D" uniqKey="Mouchiroud D">D. Mouchiroud</name>
</author>
<author>
<name sortKey="Gautier, C" uniqKey="Gautier C">C. Gautier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fullerton, S M" uniqKey="Fullerton S">S.M. Fullerton</name>
</author>
<author>
<name sortKey="Carvalho, A B" uniqKey="Carvalho A">A.B. Carvalho</name>
</author>
<author>
<name sortKey="Clark, A G" uniqKey="Clark A">A.G. Clark</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Genes (Basel)</journal-id>
<journal-id journal-id-type="iso-abbrev">Genes (Basel)</journal-id>
<journal-id journal-id-type="publisher-id">genes</journal-id>
<journal-title-group>
<journal-title>Genes</journal-title>
</journal-title-group>
<issn pub-type="epub">2073-4425</issn>
<publisher>
<publisher-name>MDPI</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">28422050</article-id>
<article-id pub-id-type="pmc">5406869</article-id>
<article-id pub-id-type="doi">10.3390/genes8040122</article-id>
<article-id pub-id-type="publisher-id">genes-08-00122</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>
<italic>K</italic>
-mer Content, Correlation, and Position Analysis of Genome DNA Sequences for the Identification of Function and Evolutionary Features</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Sievers</surname>
<given-names>Aaron</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bosiek</surname>
<given-names>Katharina</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bisch</surname>
<given-names>Marc</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Dreessen</surname>
<given-names>Chris</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Riedel</surname>
<given-names>Jascha</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Froß</surname>
<given-names>Patrick</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hausmann</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hildenbrand</surname>
<given-names>Georg</given-names>
</name>
<xref ref-type="aff" rid="af1-genes-08-00122">1</xref>
<xref ref-type="aff" rid="af2-genes-08-00122">2</xref>
<xref rid="c1-genes-08-00122" ref-type="corresp">*</xref>
</contrib>
</contrib-group>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Corominas</surname>
<given-names>Montserrat</given-names>
</name>
<role>Academic Editor</role>
</contrib>
</contrib-group>
<aff id="af1-genes-08-00122">
<label>1</label>
Kirchhoff-Institute for Physics, Heidelberg University, INF 227, 69117 Heidelberg, Germany;
<email>Sievers_Aaron@web.de</email>
(A.S.);
<email>KatharinaBosiek@gmx.de</email>
(K.B.);
<email>MarcBisch@gmx.de</email>
(M.B.);
<email>chrisdreessen@yahoo.de</email>
(C.D.);
<email>jaschelite@googlemail.com</email>
(J.R.);
<email>Fross@stud.uni-heidelberg.de</email>
(P.F.);
<email>hausmann@kip.uni-heidelberg.de</email>
(M.H.)</aff>
<aff id="af2-genes-08-00122">
<label>2</label>
Department of Radiation Oncology, Universitätsmedizin Mannheim, Medical Faculty Mannheim, Heidelberg University, Theodor-Kutzer-Ufer 1-3, 68167 Mannheim, Germany</aff>
<author-notes>
<corresp id="c1-genes-08-00122">
<label>*</label>
Correspondence:
<email>hilden@kip.uni-heidelberg.de</email>
; Tel.: +49-151-559-63919</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>4</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="collection">
<month>4</month>
<year>2017</year>
</pub-date>
<volume>8</volume>
<issue>4</issue>
<elocation-id>122</elocation-id>
<history>
<date date-type="received">
<day>10</day>
<month>2</month>
<year>2017</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>4</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>© 2017 by the authors.</copyright-statement>
<copyright-year>2017</copyright-year>
<license license-type="open-access">
<license-p>Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
).</license-p>
</license>
</permissions>
<abstract>
<p>In genome analysis,
<italic>k-mer</italic>
-based comparison methods have become standard tools. However, even though they are able to deliver reliable results, other algorithms seem to work better in some cases. To improve
<italic>k</italic>
-mer-based DNA sequence analysis and comparison, we successfully checked whether adding positional resolution is beneficial for finding and/or comparing interesting organizational structures. A simple but efficient algorithm for extracting and saving local
<italic>k</italic>
-mer spectra (frequency distribution of
<italic>k</italic>
-mers) was developed and used. The results were analyzed by including positional information based on visualizations as genomic maps and by applying basic vector correlation methods. This analysis was concentrated on small word lengths (1 ≤
<italic>k</italic>
≤ 4) on relatively small viral genomes of
<italic>Papillomaviridae</italic>
and
<italic>Herpesviridae</italic>
, while also checking its usability for larger sequences, namely human chromosome 2 and the homologous chromosomes (2A, 2B) of a chimpanzee. Using this alignment-free analysis, several regions with specific characteristics in
<italic>Papillomaviridae</italic>
and
<italic>Herpesviridae</italic>
formerly identified by independent, mostly alignment-based methods, were confirmed. Correlations between the
<italic>k</italic>
-mer content and several genes in these genomes have been found, showing similarities between classified and unclassified viruses, which may be potentially useful for further taxonomic research. Furthermore, unknown
<italic>k</italic>
-mer correlations in the genomes of Human Herpesviruses (HHVs), which are probably of major biological function, are found and described. Using the chromosomes of a chimpanzee and human that are currently known, identities between the species on every analyzed chromosome were reproduced. This demonstrates the feasibility of our approach for large data sets of complex genomes. Based on these results, we suggest
<italic>k</italic>
-mer analysis with positional resolution as a method for closing a gap between the effectiveness of alignment-based methods (like NCBI BLAST) and the high pace of standard
<italic>k</italic>
-mer analysis.</p>
</abstract>
<kwd-group>
<kwd>
<italic>k</italic>
-mer</kwd>
<kwd>
<italic>k</italic>
-mer analysis</kwd>
<kwd>sequence analysis</kwd>
<kwd>alignment-free</kwd>
<kwd>positional features</kwd>
</kwd-group>
</article-meta>
</front>
<floats-group>
<fig id="genes-08-00122-f001" orientation="portrait" position="float">
<label>Figure 1</label>
<caption>
<p>Map of Human Papillomaviruses (HPVs) for
<italic>k</italic>
= 1. From left to right: HPV4, 5, 7, 9, 49, 92, 96, 136, 140, 154, 178. A bin width of 100 bp was used. Genes were represented by colored bars at the right side of the linear representation of the circular HPV genomes (E1 red, E2 blue, E4 green, E5 yellow, E6 orange, E7 purple, L1 magenta, L2 grey). The orange boxes indicate the boundaries of the three regions with different
<italic>k</italic>
-mer structures (the regions above and below the middle region in the box do not have their own boxes for easier readability).</p>
</caption>
<graphic xlink:href="genes-08-00122-g001"></graphic>
</fig>
<fig id="genes-08-00122-f002" orientation="portrait" position="float">
<label>Figure 2</label>
<caption>
<p>Histogram of monomer contents of different regions relative to the monomer content of the whole genome of HPV4. The top region (
<bold>red</bold>
) is associated to the genes E1, E6, and E7. The central region (
<bold>blue</bold>
) is associated with E2, E4, and E5. The E region (
<bold>purple</bold>
) is the region covered by all early genes. The bottom/L region (
<bold>magenta</bold>
) is associated with the late genes (L1, L2). The two NC regions (
<bold>green</bold>
) on the left and right are the non-coding region at the top and bottom of the linear representation used in
<xref ref-type="fig" rid="genes-08-00122-f001">Figure 1</xref>
, respectively.</p>
</caption>
<graphic xlink:href="genes-08-00122-g002"></graphic>
</fig>
<fig id="genes-08-00122-f003" orientation="portrait" position="float">
<label>Figure 3</label>
<caption>
<p>Correlation heatmaps between HPV4 and 5 for
<italic>k</italic>
= 1 (
<bold>A</bold>
) and
<italic>k</italic>
= 4 (
<bold>B</bold>
) (bin width of 100 bp). The colored bars at the edges indicate the locations of the top region (
<bold>red</bold>
), central region (
<bold>blue</bold>
), bottom region (
<bold>magenta</bold>
), and NC region (
<bold>green</bold>
) according to gene annotation borders (not by
<italic>k</italic>
-mer content).</p>
</caption>
<graphic xlink:href="genes-08-00122-g003"></graphic>
</fig>
<fig id="genes-08-00122-f004" orientation="portrait" position="float">
<label>Figure 4</label>
<caption>
<p>Correlation heatmaps of different HPV types for different
<italic>k</italic>
values (
<italic>k</italic>
= 1 in
<bold>A</bold>
+
<bold>C</bold>
,
<italic>k</italic>
= 4 in
<bold>B</bold>
+
<bold>D</bold>
), with a bin width of 100 bp. The colored bars at the edges indicate the positions of the top region (red), central region (blue), bottom region (magenta), and NC region (green) according to the borders of associated genes (not by
<italic>k</italic>
-mer content). The peculiar structure of the NC region of HPV 7 is highlighted with a green border in (
<bold>A</bold>
,
<italic>k</italic>
= 1). The linear structures between HPV4 and HPV136 are considered as “strong” between HPV7 and HPV4, and are “weak” between HPV7 and HPV136.</p>
</caption>
<graphic xlink:href="genes-08-00122-g004"></graphic>
</fig>
<fig id="genes-08-00122-f005" orientation="portrait" position="float">
<label>Figure 5</label>
<caption>
<p>Heatmap Summary Images of HPV Regions. Shown are the summaries of some heatmaps (
<italic>k</italic>
= 1) of the three regions defined by different
<italic>k</italic>
-mer contents for all of the HPV types analyzed (positions of regions can be found in
<xref ref-type="app" rid="app1-genes-08-00122">Table S1</xref>
). (
<bold>A</bold>
) Good correlation within the first region between all HPV types; (
<bold>B</bold>
) Bad correlation (values around zero or lower) between the first and second region; (
<bold>C</bold>
) Relatively good correlation in the second region for most values; (
<bold>D</bold>
) Good correlation amongst all of the third regions of HPV. The data points are equally spaced and sorted numerically, therefore no additional information is provided by their horizontal alignment. The labels next to the data points indicate the corresponding HPV types whose regions were correlated.</p>
</caption>
<graphic xlink:href="genes-08-00122-g005"></graphic>
</fig>
<fig id="genes-08-00122-f006" orientation="portrait" position="float">
<label>Figure 6</label>
<caption>
<p>Map of Human Herpesvirus (HHV) genomes for
<italic>k</italic>
= 1, with a bin width of 500 bp. From left to right: HHV1, 2, 3, 5, 6A, 6B, 7, 4 type 1, 4 type 2, 8. Genes associated with low conserved regions on HHV6A, 6B, 4 type 1, and 4 type 2 were visualized with gray bars at the side of the genomes. At the right bottom corner, a small region around the EBNA genes for both HHV4 species is shown to illustrate small differences in the local
<italic>k</italic>
-mer structure.</p>
</caption>
<graphic xlink:href="genes-08-00122-g006"></graphic>
</fig>
<fig id="genes-08-00122-f007" orientation="portrait" position="float">
<label>Figure 7</label>
<caption>
<p>Map of HHV genomes for relative
<italic>k</italic>
= 2, with a bin width of 500 bp. From left to right: HHV1, 2, 3, 5, 6A, 6B, 7, 4 type 1, 4 type 2, 8. Peculiar patterns associated with high C or C/G content for
<italic>k</italic>
= 1 are marked orange, and the iterative structure at the top of 6A and 6B is marked green.</p>
</caption>
<graphic xlink:href="genes-08-00122-g007"></graphic>
</fig>
<fig id="genes-08-00122-f008" orientation="portrait" position="float">
<label>Figure 8</label>
<caption>
<p>Correlation heatmap between HHV4 type 1 and HHV4 type 2 (
<italic>k</italic>
= 1 and bin width of 500 bp). The gray bars indicate genes associated with regions of low conservation derived with alignment methods. Genes with an * were not found for the sequences used in our analysis. Therefore, their positions are only approximated by using the data from [
<xref rid="B23-genes-08-00122" ref-type="bibr">23</xref>
].</p>
</caption>
<graphic xlink:href="genes-08-00122-g008"></graphic>
</fig>
<fig id="genes-08-00122-f009" orientation="portrait" position="float">
<label>Figure 9</label>
<caption>
<p>Correlation heatmap between HHV6A and 6B for
<italic>k</italic>
= 1 (
<bold>A</bold>
) and relative
<italic>k</italic>
= 2 (
<bold>B</bold>
) (bin width 500 bp). Genes are represented by grey bars at the borders. Regions with a low identity score in [
<xref rid="B22-genes-08-00122" ref-type="bibr">22</xref>
] are marked orange, extremely low values are red, high identity scores are blue, and extremely high values are green.</p>
</caption>
<graphic xlink:href="genes-08-00122-g009"></graphic>
</fig>
<fig id="genes-08-00122-f010" orientation="portrait" position="float">
<label>Figure 10</label>
<caption>
<p>Correlation heatmap between HHV6A with HHV7 (
<bold>A</bold>
) and HHV6B with HHV7 (
<bold>B</bold>
) based on
<italic>k</italic>
= 1 (bin width 500 bp).</p>
</caption>
<graphic xlink:href="genes-08-00122-g010"></graphic>
</fig>
<fig id="genes-08-00122-f011" orientation="portrait" position="float">
<label>Figure 11</label>
<caption>
<p>Correlation heatmap between HHV7 with itself (
<bold>A</bold>
) and HHV6A with HHV6B (
<bold>B</bold>
) based on relative k = 4 (bin width 500 bp). Beginning and ending regions are highlighted with green boxes.</p>
</caption>
<graphic xlink:href="genes-08-00122-g011"></graphic>
</fig>
<fig id="genes-08-00122-f012" orientation="portrait" position="float">
<label>Figure 12</label>
<caption>
<p>Correlation heatmap between
<italic>Homo sapiens</italic>
chromosome 2 (HSc2) and
<italic>Pan troglodytes</italic>
chromosome 2A (PTc2A) and 2B. Based on
<italic>k</italic>
= 4 (bin width 2.5 Mbp) with a highly increased threshold (see scale on the right). The colored boxes indicate regions of association between the chromosomes.</p>
</caption>
<graphic xlink:href="genes-08-00122-g012"></graphic>
</fig>
<table-wrap id="genes-08-00122-t001" orientation="portrait" position="float">
<object-id pub-id-type="pii">genes-08-00122-t001_Table 1</object-id>
<label>Table 1</label>
<caption>
<p>Accession numbers of DNA sequences used.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="middle" style="border-top:solid thin;border-bottom:solid thin" rowspan="1" colspan="1">Species</th>
<th align="center" valign="middle" style="border-top:solid thin;border-bottom:solid thin" rowspan="1" colspan="1">Accession Number</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 4 (HPV4)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001457.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 5 (HPV5)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001531.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 7 (HPV7)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001595.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 9 (HPV9)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001596.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 49 (HPV49)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001591.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 92 (HPV92)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_004500.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 96 (HPV96)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_005134.2</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 136 (HPV136)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_017994.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 140 (HPV140)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_017996.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 154 (HPV154)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_021483.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Papillomavirus 178 (HPV178)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_023891.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 1 (HHV1)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001806.2</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 2 (HHV2)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001798.2</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 3 (HHV3)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001348.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 4 type1 (HHV4 type1)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_007605.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 4 type2 (HHV4 type2)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_009334.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 5 (HHV5)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_006273.2</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 6A (HHV6A)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001664.2</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 6B (HHV6B)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_000898.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 7 (HHV7)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_001716.2</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">Human Herpesvirus 8 (HHV8)</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_009333.1</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">
<italic>Homo Sapiens</italic>
Chromosome 2</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_000002.12</td>
</tr>
<tr>
<td align="center" valign="middle" rowspan="1" colspan="1">
<italic>Pan Troglodytes</italic>
Chromosome 2A</td>
<td align="center" valign="middle" rowspan="1" colspan="1">NC_006469.3</td>
</tr>
<tr>
<td align="center" valign="middle" style="border-bottom:solid thin" rowspan="1" colspan="1">
<italic>Pan Troglodytes</italic>
Chromosome 2B</td>
<td align="center" valign="middle" style="border-bottom:solid thin" rowspan="1" colspan="1">NC_006470.3</td>
</tr>
</tbody>
</table>
</table-wrap>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C92 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000C92 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:5406869
   |texte=   K-mer Content, Correlation, and Position Analysis of Genome DNA Sequences for the Identification of Function and Evolutionary Features
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:28422050" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021