K-mer-Based Motif Analysis in Insect Species across Anopheles, Drosophila, and Glossina Genera and Its Application to Species Classification
Identifieur interne : 000510 ( Main/Exploration ); précédent : 000509; suivant : 000511K-mer-Based Motif Analysis in Insect Species across Anopheles, Drosophila, and Glossina Genera and Its Application to Species Classification
Auteurs : Matyas Cserhati [États-Unis] ; Peng Xiao [États-Unis] ; Chittibabu Guda [États-Unis]Source :
- Computational and Mathematical Methods in Medicine [ 1748-670X ] ; 2019.
Abstract
Short k-mer sequences from DNA are both conserved and diverged across species owing to their functional significance in speciation, which enables their use in many species classification algorithms. In the present study, we developed a methodology to analyze the DNA k-mers of whole genome, 5′ UTR, intron, and 3′ UTR regions from 58 insect species belonging to three genera of
Url:
DOI: 10.1155/2019/4259479
PubMed: 31827584
PubMed Central: 6881769
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000B95
- to stream Pmc, to step Curation: 000B95
- to stream Pmc, to step Checkpoint: 000263
- to stream PubMed, to step Corpus: 000334
- to stream PubMed, to step Curation: 000334
- to stream PubMed, to step Checkpoint: 000505
- to stream Ncbi, to step Merge: 002430
- to stream Ncbi, to step Curation: 002430
- to stream Ncbi, to step Checkpoint: 002430
- to stream Main, to step Merge: 000513
- to stream Main, to step Curation: 000510
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">K-mer-Based Motif Analysis in Insect Species across <italic>Anopheles</italic>
, <italic>Drosophila</italic>
, and <italic>Glossina</italic>
Genera and Its Application to Species Classification</title>
<author><name sortKey="Cserhati, Matyas" sort="Cserhati, Matyas" uniqKey="Cserhati M" first="Matyas" last="Cserhati">Matyas Cserhati</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198</wicri:regionArea>
<placeName><region type="state">Nebraska</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Xiao, Peng" sort="Xiao, Peng" uniqKey="Xiao P" first="Peng" last="Xiao">Peng Xiao</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198</wicri:regionArea>
<placeName><region type="state">Nebraska</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Guda, Chittibabu" sort="Guda, Chittibabu" uniqKey="Guda C" first="Chittibabu" last="Guda">Chittibabu Guda</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198</wicri:regionArea>
<placeName><region type="state">Nebraska</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">31827584</idno>
<idno type="pmc">6881769</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6881769</idno>
<idno type="RBID">PMC:6881769</idno>
<idno type="doi">10.1155/2019/4259479</idno>
<date when="2019">2019</date>
<idno type="wicri:Area/Pmc/Corpus">000B95</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000B95</idno>
<idno type="wicri:Area/Pmc/Curation">000B95</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000B95</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000263</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000263</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:31827584</idno>
<idno type="wicri:Area/PubMed/Corpus">000334</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000334</idno>
<idno type="wicri:Area/PubMed/Curation">000334</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000334</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000505</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000505</idno>
<idno type="wicri:Area/Ncbi/Merge">002430</idno>
<idno type="wicri:Area/Ncbi/Curation">002430</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">002430</idno>
<idno type="wicri:doubleKey">1748-670X:2019:Cserhati M:k:mer:based</idno>
<idno type="wicri:Area/Main/Merge">000513</idno>
<idno type="wicri:Area/Main/Curation">000510</idno>
<idno type="wicri:Area/Main/Exploration">000510</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">K-mer-Based Motif Analysis in Insect Species across <italic>Anopheles</italic>
, <italic>Drosophila</italic>
, and <italic>Glossina</italic>
Genera and Its Application to Species Classification</title>
<author><name sortKey="Cserhati, Matyas" sort="Cserhati, Matyas" uniqKey="Cserhati M" first="Matyas" last="Cserhati">Matyas Cserhati</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198</wicri:regionArea>
<placeName><region type="state">Nebraska</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Xiao, Peng" sort="Xiao, Peng" uniqKey="Xiao P" first="Peng" last="Xiao">Peng Xiao</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198</wicri:regionArea>
<placeName><region type="state">Nebraska</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Guda, Chittibabu" sort="Guda, Chittibabu" uniqKey="Guda C" first="Chittibabu" last="Guda">Chittibabu Guda</name>
<affiliation wicri:level="2"><nlm:aff id="I1">Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Genetics, Cell Biology & Anatomy, University of Nebraska Medical Center, Omaha, NE 68198</wicri:regionArea>
<placeName><region type="state">Nebraska</region>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j">Computational and Mathematical Methods in Medicine</title>
<idno type="ISSN">1748-670X</idno>
<idno type="eISSN">1748-6718</idno>
<imprint><date when="2019">2019</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>Short k-mer sequences from DNA are both conserved and diverged across species owing to their functional significance in speciation, which enables their use in many species classification algorithms. In the present study, we developed a methodology to analyze the DNA k-mers of whole genome, 5′ UTR, intron, and 3′ UTR regions from 58 insect species belonging to three genera of <italic>Diptera</italic>
that include <italic>Anopheles</italic>
, <italic>Drosophila</italic>
, and <italic>Glossina</italic>
. We developed an improved algorithm to predict and score k-mers based on a scheme that normalizes k-mer scores in different genomic subregions. This algorithm takes advantage of the information content of the whole genome as opposed to other algorithms or studies that analyze only a small group of genes. Our algorithm uses k-mers of lengths 7–9 bp for the whole genome, 5′ and 3′ UTR regions as well as the intronic regions. Taxonomical relationships based on the whole-genome k-mer signatures showed that species of the three genera clustered together quite visibly. We also improved the scoring and filtering of these k-mers for accurate species identification. The whole-genome k-mer content correlation algorithm showed that species within a single genus correlated tightly with each other as compared to other genera. The genomes of two <italic>Aedes</italic>
and one <italic>Culex</italic>
species were also analyzed to demonstrate how newly sequenced species can be classified using the algorithm. Furthermore, working with several dozen species has enabled us to assign a whole-genome k-mer signature for each of the 58 Dipteran species by making all-to-all pairwise comparison of the k-mer content. These signatures were used to compare the similarity between species and to identify clusters of species displaying similar signatures.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Cserhati, M" uniqKey="Cserhati M">M. Cserháti</name>
</author>
<author><name sortKey="Tur Czy, Z" uniqKey="Tur Czy Z">Z. Turóczy</name>
</author>
<author><name sortKey="Dudits, D" uniqKey="Dudits D">D. Dudits</name>
</author>
<author><name sortKey="Gyorgyey, J" uniqKey="Gyorgyey J">J. Györgyey</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cserhati, M" uniqKey="Cserhati M">M. Cserhati</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cserhati, M F" uniqKey="Cserhati M">M. F. Cserhati</name>
</author>
<author><name sortKey="Mooter, M E" uniqKey="Mooter M">M.-E. Mooter</name>
</author>
<author><name sortKey="Peterson, L" uniqKey="Peterson L">L. Peterson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Vinga, S" uniqKey="Vinga S">S. Vinga</name>
</author>
<author><name sortKey="Almeida, J" uniqKey="Almeida J">J. Almeida</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pollard, D A" uniqKey="Pollard D">D. A. Pollard</name>
</author>
<author><name sortKey="Iyer, V N" uniqKey="Iyer V">V. N. Iyer</name>
</author>
<author><name sortKey="Moses, A M" uniqKey="Moses A">A. M. Moses</name>
</author>
<author><name sortKey="Eisen, M B" uniqKey="Eisen M">M. B. Eisen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yang, K" uniqKey="Yang K">K. Yang</name>
</author>
<author><name sortKey="Zhang, L" uniqKey="Zhang L">L. Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mchardy, A C" uniqKey="Mchardy A">A. C. McHardy</name>
</author>
<author><name sortKey="Martin, H G" uniqKey="Martin H">H. G. Martín</name>
</author>
<author><name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A. Tsirigos</name>
</author>
<author><name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P. Hugenholtz</name>
</author>
<author><name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I. Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Diaz, N N" uniqKey="Diaz N">N. N. Diaz</name>
</author>
<author><name sortKey="Krause, L" uniqKey="Krause L">L. Krause</name>
</author>
<author><name sortKey="Goesmann, A" uniqKey="Goesmann A">A. Goesmann</name>
</author>
<author><name sortKey="Niehaus, K" uniqKey="Niehaus K">K. Niehaus</name>
</author>
<author><name sortKey="Nattkemper, T W" uniqKey="Nattkemper T">T. W. Nattkemper</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Nalbantoglu, O U" uniqKey="Nalbantoglu O">O. U. Nalbantoglu</name>
</author>
<author><name sortKey="Way, S F" uniqKey="Way S">S. F. Way</name>
</author>
<author><name sortKey="Hinrichs, S H" uniqKey="Hinrichs S">S. H. Hinrichs</name>
</author>
<author><name sortKey="Sayood, K" uniqKey="Sayood K">K. Sayood</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wiegmann, A" uniqKey="Wiegmann A">A. Wiegmann</name>
</author>
<author><name sortKey="Richards, S" uniqKey="Richards S">S. Richards</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kiszewski, A" uniqKey="Kiszewski A">A. Kiszewski</name>
</author>
<author><name sortKey="Sachs, S E" uniqKey="Sachs S">S. E. Sachs</name>
</author>
<author><name sortKey="Mellinger, A" uniqKey="Mellinger A">A. Mellinger</name>
</author>
<author><name sortKey="Malaney, P" uniqKey="Malaney P">P. Malaney</name>
</author>
<author><name sortKey="Sachs, J" uniqKey="Sachs J">J. Sachs</name>
</author>
<author><name sortKey="Spielman, A" uniqKey="Spielman A">A. Spielman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Foster, P G" uniqKey="Foster P">P. G. Foster</name>
</author>
<author><name sortKey="Bergo, E S" uniqKey="Bergo E">E. S. Bergo</name>
</author>
<author><name sortKey="Bourke, B P" uniqKey="Bourke B">B. P. Bourke</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Krzywinski, J" uniqKey="Krzywinski J">J. Krzywinski</name>
</author>
<author><name sortKey="Besansky, N J" uniqKey="Besansky N">N. J. Besansky</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yassin, A" uniqKey="Yassin A">A. Yassin</name>
</author>
<author><name sortKey="Orgogozo, V" uniqKey="Orgogozo V">V. Orgogozo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Throckmorton, L H" uniqKey="Throckmorton L">L. H. Throckmorton</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Izumitani, H F" uniqKey="Izumitani H">H. F. Izumitani</name>
</author>
<author><name sortKey="Kusaka, Y" uniqKey="Kusaka Y">Y. Kusaka</name>
</author>
<author><name sortKey="Koshikawa, S" uniqKey="Koshikawa S">S. Koshikawa</name>
</author>
<author><name sortKey="Toda, M J" uniqKey="Toda M">M. J. Toda</name>
</author>
<author><name sortKey="Katoh, T" uniqKey="Katoh T">T. Katoh</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Neafsey, D E" uniqKey="Neafsey D">D. E. Neafsey</name>
</author>
<author><name sortKey="Waterhouse, R M" uniqKey="Waterhouse R">R. M. Waterhouse</name>
</author>
<author><name sortKey="Abai, M R" uniqKey="Abai M">M. R. Abai</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Krafsur, E" uniqKey="Krafsur E">E. Krafsur</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gooding, R H" uniqKey="Gooding R">R. H. Gooding</name>
</author>
<author><name sortKey="Krafsur, E S" uniqKey="Krafsur E">E. S. Krafsur</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Elsen, P" uniqKey="Elsen P">P. Elsen</name>
</author>
<author><name sortKey="Amoudi, M A" uniqKey="Amoudi M">M. A. Amoudi</name>
</author>
<author><name sortKey="Leclercq, M" uniqKey="Leclercq M">M. Leclercq</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lichtenberg, J" uniqKey="Lichtenberg J">J. Lichtenberg</name>
</author>
<author><name sortKey="Yilmaz, A" uniqKey="Yilmaz A">A. Yilmaz</name>
</author>
<author><name sortKey="Welch, J D" uniqKey="Welch J">J. D. Welch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tompa, M" uniqKey="Tompa M">M. Tompa</name>
</author>
<author><name sortKey="Li, N" uniqKey="Li N">N. Li</name>
</author>
<author><name sortKey="Bailey, T L" uniqKey="Bailey T">T. L. Bailey</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pesole, G" uniqKey="Pesole G">G. Pesole</name>
</author>
<author><name sortKey="Liuni, S" uniqKey="Liuni S">S. Liuni</name>
</author>
<author><name sortKey="Grillo, G" uniqKey="Grillo G">G. Grillo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gelfand, Y" uniqKey="Gelfand Y">Y. Gelfand</name>
</author>
<author><name sortKey="Rodriguez, A" uniqKey="Rodriguez A">A. Rodriguez</name>
</author>
<author><name sortKey="Benson, G" uniqKey="Benson G">G. Benson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Khan, A" uniqKey="Khan A">A. Khan</name>
</author>
<author><name sortKey="Fornes, O" uniqKey="Fornes O">O. Fornes</name>
</author>
<author><name sortKey="Stigliani, A" uniqKey="Stigliani A">A. Stigliani</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Moses, A M" uniqKey="Moses A">A. M. Moses</name>
</author>
<author><name sortKey="Pollard, D A" uniqKey="Pollard D">D. A. Pollard</name>
</author>
<author><name sortKey="Nix, D A" uniqKey="Nix D">D. A. Nix</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhou, Q" uniqKey="Zhou Q">Q. Zhou</name>
</author>
<author><name sortKey="Bachtrog, D" uniqKey="Bachtrog D">D. Bachtrog</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hao, Y J" uniqKey="Hao Y">Y. J. Hao</name>
</author>
<author><name sortKey="Zou, Y L" uniqKey="Zou Y">Y. L. Zou</name>
</author>
<author><name sortKey="Ding, Y R" uniqKey="Ding Y">Y. R. Ding</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Freitas, L A" uniqKey="Freitas L">L. A. Freitas</name>
</author>
<author><name sortKey="Russo, C A" uniqKey="Russo C">C. A. Russo</name>
</author>
<author><name sortKey="Voloch, C M" uniqKey="Voloch C">C. M. Voloch</name>
</author>
<author><name sortKey="Mutaquiha, O C" uniqKey="Mutaquiha O">O. C. Mutaquiha</name>
</author>
<author><name sortKey="Marques, L P" uniqKey="Marques L">L. P. Marques</name>
</author>
<author><name sortKey="Schrago, C G" uniqKey="Schrago C">C. G. Schrago</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Beebe, N W" uniqKey="Beebe N">N. W. Beebe</name>
</author>
<author><name sortKey="Russell, T" uniqKey="Russell T">T. Russell</name>
</author>
<author><name sortKey="Burkot, T R" uniqKey="Burkot T">T. R. Burkot</name>
</author>
<author><name sortKey="Cooper, R D" uniqKey="Cooper R">R. D. Cooper</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Nebraska</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Nebraska"><name sortKey="Cserhati, Matyas" sort="Cserhati, Matyas" uniqKey="Cserhati M" first="Matyas" last="Cserhati">Matyas Cserhati</name>
</region>
<name sortKey="Guda, Chittibabu" sort="Guda, Chittibabu" uniqKey="Guda C" first="Chittibabu" last="Guda">Chittibabu Guda</name>
<name sortKey="Xiao, Peng" sort="Xiao, Peng" uniqKey="Xiao P" first="Peng" last="Xiao">Peng Xiao</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000510 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000510 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:6881769 |texte= K-mer-Based Motif Analysis in Insect Species across Anopheles, Drosophila, and Glossina Genera and Its Application to Species Classification }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:31827584" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |