Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures

Identifieur interne : 000C80 ( Pmc/Curation ); précédent : 000C79; suivant : 000C81

Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures

Auteurs : Ying Wang [République populaire de Chine] ; Lei Fu [République populaire de Chine] ; Jie Ren [États-Unis] ; Zhaoxia Yu [États-Unis] ; Ting Chen [États-Unis, République populaire de Chine] ; Fengzhu Sun [États-Unis, République populaire de Chine]

Source :

RBID : PMC:5943621

Abstract

Comparing metagenomic samples is crucial for understanding microbial communities. For different groups of microbial communities, such as human gut metagenomic samples from patients with a certain disease and healthy controls, identifying group-specific sequences offers essential information for potential biomarker discovery. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered “group-specific” in our study. Our main purpose is to discover group-specific sequence regions between control and case groups as disease-associated markers. We developed a long k-mer (k ≥ 30 bps)-based computational pipeline to detect group-specific sequences at strain resolution free from reference sequences, sequence alignments, and metagenome-wide de novo assembly. We called our method MetaGO: Group-specific oligonucleotide analysis for metagenomic samples. An open-source pipeline on Apache Spark was developed with parallel computing. We applied MetaGO to one simulated and three real metagenomic datasets to evaluate the discriminative capability of identified group-specific markers. In the simulated dataset, 99.11% of group-specific logical 40-mers covered 98.89% disease-specific regions from the disease-associated strain. In addition, 97.90% of group-specific numerical 40-mers covered 99.61 and 96.39% of differentially abundant genome and regions between two groups, respectively. For a large-scale metagenomic liver cirrhosis (LC)-associated dataset, we identified 37,647 group-specific 40-mer features. Any one of the features can predict disease status of the training samples with the average of sensitivity and specificity higher than 0.8. The random forests classification using the top 10 group-specific features yielded a higher AUC (from ∼0.8 to ∼0.9) than that of previous studies. All group-specific 40-mers were present in LC patients, but not healthy controls. All the assembled 11 LC-specific sequences can be mapped to two strains of Veillonella parvula: UTDB1-3 and DSM2008. The experiments on the other two real datasets related to Inflammatory Bowel Disease and Type 2 Diabetes in Women consistently demonstrated that MetaGO achieved better prediction accuracy with fewer features compared to previous studies. The experiments showed that MetaGO is a powerful tool for identifying group-specific k-mers, which would be clinically applicable for disease prediction. MetaGO is available at https://github.com/VVsmileyx/MetaGO.


Url:
DOI: 10.3389/fmicb.2018.00872
PubMed: 29774017
PubMed Central: 5943621

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:5943621

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Identifying
<italic>Group-Specific</italic>
Sequences for Microbial Communities Using Long
<italic>k</italic>
-mer Sequence Signatures</title>
<author>
<name sortKey="Wang, Ying" sort="Wang, Ying" uniqKey="Wang Y" first="Ying" last="Wang">Ying Wang</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Automation, Xiamen University</institution>
,
<addr-line>Xiamen</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Fu, Lei" sort="Fu, Lei" uniqKey="Fu L" first="Lei" last="Fu">Lei Fu</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Automation, Xiamen University</institution>
,
<addr-line>Xiamen</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ren, Jie" sort="Ren, Jie" uniqKey="Ren J" first="Jie" last="Ren">Jie Ren</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<institution>Molecular and Computational Biology Program, University of Southern California, Los Angeles</institution>
,
<addr-line>CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Yu, Zhaoxia" sort="Yu, Zhaoxia" uniqKey="Yu Z" first="Zhaoxia" last="Yu">Zhaoxia Yu</name>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<institution>Department of Statistics, University of California, Irvine</institution>
,
<addr-line>Irvine, CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Chen, Ting" sort="Chen, Ting" uniqKey="Chen T" first="Ting" last="Chen">Ting Chen</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<institution>Molecular and Computational Biology Program, University of Southern California, Los Angeles</institution>
,
<addr-line>CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff4">
<institution>Bioinformatics Division, Tsinghua National Laboratory of Information Science and Technology, Tsinghua University</institution>
,
<addr-line>Beijing</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff5">
<institution>Department of Computer Science and Technology, Tsinghua University</institution>
,
<addr-line>Beijing</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Sun, Fengzhu" sort="Sun, Fengzhu" uniqKey="Sun F" first="Fengzhu" last="Sun">Fengzhu Sun</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<institution>Molecular and Computational Biology Program, University of Southern California, Los Angeles</institution>
,
<addr-line>CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<institution>Center for Computational Systems Biology, Fudan University</institution>
,
<addr-line>Shanghai</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">29774017</idno>
<idno type="pmc">5943621</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5943621</idno>
<idno type="RBID">PMC:5943621</idno>
<idno type="doi">10.3389/fmicb.2018.00872</idno>
<date when="2018">2018</date>
<idno type="wicri:Area/Pmc/Corpus">000C80</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000C80</idno>
<idno type="wicri:Area/Pmc/Curation">000C80</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000C80</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Identifying
<italic>Group-Specific</italic>
Sequences for Microbial Communities Using Long
<italic>k</italic>
-mer Sequence Signatures</title>
<author>
<name sortKey="Wang, Ying" sort="Wang, Ying" uniqKey="Wang Y" first="Ying" last="Wang">Ying Wang</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Automation, Xiamen University</institution>
,
<addr-line>Xiamen</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Fu, Lei" sort="Fu, Lei" uniqKey="Fu L" first="Lei" last="Fu">Lei Fu</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Automation, Xiamen University</institution>
,
<addr-line>Xiamen</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ren, Jie" sort="Ren, Jie" uniqKey="Ren J" first="Jie" last="Ren">Jie Ren</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<institution>Molecular and Computational Biology Program, University of Southern California, Los Angeles</institution>
,
<addr-line>CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Yu, Zhaoxia" sort="Yu, Zhaoxia" uniqKey="Yu Z" first="Zhaoxia" last="Yu">Zhaoxia Yu</name>
<affiliation wicri:level="1">
<nlm:aff id="aff3">
<institution>Department of Statistics, University of California, Irvine</institution>
,
<addr-line>Irvine, CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Chen, Ting" sort="Chen, Ting" uniqKey="Chen T" first="Ting" last="Chen">Ting Chen</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<institution>Molecular and Computational Biology Program, University of Southern California, Los Angeles</institution>
,
<addr-line>CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff4">
<institution>Bioinformatics Division, Tsinghua National Laboratory of Information Science and Technology, Tsinghua University</institution>
,
<addr-line>Beijing</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff5">
<institution>Department of Computer Science and Technology, Tsinghua University</institution>
,
<addr-line>Beijing</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Sun, Fengzhu" sort="Sun, Fengzhu" uniqKey="Sun F" first="Fengzhu" last="Sun">Fengzhu Sun</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<institution>Molecular and Computational Biology Program, University of Southern California, Los Angeles</institution>
,
<addr-line>CA</addr-line>
,
<country>United States</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff6">
<institution>Center for Computational Systems Biology, Fudan University</institution>
,
<addr-line>Shanghai</addr-line>
,
<country>China</country>
</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea># see nlm:aff country strict</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Frontiers in Microbiology</title>
<idno type="eISSN">1664-302X</idno>
<imprint>
<date when="2018">2018</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Comparing metagenomic samples is crucial for understanding microbial communities. For different groups of microbial communities, such as human gut metagenomic samples from patients with a certain disease and healthy controls, identifying
<italic>group-specific</italic>
sequences offers essential information for potential biomarker discovery. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered “
<italic>group-specific</italic>
” in our study. Our main purpose is to discover
<italic>group-specific</italic>
sequence regions between control and case groups as disease-associated markers. We developed a long
<italic>k</italic>
-mer (
<italic>k</italic>
≥ 30 bps)-based computational pipeline to detect
<italic>group-specific</italic>
sequences at strain resolution free from reference sequences, sequence alignments, and metagenome-wide
<italic>de novo</italic>
assembly. We called our method MetaGO:
<italic>Group-specific</italic>
oligonucleotide analysis for metagenomic samples. An open-source pipeline on
<italic>Apache Spark</italic>
was developed with parallel computing. We applied MetaGO to one simulated and three real metagenomic datasets to evaluate the discriminative capability of identified
<italic>group-specific</italic>
markers. In the simulated dataset, 99.11% of
<italic>group-specific</italic>
logical
<italic>40</italic>
-mers covered 98.89%
<italic>disease-specific</italic>
regions from the disease-associated strain. In addition, 97.90% of
<italic>group-specific</italic>
numerical
<italic>40</italic>
-mers covered 99.61 and 96.39% of differentially abundant genome and regions between two groups, respectively. For a large-scale metagenomic liver cirrhosis (LC)-associated dataset, we identified 37,647
<italic>group-specific 40-</italic>
mer features. Any one of the features can predict disease status of the training samples with the average of sensitivity and specificity higher than 0.8. The random forests classification using the top 10
<italic>group-specific</italic>
features yielded a higher AUC (from ∼0.8 to ∼0.9) than that of previous studies. All
<italic>group-specific 40-</italic>
mers were present in LC patients, but not healthy controls. All the assembled 11
<italic>LC-specific</italic>
sequences can be mapped to two strains of
<italic>Veillonella parvula</italic>
: UTDB1-3 and DSM2008. The experiments on the other two real datasets related to Inflammatory Bowel Disease and Type 2 Diabetes in Women consistently demonstrated that MetaGO achieved better prediction accuracy with fewer features compared to previous studies. The experiments showed that MetaGO is a powerful tool for identifying
<italic>group-specific k</italic>
-mers, which would be clinically applicable for disease prediction. MetaGO is available at
<ext-link ext-link-type="uri" xlink:href="https://github.com/VVsmileyx/MetaGO">https://github.com/VVsmileyx/MetaGO</ext-link>
.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Alneberg, J" uniqKey="Alneberg J">J. Alneberg</name>
</author>
<author>
<name sortKey="Bjarnason, B S" uniqKey="Bjarnason B">B. S. Bjarnason</name>
</author>
<author>
<name sortKey="De Bruijn, I" uniqKey="De Bruijn I">I. De Bruijn</name>
</author>
<author>
<name sortKey="Schirmer, M" uniqKey="Schirmer M">M. Schirmer</name>
</author>
<author>
<name sortKey="Quick, J" uniqKey="Quick J">J. Quick</name>
</author>
<author>
<name sortKey="Ijaz, U Z" uniqKey="Ijaz U">U. Z. Ijaz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, S F" uniqKey="Altschul S">S. F. Altschul</name>
</author>
<author>
<name sortKey="Madden, T L" uniqKey="Madden T">T. L. Madden</name>
</author>
<author>
<name sortKey="Sch Ffer, A A" uniqKey="Sch Ffer A">A. A. Schäffer</name>
</author>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J. Zhang</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z. Zhang</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W. Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benoit, G" uniqKey="Benoit G">G. Benoit</name>
</author>
<author>
<name sortKey="Peterlongo, P" uniqKey="Peterlongo P">P. Peterlongo</name>
</author>
<author>
<name sortKey="Mariadassou, M" uniqKey="Mariadassou M">M. Mariadassou</name>
</author>
<author>
<name sortKey="Drezen, E" uniqKey="Drezen E">E. Drezen</name>
</author>
<author>
<name sortKey="Schbath, S" uniqKey="Schbath S">S. Schbath</name>
</author>
<author>
<name sortKey="Lavenier, D" uniqKey="Lavenier D">D. Lavenier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Breiman, L" uniqKey="Breiman L">L. Breiman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Costello, E K" uniqKey="Costello E">E. K. Costello</name>
</author>
<author>
<name sortKey="Lauber, C L" uniqKey="Lauber C">C. L. Lauber</name>
</author>
<author>
<name sortKey="Hamady, M" uniqKey="Hamady M">M. Hamady</name>
</author>
<author>
<name sortKey="Fierer, N" uniqKey="Fierer N">N. Fierer</name>
</author>
<author>
<name sortKey="Gordon, J I" uniqKey="Gordon J">J. I. Gordon</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R. Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cui, H" uniqKey="Cui H">H. Cui</name>
</author>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X. Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feng, Q" uniqKey="Feng Q">Q. Feng</name>
</author>
<author>
<name sortKey="Liang, S" uniqKey="Liang S">S. Liang</name>
</author>
<author>
<name sortKey="Jia, H" uniqKey="Jia H">H. Jia</name>
</author>
<author>
<name sortKey="Stadlmayr, A" uniqKey="Stadlmayr A">A. Stadlmayr</name>
</author>
<author>
<name sortKey="Tang, L" uniqKey="Tang L">L. Tang</name>
</author>
<author>
<name sortKey="Lan, Z" uniqKey="Lan Z">Z. Lan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fofanov, Y" uniqKey="Fofanov Y">Y. Fofanov</name>
</author>
<author>
<name sortKey="Luo, Y" uniqKey="Luo Y">Y. Luo</name>
</author>
<author>
<name sortKey="Katili, C" uniqKey="Katili C">C. Katili</name>
</author>
<author>
<name sortKey="Wang, J" uniqKey="Wang J">J. Wang</name>
</author>
<author>
<name sortKey="Belosludtsev, Y" uniqKey="Belosludtsev Y">Y. Belosludtsev</name>
</author>
<author>
<name sortKey="Powdrill, T" uniqKey="Powdrill T">T. Powdrill</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grabherr, M G" uniqKey="Grabherr M">M. G. Grabherr</name>
</author>
<author>
<name sortKey="Haas, B J" uniqKey="Haas B">B. J. Haas</name>
</author>
<author>
<name sortKey="Yassour, M" uniqKey="Yassour M">M. Yassour</name>
</author>
<author>
<name sortKey="Levin, J Z" uniqKey="Levin J">J. Z. Levin</name>
</author>
<author>
<name sortKey="Thompson, D A" uniqKey="Thompson D">D. A. Thompson</name>
</author>
<author>
<name sortKey="Amit, I" uniqKey="Amit I">I. Amit</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Han, W" uniqKey="Han W">W. Han</name>
</author>
<author>
<name sortKey="Wang, M" uniqKey="Wang M">M. Wang</name>
</author>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y. Ye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huang, X" uniqKey="Huang X">X. Huang</name>
</author>
<author>
<name sortKey="Madan, A" uniqKey="Madan A">A. Madan</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jiang, B" uniqKey="Jiang B">B. Jiang</name>
</author>
<author>
<name sortKey="Song, K" uniqKey="Song K">K. Song</name>
</author>
<author>
<name sortKey="Ren, J" uniqKey="Ren J">J. Ren</name>
</author>
<author>
<name sortKey="Deng, M" uniqKey="Deng M">M. Deng</name>
</author>
<author>
<name sortKey="Sun, F" uniqKey="Sun F">F. Sun</name>
</author>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X. Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jiang, R" uniqKey="Jiang R">R. Jiang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlsson, F H" uniqKey="Karlsson F">F. H. Karlsson</name>
</author>
<author>
<name sortKey="Tremaroli, V" uniqKey="Tremaroli V">V. Tremaroli</name>
</author>
<author>
<name sortKey="Nookaew, I" uniqKey="Nookaew I">I. Nookaew</name>
</author>
<author>
<name sortKey="Bergstrom, G" uniqKey="Bergstrom G">G. Bergström</name>
</author>
<author>
<name sortKey="Behre, C J" uniqKey="Behre C">C. J. Behre</name>
</author>
<author>
<name sortKey="Fagerberg, B" uniqKey="Fagerberg B">B. Fagerberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kunin, V" uniqKey="Kunin V">V. Kunin</name>
</author>
<author>
<name sortKey="Copeland, A" uniqKey="Copeland A">A. Copeland</name>
</author>
<author>
<name sortKey="Lapidus, A" uniqKey="Lapidus A">A. Lapidus</name>
</author>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K. Mavromatis</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P. Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Le, V V" uniqKey="Le V">V. V. Le</name>
</author>
<author>
<name sortKey="Lang, T V" uniqKey="Lang T">T. V. Lang</name>
</author>
<author>
<name sortKey="Le, T B" uniqKey="Le T">T. B. Le</name>
</author>
<author>
<name sortKey="Hoai, T V" uniqKey="Hoai T">T. V. Hoai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, D" uniqKey="Li D">D. Li</name>
</author>
<author>
<name sortKey="Liu, C M" uniqKey="Liu C">C.-M. Liu</name>
</author>
<author>
<name sortKey="Luo, R" uniqKey="Luo R">R. Luo</name>
</author>
<author>
<name sortKey="Sadakane, K" uniqKey="Sadakane K">K. Sadakane</name>
</author>
<author>
<name sortKey="Lam, T W" uniqKey="Lam T">T.-W. Lam</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, R" uniqKey="Li R">R. Li</name>
</author>
<author>
<name sortKey="Zhu, H" uniqKey="Zhu H">H. Zhu</name>
</author>
<author>
<name sortKey="Ruan, J" uniqKey="Ruan J">J. Ruan</name>
</author>
<author>
<name sortKey="Qian, W" uniqKey="Qian W">W. Qian</name>
</author>
<author>
<name sortKey="Fang, X" uniqKey="Fang X">X. Fang</name>
</author>
<author>
<name sortKey="Shi, Z" uniqKey="Shi Z">Z. Shi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liao, W" uniqKey="Liao W">W. Liao</name>
</author>
<author>
<name sortKey="Ren, J" uniqKey="Ren J">J. Ren</name>
</author>
<author>
<name sortKey="Wang, K" uniqKey="Wang K">K. Wang</name>
</author>
<author>
<name sortKey="Wang, S" uniqKey="Wang S">S. Wang</name>
</author>
<author>
<name sortKey="Zeng, F" uniqKey="Zeng F">F. Zeng</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lozupone, C A" uniqKey="Lozupone C">C. A. Lozupone</name>
</author>
<author>
<name sortKey="Stombaugh, J" uniqKey="Stombaugh J">J. Stombaugh</name>
</author>
<author>
<name sortKey="Gonzalez, A" uniqKey="Gonzalez A">A. Gonzalez</name>
</author>
<author>
<name sortKey="Ackermann, G" uniqKey="Ackermann G">G. Ackermann</name>
</author>
<author>
<name sortKey="Wendel, D" uniqKey="Wendel D">D. Wendel</name>
</author>
<author>
<name sortKey="Vazquez Baeza, Y" uniqKey="Vazquez Baeza Y">Y. Vázquez-Baeza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lu, Y Y" uniqKey="Lu Y">Y. Y. Lu</name>
</author>
<author>
<name sortKey="Chen, T" uniqKey="Chen T">T. Chen</name>
</author>
<author>
<name sortKey="Fuhrman, J A" uniqKey="Fuhrman J">J. A. Fuhrman</name>
</author>
<author>
<name sortKey="Sun, F" uniqKey="Sun F">F. Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marcais, G" uniqKey="Marcais G">G. Marçais</name>
</author>
<author>
<name sortKey="Kingsford, C" uniqKey="Kingsford C">C. Kingsford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nielsen, H B" uniqKey="Nielsen H">H. B. Nielsen</name>
</author>
<author>
<name sortKey="Almeida, M" uniqKey="Almeida M">M. Almeida</name>
</author>
<author>
<name sortKey="Juncker, A S" uniqKey="Juncker A">A. S. Juncker</name>
</author>
<author>
<name sortKey="Rasmussen, S" uniqKey="Rasmussen S">S. Rasmussen</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J. Li</name>
</author>
<author>
<name sortKey="Sunagawa, S" uniqKey="Sunagawa S">S. Sunagawa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Papudeshi, B" uniqKey="Papudeshi B">B. Papudeshi</name>
</author>
<author>
<name sortKey="Haggerty, J M" uniqKey="Haggerty J">J. M. Haggerty</name>
</author>
<author>
<name sortKey="Doane, M" uniqKey="Doane M">M. Doane</name>
</author>
<author>
<name sortKey="Morris, M M" uniqKey="Morris M">M. M. Morris</name>
</author>
<author>
<name sortKey="Walsh, K" uniqKey="Walsh K">K. Walsh</name>
</author>
<author>
<name sortKey="Beattie, D T" uniqKey="Beattie D">D. T. Beattie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pasolli, E" uniqKey="Pasolli E">E. Pasolli</name>
</author>
<author>
<name sortKey="Schiffer, L" uniqKey="Schiffer L">L. Schiffer</name>
</author>
<author>
<name sortKey="Manghi, P" uniqKey="Manghi P">P. Manghi</name>
</author>
<author>
<name sortKey="Renson, A" uniqKey="Renson A">A. Renson</name>
</author>
<author>
<name sortKey="Obenchain, V" uniqKey="Obenchain V">V. Obenchain</name>
</author>
<author>
<name sortKey="Truong, D T" uniqKey="Truong D">D. T. Truong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pasolli, E" uniqKey="Pasolli E">E. Pasolli</name>
</author>
<author>
<name sortKey="Truong, D T" uniqKey="Truong D">D. T. Truong</name>
</author>
<author>
<name sortKey="Malik, F" uniqKey="Malik F">F. Malik</name>
</author>
<author>
<name sortKey="Waldron, L" uniqKey="Waldron L">L. Waldron</name>
</author>
<author>
<name sortKey="Segata, N" uniqKey="Segata N">N. Segata</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paulus, W" uniqKey="Paulus W">W. Paulus</name>
</author>
<author>
<name sortKey="Jellinger, K" uniqKey="Jellinger K">K. Jellinger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J. Qin</name>
</author>
<author>
<name sortKey="Li, R" uniqKey="Li R">R. Li</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J. Raes</name>
</author>
<author>
<name sortKey="Arumugam, M" uniqKey="Arumugam M">M. Arumugam</name>
</author>
<author>
<name sortKey="Burgdorf, K S" uniqKey="Burgdorf K">K. S. Burgdorf</name>
</author>
<author>
<name sortKey="Manichanh, C" uniqKey="Manichanh C">C. Manichanh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J. Qin</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y. Li</name>
</author>
<author>
<name sortKey="Cai, Z" uniqKey="Cai Z">Z. Cai</name>
</author>
<author>
<name sortKey="Li, S" uniqKey="Li S">S. Li</name>
</author>
<author>
<name sortKey="Zhu, J" uniqKey="Zhu J">J. Zhu</name>
</author>
<author>
<name sortKey="Zhang, F" uniqKey="Zhang F">F. Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, N" uniqKey="Qin N">N. Qin</name>
</author>
<author>
<name sortKey="Yang, F" uniqKey="Yang F">F. Yang</name>
</author>
<author>
<name sortKey="Li, A" uniqKey="Li A">A. Li</name>
</author>
<author>
<name sortKey="Prifti, E" uniqKey="Prifti E">E. Prifti</name>
</author>
<author>
<name sortKey="Chen, Y" uniqKey="Chen Y">Y. Chen</name>
</author>
<author>
<name sortKey="Shao, L" uniqKey="Shao L">L. Shao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quast, C" uniqKey="Quast C">C. Quast</name>
</author>
<author>
<name sortKey="Pruesse, E" uniqKey="Pruesse E">E. Pruesse</name>
</author>
<author>
<name sortKey="Yilmaz, P" uniqKey="Yilmaz P">P. Yilmaz</name>
</author>
<author>
<name sortKey="Gerken, J" uniqKey="Gerken J">J. Gerken</name>
</author>
<author>
<name sortKey="Schweer, T" uniqKey="Schweer T">T. Schweer</name>
</author>
<author>
<name sortKey="Yarza, P" uniqKey="Yarza P">P. Yarza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ren, J" uniqKey="Ren J">J. Ren</name>
</author>
<author>
<name sortKey="Ahlgren, N A" uniqKey="Ahlgren N">N. A. Ahlgren</name>
</author>
<author>
<name sortKey="Lu, Y Y" uniqKey="Lu Y">Y. Y. Lu</name>
</author>
<author>
<name sortKey="Fuhrman, J A" uniqKey="Fuhrman J">J. A. Fuhrman</name>
</author>
<author>
<name sortKey="Sun, F" uniqKey="Sun F">F. Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richter, D C" uniqKey="Richter D">D. C. Richter</name>
</author>
<author>
<name sortKey="Ott, F" uniqKey="Ott F">F. Ott</name>
</author>
<author>
<name sortKey="Auch, A F" uniqKey="Auch A">A. F. Auch</name>
</author>
<author>
<name sortKey="Schmid, R" uniqKey="Schmid R">R. Schmid</name>
</author>
<author>
<name sortKey="Huson, D H" uniqKey="Huson D">D. H. Huson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rizk, G" uniqKey="Rizk G">G. Rizk</name>
</author>
<author>
<name sortKey="Lavenier, D" uniqKey="Lavenier D">D. Lavenier</name>
</author>
<author>
<name sortKey="Chikhi, R" uniqKey="Chikhi R">R. Chikhi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sangwan, N" uniqKey="Sangwan N">N. Sangwan</name>
</author>
<author>
<name sortKey="Xia, F" uniqKey="Xia F">F. Xia</name>
</author>
<author>
<name sortKey="Gilbert, J A" uniqKey="Gilbert J">J. A. Gilbert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sczyrba, A" uniqKey="Sczyrba A">A. Sczyrba</name>
</author>
<author>
<name sortKey="Hofmann, P" uniqKey="Hofmann P">P. Hofmann</name>
</author>
<author>
<name sortKey="Belmann, P" uniqKey="Belmann P">P. Belmann</name>
</author>
<author>
<name sortKey="Koslicki, D" uniqKey="Koslicki D">D. Koslicki</name>
</author>
<author>
<name sortKey="Janssen, S" uniqKey="Janssen S">S. Janssen</name>
</author>
<author>
<name sortKey="Droge, J" uniqKey="Droge J">J. Dröge</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Segata, N" uniqKey="Segata N">N. Segata</name>
</author>
<author>
<name sortKey="Izard, J" uniqKey="Izard J">J. Izard</name>
</author>
<author>
<name sortKey="Waldron, L" uniqKey="Waldron L">L. Waldron</name>
</author>
<author>
<name sortKey="Gevers, D" uniqKey="Gevers D">D. Gevers</name>
</author>
<author>
<name sortKey="Miropolsky, L" uniqKey="Miropolsky L">L. Miropolsky</name>
</author>
<author>
<name sortKey="Garrett, W S" uniqKey="Garrett W">W. S. Garrett</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
<author>
<name sortKey="Lei, X" uniqKey="Lei X">X. Lei</name>
</author>
<author>
<name sortKey="Wang, S" uniqKey="Wang S">S. Wang</name>
</author>
<author>
<name sortKey="Wang, Z" uniqKey="Wang Z">Z. Wang</name>
</author>
<author>
<name sortKey="Song, N" uniqKey="Song N">N. Song</name>
</author>
<author>
<name sortKey="Zeng, F" uniqKey="Zeng F">F. Zeng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
<author>
<name sortKey="Liu, L" uniqKey="Liu L">L. Liu</name>
</author>
<author>
<name sortKey="Chen, L" uniqKey="Chen L">L. Chen</name>
</author>
<author>
<name sortKey="Chen, T" uniqKey="Chen T">T. Chen</name>
</author>
<author>
<name sortKey="Sun, F" uniqKey="Sun F">F. Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
<author>
<name sortKey="Wang, K" uniqKey="Wang K">K. Wang</name>
</author>
<author>
<name sortKey="Lu, Y Y" uniqKey="Lu Y">Y. Y. Lu</name>
</author>
<author>
<name sortKey="Sun, F" uniqKey="Sun F">F. Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wen, C" uniqKey="Wen C">C. Wen</name>
</author>
<author>
<name sortKey="Zheng, Z" uniqKey="Zheng Z">Z. Zheng</name>
</author>
<author>
<name sortKey="Shao, T" uniqKey="Shao T">T. Shao</name>
</author>
<author>
<name sortKey="Lin, L" uniqKey="Lin L">L. Lin</name>
</author>
<author>
<name sortKey="Xie, Z" uniqKey="Xie Z">Z. Xie</name>
</author>
<author>
<name sortKey="Chatelier, E L" uniqKey="Chatelier E">E. L. Chatelier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="White, J R" uniqKey="White J">J. R. White</name>
</author>
<author>
<name sortKey="Nagarajan, N" uniqKey="Nagarajan N">N. Nagarajan</name>
</author>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M. Pop</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wiest, R" uniqKey="Wiest R">R. Wiest</name>
</author>
<author>
<name sortKey="Lawson, M" uniqKey="Lawson M">M. Lawson</name>
</author>
<author>
<name sortKey="Geuking, M" uniqKey="Geuking M">M. Geuking</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, Y W" uniqKey="Wu Y">Y.-W. Wu</name>
</author>
<author>
<name sortKey="Simmons, B A" uniqKey="Simmons B">B. A. Simmons</name>
</author>
<author>
<name sortKey="Singer, S W" uniqKey="Singer S">S. W. Singer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Xing, X" uniqKey="Xing X">X. Xing</name>
</author>
<author>
<name sortKey="Liu, J S" uniqKey="Liu J">J. S. Liu</name>
</author>
<author>
<name sortKey="Zhong, W" uniqKey="Zhong W">W. Zhong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yatsunenko, T" uniqKey="Yatsunenko T">T. Yatsunenko</name>
</author>
<author>
<name sortKey="Rey, F E" uniqKey="Rey F">F. E. Rey</name>
</author>
<author>
<name sortKey="Manary, M J" uniqKey="Manary M">M. J. Manary</name>
</author>
<author>
<name sortKey="Trehan, I" uniqKey="Trehan I">I. Trehan</name>
</author>
<author>
<name sortKey="Dominguez Bello, M G" uniqKey="Dominguez Bello M">M. G. Dominguez-Bello</name>
</author>
<author>
<name sortKey="Contreras, M" uniqKey="Contreras M">M. Contreras</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zaharia, M" uniqKey="Zaharia M">M. Zaharia</name>
</author>
<author>
<name sortKey="Chowdhury, M" uniqKey="Chowdhury M">M. Chowdhury</name>
</author>
<author>
<name sortKey="Franklin, M J" uniqKey="Franklin M">M. J. Franklin</name>
</author>
<author>
<name sortKey="Shenker, S" uniqKey="Shenker S">S. Shenker</name>
</author>
<author>
<name sortKey="Stoica, I" uniqKey="Stoica I">I. Stoica</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X. Zhang</name>
</author>
<author>
<name sortKey="Lu, X" uniqKey="Lu X">X. Lu</name>
</author>
<author>
<name sortKey="Shi, Q" uniqKey="Shi Q">Q. Shi</name>
</author>
<author>
<name sortKey="Xu, X Q" uniqKey="Xu X">X. Q. Xu</name>
</author>
<author>
<name sortKey="Leung, H C" uniqKey="Leung H">H. C. Leung</name>
</author>
<author>
<name sortKey="Harris, L N" uniqKey="Harris L">L. N. Harris</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Front Microbiol</journal-id>
<journal-id journal-id-type="iso-abbrev">Front Microbiol</journal-id>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title-group>
<journal-title>Frontiers in Microbiology</journal-title>
</journal-title-group>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">29774017</article-id>
<article-id pub-id-type="pmc">5943621</article-id>
<article-id pub-id-type="doi">10.3389/fmicb.2018.00872</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Identifying
<italic>Group-Specific</italic>
Sequences for Microbial Communities Using Long
<italic>k</italic>
-mer Sequence Signatures</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Ying</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">
<sup>*</sup>
</xref>
<uri xlink:type="simple" xlink:href="http://loop.frontiersin.org/people/452253/overview"></uri>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Fu</surname>
<given-names>Lei</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:type="simple" xlink:href="http://loop.frontiersin.org/people/554628/overview"></uri>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ren</surname>
<given-names>Jie</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:type="simple" xlink:href="http://loop.frontiersin.org/people/547468/overview"></uri>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yu</surname>
<given-names>Zhaoxia</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<uri xlink:type="simple" xlink:href="http://loop.frontiersin.org/people/34164/overview"></uri>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Chen</surname>
<given-names>Ting</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sun</surname>
<given-names>Fengzhu</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff6">
<sup>6</sup>
</xref>
<xref ref-type="corresp" rid="c001">
<sup>*</sup>
</xref>
<uri xlink:type="simple" xlink:href="http://loop.frontiersin.org/people/47840/overview"></uri>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Department of Automation, Xiamen University</institution>
,
<addr-line>Xiamen</addr-line>
,
<country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Molecular and Computational Biology Program, University of Southern California, Los Angeles</institution>
,
<addr-line>CA</addr-line>
,
<country>United States</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Department of Statistics, University of California, Irvine</institution>
,
<addr-line>Irvine, CA</addr-line>
,
<country>United States</country>
</aff>
<aff id="aff4">
<sup>4</sup>
<institution>Bioinformatics Division, Tsinghua National Laboratory of Information Science and Technology, Tsinghua University</institution>
,
<addr-line>Beijing</addr-line>
,
<country>China</country>
</aff>
<aff id="aff5">
<sup>5</sup>
<institution>Department of Computer Science and Technology, Tsinghua University</institution>
,
<addr-line>Beijing</addr-line>
,
<country>China</country>
</aff>
<aff id="aff6">
<sup>6</sup>
<institution>Center for Computational Systems Biology, Fudan University</institution>
,
<addr-line>Shanghai</addr-line>
,
<country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Jessica Galloway-Pena, The University of Texas MD Anderson Cancer Center, United States</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Wenxuan Zhong, University of Georgia, United States; Jonathan Badger, National Cancer Institute (NCI), United States</p>
</fn>
<corresp id="c001">*Correspondence: Ying Wang,
<email>wangying@xmu.edu.cn</email>
Fengzhu Sun,
<email>fsun@usc.edu</email>
;
<email>fsun@dornsife.usc.edu</email>
</corresp>
<fn fn-type="other" id="fn002">
<p>This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>03</day>
<month>5</month>
<year>2018</year>
</pub-date>
<pub-date pub-type="collection">
<year>2018</year>
</pub-date>
<volume>9</volume>
<elocation-id>872</elocation-id>
<history>
<date date-type="received">
<day>15</day>
<month>11</month>
<year>2017</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>4</month>
<year>2018</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2018 Wang, Fu, Ren, Yu, Chen and Sun.</copyright-statement>
<copyright-year>2018</copyright-year>
<copyright-holder>Wang, Fu, Ren, Yu, Chen and Sun</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</license-p>
</license>
</permissions>
<abstract>
<p>Comparing metagenomic samples is crucial for understanding microbial communities. For different groups of microbial communities, such as human gut metagenomic samples from patients with a certain disease and healthy controls, identifying
<italic>group-specific</italic>
sequences offers essential information for potential biomarker discovery. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered “
<italic>group-specific</italic>
” in our study. Our main purpose is to discover
<italic>group-specific</italic>
sequence regions between control and case groups as disease-associated markers. We developed a long
<italic>k</italic>
-mer (
<italic>k</italic>
≥ 30 bps)-based computational pipeline to detect
<italic>group-specific</italic>
sequences at strain resolution free from reference sequences, sequence alignments, and metagenome-wide
<italic>de novo</italic>
assembly. We called our method MetaGO:
<italic>Group-specific</italic>
oligonucleotide analysis for metagenomic samples. An open-source pipeline on
<italic>Apache Spark</italic>
was developed with parallel computing. We applied MetaGO to one simulated and three real metagenomic datasets to evaluate the discriminative capability of identified
<italic>group-specific</italic>
markers. In the simulated dataset, 99.11% of
<italic>group-specific</italic>
logical
<italic>40</italic>
-mers covered 98.89%
<italic>disease-specific</italic>
regions from the disease-associated strain. In addition, 97.90% of
<italic>group-specific</italic>
numerical
<italic>40</italic>
-mers covered 99.61 and 96.39% of differentially abundant genome and regions between two groups, respectively. For a large-scale metagenomic liver cirrhosis (LC)-associated dataset, we identified 37,647
<italic>group-specific 40-</italic>
mer features. Any one of the features can predict disease status of the training samples with the average of sensitivity and specificity higher than 0.8. The random forests classification using the top 10
<italic>group-specific</italic>
features yielded a higher AUC (from ∼0.8 to ∼0.9) than that of previous studies. All
<italic>group-specific 40-</italic>
mers were present in LC patients, but not healthy controls. All the assembled 11
<italic>LC-specific</italic>
sequences can be mapped to two strains of
<italic>Veillonella parvula</italic>
: UTDB1-3 and DSM2008. The experiments on the other two real datasets related to Inflammatory Bowel Disease and Type 2 Diabetes in Women consistently demonstrated that MetaGO achieved better prediction accuracy with fewer features compared to previous studies. The experiments showed that MetaGO is a powerful tool for identifying
<italic>group-specific k</italic>
-mers, which would be clinically applicable for disease prediction. MetaGO is available at
<ext-link ext-link-type="uri" xlink:href="https://github.com/VVsmileyx/MetaGO">https://github.com/VVsmileyx/MetaGO</ext-link>
.</p>
</abstract>
<kwd-group>
<kwd>long
<italic>k</italic>
-mer</kwd>
<kwd>classification</kwd>
<kwd>
<italic>group-specific</italic>
sequence</kwd>
<kwd>metagenomics</kwd>
<kwd>microbial community</kwd>
<kwd>disease prediction</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source id="cn001">National Natural Science Foundation of China
<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</funding-source>
<award-id rid="cn001">61673324</award-id>
<award-id rid="cn001">61673324</award-id>
<award-id rid="cn001">61561146396</award-id>
</award-group>
<award-group>
<funding-source id="cn002">National Science Foundation
<named-content content-type="fundref-id">10.13039/100000001</named-content>
</funding-source>
<award-id rid="cn002">DMS-1518001</award-id>
</award-group>
<award-group>
<funding-source id="cn003">Foundation for the National Institutes of Health
<named-content content-type="fundref-id">10.13039/100000009</named-content>
</funding-source>
<award-id rid="cn003">R01GM120624</award-id>
</award-group>
<award-group>
<funding-source id="cn004">Natural Science Foundation of Fujian Province
<named-content content-type="fundref-id">10.13039/501100003392</named-content>
</funding-source>
<award-id rid="cn004">2016J01316</award-id>
</award-group>
<award-group>
<funding-source id="cn005">China Scholarship Council
<named-content content-type="fundref-id">10.13039/501100004543</named-content>
</funding-source>
<award-id rid="cn005">201606315011</award-id>
</award-group>
</funding-group>
<counts>
<fig-count count="6"></fig-count>
<table-count count="4"></table-count>
<equation-count count="0"></equation-count>
<ref-count count="49"></ref-count>
<page-count count="18"></page-count>
<word-count count="0"></word-count>
</counts>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C80 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000C80 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:5943621
   |texte=   Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:29774017" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021