Serveur d'exploration sur les relations entre la France et l'Australie

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 002947 ( Pmc/Corpus ); précédent : 0029469; suivant : 0029480 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">MixMC: A Multivariate Statistical Framework to Gain Insight into Microbial Communities</title>
<author>
<name sortKey="Le Cao, Kim Anh" sort="Le Cao, Kim Anh" uniqKey="Le Cao K" first="Kim-Anh" last="Lê Cao">Kim-Anh Lê Cao</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Costello, Mary Ellen" sort="Costello, Mary Ellen" uniqKey="Costello M" first="Mary-Ellen" last="Costello">Mary-Ellen Costello</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lakis, Vanessa Anne" sort="Lakis, Vanessa Anne" uniqKey="Lakis V" first="Vanessa Anne" last="Lakis">Vanessa Anne Lakis</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bartolo, Francois" sort="Bartolo, Francois" uniqKey="Bartolo F" first="François" last="Bartolo">François Bartolo</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Institut de Mathématiques de Toulouse, UMR CNRS 5219 INSA Université de Toulouse, Toulouse, France</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chua, Xin Yi" sort="Chua, Xin Yi" uniqKey="Chua X" first="Xin-Yi" last="Chua">Xin-Yi Chua</name>
<affiliation>
<nlm:aff id="aff003">
<addr-line>Queensland Facility for Advanced Bioinformatics, The Institute for Molecular Bioscience, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Brazeilles, Remi" sort="Brazeilles, Remi" uniqKey="Brazeilles R" first="Rémi" last="Brazeilles">Rémi Brazeilles</name>
<affiliation>
<nlm:aff id="aff004">
<addr-line>Danone Nutricia Research, Palaiseau Cedex, France</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rondeau, Pascale" sort="Rondeau, Pascale" uniqKey="Rondeau P" first="Pascale" last="Rondeau">Pascale Rondeau</name>
<affiliation>
<nlm:aff id="aff004">
<addr-line>Danone Nutricia Research, Palaiseau Cedex, France</addr-line>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27513472</idno>
<idno type="pmc">4981383</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4981383</idno>
<idno type="RBID">PMC:4981383</idno>
<idno type="doi">10.1371/journal.pone.0160169</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">002947</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">002947</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">MixMC: A Multivariate Statistical Framework to Gain Insight into Microbial Communities</title>
<author>
<name sortKey="Le Cao, Kim Anh" sort="Le Cao, Kim Anh" uniqKey="Le Cao K" first="Kim-Anh" last="Lê Cao">Kim-Anh Lê Cao</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Costello, Mary Ellen" sort="Costello, Mary Ellen" uniqKey="Costello M" first="Mary-Ellen" last="Costello">Mary-Ellen Costello</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Lakis, Vanessa Anne" sort="Lakis, Vanessa Anne" uniqKey="Lakis V" first="Vanessa Anne" last="Lakis">Vanessa Anne Lakis</name>
<affiliation>
<nlm:aff id="aff001">
<addr-line>The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bartolo, Francois" sort="Bartolo, Francois" uniqKey="Bartolo F" first="François" last="Bartolo">François Bartolo</name>
<affiliation>
<nlm:aff id="aff002">
<addr-line>Institut de Mathématiques de Toulouse, UMR CNRS 5219 INSA Université de Toulouse, Toulouse, France</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chua, Xin Yi" sort="Chua, Xin Yi" uniqKey="Chua X" first="Xin-Yi" last="Chua">Xin-Yi Chua</name>
<affiliation>
<nlm:aff id="aff003">
<addr-line>Queensland Facility for Advanced Bioinformatics, The Institute for Molecular Bioscience, Brisbane, QLD, Australia</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Brazeilles, Remi" sort="Brazeilles, Remi" uniqKey="Brazeilles R" first="Rémi" last="Brazeilles">Rémi Brazeilles</name>
<affiliation>
<nlm:aff id="aff004">
<addr-line>Danone Nutricia Research, Palaiseau Cedex, France</addr-line>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rondeau, Pascale" sort="Rondeau, Pascale" uniqKey="Rondeau P" first="Pascale" last="Rondeau">Pascale Rondeau</name>
<affiliation>
<nlm:aff id="aff004">
<addr-line>Danone Nutricia Research, Palaiseau Cedex, France</addr-line>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Culture independent techniques, such as shotgun metagenomics and 16S rRNA amplicon sequencing have dramatically changed the way we can examine microbial communities. Recently, changes in microbial community structure and dynamics have been associated with a growing list of human diseases. The identification and comparison of bacteria driving those changes requires the development of sound statistical tools, especially if microbial biomarkers are to be used in a clinical setting. We present
<monospace>mixMC</monospace>
, a novel multivariate data analysis framework for metagenomic biomarker discovery.
<monospace>mixMC</monospace>
accounts for the compositional nature of 16S data and enables detection of subtle differences when high inter-subject variability is present due to microbial sampling performed repeatedly on the same subjects, but in multiple habitats. Through data dimension reduction the multivariate methods provide insightful graphical visualisations to characterise each type of environment in a detailed manner. We applied
<monospace>mixMC</monospace>
to 16S microbiome studies focusing on multiple body sites in healthy individuals, compared our results with existing statistical tools and illustrated added value of using multivariate methodologies to fully characterise and compare microbial communities.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Clarridge, J E" uniqKey="Clarridge J">J.E. Clarridge</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huse, S M" uniqKey="Huse S">S.M. Huse</name>
</author>
<author>
<name sortKey="Welch, D M" uniqKey="Welch D">D.M. Welch</name>
</author>
<author>
<name sortKey="Morrison, H G" uniqKey="Morrison H">H.G. Morrison</name>
</author>
<author>
<name sortKey="Sogin, M L" uniqKey="Sogin M">M.L. Sogin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turnbaugh, P J" uniqKey="Turnbaugh P">P.J. Turnbaugh</name>
</author>
<author>
<name sortKey="B Ckhed, F" uniqKey="B Ckhed F">F. Bäckhed</name>
</author>
<author>
<name sortKey="Fulton, L" uniqKey="Fulton L">L. Fulton</name>
</author>
<author>
<name sortKey="Gordon, J I" uniqKey="Gordon J">J.I. Gordon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turnbaugh, P J" uniqKey="Turnbaugh P">P.J. Turnbaugh</name>
</author>
<author>
<name sortKey="Hamady, M" uniqKey="Hamady M">M. Hamady</name>
</author>
<author>
<name sortKey="Yatsunenko, T" uniqKey="Yatsunenko T">T. Yatsunenko</name>
</author>
<author>
<name sortKey="Cantarel, B L" uniqKey="Cantarel B">B.L. Cantarel</name>
</author>
<author>
<name sortKey="Duncan, A" uniqKey="Duncan A">A. Duncan</name>
</author>
<author>
<name sortKey="Ley" uniqKey="Ley">Ley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Duncan, S H" uniqKey="Duncan S">S.H. Duncan</name>
</author>
<author>
<name sortKey="Lobley, G" uniqKey="Lobley G">G. Lobley</name>
</author>
<author>
<name sortKey="Holtrop, G" uniqKey="Holtrop G">G. Holtrop</name>
</author>
<author>
<name sortKey="Ince, J" uniqKey="Ince J">J. Ince</name>
</author>
<author>
<name sortKey="Johnstone, A" uniqKey="Johnstone A">A. Johnstone</name>
</author>
<author>
<name sortKey="Louis, P" uniqKey="Louis P">P. Louis</name>
</author>
<author>
<name sortKey="Flint, H" uniqKey="Flint H">H. Flint</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gevers, D" uniqKey="Gevers D">D. Gevers</name>
</author>
<author>
<name sortKey="Kugathasan, S" uniqKey="Kugathasan S">S. Kugathasan</name>
</author>
<author>
<name sortKey="Denson, L A" uniqKey="Denson L">L.A. Denson</name>
</author>
<author>
<name sortKey="V Zquez Baeza, Y" uniqKey="V Zquez Baeza Y">Y. V´zquez-Baeza</name>
</author>
<author>
<name sortKey="Van Treuren, W" uniqKey="Van Treuren W">W. Van Treuren</name>
</author>
<author>
<name sortKey="Ren" uniqKey="Ren">Ren</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Costello, M E" uniqKey="Costello M">M.-E. Costello</name>
</author>
<author>
<name sortKey="Ciccia, F" uniqKey="Ciccia F">F. Ciccia</name>
</author>
<author>
<name sortKey="Willner, D" uniqKey="Willner D">D. Willner</name>
</author>
<author>
<name sortKey="Warrington, N" uniqKey="Warrington N">N. Warrington</name>
</author>
<author>
<name sortKey="Robinson, P C" uniqKey="Robinson P">P.C. Robinson</name>
</author>
<author>
<name sortKey="Gardiner, B" uniqKey="Gardiner B">B. Gardiner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="White, J R" uniqKey="White J">J.R. White</name>
</author>
<author>
<name sortKey="Nagarajan, N" uniqKey="Nagarajan N">N. Nagarajan</name>
</author>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M. Pop</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paulson, J N" uniqKey="Paulson J">J.N. Paulson</name>
</author>
<author>
<name sortKey="Stine, O C" uniqKey="Stine O">O.C. Stine</name>
</author>
<author>
<name sortKey="Bravo, H C" uniqKey="Bravo H">H.C. Bravo</name>
</author>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M. Pop</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aitchison, J" uniqKey="Aitchison J">J. Aitchison</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lovell, D" uniqKey="Lovell D">D. Lovell</name>
</author>
<author>
<name sortKey="Pawlowsky Glahn, V" uniqKey="Pawlowsky Glahn V">V. Pawlowsky-Glahn</name>
</author>
<author>
<name sortKey="Egozcue, J J" uniqKey="Egozcue J">J.J. Egozcue</name>
</author>
<author>
<name sortKey="Marguerat, S" uniqKey="Marguerat S">S. Marguerat</name>
</author>
<author>
<name sortKey="B Hler, J" uniqKey="B Hler J">J. Bähler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ban, Y" uniqKey="Ban Y">Y. Ban</name>
</author>
<author>
<name sortKey="An, L" uniqKey="An L">L. An</name>
</author>
<author>
<name sortKey="Jiang, H" uniqKey="Jiang H">H. Jiang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kurtz, Z D" uniqKey="Kurtz Z">Z.D. Kurtz</name>
</author>
<author>
<name sortKey="Mueller, C L" uniqKey="Mueller C">C.L. Mueller</name>
</author>
<author>
<name sortKey="Miraldi, E R" uniqKey="Miraldi E">E.R. Miraldi</name>
</author>
<author>
<name sortKey="Littman, D R" uniqKey="Littman D">D.R. Littman</name>
</author>
<author>
<name sortKey="Blaser, M J" uniqKey="Blaser M">M.J. Blaser</name>
</author>
<author>
<name sortKey="Bonneau, R A" uniqKey="Bonneau R">R.A. Bonneau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mandal, S" uniqKey="Mandal S">S. Mandal</name>
</author>
<author>
<name sortKey="Van Treuren, W" uniqKey="Van Treuren W">W. Van Treuren</name>
</author>
<author>
<name sortKey="White, R A" uniqKey="White R">R.A. White</name>
</author>
<author>
<name sortKey="Eggesbo, M" uniqKey="Eggesbo M">M. Eggesbo</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R. Knight</name>
</author>
<author>
<name sortKey="Peddada, S D" uniqKey="Peddada S">S.D. Peddada</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fernandes, A D" uniqKey="Fernandes A">A.D. Fernandes</name>
</author>
<author>
<name sortKey="Reid, J N" uniqKey="Reid J">J.N. Reid</name>
</author>
<author>
<name sortKey="Macklaim, J M" uniqKey="Macklaim J">J.M. Macklaim</name>
</author>
<author>
<name sortKey="Mcmurrough, T A" uniqKey="Mcmurrough T">T.A. McMurrough</name>
</author>
<author>
<name sortKey="Edgell, D R" uniqKey="Edgell D">D.R. Edgell</name>
</author>
<author>
<name sortKey="Gloor, G B" uniqKey="Gloor G">G.B. Gloor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kalivodov, A" uniqKey="Kalivodov A">A. Kalivodov´</name>
</author>
<author>
<name sortKey="Hron, K" uniqKey="Hron K">K. Hron</name>
</author>
<author>
<name sortKey="Filzmoser, P" uniqKey="Filzmoser P">P. Filzmoser</name>
</author>
<author>
<name sortKey="Najdekr, L" uniqKey="Najdekr L">L. Najdekr</name>
</author>
<author>
<name sortKey="Janeckov, H" uniqKey="Janeckov H">H. Janeckov´</name>
</author>
<author>
<name sortKey="Adam, T" uniqKey="Adam T">T. Adam</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bray, J R" uniqKey="Bray J">J.R. Bray</name>
</author>
<author>
<name sortKey="Curtis, J T" uniqKey="Curtis J">J.T. Curtis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lozupone, C" uniqKey="Lozupone C">C. Lozupone</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R. Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lozupone, C A" uniqKey="Lozupone C">C.A. Lozupone</name>
</author>
<author>
<name sortKey="Hamady, M" uniqKey="Hamady M">M. Hamady</name>
</author>
<author>
<name sortKey="Kelley, S T" uniqKey="Kelley S">S.T. Kelley</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R. Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Doledec, S" uniqKey="Doledec S">S. Dolédec</name>
</author>
<author>
<name sortKey="Chessel, D" uniqKey="Chessel D">D. Chessel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Segata, N" uniqKey="Segata N">N. Segata</name>
</author>
<author>
<name sortKey="Izard, J" uniqKey="Izard J">J. Izard</name>
</author>
<author>
<name sortKey="Waldron, L" uniqKey="Waldron L">L. Waldron</name>
</author>
<author>
<name sortKey="Gevers, D" uniqKey="Gevers D">D. Gevers</name>
</author>
<author>
<name sortKey="Miropolsky, L" uniqKey="Miropolsky L">L. Miropolsky</name>
</author>
<author>
<name sortKey="Garrett, W S" uniqKey="Garrett W">W.S. Garrett</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Koren, O" uniqKey="Koren O">O. Koren</name>
</author>
<author>
<name sortKey="Spor, A" uniqKey="Spor A">A. Spor</name>
</author>
<author>
<name sortKey="Felin, J" uniqKey="Felin J">J. Felin</name>
</author>
<author>
<name sortKey="Fak, F" uniqKey="Fak F">F. Fak</name>
</author>
<author>
<name sortKey="Stombaugh, J" uniqKey="Stombaugh J">J. Stombaugh</name>
</author>
<author>
<name sortKey="Tremaroli, V" uniqKey="Tremaroli V">V. Tremaroli</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Caporaso, J G" uniqKey="Caporaso J">J.G. Caporaso</name>
</author>
<author>
<name sortKey="Kuczynski, J" uniqKey="Kuczynski J">J. Kuczynski</name>
</author>
<author>
<name sortKey="Stombaugh, J" uniqKey="Stombaugh J">J. Stombaugh</name>
</author>
<author>
<name sortKey="Bittinger, K" uniqKey="Bittinger K">K. Bittinger</name>
</author>
<author>
<name sortKey="Bushman, F D" uniqKey="Bushman F">F.D. Bushman</name>
</author>
<author>
<name sortKey="Costello, E K" uniqKey="Costello E">E.K. Costello</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bokulich, N A" uniqKey="Bokulich N">N.A. Bokulich</name>
</author>
<author>
<name sortKey="Subramanian, S" uniqKey="Subramanian S">S. Subramanian</name>
</author>
<author>
<name sortKey="Faith, J J" uniqKey="Faith J">J.J. Faith</name>
</author>
<author>
<name sortKey="Gevers, D" uniqKey="Gevers D">D. Gevers</name>
</author>
<author>
<name sortKey="Gordon, J I" uniqKey="Gordon J">J.I. Gordon</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R. Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kunin, V" uniqKey="Kunin V">V Kunin</name>
</author>
<author>
<name sortKey="Engelbrektson, A" uniqKey="Engelbrektson A">A Engelbrektson</name>
</author>
<author>
<name sortKey="Ochman, H" uniqKey="Ochman H">H Ochman</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Knights, D" uniqKey="Knights D">D. Knights</name>
</author>
<author>
<name sortKey="Parfrey, L W" uniqKey="Parfrey L">L.W. Parfrey</name>
</author>
<author>
<name sortKey="Zaneveld, J" uniqKey="Zaneveld J">J. Zaneveld</name>
</author>
<author>
<name sortKey="Lozupone, C" uniqKey="Lozupone C">C. Lozupone</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R. Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Arumugam, M" uniqKey="Arumugam M">M. Arumugam</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J. Raes</name>
</author>
<author>
<name sortKey="Pelletier, E" uniqKey="Pelletier E">E. Pelletier</name>
</author>
<author>
<name sortKey="Le Paslier, D" uniqKey="Le Paslier D">D. Le Paslier</name>
</author>
<author>
<name sortKey="Yamada, T" uniqKey="Yamada T">T. Yamada</name>
</author>
<author>
<name sortKey="Mende, D R" uniqKey="Mende D">D.R. Mende</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Filzmoser, P" uniqKey="Filzmoser P">P. Filzmoser</name>
</author>
<author>
<name sortKey="Hron, K" uniqKey="Hron K">K. Hron</name>
</author>
<author>
<name sortKey="Reimann, C" uniqKey="Reimann C">C. Reimann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Templ, M" uniqKey="Templ M">M. Templ</name>
</author>
<author>
<name sortKey="Hron, K" uniqKey="Hron K">K. Hron</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Westerhuis, J A" uniqKey="Westerhuis J">J.A. Westerhuis</name>
</author>
<author>
<name sortKey="Van Velzen, E J" uniqKey="Van Velzen E">E.J. van Velzen</name>
</author>
<author>
<name sortKey="Hoefsloot, H C" uniqKey="Hoefsloot H">H.C. Hoefsloot</name>
</author>
<author>
<name sortKey="Smilde, A K" uniqKey="Smilde A">A.K. Smilde</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liquet, B" uniqKey="Liquet B">B. Liquet</name>
</author>
<author>
<name sortKey="Le Cao, K A" uniqKey="Le Cao K">K.-A. Lê Cao</name>
</author>
<author>
<name sortKey="Hocini, H" uniqKey="Hocini H">H. Hocini</name>
</author>
<author>
<name sortKey="Thiebaut, R" uniqKey="Thiebaut R">R. Thiébaut</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Straube, J" uniqKey="Straube J">J. Straube</name>
</author>
<author>
<name sortKey="Gorse, A D" uniqKey="Gorse A">A.-D. Gorse</name>
</author>
<author>
<name sortKey="Huang, B E" uniqKey="Huang B">B.E. Huang</name>
</author>
<author>
<name sortKey="Le Cao, K A" uniqKey="Le Cao K">K.-A. Lê Cao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Le Cao, K A" uniqKey="Le Cao K">K.-A. Lê Cao</name>
</author>
<author>
<name sortKey="Boitard, S" uniqKey="Boitard S">S. Boitard</name>
</author>
<author>
<name sortKey="Besse, P" uniqKey="Besse P">P. Besse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wold, S" uniqKey="Wold S">S. Wold</name>
</author>
<author>
<name sortKey="Sjostrom, M" uniqKey="Sjostrom M">M. Sjöström</name>
</author>
<author>
<name sortKey="Eriksson, L" uniqKey="Eriksson L">L. Eriksson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tibshirani, R" uniqKey="Tibshirani R">R. Tibshirani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Asnicar, F" uniqKey="Asnicar F">F. Asnicar</name>
</author>
<author>
<name sortKey="Weingart, G" uniqKey="Weingart G">G. Weingart</name>
</author>
<author>
<name sortKey="Tickle, T" uniqKey="Tickle T">T. Tickle</name>
</author>
<author>
<name sortKey="Huttenhower, C" uniqKey="Huttenhower C">C. Huttenhower</name>
</author>
<author>
<name sortKey="Segata, N" uniqKey="Segata N">N. Segata</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benjamini, Y" uniqKey="Benjamini Y">Y. Benjamini</name>
</author>
<author>
<name sortKey="Hochberg, Y" uniqKey="Hochberg Y">Y. Hochberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Anders, S" uniqKey="Anders S">S. Anders</name>
</author>
<author>
<name sortKey="Huber, W" uniqKey="Huber W">W. Huber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcmurdie, P J" uniqKey="Mcmurdie P">P.J. McMurdie</name>
</author>
<author>
<name sortKey="Holmes, S" uniqKey="Holmes S">S. Holmes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Love, M I" uniqKey="Love M">M.I. Love</name>
</author>
<author>
<name sortKey="Huber, W" uniqKey="Huber W">W. Huber</name>
</author>
<author>
<name sortKey="Anders, S" uniqKey="Anders S">S. Anders</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, K" uniqKey="Li K">K. Li</name>
</author>
<author>
<name sortKey="Bihan, M" uniqKey="Bihan M">M. Bihan</name>
</author>
<author>
<name sortKey="Yooseph, S" uniqKey="Yooseph S">S. Yooseph</name>
</author>
<author>
<name sortKey="Methee, B A" uniqKey="Methee B">B.A. Methée</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="He, X" uniqKey="He X">X. He</name>
</author>
<author>
<name sortKey="Mclean, J S" uniqKey="Mclean J">J.S. McLean</name>
</author>
<author>
<name sortKey="Edlund, A" uniqKey="Edlund A">A. Edlund</name>
</author>
<author>
<name sortKey="Yooseph, S" uniqKey="Yooseph S">S. Yooseph</name>
</author>
<author>
<name sortKey="Hall, A P" uniqKey="Hall A">A.P. Hall</name>
</author>
<author>
<name sortKey="Liu, S Y" uniqKey="Liu S">S.-Y. Liu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Warton, D I" uniqKey="Warton D">D.I. Warton</name>
</author>
<author>
<name sortKey="Wright, S T" uniqKey="Wright S">S.T. Wright</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Franzosa, E A" uniqKey="Franzosa E">E.A. Franzosa</name>
</author>
<author>
<name sortKey="Morgan, X C" uniqKey="Morgan X">X.C. Morgan</name>
</author>
<author>
<name sortKey="Segata, N" uniqKey="Segata N">N. Segata</name>
</author>
<author>
<name sortKey="Waldron, L" uniqKey="Waldron L">L. Waldron</name>
</author>
<author>
<name sortKey="Reyes, J" uniqKey="Reyes J">J. Reyes</name>
</author>
<author>
<name sortKey="Earl, A M" uniqKey="Earl A">A.M. Earl</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gonz Lez, I" uniqKey="Gonz Lez I">I. Gonz´lez</name>
</author>
<author>
<name sortKey="Le Cao, K A" uniqKey="Le Cao K">K.-A. Lê Cao</name>
</author>
<author>
<name sortKey="Davis, M J" uniqKey="Davis M">M.J. Davis</name>
</author>
<author>
<name sortKey="Dejean, S" uniqKey="Dejean S">S. Déjean</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, CA USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27513472</article-id>
<article-id pub-id-type="pmc">4981383</article-id>
<article-id pub-id-type="publisher-id">PONE-D-16-08651</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0160169</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Medical Microbiology</subject>
<subj-group>
<subject>Microbiome</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Genetics</subject>
<subj-group>
<subject>Genomics</subject>
<subj-group>
<subject>Microbial Genomics</subject>
<subj-group>
<subject>Microbiome</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Microbial Genomics</subject>
<subj-group>
<subject>Microbiome</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Organisms</subject>
<subj-group>
<subject>Bacteria</subject>
<subj-group>
<subject>Streptococcus</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Medical Microbiology</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Streptococcus</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Medicine and Health Sciences</subject>
<subj-group>
<subject>Pathology and Laboratory Medicine</subject>
<subj-group>
<subject>Pathogens</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Streptococcus</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Organisms</subject>
<subj-group>
<subject>Bacteria</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Research and Analysis Methods</subject>
<subj-group>
<subject>Research Design</subject>
<subj-group>
<subject>Experimental Design</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Physical Sciences</subject>
<subj-group>
<subject>Mathematics</subject>
<subj-group>
<subject>Statistics (Mathematics)</subject>
<subj-group>
<subject>Statistical Data</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Anatomy</subject>
<subj-group>
<subject>Body Fluids</subject>
<subj-group>
<subject>Saliva</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Medicine and Health Sciences</subject>
<subj-group>
<subject>Anatomy</subject>
<subj-group>
<subject>Body Fluids</subject>
<subj-group>
<subject>Saliva</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Physiology</subject>
<subj-group>
<subject>Body Fluids</subject>
<subj-group>
<subject>Saliva</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Medicine and Health Sciences</subject>
<subj-group>
<subject>Physiology</subject>
<subj-group>
<subject>Body Fluids</subject>
<subj-group>
<subject>Saliva</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Organisms</subject>
<subj-group>
<subject>Bacteria</subject>
<subj-group>
<subject>Burkholderia</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Biology and Life Sciences</subject>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Medical Microbiology</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Burkholderia</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Medicine and Health Sciences</subject>
<subj-group>
<subject>Pathology and Laboratory Medicine</subject>
<subj-group>
<subject>Pathogens</subject>
<subj-group>
<subject>Microbial Pathogens</subject>
<subj-group>
<subject>Bacterial Pathogens</subject>
<subj-group>
<subject>Burkholderia</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Research and Analysis Methods</subject>
<subj-group>
<subject>Mathematical and Statistical Techniques</subject>
<subj-group>
<subject>Statistical Methods</subject>
<subj-group>
<subject>Multivariate Analysis</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group subj-group-type="Discipline-v3">
<subject>Physical Sciences</subject>
<subj-group>
<subject>Mathematics</subject>
<subj-group>
<subject>Statistics (Mathematics)</subject>
<subj-group>
<subject>Statistical Methods</subject>
<subj-group>
<subject>Multivariate Analysis</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>MixMC: A Multivariate Statistical Framework to Gain Insight into Microbial Communities</article-title>
<alt-title alt-title-type="running-head">A Multivariate Statistical Framework to Gain Insight into Microbial Communities</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" equal-contrib="yes">
<contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0003-3923-1116</contrib-id>
<name>
<surname>Lê Cao</surname>
<given-names>Kim-Anh</given-names>
</name>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="cor001">*</xref>
</contrib>
<contrib contrib-type="author" equal-contrib="yes">
<name>
<surname>Costello</surname>
<given-names>Mary-Ellen</given-names>
</name>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="currentaff001">
<sup>¤</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lakis</surname>
<given-names>Vanessa Anne</given-names>
</name>
<xref ref-type="aff" rid="aff001">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bartolo</surname>
<given-names>François</given-names>
</name>
<xref ref-type="aff" rid="aff002">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Chua</surname>
<given-names>Xin-Yi</given-names>
</name>
<xref ref-type="aff" rid="aff003">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Brazeilles</surname>
<given-names>Rémi</given-names>
</name>
<xref ref-type="aff" rid="aff004">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Rondeau</surname>
<given-names>Pascale</given-names>
</name>
<xref ref-type="aff" rid="aff004">
<sup>4</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff001">
<label>1</label>
<addr-line>The University of Queensland Diamantina Institute, The University of Queensland, Translational Research Institute, Brisbane, QLD, Australia</addr-line>
</aff>
<aff id="aff002">
<label>2</label>
<addr-line>Institut de Mathématiques de Toulouse, UMR CNRS 5219 INSA Université de Toulouse, Toulouse, France</addr-line>
</aff>
<aff id="aff003">
<label>3</label>
<addr-line>Queensland Facility for Advanced Bioinformatics, The Institute for Molecular Bioscience, Brisbane, QLD, Australia</addr-line>
</aff>
<aff id="aff004">
<label>4</label>
<addr-line>Danone Nutricia Research, Palaiseau Cedex, France</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Moreno-Hagelsieb</surname>
<given-names>Gabriel</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">
<addr-line>Wilfrid Laurier University, CANADA</addr-line>
</aff>
<author-notes>
<fn fn-type="conflict" id="coi001">
<p>
<bold>Competing Interests: </bold>
The authors confirm that there is no competing interest or financial disclosure to Danone Nutricia Research. This does not alter the authors’ adherence to PLOS ONE policies on sharing data and materials.</p>
</fn>
<fn fn-type="con">
<p>
<list list-type="simple">
<list-item>
<p>
<bold>Conceived and designed the experiments</bold>
: KALC RB.</p>
</list-item>
<list-item>
<p>
<bold>Analyzed the data</bold>
: KALC MEC VAL FB XYC RB.</p>
</list-item>
<list-item>
<p>
<bold>Wrote the paper</bold>
: KALC MEC.</p>
</list-item>
<list-item>
<p>
<bold>Participated in the design of the study</bold>
: PR.</p>
</list-item>
</list>
</p>
</fn>
<fn fn-type="current-aff" id="currentaff001">
<label>¤</label>
<p>Current address: Queensland University of Technology, Translational Research Institute, Brisbane, QLD 4102, Australia</p>
</fn>
<corresp id="cor001">* E-mail:
<email>k.lecao@uq.edu.au</email>
</corresp>
</author-notes>
<pub-date pub-type="collection">
<year>2016</year>
</pub-date>
<pub-date pub-type="epub">
<day>11</day>
<month>8</month>
<year>2016</year>
</pub-date>
<volume>11</volume>
<issue>8</issue>
<elocation-id>e0160169</elocation-id>
<history>
<date date-type="received">
<day>29</day>
<month>2</month>
<year>2016</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>7</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>© 2016 Lê Cao et al</copyright-statement>
<copyright-year>2016</copyright-year>
<copyright-holder>Lê Cao et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open access article distributed under the terms of the
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>
, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="pone.0160169.pdf"></self-uri>
<abstract>
<p>Culture independent techniques, such as shotgun metagenomics and 16S rRNA amplicon sequencing have dramatically changed the way we can examine microbial communities. Recently, changes in microbial community structure and dynamics have been associated with a growing list of human diseases. The identification and comparison of bacteria driving those changes requires the development of sound statistical tools, especially if microbial biomarkers are to be used in a clinical setting. We present
<monospace>mixMC</monospace>
, a novel multivariate data analysis framework for metagenomic biomarker discovery.
<monospace>mixMC</monospace>
accounts for the compositional nature of 16S data and enables detection of subtle differences when high inter-subject variability is present due to microbial sampling performed repeatedly on the same subjects, but in multiple habitats. Through data dimension reduction the multivariate methods provide insightful graphical visualisations to characterise each type of environment in a detailed manner. We applied
<monospace>mixMC</monospace>
to 16S microbiome studies focusing on multiple body sites in healthy individuals, compared our results with existing statistical tools and illustrated added value of using multivariate methodologies to fully characterise and compare microbial communities.</p>
</abstract>
<funding-group>
<award-group id="award001">
<funding-source>
<institution-wrap>
<institution-id institution-id-type="funder-id">http://dx.doi.org/10.13039/501100000925</institution-id>
<institution>National Health and Medical Research Council</institution>
</institution-wrap>
</funding-source>
<award-id>APP1087415</award-id>
<principal-award-recipient>
<contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0003-3923-1116</contrib-id>
<name>
<surname>Lê Cao</surname>
<given-names>Kim-Anh</given-names>
</name>
</principal-award-recipient>
</award-group>
<award-group id="award002">
<funding-source>
<institution-wrap>
<institution-id institution-id-type="funder-id">http://dx.doi.org/10.13039/501100000947</institution-id>
<institution>Australian Cancer Research Foundation</institution>
</institution-wrap>
</funding-source>
<award-id>Diamantina Individualised Oncology Care Centre</award-id>
<principal-award-recipient>
<contrib-id authenticated="true" contrib-id-type="orcid">http://orcid.org/0000-0003-3923-1116</contrib-id>
<name>
<surname>Lê Cao</surname>
<given-names>Kim-Anh</given-names>
</name>
</principal-award-recipient>
</award-group>
<funding-statement>KALC was supported in part by the Australian Cancer Research Foundation (ACRF) for the Diamantina Individualised Oncology Care Centre at The University of Queensland Diamantina Institute and the National Health and Medical Research Council (NHMRC) Career Development fellowship (APP1087415). FB was supported by the Agence Nationale de la Recherche (ANR) for the SYNTHACS project (ANR-10-BTBR-05-02). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The authors confirm that there is no competing interest or financial disclosure to Danone Nutricia Research. This does not alter the authors’ adherence to PLOS ONE policies on sharing data and materials.</funding-statement>
</funding-group>
<counts>
<fig-count count="6"></fig-count>
<table-count count="1"></table-count>
<page-count count="21"></page-count>
</counts>
<custom-meta-group>
<custom-meta id="data-availability">
<meta-name>Data Availability</meta-name>
<meta-value>Data from the Human Microbiome Project are available from
<ext-link ext-link-type="uri" xlink:href="http://hmpdacc.org/HMQCP/all/">http://hmpdacc.org/HMQCP/all/</ext-link>
The processed data analysed in this study are available from our website
<ext-link ext-link-type="uri" xlink:href="http://www.mixOmics.org/mixMC">www.mixOmics.org/mixMC</ext-link>
Data from the Koren study were downloaded from the Qiita database
<ext-link ext-link-type="uri" xlink:href="http://qiita.microbio.me/study/description/349">http://qiita.microbio.me/study/description/349</ext-link>
The processed data analysed in this study are available from our website
<ext-link ext-link-type="uri" xlink:href="http://www.mixOmics.org/mixMC">www.mixOmics.org/mixMC</ext-link>
.</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
<notes>
<title>Data Availability</title>
<p>Data from the Human Microbiome Project are available from
<ext-link ext-link-type="uri" xlink:href="http://hmpdacc.org/HMQCP/all/">http://hmpdacc.org/HMQCP/all/</ext-link>
The processed data analysed in this study are available from our website
<ext-link ext-link-type="uri" xlink:href="http://www.mixOmics.org/mixMC">www.mixOmics.org/mixMC</ext-link>
Data from the Koren study were downloaded from the Qiita database
<ext-link ext-link-type="uri" xlink:href="http://qiita.microbio.me/study/description/349">http://qiita.microbio.me/study/description/349</ext-link>
The processed data analysed in this study are available from our website
<ext-link ext-link-type="uri" xlink:href="http://www.mixOmics.org/mixMC">www.mixOmics.org/mixMC</ext-link>
.</p>
</notes>
</front>
<body>
<sec sec-type="intro" id="sec001">
<title>Introduction</title>
<p>The human gut microbiome contains a dynamic and vast array of microbes that are essential to health and provide important metabolic capabilities. Until recently, studying these complex communities has been difficult and generally limited to classical phenotypic techniques [
<xref rid="pone.0160169.ref001" ref-type="bibr">1</xref>
,
<xref rid="pone.0160169.ref002" ref-type="bibr">2</xref>
]. With the improvement of high-throughput sequencing technology, the ability to profile complex microbial communities without the need to individually culture organisms has increased dramatically. These sequencing methods range from RNA sequencing (RNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), metagenomic and 16S rRNA gene amplification analysis of microbial populations. 16S rRNA sequencing in particular has substantially changed our understanding of phylogeny and microbial diversity, and is quickly becoming a staple for profiling microbial communities and their abundances from soil to humans. With this sequencing technique, hypervariable regions within the gene are amplified, sequenced, and clustered into operational taxonomic units (OTU). Taxonomic classification of representative sequences from each cluster is then aligned against a database of previously characterised 16S ribosomal DNA reference sequences to identify bacteria of interest. As alterations and changes in microbiomes have been associated with a range of diseases including obesity [
<xref rid="pone.0160169.ref003" ref-type="bibr">3</xref>
<xref rid="pone.0160169.ref005" ref-type="bibr">5</xref>
], Crohn’s disease [
<xref rid="pone.0160169.ref006" ref-type="bibr">6</xref>
] or ankylosing spondylitis [
<xref rid="pone.0160169.ref007" ref-type="bibr">7</xref>
], it is integral that we analyse this data appropriately given the impact on human health and disease treatment outcomes [
<xref rid="pone.0160169.ref008" ref-type="bibr">8</xref>
].</p>
<p>A number of statistical analysis tools have been proposed to examine differences between microbial communities as well as to identify features that are key to driving the differences. Those methods were developed to accommodate the specific
<italic>sparse</italic>
nature of microbiome data. White
<italic>et al</italic>
. proposed Metastat, a non parametric t-test based on permutation or a Fisher’s exact test when data are sparsely sampled [
<xref rid="pone.0160169.ref008" ref-type="bibr">8</xref>
]. Their approach was a first step towards identifying organisms whose differential abundance correlated with disease. Paulson
<italic>et al</italic>
. developed a zero-inflated Gaussian (ZIG) distribution mixture model to account for biases due to undersampling of the microbial community [
<xref rid="pone.0160169.ref009" ref-type="bibr">9</xref>
].</p>
<p>The other characteristic of microbiome data is their underlying
<italic>compositional</italic>
structure. Due to varying sampling/sequencing depths between samples from high-throughput sequencing, each OTU count is converted into relative abundance (proportion) in each sample. This intuitive pre-processing step results in compositional data which reside in a simplex sample space rather than the Euclidian space [
<xref rid="pone.0160169.ref010" ref-type="bibr">10</xref>
]. As a consequence, conventional statistical methods including correlation coefficients or univariate methods may lead to spurious results as the independence assumption between predictor variables is not met [
<xref rid="pone.0160169.ref011" ref-type="bibr">11</xref>
<xref rid="pone.0160169.ref013" ref-type="bibr">13</xref>
]. A growing list of references advocate against the use of such methods for microbiome compositional data [
<xref rid="pone.0160169.ref014" ref-type="bibr">14</xref>
,
<xref rid="pone.0160169.ref015" ref-type="bibr">15</xref>
]. One solution that was proposed by Aitchison is to transform compositional data into Euclidian space using centered log ratio transformation (CLR) before applying standard univariate or multivariate methods [
<xref rid="pone.0160169.ref010" ref-type="bibr">10</xref>
,
<xref rid="pone.0160169.ref014" ref-type="bibr">14</xref>
,
<xref rid="pone.0160169.ref016" ref-type="bibr">16</xref>
].</p>
<p>Another important aspect to consider when analysing microbiome data is that microbial communities modulate and influence biological pathways as a whole. Therefore univariate statistical approaches that test each OTU feature individually, disregarding interactions or correlations between features may provide limited insight into the microbiome. One could instead consider multivariate methods as they analyse the entire set of OTUs at once. So far, most multivariate approaches are solely used to visualise diversity patterns, such as unsupervised Principal Coordinate Analysis (PCoA [
<xref rid="pone.0160169.ref017" ref-type="bibr">17</xref>
]) based on sample-wise distance/dissimilarity matrices to scale for species abundance (e.g. Bray-Curtis [
<xref rid="pone.0160169.ref018" ref-type="bibr">18</xref>
], unweighted [
<xref rid="pone.0160169.ref019" ref-type="bibr">19</xref>
] or weighted Unifrac [
<xref rid="pone.0160169.ref020" ref-type="bibr">20</xref>
] distances), or supervised between-class analysis [
<xref rid="pone.0160169.ref021" ref-type="bibr">21</xref>
] to segregate sample groups. However, those multivariate approaches limit our understanding as they do not indicate which key species discriminate the sample groups, with the exception of ALDex2 [
<xref rid="pone.0160169.ref015" ref-type="bibr">15</xref>
], and LEfSe [
<xref rid="pone.0160169.ref022" ref-type="bibr">22</xref>
]. Those methods still rely on univariate tests (Welch’s t- or Wilcoxon rank test) as a first step to assess the significance of each OTU.</p>
<p>Finally, the other critical issue we address in this study is high inter-subject variability [
<xref rid="pone.0160169.ref004" ref-type="bibr">4</xref>
], which is often reduced with an appropriate experimental repeated-measures design where each subject acts as its own control. Thus, microbial sampling is performed repeatedly on the same subjects over different habitats. While such experimental design has been widely adopted by community profiling studies such as the Human Microbiome Project (HMP, [
<xref rid="pone.0160169.ref023" ref-type="bibr">23</xref>
,
<xref rid="pone.0160169.ref024" ref-type="bibr">24</xref>
]) to define a ‘healthy’ microbiome community by characterising different body sites in the same subjects, very few statistical methods have taken advantage of this design and accommodate inter-subject variability.</p>
<p>We introduce
<monospace>mixMC</monospace>
, a multivariate analysis framework for 16S data to identify OTU features discriminating multiple groups of samples.
<monospace>mixMC</monospace>
addresses the limitations of existing multivariate methods for microbiome studies and proposes unique analytical capabilities: it handles compositional and sparse data, repeated-measures experiments and multiclass problems; it highlights important discriminative features, and it provides interpretable graphical outputs to better understand the microbial communities contribution to each habitat. We applied
<monospace>mixMC</monospace>
to multiple body site studies in healthy individuals from HMP and the study from Koren
<italic>et al</italic>
. [
<xref rid="pone.0160169.ref025" ref-type="bibr">25</xref>
], compared our results with existing univariate statistical approaches and provided thorough interpretations of the microbial communities unraveled using our multivariate analyses.</p>
</sec>
<sec sec-type="materials|methods" id="sec002">
<title>Material and Methods</title>
<p>We analysed publicly available 16S data from the NIH Human Microbiome Project and cross-compared our results with the microbiome study from Koren
<italic>et al</italic>
. [
<xref rid="pone.0160169.ref025" ref-type="bibr">25</xref>
]. The data were processed by the open-source bioinformatics software QIIME [
<xref rid="pone.0160169.ref026" ref-type="bibr">26</xref>
] for the 16S variable region 1–3. We first describe the different processing, and normalisation steps, and the statistical methods applied in this study, summarised in
<xref ref-type="fig" rid="pone.0160169.g001">Fig 1A</xref>
.</p>
<fig id="pone.0160169.g001" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0160169.g001</object-id>
<label>Fig 1</label>
<caption>
<title>Comparison between multivariate and univariate statistical analysis frameworks for 16S microbiome data.</title>
<p>
<bold>(A)</bold>
Multivariate
<monospace>mixMC</monospace>
framework including processing/normalisation, optional repeated measures design, unsupervised and supervised analyses,
<bold>(B)</bold>
Univariate framework, including normalisation and optional repeated measures design analysis.</p>
</caption>
<graphic xlink:href="pone.0160169.g001"></graphic>
</fig>
<sec id="sec003">
<title>Data processing and normalisation</title>
<p>One of the characteristics of 16S data is their sparse nature and the differences in sequencing depth, which makes preprocessing and normalisation steps crucial when the aim is to characterise and differentiate microbial communities. Since this study focuses only on beta diversity and differences in abundance between sample groups, we do not recommend using a rarefaction step prior to the
<monospace>mixMC</monospace>
analyses.</p>
<sec id="sec004">
<title>Prefiltering</title>
<p>Bokulich
<italic>et al</italic>
. demonstrated that strict quality filtering of reads greatly improves measures for microbial community profiling [
<xref rid="pone.0160169.ref027" ref-type="bibr">27</xref>
]. After removing samples with a very low number of total OTU counts (less than 10), we removed OTUs with proportional counts across all samples below 0.01%. While this may appear drastic, this prefiltering step can counteract sequencing errors, estimated to be 1/1000 in Illumina MiSeq for example [
<xref rid="pone.0160169.ref002" ref-type="bibr">2</xref>
,
<xref rid="pone.0160169.ref028" ref-type="bibr">28</xref>
]. The prefiltering step avoids spurious results in the downstream statistical analysis. The proposed threshold is the default value in QIIME that was also used in other microbiome studies (e.g. [
<xref rid="pone.0160169.ref029" ref-type="bibr">29</xref>
,
<xref rid="pone.0160169.ref030" ref-type="bibr">30</xref>
]).</p>
</sec>
<sec id="sec005">
<title>Normalisation</title>
<p>Normalisation must address the issues of sparse counts and differences in sequencing depth and needs to be carefully chosen as this step can strongly affect the downstream statistical results [
<xref rid="pone.0160169.ref009" ref-type="bibr">9</xref>
]. So far, two types of normalisations have been proposed for microbiome studies.</p>
<p>The commonly used Total Sum Scaling normalisation (TSS) divides each OTU count by the total number of counts in each individual sample to account for uneven sequencing depths across samples. However, since TSS reflects relative information (i.e. proportions), the resulting normalised data reside in a simplex rather than an Euclidian space which may lead to spurious false discoveries if standard statistical methods are applied [
<xref rid="pone.0160169.ref010" ref-type="bibr">10</xref>
]. The solution is to transform TSS data to project them to the Euclidian space using log ratio transformations. The Centered Log Ratio transformation (CLR) has been recently applied in several compositional data studies [
<xref rid="pone.0160169.ref014" ref-type="bibr">14</xref>
<xref rid="pone.0160169.ref016" ref-type="bibr">16</xref>
]. Let
<bold>
<italic>x</italic>
</bold>
= (
<italic>x</italic>
<sub>1</sub>
, ⋯,
<italic>x</italic>
<sub>p</sub>
)′ denote a composition on the
<italic>p</italic>
TSS normalised OTU counts, then the CLR transformation is defined as
<disp-formula id="pone.0160169.e001">
<alternatives>
<graphic xlink:href="pone.0160169.e001.jpg" id="pone.0160169.e001g" mimetype="image" position="anchor" orientation="portrait"></graphic>
<mml:math id="M1">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo></mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>l</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>g</mml:mi>
<mml:mfrac>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mroot>
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>p</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mi>p</mml:mi>
</mml:mroot>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mo></mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>l</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>g</mml:mi>
<mml:mfrac>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mroot>
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>p</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mi>p</mml:mi>
</mml:mroot>
</mml:mfrac>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</alternatives>
</disp-formula>
</p>
<p>Alternatively, the Cumulative Sum Scaling normalisation (CSS, [
<xref rid="pone.0160169.ref009" ref-type="bibr">9</xref>
]) was developed to prevent TSS bias in differential abundance analysis with sparse counts. CSS can be considered as an extension of the quantile normalisation approach and consists of TSS scaling raw counts that are relatively invariant across samples, up to a percentile determined using a data-driven approach. CSS therefore partially accounts for compositional data. We applied CSS on the log transformed counts using the
<monospace>metagenomeSeq</monospace>
package [
<xref rid="pone.0160169.ref031" ref-type="bibr">31</xref>
].</p>
</sec>
</sec>
<sec id="sec006">
<title>Methods</title>
<p>The main objective of our study is to extend and apply multivariate statistical analysis methods for microbiome compositional data. The
<monospace>mixMC</monospace>
framework (
<xref ref-type="fig" rid="pone.0160169.g001">Fig 1</xref>
) includes unsupervised analyses to visualise diversity patterns with Principal Component Analysis (PCA) and supervised analyses to identify indicator species or determinant microbiota members characterising differences between habitats or body sites (sparse Partial Least Square Discriminant Analysis, sPLS-DA). In addition, our framework addresses a commonly encountered experimental design in microbiome studies called
<italic>repeated-measures design</italic>
, where microbial sampling is performed on the same individuals but in different body sites to detect differences between habitats. This design leads to analytical challenges in order to be able to discern subtle differences
<italic>between</italic>
body sites from the large variation between individuals
<italic>within</italic>
each body site.</p>
<sec id="sec007">
<title>Unsupervised multivariate analysis</title>
<p>PCA variants, such as Principal Coordinate Analysis (PCoA, [
<xref rid="pone.0160169.ref017" ref-type="bibr">17</xref>
]) allows for dimension reduction of the data and visualisation of diversity patterns in microbiome studies. PCoA is commonly applied to non Euclidian sample-wise dissimilarity matrices (e.g. Bray-Curtis [
<xref rid="pone.0160169.ref018" ref-type="bibr">18</xref>
]) or phylogenetic distances between sets of taxa in a phylogenetic tree (weighted or unweighted Unifrac distance, [
<xref rid="pone.0160169.ref019" ref-type="bibr">19</xref>
,
<xref rid="pone.0160169.ref020" ref-type="bibr">20</xref>
]). Alternatively, and to avoid spurious results arising from compositional data PCA can be applied on log ratio compositional data using either CLR transformation, or Isometric Log Ratio transformation (ILR, [
<xref rid="pone.0160169.ref032" ref-type="bibr">32</xref>
], described in
<xref ref-type="supplementary-material" rid="pone.0160169.s001">S1 Text</xref>
). In
<monospace>mixMC</monospace>
we applied PCA on ILR transformed data using customised R scripts from the
<monospace>robCompositions</monospace>
package [
<xref rid="pone.0160169.ref033" ref-type="bibr">33</xref>
].</p>
</sec>
<sec id="sec008">
<title>Multilevel variance decomposition</title>
<p>One way to account for repeated measurements designs is to separate body site variation (termed ‘
<italic>within variation</italic>
’) from individual variation (termed ‘
<italic>between subject variation</italic>
’) via variance decomposition. In univariate analyses, this step refers to repeated measures ANOVA (also called within-subjects ANOVA). In multivariate analysis we refer to ‘multilevel approach’ [
<xref rid="pone.0160169.ref034" ref-type="bibr">34</xref>
]. The within subject variation is obtained by calculating the net differences between repeated observations (i.e. between each body site within each individual). Since the within subject variation assesses the difference in the body sites within each subject and disregards the possibly large individual variation, the within variation can then be used as input data in the subsequent multivariate statistical analysis [
<xref rid="pone.0160169.ref035" ref-type="bibr">35</xref>
]. In
<monospace>mixMC</monospace>
, the multilevel variance decomposition is applied on the log ratio transformed data described above, prior to the multivariate analyses (
<xref ref-type="fig" rid="pone.0160169.g001">Fig 1A</xref>
). Note that the variance decomposition in the multilevel approach does not take into account the correlation structure or order between measurements and is not appropriate for a time course experiment where the objective is to examine the effect of time in a study (see for example applications of linear mixed model splines for those specific cases [
<xref rid="pone.0160169.ref031" ref-type="bibr">31</xref>
,
<xref rid="pone.0160169.ref036" ref-type="bibr">36</xref>
]).</p>
</sec>
<sec id="sec009">
<title>Supervised multivariate analysis</title>
<p>The multivariate approach sparse Partial Least Squares Discriminant Analysis (sPLS-DA, [
<xref rid="pone.0160169.ref037" ref-type="bibr">37</xref>
]) is an extension of the PLS algorithm from Wold
<italic>et al</italic>
. [
<xref rid="pone.0160169.ref038" ref-type="bibr">38</xref>
] to perform feature selection with multilevel decomposition [
<xref rid="pone.0160169.ref035" ref-type="bibr">35</xref>
]. In
<monospace>mixMC</monospace>
we further extended the multilevel sPLS-DA for microbiome data using either CSS normalised data, or TSS+CLR data.</p>
<p>
<italic>Principle of PLS-DA</italic>
. PLS-Discriminant Analysis is a multivariate regression model which maximises the covariance between linear combinations of the OTU counts and the outcome (a dummy matrix indicating the body site of each sample). Covariance maximisation is achieved in a sequential manner via the use of latent component scores. Each component is a linear combination of OTU counts and characterises a particular source of co-variation between the OTU and the body sites. As a consequence, the final number of components summarising most of the information from the data must be specified. The sparse version of PLS-DA, sPLS-DA uses Lasso penalisations [
<xref rid="pone.0160169.ref039" ref-type="bibr">39</xref>
] to select the most discriminative features in the PLS-DA model. The penalisation is applied componentwise and the resulting selected features reflect the particular source of covariance in the data highlighted by each PLS component.</p>
<p>
<italic>Parameters and performance evaluation</italic>
. The number of features to select per component must be specified in sPLS-DA and is usually optimised using cross-validation. In this study we used 10-fold cross-validation repeated 100 times. For varying features selected by sPLS-DA the classification error rate resulting from the cross-validation process was then recorded and the lowest error rate indicated the optimal number of features to select on each component. This procedure concurrently indicated the optimal number of components for the sPLS-DA model. Once those parameters chosen, the final sPLS-DA model was run on the entire data set to obtain the final list of discriminative OTUs for each component.</p>
<p>
<italic>Graphical and numerical outputs</italic>
. We further characterised each selected OTU by calculating its median normalised count in each body site. An OTU was defined as ‘contributing to a body site’ if the median count in that specific body site was higher than in any other body site. We graphically represented the contribution of each selected OTU with a barplot where each OTU bar length corresponds to the importance of the feature in the multivariate model (i.e. the multivariate regression coefficient with either a positive or negative sign for that particular feature on each component) ranked by decreasing importance starting from the bottom, and with colours matching the contributing body site. The contribution plot can display the bacterial taxonomy at any specified level, here we chose the family level. We also used circular representations of taxonomic trees using the GraPhlAn software tool [
<xref rid="pone.0160169.ref040" ref-type="bibr">40</xref>
] to complement the contribution plot with taxonomy information. In this plot the background colour indicates the body sites where the OTU is most abundant, the node size represents the median OTU count in that body site and the node colour indicates a negative (black) or positive (yellow) weight from the sPLS-DA regression coefficient. Other insightful outputs include sample representation where each individual is projected onto the sPLS-DA components, the list of OTU features selected on each component, the cross-validation error rate per component and the number of features contributing to each body site for each component.</p>
<p>The multilevel sPLS-DA framework is implemented in the R package
<monospace>mixOmics</monospace>
[
<xref rid="pone.0160169.ref041" ref-type="bibr">41</xref>
] using multilevel decomposition [
<xref rid="pone.0160169.ref035" ref-type="bibr">35</xref>
]. The cladogram was generated using the GraPhlAn Python code [
<xref rid="pone.0160169.ref040" ref-type="bibr">40</xref>
]. R codes and tutorials are available on our website
<ext-link ext-link-type="uri" xlink:href="http://www.mixOmics.org/mixMC">www.mixOmics.org/mixMC</ext-link>
.</p>
</sec>
<sec id="sec010">
<title>Univariate analysis</title>
<p>Unlike multivariate methods, univariate methods test each OTU for differential abundance between body sites. P-values obtained were adjusted for multiple testing using the False Discovery Rate (FDR, [
<xref rid="pone.0160169.ref042" ref-type="bibr">42</xref>
]) at the 5% significance level. We considered two univariate approaches able to analyse repeated-measures experiments (
<xref ref-type="fig" rid="pone.0160169.g001">Fig 1B</xref>
).</p>
<p>DESeq2 was developed for DNA sequencing read count data where mean and variance for the binomial distribution is estimated for each feature [
<xref rid="pone.0160169.ref043" ref-type="bibr">43</xref>
]. OTU counts are normalised internally to the method with respect to a library size factor estimation, however, this normalisation does not address the issue of compositional data. For microbiome data analysis DESeq2 has served as a basis of comparison to novel methodological developments [
<xref rid="pone.0160169.ref009" ref-type="bibr">9</xref>
,
<xref rid="pone.0160169.ref015" ref-type="bibr">15</xref>
,
<xref rid="pone.0160169.ref044" ref-type="bibr">44</xref>
]. We used mean dispersion estimates models as implemented in the R package
<monospace>DESeq2</monospace>
[
<xref rid="pone.0160169.ref045" ref-type="bibr">45</xref>
].</p>
<p>ZIG [
<xref rid="pone.0160169.ref009" ref-type="bibr">9</xref>
] is a mixture model with a Zero-Inflated Gaussian distribution to account for varying depths of coverage that is typical for microbial community under-sampling. In the ZIG model, OTU counts are first log transformed and then CSS normalised (R package
<monospace>metagenomeSeq</monospace>
[
<xref rid="pone.0160169.ref031" ref-type="bibr">31</xref>
]).</p>
</sec>
</sec>
<sec id="sec011">
<title>Case studies</title>
<sec id="sec012">
<title>HMP case studies</title>
<p>We analysed subsets of the NIH HMP16S data downloaded from
<ext-link ext-link-type="uri" xlink:href="http://hmpdacc.org/HMQCP/all/">http://hmpdacc.org/HMQCP/all/</ext-link>
for the V1–3 variable region. The original data contained 43 146 OTU counts for 2 911 samples measured from 18 different body sites. We focused on the first visit of each healthy individual and further divided the data into two data subsets. For both data sets a preliminary exploratory PCoA confirmed that there was no confounding covariate effect due to run center or gender (see
<xref ref-type="supplementary-material" rid="pone.0160169.s007">S1 Fig</xref>
).</p>
<p>
<italic>Most diverse body sites dataset</italic>
. Understanding microbial community diversity across body habitats is fundamental to study the human microbiome. In their extensive HMP data statistical analysis, Li
<italic>et al</italic>
. quantified intra-sample diversity using the Shannon index. Based on their results we chose the three most diverse habitats according to all genera-based and OTU-based taxonomic units [
<xref rid="pone.0160169.ref046" ref-type="bibr">46</xref>
], namely Subgingival plaque (Oral), Antecubital fossa (Skin) and Stool sampled from 54 unique healthy individuals for a total of 162 samples. The prefiltered dataset included 1 674 OTU counts (
<xref ref-type="supplementary-material" rid="pone.0160169.s002">S1 Table</xref>
).</p>
<p>
<italic>Oral body sites dataset</italic>
. While many published analyses have focused on the main microbial habitats (gut, oral cavity, skin and vagina from the [
<xref rid="pone.0160169.ref024" ref-type="bibr">24</xref>
,
<xref rid="pone.0160169.ref047" ref-type="bibr">47</xref>
]), little has been done to comprehensively characterise multiple sites within a single habitat. In this data set we solely considered samples from oral cavity, which has been found to be as diverse as the stool microbiome [
<xref rid="pone.0160169.ref046" ref-type="bibr">46</xref>
]. The nine oral sites were Attached Keratinising Gingiva, Buccal Mucosa, Hard Palate, Palatine Tonsils, Saliva, Subgingival Plaque, Supragingival Plaque, Throat and Tongue Dorsum. After prefiltering, the data included 1 562 OTU for 73 unique healthy individuals and a total of 657 sample (
<xref ref-type="supplementary-material" rid="pone.0160169.s002">S1 Table</xref>
).</p>
</sec>
<sec id="sec013">
<title>Koren dataset</title>
<p>Koren and colleagues examined the link between oral, gut and plaque microbial communities in patients with atherosclerosis and controls [
<xref rid="pone.0160169.ref025" ref-type="bibr">25</xref>
]. We compared our HMP most diverse results to the healthy individuals from this dataset. This study contained partially repeated measures from multiple sites including 15 unique patients samples from saliva and stool, and 13 unique patients only sampled from arterial plaque samples. The data were downloaded from the QIITA database (
<ext-link ext-link-type="uri" xlink:href="http://qiita.microbio.me/study/description/349">http://qiita.microbio.me/study/description/349</ext-link>
) and included 5 138 OTU. After prefiltering, the data included 973 OTU for 43 samples.</p>
</sec>
</sec>
</sec>
<sec sec-type="results" id="sec014">
<title>Results</title>
<sec id="sec015">
<title>Unsupervised analyses on Most Diverse body sites dataset</title>
<p>We applied unsupervised analyses PCoA or PCA on ILR transformed data to visualise diversity patterns between microbial communities, then compared different types of normalisations (TSS-ILR, CSS) followed by a multilevel variance decomposition for repeated measures.</p>
<p>A PCoA performed on the filtered OTU raw counts (with no normalisation) showed that the unweighted Unifrac distance could highlight diversity patterns between each body site better than weighted Unifrac (
<xref ref-type="fig" rid="pone.0160169.g002">Fig 2</xref>
). As this study focuses on the most diverse body sites, the presence or absence of microbial communities is expected to drive the differences between body sites more than the relative abundance usually highlighted by weighted Unifrac. Applying PCoA on the unfiltered count data led to similar interpretation (
<xref ref-type="supplementary-material" rid="pone.0160169.s008">S2 Fig</xref>
), but we observed a lower amount of explained variance of the first and second coordinate as more ‘noisy’ OTU were present in the data (unweighted Unifrac: 11.28% and 8.95% for the unfiltered data vs. 17.37% and 14.48% for the filtered data).</p>
<fig id="pone.0160169.g002" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0160169.g002</object-id>
<label>Fig 2</label>
<caption>
<title>Most diverse data, PCoA sample plots.</title>
<p>Sample plot on the first two coordinates with
<bold>(a)</bold>
weighted Unifrac
<bold>(b)</bold>
unweighted Unifrac calculated on the filtered OTU count table (based on 1 674 OTU).</p>
</caption>
<graphic xlink:href="pone.0160169.g002"></graphic>
</fig>
<p>We then compared the different normalisation strategies, including the multilevel variance decomposition using PCA. The normalisations TSS, TSS + ILR, CSS seemed to cluster the body sites similarly (
<xref ref-type="fig" rid="pone.0160169.g003">Fig 3(a), 3(c) and 3(e)</xref>
). The multilevel decomposition led to a smaller variability within body sites and a greater variability between body sites (
<xref ref-type="fig" rid="pone.0160169.g003">Fig 3(b), 3(d) and 3(f)</xref>
), and consequently increased the amount of variance explained. Using TSS+ILR or CSS also increased the explained variance (TSS+ILR, 44.6% for the first two components, 33.5% for CSS).</p>
<fig id="pone.0160169.g003" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0160169.g003</object-id>
<label>Fig 3</label>
<caption>
<title>Most diverse data, PCA sample plots.</title>
<p>
<bold>(a)</bold>
TSS and
<bold>(b)</bold>
TSS multilevel OTU log counts,
<bold>(c)</bold>
TSS-ILR and
<bold>(d)</bold>
TSS-ILR multilevel normalised log counts,
<bold>(e)</bold>
CSS and
<bold>(f)</bold>
CSS multilevel log counts.</p>
</caption>
<graphic xlink:href="pone.0160169.g003"></graphic>
</fig>
<p>This preliminary exploration indicated that the abundance of microbial communities could characterise each body site quite clearly, and that the multilevel decomposition enabled better separation of the body site clusters, in particular when applied to the TSS+ILR or CSS normalised data.</p>
</sec>
<sec id="sec016">
<title>Supervised analysis on Most Diverse body sites dataset</title>
<p>We applied multilevel sPLS-DA to identify a microbiome signature characterising each body site and compared the different normalisation strategies (TSS+CLR or CSS) in our multivariate method to DESeq2 and ZIG univariate methods.</p>
<sec id="sec017">
<title>Impact of normalisation to identify discriminative features with sPLS-DA</title>
<p>The sPLS-DA classification performance was similar in both TSS+CLR or CSS normalised data. The lowest classification error rate was obtained for two components (0.7% for TSS+CLR and 0.3% for CSS,
<xref ref-type="supplementary-material" rid="pone.0160169.s005">S4 Table</xref>
). Both normalisations consistently misclassified antecubical fossa on the first component but correctly classified the two other body sites, and the addition of the second component enabled a better classification of all body sites (
<xref ref-type="fig" rid="pone.0160169.g004">Fig 4</xref>
). The number of OTUs selected with sPLS-DA was 160 with TSS+CLR and 130 with CSS. We next assessed the contribution of the selected OTU selected on each component (
<xref ref-type="supplementary-material" rid="pone.0160169.s006">S5 Table</xref>
). We found that both normalisations identified similar bacterial families. Component 1 characterised the subgingival plaque with
<italic>Micrococcaceae, Neisseriaceae, Streptococcaceae, Flavobacteriaceae</italic>
and
<italic>Campylobacteraceae</italic>
. CSS also identified the
<italic>Burkholderiaceae</italic>
family. Component 2 characterised stool and anticubital fossa. For anticubital fossa, TSS+CLR identified
<italic>Propionibacteriaceae, Staphylococcaceae and Corynebacteriaceae</italic>
while CSS additionally identified
<italic>Propionibacteriaceae, Staphylococcaceae</italic>
but failed to identify
<italic>Corynebacteriaceae</italic>
. Bacterial families characterising stool included
<italic>Bacteroides, Ruminococcaceae, Lachnospiraceae, Rikenellaceae</italic>
and
<italic>Porphyromonadaceae</italic>
. Across the three body sites, we found that both normalisations led to very similar families of bacteria—5 families for component 1, 10 (TSS+CLR) or 8 (CSS) for component 2 with a difference of 1 or 2 families on each component between TSS+CLR and CSS (see
<xref ref-type="supplementary-material" rid="pone.0160169.s006">S5 Table</xref>
). Interestingly, we observed that increasing the number of selected OTU did not add more relevant bacteria families. It is rather the proportion of number of OTU corresponding to the families that varied (
<xref ref-type="fig" rid="pone.0160169.g004">Fig 4(d)</xref>
).</p>
<fig id="pone.0160169.g004" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0160169.g004</object-id>
<label>Fig 4</label>
<caption>
<title>Most diverse TSS+CLR data, sPLS-DA sample, contribution and cladogram plots.</title>
<p>
<bold>(a)</bold>
sample plot on the first two components with 95% confidence level ellipse plots,
<bold>(b)</bold>
and
<bold>(c)</bold>
represent the contribution of each OTU feature selected on the first (10 OTUs) and second component (120 OTUs), with OTU contribution ranked from bottom (important) to top. Colours indicate body site in which the OTU is most abundant.
<bold>(d)</bold>
Cladogram generated from the sPLS-DA result using GraphlAn.</p>
</caption>
<graphic xlink:href="pone.0160169.g004"></graphic>
</fig>
</sec>
<sec id="sec018">
<title>Comparisons with no multilevel approach</title>
<p>To understand the impact and benefits of the proposed multilevel approach, we examined the OTU selected by sPLS-DA multilevel on either the TSS or CSS normalised counts without multilevel transformation. The classification error rate was substantially greater than with the previous multilevel analysis, (6% for TSS+CLR and 3% for CSS for two components) with a larger number of OTU selected (400 OTU selected for TSS+CLR and 240 for CSS). With the TSS+CLR normalisation, we identified similar families characterising subgingival plaque on the first component, including
<italic>Burkholderiaceae, Fusobacteriaceae, Gemellaceae, Veillonellaceae</italic>
. The families selected on the second component characterised antecubital fossa similarly to the multilevel approach, however the notable omission was the entire
<italic>Ruminococcus</italic>
family characterising stool in the multilevel approach that was not identified here. Overall, we found that the multivariate analysis ignoring the repeated-measures design tended to identify differential features driving the overall signature and disregarded subtleties between microbial communities in environments sampled on the same individuals.</p>
</sec>
<sec id="sec019">
<title>Comparison with univariate analysis</title>
<p>While the number of OTUs declared as differentially abundant was similar between DESeq2 and ZIG (
<xref ref-type="supplementary-material" rid="pone.0160169.s003">S2 Table</xref>
), we observed strong differences at both OTU and family levels (
<xref ref-type="supplementary-material" rid="pone.0160169.s009">S3 Fig</xref>
). Interestingly, the sPLS-DA selections were all included in the ZIG and DESEq2 selections. DESeq2 identified relevant features that were common to sPLS-DA selections, such as
<italic>Propionibacteriaceae, Staphylococcaceae</italic>
and
<italic>Corynebacteriaceae</italic>
with the addition of
<italic>Burkholderiaceae</italic>
as a defining feature characterising Antecubital fossa. It also characterised the Subgingival plaque microbial community with OTUs from
<italic>Streptococcaceae, Neisseriaceae, Gemellaceae</italic>
and
<italic>Micrococcaceae</italic>
families, also identified in sPLS-DA. However, DESeq2 was poor at characterising Stool. Indeed, very few bacterial families, including
<italic>Bacteroides</italic>
and
<italic>Lachnospiraceae</italic>
were identified. Such low bacterial diversity was not consistent with the sPLS-DA nor with the literature. Similar to DESeq2 and sPLS-DA, ZIG identified features of the Antecubital fossa with OTU belonging to
<italic>Propionibacteriaceae, Staphylococcaceae, Burkholderiaceae</italic>
and
<italic>Corynebacteriaceae</italic>
. Like DESeq2, ZIG described the Subgingival plaque microbiome with OTU belonging to
<italic>Streptococcaceae, Neisseriaceae, Micrococcaceae</italic>
and
<italic>Gemellaceae</italic>
. However, ZIG also identified OTUs belonging to
<italic>Fusobacteriaceae, Burkholderiaceae, Flavobacteriaceae, Campylobacteraceae, Veillonellaceae</italic>
and
<italic>Actinomycetaceae</italic>
. In contrast to DESeq2, ZIG identified and described the Stool microbiome well, with OTU belonging to the families of
<italic>Bacteroides, Porphyromonadaceae, Rikenellaceae, Lachnospiraceae</italic>
and
<italic>Ruminococcaceae</italic>
. One reason to explain the differences between the two univariate methods might be that DESeq2 does not adequately model sparse counts.</p>
</sec>
</sec>
<sec id="sec020">
<title>Analysis of the oral body site dataset with
<monospace>mixMC</monospace>
</title>
<p>Similar to the Most Diverse data set, unsupervised data analyses showed that unweighted Unifrac better discriminated the different body sites (plaque, gingiva) compared to weighted Unifrac in the PCoA sample plots (
<xref ref-type="supplementary-material" rid="pone.0160169.s010">S4(a) and S4(b) Fig</xref>
). TSS+ILR explained greater variance (21.35% on the first component) than CSS (13.63%), with better separated body sites clusters (
<xref ref-type="supplementary-material" rid="pone.0160169.s010">S4(c) and S4(e) Fig</xref>
). The explained variance further increased with a multilevel variance decomposition (25.37% vs. 18.22%,
<xref ref-type="supplementary-material" rid="pone.0160169.s010">S4(d) and S4(f) Fig</xref>
).</p>
<sec id="sec021">
<title>sPLS-DA performance and choice of parameters</title>
<p>We observed similar classification performances between sPLS-DA on either TSS+CLR or CSS, with a slightly lower classification error rate for TSS+CLR (
<xref ref-type="supplementary-material" rid="pone.0160169.s011">S5 Fig</xref>
,
<xref ref-type="table" rid="pone.0160169.t001">Table 1</xref>
). The final sPLS-DA model included 8 components that led to optimal performance, with a classification error rate that substantially decreased from 78% (component 1) to 26% for TSS+CLR and 30% for CSS (component 8). The classification error rate remained relatively high as similar body sites were consistently misclassified across components, as described in
<xref ref-type="table" rid="pone.0160169.t001">Table 1</xref>
. For example, Tonsils had the highest classification error rate as no OTU was able to characterise this particular body site (
<xref ref-type="table" rid="pone.0160169.t001">Table 1</xref>
). We observed that the TSS+CLR normalisation was better at characterising tonsil and plaque (component 1), buccal mucosa (component 2) and gingiva (component 3) than the CSS normalisation. The CSS normalisation also led to a substantial number of ties (equal median counts) when assessing the body site contribution of the selected OTU (not shown). Therefore, the detailed analysis that follows solely focuses on a multilevel sPLS-DA model with TSS+CLR normalisation.</p>
<table-wrap id="pone.0160169.t001" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0160169.t001</object-id>
<label>Table 1</label>
<caption>
<title>Oral data.</title>
<p>Top: Number of selected features at the OTU (family) level and mean classification error rate per component. Bottom: Number of features at the OTU (family) level contributing to each body site for each sPLS-DA component. Note that we may observe some overlap between families across the different body sites.</p>
</caption>
<alternatives>
<graphic id="pone.0160169.t001g" xlink:href="pone.0160169.t001"></graphic>
<table frame="box" rules="all" border="0">
<colgroup span="1">
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
</colgroup>
<thead>
<tr>
<th align="right" rowspan="1" colspan="1"></th>
<th align="right" rowspan="1" colspan="1">Comp 1</th>
<th align="right" rowspan="1" colspan="1">Comp 2</th>
<th align="right" rowspan="1" colspan="1">Comp 3</th>
<th align="right" rowspan="1" colspan="1">Comp 4</th>
<th align="right" rowspan="1" colspan="1">Comp 5</th>
<th align="right" rowspan="1" colspan="1">Comp 6</th>
<th align="right" rowspan="1" colspan="1">Comp 7</th>
<th align="right" rowspan="1" colspan="1">Comp 8</th>
</tr>
</thead>
<tbody>
<tr>
<td align="right" rowspan="1" colspan="1"># features selected</td>
<td align="right" rowspan="1" colspan="1">60 (13)</td>
<td align="right" rowspan="1" colspan="1">40 (2)</td>
<td align="right" rowspan="1" colspan="1">190 (18)</td>
<td align="right" rowspan="1" colspan="1">200 (14)</td>
<td align="right" rowspan="1" colspan="1">40 (8)</td>
<td align="right" rowspan="1" colspan="1">200 (26)</td>
<td align="right" rowspan="1" colspan="1">180 (23)</td>
<td align="right" rowspan="1" colspan="1">190 (22)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">mean classification error rate</td>
<td align="right" rowspan="1" colspan="1">0.778</td>
<td align="right" rowspan="1" colspan="1">0.584</td>
<td align="right" rowspan="1" colspan="1">0.501</td>
<td align="right" rowspan="1" colspan="1">0.410</td>
<td align="right" rowspan="1" colspan="1">0.336</td>
<td align="right" rowspan="1" colspan="1">0.316</td>
<td align="right" rowspan="1" colspan="1">0.279</td>
<td align="right" rowspan="1" colspan="1">0.262</td>
</tr>
<tr>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">sd classification error rate</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.000</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.002</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.003</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.003</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.003</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.005</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.004</td>
<td align="right" style="border-bottom:thick" rowspan="1" colspan="1">0.004</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Attached Keratinized gingiva</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">35 (2)</td>
<td align="right" rowspan="1" colspan="1">123 (12)</td>
<td align="right" rowspan="1" colspan="1">9 (6)</td>
<td align="right" rowspan="1" colspan="1">1 (1)</td>
<td align="right" rowspan="1" colspan="1">73 (16)</td>
<td align="right" rowspan="1" colspan="1">34 (11)</td>
<td align="right" rowspan="1" colspan="1">47 (15)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Buccal mucosa</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">5 (1)</td>
<td align="right" rowspan="1" colspan="1">4 (1)</td>
<td align="right" rowspan="1" colspan="1">1 (1)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">31 (4)</td>
<td align="right" rowspan="1" colspan="1">3 (1)</td>
<td align="right" rowspan="1" colspan="1">3 (1)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Hard palate</td>
<td align="right" rowspan="1" colspan="1">2 (1)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">1 (1)</td>
<td align="right" rowspan="1" colspan="1">3 (1)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">3 (2)</td>
<td align="right" rowspan="1" colspan="1">5 (3)</td>
<td align="right" rowspan="1" colspan="1">9 (3)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Palatine Tonsils</td>
<td align="right" rowspan="1" colspan="1">1 (1)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">5 (3)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">2 (2)</td>
<td align="right" rowspan="1" colspan="1">4 (2)</td>
<td align="right" rowspan="1" colspan="1">6 (4)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Saliva</td>
<td align="right" rowspan="1" colspan="1">5 (3)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">2 (2)</td>
<td align="right" rowspan="1" colspan="1">28 (5)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">4 (2)</td>
<td align="right" rowspan="1" colspan="1">11 (5)</td>
<td align="right" rowspan="1" colspan="1">7 (2)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Subgingival plaque</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">7 (7)</td>
<td align="right" rowspan="1" colspan="1">15 (5)</td>
<td align="right" rowspan="1" colspan="1">39 (7)</td>
<td align="right" rowspan="1" colspan="1">14 (11)</td>
<td align="right" rowspan="1" colspan="1">6 (5)</td>
<td align="right" rowspan="1" colspan="1">21 (10)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Supragingival plaque</td>
<td align="right" rowspan="1" colspan="1">11 (4)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">53 (8)</td>
<td align="right" rowspan="1" colspan="1">23 (6)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">31 (9)</td>
<td align="right" rowspan="1" colspan="1">15 (8)</td>
<td align="right" rowspan="1" colspan="1">31 (6)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Throat</td>
<td align="right" rowspan="1" colspan="1">11 (5)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">16 (4)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">5 (3)</td>
<td align="right" rowspan="1" colspan="1">42 (5)</td>
<td align="right" rowspan="1" colspan="1">9 (4)</td>
</tr>
<tr>
<td align="right" rowspan="1" colspan="1">Tongue dorsum</td>
<td align="right" rowspan="1" colspan="1">30 (9)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">100 (8)</td>
<td align="right" rowspan="1" colspan="1">0</td>
<td align="right" rowspan="1" colspan="1">37 (11)</td>
<td align="right" rowspan="1" colspan="1">60 (13)</td>
<td align="right" rowspan="1" colspan="1">57 (12)</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
</sec>
<sec id="sec022">
<title>Body sites characterisation</title>
<p>We mainly focused on the first three sPLS-DA components for our interpretation (
<xref ref-type="fig" rid="pone.0160169.g005">Fig 5</xref>
and
<xref ref-type="supplementary-material" rid="pone.0160169.s012">S6 Fig</xref>
for the remaining 5 components). Each component seemed to characterise specific subsets of the body sites. For example component 1 discriminated sub and supra gingival plaque against the other body sites, component 2 clustered attached keratinised gingiva and buccal mucosa, but with no clear cut separation (
<xref ref-type="fig" rid="pone.0160169.g005">Fig 5(a)</xref>
), while component 3 seemed to separate attached keratinised gingiva form the others (
<xref ref-type="fig" rid="pone.0160169.g005">Fig 5(c)</xref>
). Similar conclusions could be drawn for the other components (
<xref ref-type="supplementary-material" rid="pone.0160169.s012">S6 Fig</xref>
). The interpretation of these sample plots can be subjective, however, they reflect the close anatomical proximity of the different sample sites in the mouth, such as the tongue coming in contact with the hard palate, teeth, saliva and gums.</p>
<fig id="pone.0160169.g005" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0160169.g005</object-id>
<label>Fig 5</label>
<caption>
<title>Oral data, sPLS-DA sample plot for the different components.</title>
<p>
<bold>(a)</bold>
Component 1 vs. Component 2,
<bold>(b)</bold>
Component 2 vs Component 3, using 95% confidence ellipses.</p>
</caption>
<graphic xlink:href="pone.0160169.g005"></graphic>
</fig>
</sec>
<sec id="sec023">
<title>Features contribution</title>
<p>
<xref ref-type="table" rid="pone.0160169.t001">Table 1</xref>
details the number of features contributing to each oral site per component. Those outputs combined with the interpretation from the sample plots in
<xref ref-type="fig" rid="pone.0160169.g005">Fig 5</xref>
enable better insight into bacteria contributing to body sites that are contiguous. For some cases we observed similar contributions of microbial communities in close body sites, for example Throat and Tongue appeared to be characterised by the same family of bacteria. The closeness of those selected bacteria in terms of their taxonomy can be visualised in the cladogram in
<xref ref-type="fig" rid="pone.0160169.g006">Fig 6(d)</xref>
. We examined the ability of sPLS-DA to highlight subtle differences and characterise different sites in close proximity within the oral microbiome. We reviewed the relevant families selected on the first three sPLS-DA components, which appeared to characterise particular body sites (
<xref ref-type="table" rid="pone.0160169.t001">Table 1</xref>
,
<xref ref-type="supplementary-material" rid="pone.0160169.s014">S1 File</xref>
).</p>
<fig id="pone.0160169.g006" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0160169.g006</object-id>
<label>Fig 6</label>
<caption>
<title>Oral data, contribution and cladogram plots of the features selected for each sPLS-DA component.</title>
<p>
<bold>(a)</bold>
Component 1,
<bold>(b)</bold>
Component 2,
<bold>(c)</bold>
Component 3. In
<bold>(c)</bold>
only the top 150 OTU are represented.
<bold>(d)</bold>
Cladogram generated from the sPLS-DA results for components 1 and 2 using GraphlAn.</p>
</caption>
<graphic xlink:href="pone.0160169.g006"></graphic>
</fig>
<p>The bacteria families selected on component 1 strongly characterised hard palate (members of the
<italic>Streptococcaceae</italic>
family), saliva (
<italic>Prevotellaceae</italic>
,
<italic>Lachnospiraceae</italic>
as well as the phylum TM7 recently described in [
<xref rid="pone.0160169.ref048" ref-type="bibr">48</xref>
] and found prevalent in oral cavity), supragingival plaque as well as throat and tongue. The throat microbiome was characterised by
<italic>Prevotellaceae, Lachnospiraceae, Veillonellaceae, Streptococcaceae</italic>
and
<italic>Erysipelotrichaceae</italic>
. The tongue was found to be more diverse with eight families of bacteria found to be characterising the site. These include the order
<italic>Clostridiales</italic>
families
<italic>Coriobacteriaceae, Gemellaceae, Carnobacteriaceae, Lachnospiraceae, Prevotellaceae, Micrococcaceae, Streptococcaceae</italic>
and
<italic>Veillonellaceae</italic>
. Component 2 separated attached keratinized gingiva from buccal mucosa with the families
<italic>Gemellaceae</italic>
and
<italic>Streptococcaceae</italic>
. Component 3 discriminated multiple sites, in particular attached keratinized gingiva (
<italic>Prevotellaceae, Porphyromonadaceae, Flavobacteriaceae, Carnobacteriaceae, Streptococcaceae, Fusobacteriaceae, Campylobacteraceae, Pasteurellaceae, Neisseriaceae, Moraxellaceae</italic>
and TM7), buccal mucosa and hard palate (
<italic>Streptococcaceae</italic>
for both). Interestingly, component 3 discriminated subgingival plaque (
<italic>Burkholderiaceae, Flavobacteriaceae, Gemellaceae, Micrococcaceae, Neisseriaceae, Prevotellaceae</italic>
and
<italic>Streptococcaceae</italic>
) from supragingival plaque (
<italic>Actinomycetaceae, Burkholderiaceae, Flavobacteriaceae, Fusobacteriaceae, Micrococcaceae, Neisseriaceae</italic>
and
<italic>Streptococcaceae</italic>
) with some overlap between the families.</p>
<p>The analysis of the Oral dataset using our
<monospace>mixMC</monospace>
framework identified relevant bacteria families characterising subtle differences in the oral environment as well as deciphering particular characteristics in each body site.</p>
</sec>
</sec>
<sec id="sec024">
<title>Comparison with the Koren data set</title>
<p>To further validate the relevance of our multivariate method to discriminate and identify microbial features describing microbial communities, we applied our sPLS-DA to the study from Koren
<italic>et al</italic>
. [
<xref rid="pone.0160169.ref025" ref-type="bibr">25</xref>
]. Since the dataset only contained partially repeated measures from multiple sites (individual patients samples in plaque were not sampled in other body sites), we applied a non multilevel sPLS-DA on the TSS+CLR data, resulting in a selection of 30+100 OTU on two components (
<xref ref-type="supplementary-material" rid="pone.0160169.s013">S7 Fig</xref>
,
<xref ref-type="supplementary-material" rid="pone.0160169.s014">S1 File</xref>
). We found that sPLS-DA was able to clearly and distinctly discriminate the three body sites saliva, plaque and stool. Component 1 best characterised stool identifying families of bacteria such as
<italic>Lachnospiraceae, Ruminococcaceae and Bacteroides</italic>
; similar to what was observed in the HMP dataset. Component 2 best discriminated arterial plaque and saliva. Arterial plaque was characterised by families including
<italic>Burkholderiaceae, Propionibacteriaceae, Pseudomonadaceae and Staphylococcaceae</italic>
, which was consistent with what the authors reported to as the ‘core microbiome’ for arterial plaque samples. Our analysis also identified
<italic>Alcaligenaceae, Enterobacteriaceae, Moraxellaceae</italic>
and
<italic>Comamonadaceae</italic>
as bacterial families describing arterial plaque. Saliva was also characterised on component 2 by the same families of bacteria both reported by Koren
<italic>et al</italic>
. and our microbiome signature in the HMP data set.</p>
<p>Our comparative analysis demonstrates that sPLS-DA not only produced reliable and consistent results across different sequencing platforms and datasets but was also able to identify key members of the microbial community characterising in particular saliva, plaque and stool.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="sec025">
<title>Discussion</title>
<p>Traditionally, unsupervised dimension reduction multivariate approaches for microbiome data such as PCoA use pairwise distances or dissimilarities calculated on count data to scale microbial community abundances. However, the output of such method is limited to the visualisation of patterns in the data only. Our
<monospace>mixMC</monospace>
framework did not propose such distances for various reasons. From a theoretical point of view and as discussed by Warton
<italic>et al</italic>
. [
<xref rid="pone.0160169.ref049" ref-type="bibr">49</xref>
], distance-based analyses make implicit assumptions on the mean-variance relationship in count data that may not hold, with the consequence of possible misleading results. From a practical point of view, a multivariate projection based method applied on a
<italic>n</italic>
×
<italic>n</italic>
similarity matrix does not enable identification of bacteria driving differences between habitats. We therefore proposed to directly handle abundance data to achieve that goal.</p>
<p>In our study, we compared two normalisation techniques for 16S OTU count data. TSS normalisation is a popular approach to accommodate for varying sampling and sequencing depth [
<xref rid="pone.0160169.ref008" ref-type="bibr">8</xref>
,
<xref rid="pone.0160169.ref022" ref-type="bibr">22</xref>
], but with the disadvantage of producing compositional data that may lead to spurious results when applying traditional statistical methods [
<xref rid="pone.0160169.ref013" ref-type="bibr">13</xref>
,
<xref rid="pone.0160169.ref015" ref-type="bibr">15</xref>
]. Transforming compositional data using log ratios such as Isometric Log Ratio (ILR) or Centered Log Ratio transformation (CLR) enables to circumvent this issue [
<xref rid="pone.0160169.ref010" ref-type="bibr">10</xref>
,
<xref rid="pone.0160169.ref032" ref-type="bibr">32</xref>
]. Our
<monospace>mixMC</monospace>
framework includes those transformations to visualise diversity patterns (PCA) or to perform discriminant analysis and identify indicator species explaining abundance differences between habitats (sPLS-DA). We applied the ILR transformation for PCA, as proposed by [
<xref rid="pone.0160169.ref016" ref-type="bibr">16</xref>
,
<xref rid="pone.0160169.ref032" ref-type="bibr">32</xref>
] to overcome the CLR limitation that may lead to singular covariance matrices. For sPLS-DA however, the feature selection process requires
<italic>n</italic>
×
<italic>p</italic>
input matrix in order to identify indicator species and we therefore applied the one-to-one CLR transformation. We showed that sPLS-DA delivered relevant results in our three case studies using TSS+CLR transformed data. CSS normalisation was proposed by Paulson
<italic>et al</italic>
. to account for sparse counts [
<xref rid="pone.0160169.ref009" ref-type="bibr">9</xref>
]. In the Most Diverse case study we showed that both TSS and CSS normalisations identified the same bacteria families. In the more complex Oral case study we observed differences as TSS+CLR led to the identification of a greater number of families than CSS. We therefore must therefore keep in mind that normalisation is data specific and needs to be carefully chosen prior to statistical analysis.</p>
<p>Our
<monospace>mixMC</monospace>
framework proposes to handle repeated-measures design with a multilevel variance decomposition. This additional transformation step can also be seen as a scaling transformation to be able to extract subtle differences between body sites or habitats within the same individuals. We anticipate that such experimental designs will become widely adopted in microbiome studies. However, our framework is not only restricted to repeated measures designs and can be used in a more general case to compare phenotypes or disease outcomes.</p>
<p>
<monospace>mixMC</monospace>
proposes more extensive analytical features than univariate methods, including insightful graphical outputs for data interpretation. We found that both univariate and multivariate approaches led to similar overall structure of the signatures were similar at the family level. However, dimension reduction multivariate approaches provide intuitive plots and numerical outputs for a better understanding of the discriminative ability of the OTU features identified.</p>
<p>Our study aligns well with recent studies that investigated the link between gut and oral microbial communities [
<xref rid="pone.0160169.ref025" ref-type="bibr">25</xref>
,
<xref rid="pone.0160169.ref050" ref-type="bibr">50</xref>
]. Franzosa
<italic>et al</italic>
. showed identified a subset of abundant oral microbes surviving transit to the gut that were linked with disease markers of atherosclerosis such as cholesterol [
<xref rid="pone.0160169.ref050" ref-type="bibr">50</xref>
]. From our detailed analyses, we reached similar conclusions identifying bacteria such as
<italic>Fusobacterium</italic>
,
<italic>Propionibacterium, Veillonella</italic>
in both the oral body sites from both HMP data sets (including plaque, tongue and gingiva) and stool microbiomes as underlined by Koren
<italic>et al</italic>
. [
<xref rid="pone.0160169.ref025" ref-type="bibr">25</xref>
]. Our comparative study with the Koren data set demonstrated that sPLS-DA was able to identify a microbiome signature consistent across different individual cohorts and sequencing platforms. The microbiome signatures we identified from the most diverse HMP data set and the Koren data set further demonstrated that microbial communities can not be considered discrete environments, but are, in fact, fluid environments.</p>
</sec>
<sec sec-type="conclusions" id="sec026">
<title>Conclusions</title>
<p>
<monospace>mixMC</monospace>
is a statistical analysis framework enabling holistic understanding of microbial communities. In this study, we demonstrated the advantages of using multivariate methodologies for the statistical analysis of 16S compositional data, to summarise and reduce the dimension of possibly large data sets; to obtain a better understanding of the microbial communities through insightful graphical outputs; and to highlight features characterising and discriminating different environments. While our study has particularly focused on repeated-measures designs, the multivariate approach that we propose is not restricted to such designs only. Similar analyses can be performed on non-repeated designs to highlight relevant microbial features.</p>
<p>The multivariate approach sPLS-DA is a specific case of a larger family of projection-based multivariate approaches, some of which also allow integration of different types of data. Our proposed analysis framework therefore paves the transition towards a ‘microbiome system biology’ approach by integrating large scale multi-‘omics studies such as metatranscriptomics, metabolomics or metaproteomics currently being collected by the integrative HMP project [
<xref rid="pone.0160169.ref051" ref-type="bibr">51</xref>
], therefore enabling the improvement of our understanding of the biomolecular activities and regulatory systems of human microbiota.</p>
<sec id="sec027">
<title>Availability of supporting data</title>
<p>The data sets supporting the results of this article are available from the NIH Human Microbiome Project
<ext-link ext-link-type="uri" xlink:href="http://hmpdacc.org/HMQCP/all/">http://hmpdacc.org/HMQCP/all/</ext-link>
in raw data format, and in processed format on our website
<ext-link ext-link-type="uri" xlink:href="http://www.mixOmics.org/mixMC">www.mixOmics.org/mixMC</ext-link>
. R functions are available on our mixOmics package [
<xref rid="pone.0160169.ref041" ref-type="bibr">41</xref>
,
<xref rid="pone.0160169.ref052" ref-type="bibr">52</xref>
]. R scripts and a full tutorial to reproduce the results from the proposed framework are also available on our website.</p>
</sec>
</sec>
<sec sec-type="supplementary-material" id="sec028">
<title>Supporting Information</title>
<supplementary-material content-type="local-data" id="pone.0160169.s001">
<label>S1 Text</label>
<caption>
<title>Isometric Log Ratio transformation.</title>
<p>(PDF)</p>
</caption>
<media xlink:href="pone.0160169.s001.pdf">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s002">
<label>S1 Table</label>
<caption>
<title>Description of the two HMP data sets through preprocessing steps.</title>
<p>(PDF)</p>
</caption>
<media xlink:href="pone.0160169.s002.pdf">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s003">
<label>S2 Table</label>
<caption>
<title>Most diverse data, number of features selected by the different univariate and multivariate approaches at the OTU or family level.</title>
<p>The OTU selection is based on either 5% significance level (adjusted FDR p-values) for DESeq2 and ZIG or the best classification performance with mean error rate across 10-fold cross-validation repeated 100 times (standard deviation) for sPLS-DA with two components.</p>
<p>(PDF)</p>
</caption>
<media xlink:href="pone.0160169.s003.pdf">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s004">
<label>S3 Table</label>
<caption>
<title>Oral data, performance of sPLS-DA per component and body site (TSS+CLR data).</title>
<p>The mean classification error rate across 10-fold cross validation performed 100 times is indicated.</p>
<p>(PDF)</p>
</caption>
<media xlink:href="pone.0160169.s004.pdf">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s005">
<label>S4 Table</label>
<caption>
<title>Most diverse data, performance of sPLS-DA per body site.</title>
<p>Componentwise 100*10-fold cross-validation classification error rate for sPLS-DA applied to either TSS+CLR or CSS normalised counts with respect to each body site class leading to the optimal microbiome signature.</p>
<p>(PDF)</p>
</caption>
<media xlink:href="pone.0160169.s005.pdf">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s006">
<label>S5 Table</label>
<caption>
<title>Most diverse data, number of features contributing to each body site for each sPLS-DA component.</title>
<p>The sPLS-DA model was applied to either TSS+CLR or CSS normalised counts. Contribution is defined as the body site for which the maximum median normalised OTU abundance is achieved at the OTU (family) level.</p>
<p>(PDF)</p>
</caption>
<media xlink:href="pone.0160169.s006.pdf">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s007">
<label>S1 Fig</label>
<caption>
<title>Oral data, PCoA sample plots with colours indicating gender or run centres.</title>
<p>Sample plot on the first two coordinates with colours indicating gender in
<bold>(a)</bold>
weighted Unifrac or
<bold>(b)</bold>
unweighted Unifrac, or run centers in
<bold>(c)</bold>
weighted Unifrac or
<bold>(d)</bold>
unweighted Unifrac calculated on the filtered OTU count table.</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0160169.s007.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s008">
<label>S2 Fig</label>
<caption>
<title>Most diverse data, PCoA sample plots.</title>
<p>Sample plot on the first two coordinates with
<bold>(a)</bold>
weighted Unifrac
<bold>(b)</bold>
unweighted Unifrac calculated on the unfiltered OTU count table (based on 43,146 OTU).</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0160169.s008.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s009">
<label>S3 Fig</label>
<caption>
<title>Most diverse data, comparison between univariate OTU selections and multivariate sPLS-DA selection.</title>
<p>Comparison of the most differentially abundant features identified by DESeq2 and ZIG (FDR ≤ 0.05) and the most discriminative features identified by TSS+CLR with sPLS-DA or CSS withsPLS-DA (lowest mean classification error rate achieved when performing 100 * 10-fold cross-validation).
<bold>(a)</bold>
: selection size at OTU level,
<bold>(b)</bold>
: at the family level.</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0160169.s009.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s010">
<label>S4 Fig</label>
<caption>
<title>Oral data, PCoA and PCA sample plots.</title>
<p>Sample plot on the first two coordinates with
<bold>(a)</bold>
weighted Unifrac
<bold>(b)</bold>
unweighted Unifrac calculated on the filtered OTU count table and on the first components for
<bold>(c)</bold>
TSS+ILR and
<bold>(d)</bold>
TSS+ILR multilevel normalised OTU counts, and
<bold>(e)</bold>
CSS and
<bold>(f)</bold>
CSS multilevel normalised OTU counts.</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0160169.s010.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s011">
<label>S5 Fig</label>
<caption>
<title>Oral data, sPLS-DA performance.</title>
<p>Mean classification performance using 100 * 10-fold cross-validation. Each component is based on an optimal selection of OTU features that leads to the best classification performance. The sPLS-DA classifier was applied on
<bold>(a)</bold>
TSS+CLR or
<bold>(b)</bold>
CSS normalised data.</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0160169.s011.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s012">
<label>S6 Fig</label>
<caption>
<title>Oral data, sPLS-DA sample representation for the different components of the model.</title>
<p>
<bold>(d)</bold>
Component 4 vs Component 5,
<bold>(e)</bold>
Component 5 vs Component 6,
<bold>(f)</bold>
Component 6 vs Component 7,
<bold>(g)</bold>
Component 7 vs Component 8.</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0160169.s012.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s013">
<label>S7 Fig</label>
<caption>
<title>Koren data.</title>
<p>Sample plot on the first two components with
<bold>(a)</bold>
PCA
<bold>(b)</bold>
sPLS-DA on selected OTU. Contribution plots on the
<bold>(c)</bold>
first component (30 OTU selected) and
<bold>(d)</bold>
on the second component (100 OTU selected).</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0160169.s013.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0160169.s014">
<label>S1 File</label>
<caption>
<title>Diverse, Oral and Koren TSS+CLR data: selected OTU.</title>
<p>Contribution of selected OTU for each sPLS-DA component.</p>
<p>(ZIP)</p>
</caption>
<media xlink:href="pone.0160169.s014.zip">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>KALC was supported in part by the Australian Cancer Research Foundation (ACRF) for the Diamantina Individualised Oncology Care Centre at The University of Queensland Diamantina Institute and the National Health and Medical Research Council (NHMRC) Career Development fellowship (APP1087415). FB was supported by the Agence Nationale de la Recherche (ANR) for the SYNTHACS project (ANR-10-BTBR-05-02). The authors would like to thank Christian Cherveaux (Danone Nutricia Research) for fruitful discussions in the early stages of the project.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="pone.0160169.ref001">
<label>1</label>
<mixed-citation publication-type="journal">
<name>
<surname>Clarridge</surname>
<given-names>J.E.</given-names>
</name>
:
<article-title>Impact of 16S rRNA gene sequence analysis for identification of bacteria on clinical microbiology and infectious diseases</article-title>
.
<source>Clinical microbiology reviews</source>
<volume>17</volume>
(
<issue>4</issue>
),
<fpage>840</fpage>
<lpage>862</lpage>
(
<year>2004</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1128/CMR.17.4.840-862.2004">10.1128/CMR.17.4.840-862.2004</ext-link>
</comment>
<pub-id pub-id-type="pmid">15489351</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref002">
<label>2</label>
<mixed-citation publication-type="journal">
<name>
<surname>Huse</surname>
<given-names>S.M.</given-names>
</name>
,
<name>
<surname>Welch</surname>
<given-names>D.M.</given-names>
</name>
,
<name>
<surname>Morrison</surname>
<given-names>H.G.</given-names>
</name>
,
<name>
<surname>Sogin</surname>
<given-names>M.L.</given-names>
</name>
:
<article-title>Ironing out the wrinkles in the rare biosphere through improved otu clustering</article-title>
.
<source>Environmental microbiology</source>
<volume>12</volume>
(
<issue>7</issue>
),
<fpage>1889</fpage>
<lpage>1898</lpage>
(
<year>2010</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1111/j.1462-2920.2010.02193.x">10.1111/j.1462-2920.2010.02193.x</ext-link>
</comment>
<pub-id pub-id-type="pmid">20236171</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref003">
<label>3</label>
<mixed-citation publication-type="journal">
<name>
<surname>Turnbaugh</surname>
<given-names>P.J.</given-names>
</name>
,
<name>
<surname>Bäckhed</surname>
<given-names>F.</given-names>
</name>
,
<name>
<surname>Fulton</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Gordon</surname>
<given-names>J.I.</given-names>
</name>
:
<article-title>Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome</article-title>
.
<source>Cell host & microbe</source>
<volume>3</volume>
(
<issue>4</issue>
),
<fpage>213</fpage>
<lpage>223</lpage>
(
<year>2008</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.chom.2008.02.015">10.1016/j.chom.2008.02.015</ext-link>
</comment>
<pub-id pub-id-type="pmid">18407065</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref004">
<label>4</label>
<mixed-citation publication-type="journal">
<name>
<surname>Turnbaugh</surname>
<given-names>P.J.</given-names>
</name>
,
<name>
<surname>Hamady</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Yatsunenko</surname>
<given-names>T.</given-names>
</name>
,
<name>
<surname>Cantarel</surname>
<given-names>B.L.</given-names>
</name>
,
<name>
<surname>Duncan</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Ley</surname>
</name>
,
<etal>et al</etal>
:
<article-title>A core gut microbiome in obese and lean twins</article-title>
.
<source>nature</source>
<volume>457</volume>
(
<issue>7228</issue>
),
<fpage>480</fpage>
<lpage>484</lpage>
(
<year>2009</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature07540">10.1038/nature07540</ext-link>
</comment>
<pub-id pub-id-type="pmid">19043404</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref005">
<label>5</label>
<mixed-citation publication-type="journal">
<name>
<surname>Duncan</surname>
<given-names>S.H.</given-names>
</name>
,
<name>
<surname>Lobley</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Holtrop</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Ince</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Johnstone</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Louis</surname>
<given-names>P.</given-names>
</name>
,
<name>
<surname>Flint</surname>
<given-names>H.</given-names>
</name>
:
<article-title>Human colonic microbiota associated with diet, obesity and weight loss</article-title>
.
<source>International journal of obesity</source>
<volume>32</volume>
(
<issue>11</issue>
),
<fpage>1720</fpage>
<lpage>1724</lpage>
(
<year>2008</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/ijo.2008.155">10.1038/ijo.2008.155</ext-link>
</comment>
<pub-id pub-id-type="pmid">18779823</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref006">
<label>6</label>
<mixed-citation publication-type="journal">
<name>
<surname>Gevers</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Kugathasan</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Denson</surname>
<given-names>L.A.</given-names>
</name>
,
<name>
<surname>V´zquez-Baeza</surname>
<given-names>Y.</given-names>
</name>
,
<name>
<surname>Van Treuren</surname>
<given-names>W.</given-names>
</name>
,
<name>
<surname>Ren</surname>
</name>
<etal>et al</etal>
:
<article-title>The treatment-naive microbiome in new-onset crohn’s disease</article-title>
.
<source>Cell host & microbe</source>
<volume>15</volume>
(
<issue>3</issue>
),
<fpage>382</fpage>
<lpage>392</lpage>
(
<year>2014</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.chom.2014.02.005">10.1016/j.chom.2014.02.005</ext-link>
</comment>
<pub-id pub-id-type="pmid">24629344</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref007">
<label>7</label>
<mixed-citation publication-type="journal">
<name>
<surname>Costello</surname>
<given-names>M.-E.</given-names>
</name>
,
<name>
<surname>Ciccia</surname>
<given-names>F.</given-names>
</name>
,
<name>
<surname>Willner</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Warrington</surname>
<given-names>N.</given-names>
</name>
,
<name>
<surname>Robinson</surname>
<given-names>P.C.</given-names>
</name>
,
<name>
<surname>Gardiner</surname>
<given-names>B.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>Intestinal dysbiosis in ankylosing spondylitis</article-title>
.
<source>Arthritis & Rheumatology</source>
<volume>67</volume>
(
<issue>3</issue>
),
<fpage>686</fpage>
<lpage>691</lpage>
(
<year>2015</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1002/art.38967">10.1002/art.38967</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref008">
<label>8</label>
<mixed-citation publication-type="journal">
<name>
<surname>White</surname>
<given-names>J.R.</given-names>
</name>
,
<name>
<surname>Nagarajan</surname>
<given-names>N.</given-names>
</name>
,
<name>
<surname>Pop</surname>
<given-names>M.</given-names>
</name>
:
<article-title>Statistical methods for detecting differentially abundant features in clinical metagenomic samples</article-title>
.
<source>PLoS computational biology</source>
<volume>5</volume>
(
<issue>4</issue>
),
<fpage>e1000352</fpage>
(
<year>2009</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pcbi.1000352">10.1371/journal.pcbi.1000352</ext-link>
</comment>
<pub-id pub-id-type="pmid">19360128</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref009">
<label>9</label>
<mixed-citation publication-type="journal">
<name>
<surname>Paulson</surname>
<given-names>J.N.</given-names>
</name>
,
<name>
<surname>Stine</surname>
<given-names>O.C.</given-names>
</name>
,
<name>
<surname>Bravo</surname>
<given-names>H.C.</given-names>
</name>
,
<name>
<surname>Pop</surname>
<given-names>M.</given-names>
</name>
:
<article-title>Differential abundance analysis for microbial marker-gene surveys</article-title>
.
<source>Nature methods</source>
<volume>10</volume>
(
<issue>12</issue>
),
<fpage>1200</fpage>
<lpage>1202</lpage>
(
<year>2013</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nmeth.2658">10.1038/nmeth.2658</ext-link>
</comment>
<pub-id pub-id-type="pmid">24076764</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref010">
<label>10</label>
<mixed-citation publication-type="journal">
<name>
<surname>Aitchison</surname>
<given-names>J.</given-names>
</name>
:
<article-title>The statistical analysis of compositional data</article-title>
.
<source>Journal of the Royal Statistical Society. Series B (Methodological)</source>
,
<fpage>139</fpage>
<lpage>177</lpage>
(
<year>1982</year>
)</mixed-citation>
</ref>
<ref id="pone.0160169.ref011">
<label>11</label>
<mixed-citation publication-type="journal">
<name>
<surname>Lovell</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Pawlowsky-Glahn</surname>
<given-names>V.</given-names>
</name>
,
<name>
<surname>Egozcue</surname>
<given-names>J.J.</given-names>
</name>
,
<name>
<surname>Marguerat</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Bähler</surname>
<given-names>J.</given-names>
</name>
:
<article-title>Proportionality: a valid alternative to correlation for relative data</article-title>
.
<source>PLoS computational biology</source>
<volume>11</volume>
(
<issue>3</issue>
),
<fpage>e1004075</fpage>
(
<year>2015</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pcbi.1004075">10.1371/journal.pcbi.1004075</ext-link>
</comment>
<pub-id pub-id-type="pmid">25775355</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref012">
<label>12</label>
<mixed-citation publication-type="journal">
<name>
<surname>Ban</surname>
<given-names>Y.</given-names>
</name>
,
<name>
<surname>An</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Jiang</surname>
<given-names>H.</given-names>
</name>
:
<article-title>Investigating microbial co-occurrence patterns based on metagenomic compositional data</article-title>
.
<source>Bioinformatics</source>
<volume>31</volume>
(
<issue>20</issue>
),
<fpage>3322</fpage>
<lpage>3329</lpage>
(
<year>2015</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/bioinformatics/btv364">10.1093/bioinformatics/btv364</ext-link>
</comment>
<pub-id pub-id-type="pmid">26079350</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref013">
<label>13</label>
<mixed-citation publication-type="journal">
<name>
<surname>Kurtz</surname>
<given-names>Z.D.</given-names>
</name>
,
<name>
<surname>Mueller</surname>
<given-names>C.L.</given-names>
</name>
,
<name>
<surname>Miraldi</surname>
<given-names>E.R.</given-names>
</name>
,
<name>
<surname>Littman</surname>
<given-names>D.R.</given-names>
</name>
,
<name>
<surname>Blaser</surname>
<given-names>M.J.</given-names>
</name>
,
<name>
<surname>Bonneau</surname>
<given-names>R.A.</given-names>
</name>
:
<article-title>Sparse and compositionally robust inference of microbial ecological networks</article-title>
.
<source>PLoS Comput Biol</source>
<volume>11</volume>
(
<issue>5</issue>
),
<fpage>e1004226</fpage>
(
<year>2015</year>
).
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pcbi.1004226">10.1371/journal.pcbi.1004226</ext-link>
</comment>
<pub-id pub-id-type="pmid">25950956</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref014">
<label>14</label>
<mixed-citation publication-type="journal">
<name>
<surname>Mandal</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Van Treuren</surname>
<given-names>W.</given-names>
</name>
,
<name>
<surname>White</surname>
<given-names>R.A.</given-names>
</name>
,
<name>
<surname>Eggesbo</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Knight</surname>
<given-names>R.</given-names>
</name>
,
<name>
<surname>Peddada</surname>
<given-names>S.D.</given-names>
</name>
:
<article-title>Analysis of composition of microbiomes: a novel method for studying microbial composition</article-title>
.
<source>Microbial Ecology in Health and Disease</source>
<volume>26</volume>
(
<year>2015</year>
)</mixed-citation>
</ref>
<ref id="pone.0160169.ref015">
<label>15</label>
<mixed-citation publication-type="journal">
<name>
<surname>Fernandes</surname>
<given-names>A.D.</given-names>
</name>
,
<name>
<surname>Reid</surname>
<given-names>J.N.</given-names>
</name>
,
<name>
<surname>Macklaim</surname>
<given-names>J.M.</given-names>
</name>
,
<name>
<surname>McMurrough</surname>
<given-names>T.A.</given-names>
</name>
,
<name>
<surname>Edgell</surname>
<given-names>D.R.</given-names>
</name>
,
<name>
<surname>Gloor</surname>
<given-names>G.B.</given-names>
</name>
:
<article-title>Unifying the analysis of high-throughput sequencing datasets: characterizing rna-seq, 16s rrna gene sequencing and selective growth experiments by compositional data analysis</article-title>
.
<source>Microbiome</source>
<volume>2</volume>
(
<issue>1</issue>
),
<fpage>1</fpage>
<lpage>13</lpage>
(
<year>2014</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/2049-2618-2-15">10.1186/2049-2618-2-15</ext-link>
</comment>
<pub-id pub-id-type="pmid">24468033</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref016">
<label>16</label>
<mixed-citation publication-type="journal">
<name>
<surname>Kalivodov´</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Hron</surname>
<given-names>K.</given-names>
</name>
,
<name>
<surname>Filzmoser</surname>
<given-names>P.</given-names>
</name>
,
<name>
<surname>Najdekr</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Janeckov´</surname>
<given-names>H.</given-names>
</name>
,
<name>
<surname>Adam</surname>
<given-names>T.</given-names>
</name>
:
<article-title>PLS-DA for compositional data with application to metabolomics</article-title>
.
<source>Journal of Chemometrics</source>
<volume>29</volume>
(
<issue>1</issue>
),
<fpage>21</fpage>
<lpage>28</lpage>
(
<year>2015</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1002/cem.2657">10.1002/cem.2657</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref017">
<label>17</label>
<mixed-citation publication-type="other">Gower J.C.: Principal coordinates analysis. Wiley StatsRef: Statistics Reference Online (1998)</mixed-citation>
</ref>
<ref id="pone.0160169.ref018">
<label>18</label>
<mixed-citation publication-type="journal">
<name>
<surname>Bray</surname>
<given-names>J.R.</given-names>
</name>
,
<name>
<surname>Curtis</surname>
<given-names>J.T.</given-names>
</name>
:
<article-title>An ordination of the upland forest communities of southern wisconsin</article-title>
.
<source>Ecological monographs</source>
<volume>27</volume>
(
<issue>4</issue>
),
<fpage>325</fpage>
<lpage>349</lpage>
(
<year>1957</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.2307/1942268">10.2307/1942268</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref019">
<label>19</label>
<mixed-citation publication-type="journal">
<name>
<surname>Lozupone</surname>
<given-names>C.</given-names>
</name>
,
<name>
<surname>Knight</surname>
<given-names>R.</given-names>
</name>
:
<article-title>Unifrac: a new phylogenetic method for comparing microbial communities</article-title>
.
<source>Applied and environmental microbiology</source>
<volume>71</volume>
(
<issue>12</issue>
),
<fpage>8228</fpage>
<lpage>8235</lpage>
(
<year>2005</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1128/AEM.71.12.8228-8235.2005">10.1128/AEM.71.12.8228-8235.2005</ext-link>
</comment>
<pub-id pub-id-type="pmid">16332807</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref020">
<label>20</label>
<mixed-citation publication-type="journal">
<name>
<surname>Lozupone</surname>
<given-names>C.A.</given-names>
</name>
,
<name>
<surname>Hamady</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Kelley</surname>
<given-names>S.T.</given-names>
</name>
,
<name>
<surname>Knight</surname>
<given-names>R.</given-names>
</name>
:
<article-title>Quantitative and qualitative
<italic>β</italic>
diversity measures lead to different insights into factors that structure microbial communities</article-title>
.
<source>Applied and environmental microbiology</source>
<volume>73</volume>
(
<issue>5</issue>
),
<fpage>1576</fpage>
<lpage>1585</lpage>
(
<year>2007</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1128/AEM.01996-06">10.1128/AEM.01996-06</ext-link>
</comment>
<pub-id pub-id-type="pmid">17220268</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref021">
<label>21</label>
<mixed-citation publication-type="journal">
<name>
<surname>Dolédec</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Chessel</surname>
<given-names>D.</given-names>
</name>
:
<article-title>Rythmes saisonniers et composantes stationnelles en milieu aquatique. i: Description d’un plan d’observation complet par projection de variables</article-title>
.
<source>Acta oecologica. Oecologia generalis</source>
<volume>8</volume>
(
<issue>3</issue>
),
<fpage>403</fpage>
<lpage>426</lpage>
(
<year>1987</year>
)</mixed-citation>
</ref>
<ref id="pone.0160169.ref022">
<label>22</label>
<mixed-citation publication-type="journal">
<name>
<surname>Segata</surname>
<given-names>N.</given-names>
</name>
,
<name>
<surname>Izard</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Waldron</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Gevers</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Miropolsky</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Garrett</surname>
<given-names>W.S.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>Metagenomic biomarker discovery and explanation</article-title>
.
<source>Genome Biol</source>
<volume>12</volume>
(
<issue>6</issue>
),
<fpage>60</fpage>
(
<year>2011</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/gb-2011-12-6-r60">10.1186/gb-2011-12-6-r60</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref023">
<label>23</label>
<mixed-citation publication-type="journal">
<collab>Human Microbiome Project Consortium</collab>
:
<article-title>A framework for human microbiome research</article-title>
.
<source>Nature</source>
<volume>486</volume>
(
<issue>7402</issue>
),
<fpage>215</fpage>
<lpage>221</lpage>
(
<year>2012</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature11209">10.1038/nature11209</ext-link>
</comment>
<pub-id pub-id-type="pmid">22699610</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref024">
<label>24</label>
<mixed-citation publication-type="journal">
<collab>Human Microbiome Project Consortium</collab>
:
<article-title>Structure, function and diversity of the healthy human microbiome</article-title>
.
<source>Nature</source>
<volume>486</volume>
(
<issue>7402</issue>
),
<fpage>207</fpage>
<lpage>214</lpage>
(
<year>2012</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature11234">10.1038/nature11234</ext-link>
</comment>
<pub-id pub-id-type="pmid">22699609</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref025">
<label>25</label>
<mixed-citation publication-type="journal">
<name>
<surname>Koren</surname>
<given-names>O.</given-names>
</name>
,
<name>
<surname>Spor</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Felin</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Fak</surname>
<given-names>F.</given-names>
</name>
,
<name>
<surname>Stombaugh</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Tremaroli</surname>
<given-names>V.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>Human oral, gut, and plaque microbiota in patients with atherosclerosis</article-title>
.
<source>Proceedings of the National Academy of Sciences</source>
<volume>108</volume>
(
<issue>Supplement 1</issue>
),
<fpage>4592</fpage>
<lpage>4598</lpage>
(
<year>2011</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.1011383107">10.1073/pnas.1011383107</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref026">
<label>26</label>
<mixed-citation publication-type="journal">
<name>
<surname>Caporaso</surname>
<given-names>J.G.</given-names>
</name>
,
<name>
<surname>Kuczynski</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Stombaugh</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Bittinger</surname>
<given-names>K.</given-names>
</name>
,
<name>
<surname>Bushman</surname>
<given-names>F.D.</given-names>
</name>
,
<name>
<surname>Costello</surname>
<given-names>E.K.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>QIIME allows analysis of high-throughput community sequencing data</article-title>
.
<source>Nature methods</source>
<volume>7</volume>
(
<issue>5</issue>
),
<fpage>335</fpage>
<lpage>336</lpage>
(
<year>2010</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nmeth.f.303">10.1038/nmeth.f.303</ext-link>
</comment>
<pub-id pub-id-type="pmid">20383131</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref027">
<label>27</label>
<mixed-citation publication-type="journal">
<name>
<surname>Bokulich</surname>
<given-names>N.A.</given-names>
</name>
,
<name>
<surname>Subramanian</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Faith</surname>
<given-names>J.J.</given-names>
</name>
,
<name>
<surname>Gevers</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Gordon</surname>
<given-names>J.I.</given-names>
</name>
,
<name>
<surname>Knight</surname>
<given-names>R.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>Quality-filtering vastly improves diversity estimates from illumina amplicon sequencing</article-title>
.
<source>Nature methods</source>
<volume>10</volume>
(
<issue>1</issue>
),
<fpage>57</fpage>
<lpage>59</lpage>
(
<year>2013</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nmeth.2276">10.1038/nmeth.2276</ext-link>
</comment>
<pub-id pub-id-type="pmid">23202435</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref028">
<label>28</label>
<mixed-citation publication-type="journal">
<name>
<surname>Kunin</surname>
<given-names>V</given-names>
</name>
,
<name>
<surname>Engelbrektson</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Ochman</surname>
<given-names>H</given-names>
</name>
,
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
.
<article-title>Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates</article-title>
.
<source>Environmental Microbiology</source>
<volume>12</volume>
(
<issue>1</issue>
):
<fpage>118</fpage>
<lpage>23</lpage>
(
<year>2010</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1111/j.1462-2920.2009.02051.x">10.1111/j.1462-2920.2009.02051.x</ext-link>
</comment>
<pub-id pub-id-type="pmid">19725865</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref029">
<label>29</label>
<mixed-citation publication-type="journal">
<name>
<surname>Knights</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Parfrey</surname>
<given-names>L.W.</given-names>
</name>
,
<name>
<surname>Zaneveld</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Lozupone</surname>
<given-names>C.</given-names>
</name>
,
<name>
<surname>Knight</surname>
<given-names>R.</given-names>
</name>
:
<article-title>Human-associated microbial signatures: examining their predictive value</article-title>
.
<source>Cell host & microbe</source>
<volume>10</volume>
(
<issue>4</issue>
),
<fpage>292</fpage>
<lpage>296</lpage>
(
<year>2011</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.chom.2011.09.003">10.1016/j.chom.2011.09.003</ext-link>
</comment>
<pub-id pub-id-type="pmid">22018228</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref030">
<label>30</label>
<mixed-citation publication-type="journal">
<name>
<surname>Arumugam</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Raes</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Pelletier</surname>
<given-names>E.</given-names>
</name>
,
<name>
<surname>Le Paslier</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Yamada</surname>
<given-names>T.</given-names>
</name>
,
<name>
<surname>Mende</surname>
<given-names>D.R.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>Enterotypes of the human gut microbiome</article-title>
.
<source>nature</source>
<volume>473</volume>
(
<issue>7346</issue>
),
<fpage>174</fpage>
<lpage>180</lpage>
(
<year>2011</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature09944">10.1038/nature09944</ext-link>
</comment>
<pub-id pub-id-type="pmid">21508958</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref031">
<label>31</label>
<mixed-citation publication-type="other">Paulson, J.N., Pop, M., Bravo, H.C.: metagenomeSeq: Statistical analysis for sparse high-throughput sequencing. Bioconductor package: 1.6.0. (2015).
<comment>
<ext-link ext-link-type="uri" xlink:href="http://cbcb.umd.edu/software/metagenomeSeq">http://cbcb.umd.edu/software/metagenomeSeq</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref032">
<label>32</label>
<mixed-citation publication-type="journal">
<name>
<surname>Filzmoser</surname>
<given-names>P.</given-names>
</name>
,
<name>
<surname>Hron</surname>
<given-names>K.</given-names>
</name>
,
<name>
<surname>Reimann</surname>
<given-names>C.</given-names>
</name>
:
<article-title>Principal component analysis for compositional data with outliers</article-title>
.
<source>Environmetrics</source>
<volume>20</volume>
(
<issue>6</issue>
),
<fpage>621</fpage>
<lpage>632</lpage>
(
<year>2009</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1002/env.966">10.1002/env.966</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref033">
<label>33</label>
<mixed-citation publication-type="book">
<name>
<surname>Templ</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Hron</surname>
<given-names>K.</given-names>
</name>
,
<source>Filzmoser P.: robCompositions: an R-package for robust statistical analysis of compositional data</source>
.
<publisher-name>John Wiley and Sons</publisher-name>
(
<year>2011</year>
)</mixed-citation>
</ref>
<ref id="pone.0160169.ref034">
<label>34</label>
<mixed-citation publication-type="journal">
<name>
<surname>Westerhuis</surname>
<given-names>J.A.</given-names>
</name>
,
<name>
<surname>van Velzen</surname>
<given-names>E.J.</given-names>
</name>
,
<name>
<surname>Hoefsloot</surname>
<given-names>H.C.</given-names>
</name>
,
<name>
<surname>Smilde</surname>
<given-names>A.K.</given-names>
</name>
:
<article-title>Multivariate paired data analysis: multilevel plsda versus oplsda</article-title>
.
<source>Metabolomics</source>
<volume>6</volume>
(
<issue>1</issue>
),
<fpage>119</fpage>
<lpage>128</lpage>
(
<year>2010</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1007/s11306-009-0185-z">10.1007/s11306-009-0185-z</ext-link>
</comment>
<pub-id pub-id-type="pmid">20339442</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref035">
<label>35</label>
<mixed-citation publication-type="journal">
<name>
<surname>Liquet</surname>
<given-names>B.</given-names>
</name>
,
<name>
<surname>Lê Cao</surname>
<given-names>K.-A.</given-names>
</name>
,
<name>
<surname>Hocini</surname>
<given-names>H.</given-names>
</name>
,
<name>
<surname>Thiébaut</surname>
<given-names>R.</given-names>
</name>
:
<article-title>A novel approach for biomarker selection and the integration of repeated measures experiments from two assays</article-title>
.
<source>BMC bioinformatics</source>
<volume>13</volume>
(
<issue>1</issue>
),
<fpage>325</fpage>
(
<year>2012</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1471-2105-13-325">10.1186/1471-2105-13-325</ext-link>
</comment>
<pub-id pub-id-type="pmid">23216942</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref036">
<label>36</label>
<mixed-citation publication-type="journal">
<name>
<surname>Straube</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Gorse</surname>
<given-names>A.-D.</given-names>
</name>
,
<name>
<surname>Huang</surname>
<given-names>B.E.</given-names>
</name>
,
<name>
<surname>Lê Cao</surname>
<given-names>K.-A.</given-names>
</name>
:
<article-title>A linear mixed model spline framework for analysing time course’omics data</article-title>
.
<source>PLoS ONE</source>
<volume>10</volume>
(
<issue>8</issue>
),
<fpage>e0134540</fpage>
(
<year>2015</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0134540">10.1371/journal.pone.0134540</ext-link>
</comment>
<pub-id pub-id-type="pmid">26313144</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref037">
<label>37</label>
<mixed-citation publication-type="journal">
<name>
<surname>Lê Cao</surname>
<given-names>K.-A.</given-names>
</name>
,
<name>
<surname>Boitard</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Besse</surname>
<given-names>P.</given-names>
</name>
:
<article-title>Sparse PLS Discriminant Analysis: biologically relevant feature selection and graphical displays for multiclass problems</article-title>
.
<source>BMC bioinformatics</source>
<volume>12</volume>
(
<issue>1</issue>
),
<fpage>253</fpage>
(
<year>2011</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1471-2105-12-253">10.1186/1471-2105-12-253</ext-link>
</comment>
<pub-id pub-id-type="pmid">21693065</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref038">
<label>38</label>
<mixed-citation publication-type="journal">
<name>
<surname>Wold</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Sjöström</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Eriksson</surname>
<given-names>L.</given-names>
</name>
:
<article-title>Pls-regression: a basic tool of chemometrics</article-title>
.
<source>Chemometrics and intelligent laboratory systems</source>
<volume>58</volume>
(
<issue>2</issue>
),
<fpage>109</fpage>
<lpage>130</lpage>
(
<year>2001</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/S0169-7439(01)00155-1">10.1016/S0169-7439(01)00155-1</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref039">
<label>39</label>
<mixed-citation publication-type="journal">
<name>
<surname>Tibshirani</surname>
<given-names>R.</given-names>
</name>
:
<article-title>Regression shrinkage and selection via the lasso</article-title>
.
<source>Journal of the Royal Statistical Society. Series B (Methodological)</source>
,
<fpage>267</fpage>
<lpage>288</lpage>
(
<year>1996</year>
)</mixed-citation>
</ref>
<ref id="pone.0160169.ref040">
<label>40</label>
<mixed-citation publication-type="journal">
<name>
<surname>Asnicar</surname>
<given-names>F.</given-names>
</name>
,
<name>
<surname>Weingart</surname>
<given-names>G.</given-names>
</name>
,
<name>
<surname>Tickle</surname>
<given-names>T.</given-names>
</name>
,
<name>
<surname>Huttenhower</surname>
<given-names>C.</given-names>
</name>
,
<name>
<surname>Segata</surname>
<given-names>N.</given-names>
</name>
:
<article-title>Compact graphical representation of phylogenetic data and metadata with graphlan</article-title>
.
<source>PeerJ</source>
(
<year>2015</year>
).
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.7717/peerj.1029">10.7717/peerj.1029</ext-link>
</comment>
<pub-id pub-id-type="pmid">26157614</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref041">
<label>41</label>
<mixed-citation publication-type="other">Lê Cao, K.-A., Rohart F., Gautier, B., Bartolo, F., Gonz´lez, I., Déjean, S.: mixOmics: Omics Data Integration Project. R package version 6.0.0 (2016).
<comment>
<ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=mixOmics">https://CRAN.R-project.org/package=mixOmics</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref042">
<label>42</label>
<mixed-citation publication-type="journal">
<name>
<surname>Benjamini</surname>
<given-names>Y.</given-names>
</name>
,
<name>
<surname>Hochberg</surname>
<given-names>Y.</given-names>
</name>
:
<article-title>Controlling the false discovery rate: a practical and powerful approach to multiple testing</article-title>
.
<source>Journal of the Royal Statistical Society. Series B (Methodological)</source>
,
<fpage>289</fpage>
<lpage>300</lpage>
(
<year>1995</year>
)</mixed-citation>
</ref>
<ref id="pone.0160169.ref043">
<label>43</label>
<mixed-citation publication-type="journal">
<name>
<surname>Anders</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Huber</surname>
<given-names>W.</given-names>
</name>
:
<article-title>Differential expression analysis for sequence count data</article-title>
.
<source>Genome biol</source>
<volume>11</volume>
(
<issue>10</issue>
),
<fpage>106</fpage>
(
<year>2010</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/gb-2010-11-10-r106">10.1186/gb-2010-11-10-r106</ext-link>
</comment>
<pub-id pub-id-type="pmid">20236492</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref044">
<label>44</label>
<mixed-citation publication-type="journal">
<name>
<surname>McMurdie</surname>
<given-names>P.J.</given-names>
</name>
,
<name>
<surname>Holmes</surname>
<given-names>S.</given-names>
</name>
:
<article-title>Waste not, want not: why rarefying microbiome data is inadmissible</article-title>
.
<source>PLoS Computational Biology</source>
<volume>10</volume>
(
<issue>4</issue>
),
<fpage>e1003531</fpage>
(
<year>2014</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pcbi.1003531">10.1371/journal.pcbi.1003531</ext-link>
</comment>
<pub-id pub-id-type="pmid">24699258</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref045">
<label>45</label>
<mixed-citation publication-type="journal">
<name>
<surname>Love</surname>
<given-names>M.I.</given-names>
</name>
,
<name>
<surname>Huber</surname>
<given-names>W.</given-names>
</name>
,
<name>
<surname>Anders</surname>
<given-names>S.</given-names>
</name>
:
<article-title>Moderated estimation of fold change and dispersion for rna-seq data with deseq2</article-title>
.
<source>Genome biology</source>
<volume>15</volume>
(
<issue>12</issue>
),
<fpage>550</fpage>
(
<year>2014</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/s13059-014-0550-8">10.1186/s13059-014-0550-8</ext-link>
</comment>
<pub-id pub-id-type="pmid">25516281</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref046">
<label>46</label>
<mixed-citation publication-type="journal">
<name>
<surname>Li</surname>
<given-names>K.</given-names>
</name>
,
<name>
<surname>Bihan</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Yooseph</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Methée</surname>
<given-names>B.A.</given-names>
</name>
:
<article-title>Analyses of the microbial diversity across the human microbiome</article-title>
.
<source>PLoS ONE</source>
<volume>7</volume>
(
<issue>6</issue>
),
<fpage>e32118</fpage>
(
<year>2012</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0032118">10.1371/journal.pone.0032118</ext-link>
</comment>
<pub-id pub-id-type="pmid">22719823</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref047">
<label>47</label>
<mixed-citation publication-type="journal">
<collab>Human Microbiome Project Consortium</collab>
:
<article-title>Evaluation of 16S rDNA-based community profiling for human microbiome research</article-title>
.
<source>PLoS ONE</source>
<volume>7</volume>
(
<issue>6</issue>
),
<fpage>e39315</fpage>
(
<year>2012</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0039315">10.1371/journal.pone.0039315</ext-link>
</comment>
<pub-id pub-id-type="pmid">22720093</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref048">
<label>48</label>
<mixed-citation publication-type="journal">
<name>
<surname>He</surname>
<given-names>X.</given-names>
</name>
,
<name>
<surname>McLean</surname>
<given-names>J.S.</given-names>
</name>
,
<name>
<surname>Edlund</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Yooseph</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Hall</surname>
<given-names>A.P.</given-names>
</name>
,
<name>
<surname>Liu</surname>
<given-names>S.-Y.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>Cultivation of a human-associated tm7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle</article-title>
.
<source>Proceedings of the National Academy of Sciences</source>
<volume>112</volume>
(
<issue>1</issue>
),
<fpage>244</fpage>
<lpage>249</lpage>
(
<year>2015</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.1419038112">10.1073/pnas.1419038112</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref049">
<label>49</label>
<mixed-citation publication-type="journal">
<name>
<surname>Warton</surname>
<given-names>D.I.</given-names>
</name>
,
<name>
<surname>Wright</surname>
<given-names>S.T.</given-names>
</name>
,
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
:
<article-title>Distance-based multivariate analyses confound location and dispersion effects</article-title>
.
<source>Methods in Ecology and Evolution</source>
<volume>3</volume>
(
<issue>1</issue>
),
<fpage>89</fpage>
<lpage>101</lpage>
(
<year>2012</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1111/j.2041-210X.2011.00127.x">10.1111/j.2041-210X.2011.00127.x</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref050">
<label>50</label>
<mixed-citation publication-type="journal">
<name>
<surname>Franzosa</surname>
<given-names>E.A.</given-names>
</name>
,
<name>
<surname>Morgan</surname>
<given-names>X.C.</given-names>
</name>
,
<name>
<surname>Segata</surname>
<given-names>N.</given-names>
</name>
,
<name>
<surname>Waldron</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Reyes</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Earl</surname>
<given-names>A.M.</given-names>
</name>
,
<etal>et al</etal>
:
<article-title>Relating the metatranscriptome and metagenome of the human gut</article-title>
.
<source>Proceedings of the National Academy of Sciences</source>
<volume>111</volume>
(
<issue>22</issue>
),
<fpage>2329</fpage>
<lpage>2338</lpage>
(
<year>2014</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1073/pnas.1319284111">10.1073/pnas.1319284111</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref051">
<label>51</label>
<mixed-citation publication-type="journal">
<collab>Integrative HMP (iHMP) Research Network Consortium</collab>
:
<article-title>The integrative human microbiome project: Dynamic analysis of microbiome-host omics profiles during periods of human health and disease</article-title>
.
<source>Cell host & microbe</source>
<volume>16</volume>
(
<issue>3</issue>
),
<fpage>276</fpage>
(
<year>2014</year>
)
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1016/j.chom.2014.08.014">10.1016/j.chom.2014.08.014</ext-link>
</comment>
<pub-id pub-id-type="pmid">25211071</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0160169.ref052">
<label>52</label>
<mixed-citation publication-type="journal">
<name>
<surname>Gonz´lez</surname>
<given-names>I.</given-names>
</name>
,
<name>
<surname>Lê Cao</surname>
<given-names>K.-A.</given-names>
</name>
,
<name>
<surname>Davis</surname>
<given-names>M.J.</given-names>
</name>
,
<name>
<surname>Déjean</surname>
<given-names>S.</given-names>
</name>
:
<article-title>Visualising associations between paired ‘omics’ data sets</article-title>
.
<source>BioData Mining</source>
<volume>5</volume>
(
<issue>1</issue>
):
<fpage>19</fpage>
(
<year>2013</year>
).</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002947  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 002947  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024