Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Reconstruction of Ribosomal RNA Genes from Metagenomic Data

Identifieur interne : 000606 ( Pmc/Corpus ); précédent : 000605; suivant : 000607

Reconstruction of Ribosomal RNA Genes from Metagenomic Data

Auteurs : Lu Fan ; Kerensa Mcelroy ; Torsten Thomas

Source :

RBID : PMC:3384625

Abstract

Direct sequencing of environmental DNA (metagenomics) has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes.


Url:
DOI: 10.1371/journal.pone.0039948
PubMed: 22761935
PubMed Central: 3384625

Links to Exploration step

PMC:3384625

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Reconstruction of Ribosomal RNA Genes from Metagenomic Data</title>
<author>
<name sortKey="Fan, Lu" sort="Fan, Lu" uniqKey="Fan L" first="Lu" last="Fan">Lu Fan</name>
<affiliation>
<nlm:aff id="aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mcelroy, Kerensa" sort="Mcelroy, Kerensa" uniqKey="Mcelroy K" first="Kerensa" last="Mcelroy">Kerensa Mcelroy</name>
<affiliation>
<nlm:aff id="aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Thomas, Torsten" sort="Thomas, Torsten" uniqKey="Thomas T" first="Torsten" last="Thomas">Torsten Thomas</name>
<affiliation>
<nlm:aff id="aff1"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">22761935</idno>
<idno type="pmc">3384625</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3384625</idno>
<idno type="RBID">PMC:3384625</idno>
<idno type="doi">10.1371/journal.pone.0039948</idno>
<date when="2012">2012</date>
<idno type="wicri:Area/Pmc/Corpus">000606</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Reconstruction of Ribosomal RNA Genes from Metagenomic Data</title>
<author>
<name sortKey="Fan, Lu" sort="Fan, Lu" uniqKey="Fan L" first="Lu" last="Fan">Lu Fan</name>
<affiliation>
<nlm:aff id="aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mcelroy, Kerensa" sort="Mcelroy, Kerensa" uniqKey="Mcelroy K" first="Kerensa" last="Mcelroy">Kerensa Mcelroy</name>
<affiliation>
<nlm:aff id="aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Thomas, Torsten" sort="Thomas, Torsten" uniqKey="Thomas T" first="Torsten" last="Thomas">Torsten Thomas</name>
<affiliation>
<nlm:aff id="aff1"></nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Direct sequencing of environmental DNA (metagenomics) has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Pace, Nr" uniqKey="Pace N">NR Pace</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hong, S" uniqKey="Hong S">S Hong</name>
</author>
<author>
<name sortKey="Bunge, J" uniqKey="Bunge J">J Bunge</name>
</author>
<author>
<name sortKey="Leslin, C" uniqKey="Leslin C">C Leslin</name>
</author>
<author>
<name sortKey="Jeon, S" uniqKey="Jeon S">S Jeon</name>
</author>
<author>
<name sortKey="Epstein, Ss" uniqKey="Epstein S">SS Epstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Venter, Jc" uniqKey="Venter J">JC Venter</name>
</author>
<author>
<name sortKey="Remington, K" uniqKey="Remington K">K Remington</name>
</author>
<author>
<name sortKey="Heidelberg, Jf" uniqKey="Heidelberg J">JF Heidelberg</name>
</author>
<author>
<name sortKey="Halpern, Al" uniqKey="Halpern A">AL Halpern</name>
</author>
<author>
<name sortKey="Rusch, D" uniqKey="Rusch D">D Rusch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Haas, Bj" uniqKey="Haas B">BJ Haas</name>
</author>
<author>
<name sortKey="Gevers, D" uniqKey="Gevers D">D Gevers</name>
</author>
<author>
<name sortKey="Earl, Am" uniqKey="Earl A">AM Earl</name>
</author>
<author>
<name sortKey="Feldgarden, M" uniqKey="Feldgarden M">M Feldgarden</name>
</author>
<author>
<name sortKey="Ward, Dv" uniqKey="Ward D">DV Ward</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stark, M" uniqKey="Stark M">M Stark</name>
</author>
<author>
<name sortKey="Berger, Sa" uniqKey="Berger S">SA Berger</name>
</author>
<author>
<name sortKey="Stamatakis, A" uniqKey="Stamatakis A">A Stamatakis</name>
</author>
<author>
<name sortKey="Von Mering, C" uniqKey="Von Mering C">C von Mering</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, M" uniqKey="Wu M">M Wu</name>
</author>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author>
<name sortKey="Gibbons, T" uniqKey="Gibbons T">T Gibbons</name>
</author>
<author>
<name sortKey="Ghodsi, M" uniqKey="Ghodsi M">M Ghodsi</name>
</author>
<author>
<name sortKey="Treangen, T" uniqKey="Treangen T">T Treangen</name>
</author>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M Pop</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H Teeling</name>
</author>
<author>
<name sortKey="Waldmann, J" uniqKey="Waldmann J">J Waldmann</name>
</author>
<author>
<name sortKey="Lombardot, T" uniqKey="Lombardot T">T Lombardot</name>
</author>
<author>
<name sortKey="Bauer, M" uniqKey="Bauer M">M Bauer</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO Glöckner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Martin, Hg" uniqKey="Martin H">HG Martín</name>
</author>
<author>
<name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A Tsirigos</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brady, A" uniqKey="Brady A">A Brady</name>
</author>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saeed, I" uniqKey="Saeed I">I Saeed</name>
</author>
<author>
<name sortKey="Tang, S L" uniqKey="Tang S">S-L Tang</name>
</author>
<author>
<name sortKey="Halgamuge, Sk" uniqKey="Halgamuge S">SK Halgamuge</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, D" uniqKey="Wu D">D Wu</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Pukall, R" uniqKey="Pukall R">R Pukall</name>
</author>
<author>
<name sortKey="Dalin, E" uniqKey="Dalin E">E Dalin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schloss, Pd" uniqKey="Schloss P">PD Schloss</name>
</author>
<author>
<name sortKey="Handelsman, J" uniqKey="Handelsman J">J Handelsman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pedr S Ali, C" uniqKey="Pedr S Ali C">C Pedrós-Alió</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sharpton, Tj" uniqKey="Sharpton T">TJ Sharpton</name>
</author>
<author>
<name sortKey="Riesenfeld, Sj" uniqKey="Riesenfeld S">SJ Riesenfeld</name>
</author>
<author>
<name sortKey="Kembel, Sw" uniqKey="Kembel S">SW Kembel</name>
</author>
<author>
<name sortKey="Ladau, J" uniqKey="Ladau J">J Ladau</name>
</author>
<author>
<name sortKey="O Dwyer, Jp" uniqKey="O Dwyer J">JP O'Dwyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rusch, Db" uniqKey="Rusch D">DB Rusch</name>
</author>
<author>
<name sortKey="Halpern, Al" uniqKey="Halpern A">AL Halpern</name>
</author>
<author>
<name sortKey="Sutton, G" uniqKey="Sutton G">G Sutton</name>
</author>
<author>
<name sortKey="Heidelberg, Kb" uniqKey="Heidelberg K">KB Heidelberg</name>
</author>
<author>
<name sortKey="Williamson, S" uniqKey="Williamson S">S Williamson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Miller, Cs" uniqKey="Miller C">CS Miller</name>
</author>
<author>
<name sortKey="Baker, Bj" uniqKey="Baker B">BJ Baker</name>
</author>
<author>
<name sortKey="Thomas, Bc" uniqKey="Thomas B">BC Thomas</name>
</author>
<author>
<name sortKey="Singer, Sw" uniqKey="Singer S">SW Singer</name>
</author>
<author>
<name sortKey="Banfield, Jf" uniqKey="Banfield J">JF Banfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schloss, Pd" uniqKey="Schloss P">PD Schloss</name>
</author>
<author>
<name sortKey="Handelsman, J" uniqKey="Handelsman J">J Handelsman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Margulies, M" uniqKey="Margulies M">M Margulies</name>
</author>
<author>
<name sortKey="Egholm, M" uniqKey="Egholm M">M Egholm</name>
</author>
<author>
<name sortKey="Altman, We" uniqKey="Altman W">WE Altman</name>
</author>
<author>
<name sortKey="Attiya, S" uniqKey="Attiya S">S Attiya</name>
</author>
<author>
<name sortKey="Bader, Js" uniqKey="Bader J">JS Bader</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Barry, K" uniqKey="Barry K">K Barry</name>
</author>
<author>
<name sortKey="Shapiro, H" uniqKey="Shapiro H">H Shapiro</name>
</author>
<author>
<name sortKey="Goltsman, E" uniqKey="Goltsman E">E Goltsman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcelroy, Ke" uniqKey="Mcelroy K">KE McElroy</name>
</author>
<author>
<name sortKey="Luciani, F" uniqKey="Luciani F">F Luciani</name>
</author>
<author>
<name sortKey="Thomas, T" uniqKey="Thomas T">T Thomas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fan, L" uniqKey="Fan L">L Fan</name>
</author>
<author>
<name sortKey="Reynolds, D" uniqKey="Reynolds D">D Reynolds</name>
</author>
<author>
<name sortKey="Liu, M" uniqKey="Liu M">M Liu</name>
</author>
<author>
<name sortKey="Stark, M" uniqKey="Stark M">M Stark</name>
</author>
<author>
<name sortKey="Kjelleberg, S" uniqKey="Kjelleberg S">S Kjelleberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schmieder, R" uniqKey="Schmieder R">R Schmieder</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bengtsson, J" uniqKey="Bengtsson J">J Bengtsson</name>
</author>
<author>
<name sortKey="Eriksson, Km" uniqKey="Eriksson K">KM Eriksson</name>
</author>
<author>
<name sortKey="Hartmann, M" uniqKey="Hartmann M">M Hartmann</name>
</author>
<author>
<name sortKey="Wang, Z" uniqKey="Wang Z">Z Wang</name>
</author>
<author>
<name sortKey="Shenoy, Bd" uniqKey="Shenoy B">BD Shenoy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Radax, R" uniqKey="Radax R">R Radax</name>
</author>
<author>
<name sortKey="Rattei, T" uniqKey="Rattei T">T Rattei</name>
</author>
<author>
<name sortKey="Lanzen, A" uniqKey="Lanzen A">A Lanzen</name>
</author>
<author>
<name sortKey="Bayer, C" uniqKey="Bayer C">C Bayer</name>
</author>
<author>
<name sortKey="Rapp, Ht" uniqKey="Rapp H">HT Rapp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pruesse, E" uniqKey="Pruesse E">E Pruesse</name>
</author>
<author>
<name sortKey="Quast, C" uniqKey="Quast C">C Quast</name>
</author>
<author>
<name sortKey="Knittel, K" uniqKey="Knittel K">K Knittel</name>
</author>
<author>
<name sortKey="Fuchs, Bm" uniqKey="Fuchs B">BM Fuchs</name>
</author>
<author>
<name sortKey="Ludwig, W" uniqKey="Ludwig W">W Ludwig</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dowd, Se" uniqKey="Dowd S">SE Dowd</name>
</author>
<author>
<name sortKey="Callaway, Tr" uniqKey="Callaway T">TR Callaway</name>
</author>
<author>
<name sortKey="Wolcott, Rd" uniqKey="Wolcott R">RD Wolcott</name>
</author>
<author>
<name sortKey="Sun, Y" uniqKey="Sun Y">Y Sun</name>
</author>
<author>
<name sortKey="Mckeehan, T" uniqKey="Mckeehan T">T McKeehan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schloss Gevers, Pda" uniqKey="Schloss Gevers P">PDA Schloss, Gevers</name>
</author>
<author>
<name sortKey="Westcott, Da" uniqKey="Westcott D">DA Westcott</name>
</author>
<author>
<name sortKey="Sarah, S" uniqKey="Sarah S">S Sarah</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Caporaso, Jg" uniqKey="Caporaso J">JG Caporaso</name>
</author>
<author>
<name sortKey="Kuczynski, J" uniqKey="Kuczynski J">J Kuczynski</name>
</author>
<author>
<name sortKey="Stombaugh, J" uniqKey="Stombaugh J">J Stombaugh</name>
</author>
<author>
<name sortKey="Bittinger, K" uniqKey="Bittinger K">K Bittinger</name>
</author>
<author>
<name sortKey="Bushman, Fd" uniqKey="Bushman F">FD Bushman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Q" uniqKey="Wang Q">Q Wang</name>
</author>
<author>
<name sortKey="Garrity, Gm" uniqKey="Garrity G">GM Garrity</name>
</author>
<author>
<name sortKey="Tiedje, Jm" uniqKey="Tiedje J">JM Tiedje</name>
</author>
<author>
<name sortKey="Cole, Jr" uniqKey="Cole J">JR Cole</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcdonald, D" uniqKey="Mcdonald D">D McDonald</name>
</author>
<author>
<name sortKey="Price, Mn" uniqKey="Price M">MN Price</name>
</author>
<author>
<name sortKey="Goodrich, J" uniqKey="Goodrich J">J Goodrich</name>
</author>
<author>
<name sortKey="Nawrocki, Ep" uniqKey="Nawrocki E">EP Nawrocki</name>
</author>
<author>
<name sortKey="Desantis, Tz" uniqKey="Desantis T">TZ Desantis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stamatakis, A" uniqKey="Stamatakis A">A Stamatakis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Talavera, G" uniqKey="Talavera G">G Talavera</name>
</author>
<author>
<name sortKey="Castresana, J" uniqKey="Castresana J">J Castresana</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, Rc" uniqKey="Edgar R">RC Edgar</name>
</author>
<author>
<name sortKey="Haas, Bj" uniqKey="Haas B">BJ Haas</name>
</author>
<author>
<name sortKey="Clemente, Jc" uniqKey="Clemente J">JC Clemente</name>
</author>
<author>
<name sortKey="Quince, C" uniqKey="Quince C">C Quince</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ashelford, Ke" uniqKey="Ashelford K">KE Ashelford</name>
</author>
<author>
<name sortKey="Chuzhanova, Na" uniqKey="Chuzhanova N">NA Chuzhanova</name>
</author>
<author>
<name sortKey="Fry, Jc" uniqKey="Fry J">JC Fry</name>
</author>
<author>
<name sortKey="Jones, Aj" uniqKey="Jones A">AJ Jones</name>
</author>
<author>
<name sortKey="Weightman, Aj" uniqKey="Weightman A">AJ Weightman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huber, T" uniqKey="Huber T">T Huber</name>
</author>
<author>
<name sortKey="Faulkner, G" uniqKey="Faulkner G">G Faulkner</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ashelford, Ke" uniqKey="Ashelford K">KE Ashelford</name>
</author>
<author>
<name sortKey="Chuzhanova, Na" uniqKey="Chuzhanova N">NA Chuzhanova</name>
</author>
<author>
<name sortKey="Fry, Jc" uniqKey="Fry J">JC Fry</name>
</author>
<author>
<name sortKey="Jones, Aj" uniqKey="Jones A">AJ Jones</name>
</author>
<author>
<name sortKey="Weightman, Aj" uniqKey="Weightman A">AJ Weightman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quince, C" uniqKey="Quince C">C Quince</name>
</author>
<author>
<name sortKey="Lanzen, A" uniqKey="Lanzen A">A Lanzén</name>
</author>
<author>
<name sortKey="Curtis, Tp" uniqKey="Curtis T">TP Curtis</name>
</author>
<author>
<name sortKey="Davenport, Rj" uniqKey="Davenport R">RJ Davenport</name>
</author>
<author>
<name sortKey="Hall, N" uniqKey="Hall N">N Hall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taylor, Mw" uniqKey="Taylor M">MW Taylor</name>
</author>
<author>
<name sortKey="Radax, R" uniqKey="Radax R">R Radax</name>
</author>
<author>
<name sortKey="Steger, D" uniqKey="Steger D">D Steger</name>
</author>
<author>
<name sortKey="Wagner, M" uniqKey="Wagner M">M Wagner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schmitt, S" uniqKey="Schmitt S">S Schmitt</name>
</author>
<author>
<name sortKey="Tsai, P" uniqKey="Tsai P">P Tsai</name>
</author>
<author>
<name sortKey="Bell, J" uniqKey="Bell J">J Bell</name>
</author>
<author>
<name sortKey="Fromont, J" uniqKey="Fromont J">J Fromont</name>
</author>
<author>
<name sortKey="Ilan, M" uniqKey="Ilan M">M Ilan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sogin, Ml" uniqKey="Sogin M">ML Sogin</name>
</author>
<author>
<name sortKey="Morrison, Hg" uniqKey="Morrison H">HG Morrison</name>
</author>
<author>
<name sortKey="Huber, Ja" uniqKey="Huber J">JA Huber</name>
</author>
<author>
<name sortKey="Mark Welch, D" uniqKey="Mark Welch D">D Mark Welch</name>
</author>
<author>
<name sortKey="Huse, Sm" uniqKey="Huse S">SM Huse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quince, C" uniqKey="Quince C">C Quince</name>
</author>
<author>
<name sortKey="Lanzen, A" uniqKey="Lanzen A">A Lanzen</name>
</author>
<author>
<name sortKey="Davenport, Rj" uniqKey="Davenport R">RJ Davenport</name>
</author>
<author>
<name sortKey="Turnbaugh, Pj" uniqKey="Turnbaugh P">PJ Turnbaugh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thomas, T" uniqKey="Thomas T">T Thomas</name>
</author>
<author>
<name sortKey="Rusch, D" uniqKey="Rusch D">D Rusch</name>
</author>
<author>
<name sortKey="Demaere, Mz" uniqKey="Demaere M">MZ Demaere</name>
</author>
<author>
<name sortKey="Yung, Py" uniqKey="Yung P">PY Yung</name>
</author>
<author>
<name sortKey="Lewis, M" uniqKey="Lewis M">M Lewis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, M" uniqKey="Liu M">M Liu</name>
</author>
<author>
<name sortKey="Fan, L" uniqKey="Fan L">L Fan</name>
</author>
<author>
<name sortKey="Zhong, L" uniqKey="Zhong L">L Zhong</name>
</author>
<author>
<name sortKey="Kjelleberg, S" uniqKey="Kjelleberg S">S Kjelleberg</name>
</author>
<author>
<name sortKey="Thomas, T" uniqKey="Thomas T">T Thomas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taylor, Mw" uniqKey="Taylor M">MW Taylor</name>
</author>
<author>
<name sortKey="Schupp, Pj" uniqKey="Schupp P">PJ Schupp</name>
</author>
<author>
<name sortKey="Dahllof, I" uniqKey="Dahllof I">I Dahllöf</name>
</author>
<author>
<name sortKey="Kjelleberg, S" uniqKey="Kjelleberg S">S Kjelleberg</name>
</author>
<author>
<name sortKey="Steinberg, Pd" uniqKey="Steinberg P">PD Steinberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taylor, Mw" uniqKey="Taylor M">MW Taylor</name>
</author>
<author>
<name sortKey="Schupp, Pj" uniqKey="Schupp P">PJ Schupp</name>
</author>
<author>
<name sortKey="De Nys, R" uniqKey="De Nys R">R de Nys</name>
</author>
<author>
<name sortKey="Kjelleberg, S" uniqKey="Kjelleberg S">S Kjelleberg</name>
</author>
<author>
<name sortKey="Steinberg, Pd" uniqKey="Steinberg P">PD Steinberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yung, Py" uniqKey="Yung P">PY Yung</name>
</author>
<author>
<name sortKey="Burke, C" uniqKey="Burke C">C Burke</name>
</author>
<author>
<name sortKey="Lewis, M" uniqKey="Lewis M">M Lewis</name>
</author>
<author>
<name sortKey="Egan, S" uniqKey="Egan S">S Egan</name>
</author>
<author>
<name sortKey="Kjelleberg, S" uniqKey="Kjelleberg S">S Kjelleberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simon, C" uniqKey="Simon C">C Simon</name>
</author>
<author>
<name sortKey="Daniel, R" uniqKey="Daniel R">R Daniel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thomas, T" uniqKey="Thomas T">T Thomas</name>
</author>
<author>
<name sortKey="Gilbert, J" uniqKey="Gilbert J">J Gilbert</name>
</author>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Peterson, J" uniqKey="Peterson J">J Peterson</name>
</author>
<author>
<name sortKey="Garges, S" uniqKey="Garges S">S Garges</name>
</author>
<author>
<name sortKey="Giovanni, M" uniqKey="Giovanni M">M Giovanni</name>
</author>
<author>
<name sortKey="Mcinnes, P" uniqKey="Mcinnes P">P McInnes</name>
</author>
<author>
<name sortKey="Wang, L" uniqKey="Wang L">L Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Siqueira, Jf" uniqKey="Siqueira J">JF Siqueira</name>
</author>
<author>
<name sortKey="Fouad, Af" uniqKey="Fouad A">AF Fouad</name>
</author>
<author>
<name sortKey="Rocas, In" uniqKey="Rocas I">IN Rôças</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">22761935</article-id>
<article-id pub-id-type="pmc">3384625</article-id>
<article-id pub-id-type="publisher-id">PONE-D-12-14610</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0039948</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline-v2">
<subject>Biology</subject>
<subj-group>
<subject>Computational Biology</subject>
<subj-group>
<subject>Sequence Analysis</subject>
</subj-group>
</subj-group>
<subj-group>
<subject>Ecology</subject>
<subj-group>
<subject>Microbial Ecology</subject>
</subj-group>
</subj-group>
<subj-group>
<subject>Evolutionary Biology</subject>
<subj-group>
<subject>Evolutionary Systematics</subject>
<subj-group>
<subject>Taxonomy</subject>
<subj-group>
<subject>Microbial Taxonomy</subject>
</subj-group>
</subj-group>
</subj-group>
</subj-group>
<subj-group>
<subject>Genomics</subject>
<subj-group>
<subject>Metagenomics</subject>
</subj-group>
</subj-group>
<subj-group>
<subject>Microbiology</subject>
<subj-group>
<subject>Microbial Ecology</subject>
</subj-group>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Reconstruction of Ribosomal RNA Genes from Metagenomic Data</article-title>
<alt-title alt-title-type="running-head">16S rRNA Genes in Metagenomes</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Fan</surname>
<given-names>Lu</given-names>
</name>
<xref ref-type="aff" rid="aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>McElroy</surname>
<given-names>Kerensa</given-names>
</name>
<xref ref-type="aff" rid="aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Thomas</surname>
<given-names>Torsten</given-names>
</name>
<xref ref-type="aff" rid="aff1"></xref>
<xref ref-type="corresp" rid="cor1">
<sup>*</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<addr-line>School of Biotechnology and Biomolecular Sciences and Centre for Marine Bio-Innovation, University of New South Wales, Sydney, New South Wales, Australia</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Rodriguez-Valera</surname>
<given-names>Francisco</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">Universidad Miguel Hernandez, Spain</aff>
<author-notes>
<corresp id="cor1">* E-mail:
<email>t.thomas@unsw.edu.au</email>
</corresp>
<fn fn-type="con">
<p>Conceived and designed the experiments: LF TT. Performed the experiments: LF. Analyzed the data: LF TT. Contributed reagents/materials/analysis tools: LF KM. Wrote the paper: LF KM TT.</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<year>2012</year>
</pub-date>
<pub-date pub-type="epub">
<day>27</day>
<month>6</month>
<year>2012</year>
</pub-date>
<volume>7</volume>
<issue>6</issue>
<elocation-id>e39948</elocation-id>
<history>
<date date-type="received">
<day>22</day>
<month>5</month>
<year>2012</year>
</date>
<date date-type="accepted">
<day>29</day>
<month>5</month>
<year>2012</year>
</date>
</history>
<permissions>
<copyright-statement>Fan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</copyright-statement>
<copyright-year>2012</copyright-year>
</permissions>
<abstract>
<p>Direct sequencing of environmental DNA (metagenomics) has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes.</p>
</abstract>
<counts>
<page-count count="9"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Microorganisms are vital components of our planet's ecosystems. PCR amplification and sequencing of 16S ribosomal RNA (16S rRNA) genes directly from environmental samples has over the last two decades revealed an astonishing amount of new microbial diversity
<xref ref-type="bibr" rid="pone.0039948-Pace1">[1]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Tringe1">[2]</xref>
. However, as the ‘universal’ primers used in PCR are designed based on already known groups of organisms, a skewed picture of community composition is likely obtained, especially for environmental samples containing divergent microbial lineages
<xref ref-type="bibr" rid="pone.0039948-Hong1">[3]</xref>
.</p>
<p>Direct sequencing of total environmental DNA (metagenomics) has the potential to assess the true diversity of the environment without primer bias
<xref ref-type="bibr" rid="pone.0039948-Venter1">[4]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Haas1">[5]</xref>
. Metagenomic sequences can be assigned to taxa using their similarity to reference genomes based on either sequence similarity
<xref ref-type="bibr" rid="pone.0039948-Huson1">[6]</xref>
<xref ref-type="bibr" rid="pone.0039948-Liu1">[9]</xref>
or genomic composition
<xref ref-type="bibr" rid="pone.0039948-Teeling1">[10]</xref>
<xref ref-type="bibr" rid="pone.0039948-Saeed1">[13]</xref>
. However, these types of assignments are only informative when the genomes of closely related taxa are present in the reference set. As reference genomes are only available for a limited part of the phylogenetic tree of life
<xref ref-type="bibr" rid="pone.0039948-Wu2">[14]</xref>
, these taxonomic predictions are generally of low resolution (e.g. phyla or order) and hence often give only an unsatisfactory description of community composition.</p>
<p>In contrast, several comprehensive databases exist for the 16S rRNA gene that provide detailed phylogenetic trees
<xref ref-type="bibr" rid="pone.0039948-Schloss1">[15]</xref>
and allow for taxonomic resolution down to the species level
<xref ref-type="bibr" rid="pone.0039948-PedrsAli1">[16]</xref>
. Shotgun metagenomic datasets obviously also contain fragmented 16S rRNA genes and these have been directly assigned to taxa through BLAST-based comparisons
<xref ref-type="bibr" rid="pone.0039948-Venter1">[4]</xref>
or phylogenetic distance-based clustering
<xref ref-type="bibr" rid="pone.0039948-Sharpton1">[17]</xref>
. However, the short and random nature of metagenomic sequences may not contain the phylogenetically most informative regions of the 16S rRNA genes, thus diminishing the efficiency of taxonomic assignments. Sequence assembly can potentially increase the length of the 16S rRNA gene sequences recovered
<xref ref-type="bibr" rid="pone.0039948-Rusch1">[18]</xref>
, but low sequence coverage may limit assembly success for 16S rRNA genes and low-stringency assemblies may result in chimeric sequences
<xref ref-type="bibr" rid="pone.0039948-Miller1">[19]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Schloss2">[20]</xref>
. The recently released EMIRGE software uses iterative mapping of short Illumina reads against reference sequences to reconstruct 16S rRNA genes
<xref ref-type="bibr" rid="pone.0039948-Miller1">[19]</xref>
. Although this approach has an explicit accuracy to single nucleotide difference, its potential to avoid chimeras is strongly dependent on the quality of the reference database. Further, EMIRGE's algorithm is currently not designed for pyrosequencing reads, which contain high rates of insertion and deletions errors (e.g. in homopolymers)
<xref ref-type="bibr" rid="pone.0039948-Margulies1">[21]</xref>
. There is thus a need for an approach that reconstructs 16S rRNA genes with high accuracy from pyrosequencing data.</p>
<p>In the present study, we describe a strategy to reconstruct nearly full-length 16S rRNA sequences from metagenomicpyrosequencing data. Through simulation of communities with different diversities we developed a process of stringent assembly and data filtering that generates 16S rRNAcontigs with minimal chimera rates. We then applied our process to assess the microbial symbiont communities from two marine sponges species and compared the outcome to PCR-based assessments of the community structure (pyro-tag-sequencing). We show that about 30% of the abundant phylotypes reconstructed from metagenomic reads failed to be amplified by PCR, which is most likely due to primer mismatches.</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>Materials and Methods</title>
<sec id="s2a">
<title>Simulated metagenomes and metagenomic samples</title>
<p>Ninety completed genomes were selected as references, including 76 bacteria and 14 archaea and combined using established profiles of community diversity with high- (HC), median- (MC), and low- (LC) complexity
<xref ref-type="bibr" rid="pone.0039948-Mavromatis1">[22]</xref>
(
<xref ref-type="supplementary-material" rid="pone.0039948.s001">Table S1</xref>
). Genomic sequences, 16S rRNA gene sequences and gene copy number per genome were obtained from the Integrated Microbial Genomes website (
<ext-link ext-link-type="uri" xlink:href="http://img.jgi.doe.gov/cgi-bin/w/main.cgi">http://img.jgi.doe.gov/cgi-bin/w/main.cgi</ext-link>
). Heterogenous 16S rRNA genes within a genome were considered separately. For each metagenome complexity, three read data set (1,000,000 reads each, 350 nt) were simulated using empirically derived and context-based error models (GemSIM software
<xref ref-type="bibr" rid="pone.0039948-McElroy1">[23]</xref>
).</p>
<p>Three environmental DNA samples for each of the two sponges
<italic>Cymbastelaconcentrica</italic>
and
<italic>C. coralliophila</italic>
were obtained as described in ref.
<xref ref-type="bibr" rid="pone.0039948-Fan1">[24]</xref>
. Shotgun pyrosequencing (454 Titanium) was conducted at the J. Craig Venter Institute, Rockville, USA and the resulting average read length corresponded to the simulated datasets above. The shotgun sequencing is available through the Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis website (
<ext-link ext-link-type="uri" xlink:href="http://camera.calit2.net/">http://camera.calit2.net/</ext-link>
) under project accession ‘CAM_PROJ_BotanyBay’.</p>
</sec>
<sec id="s2b">
<title>Reconstruction of 16S rRNA gene sequences</title>
<p>The metagenomic reads of the simulated communities and the sponge microbial communities were pre-processed with PrinSeq
<xref ref-type="bibr" rid="pone.0039948-Schmieder1">[25]</xref>
using the settings ‘(“minlen”:“60”,“maxlen”:“700”,“minqualm”:“20”,“nsmaxp”:“1”,“complval”:“50”, “noniupac”:“true”,“derep0”:“true”,“derep1”:“true”,“complmethod”:“2”,“trimtails”:“6”,“trimns”:“1”,“trimscore”:“15”,“trimwindow”:“2”,“trimstep”:“1”,“tailsite”:“1”,“trimsite”:“3”,“trimtype”:“2”,“trimrule”:“1”)’. Metaxa (version 1.0.2)
<xref ref-type="bibr" rid="pone.0039948-Bengtsson1">[26]</xref>
was then used to identify reads containing 16S rRNA sequences. Reads (>300 nt) from triplicates were then pooled and assembled with the GS De Novo Assembler 2.3 (454 Life Sciences, Branford, CT) using the ‘cDNA’ option, which is optimized for the uneven and high coverage typically expected in RNA assemblies. Default settings were used except ‘overlap identity’, which was set to 99%. Additionally, ‘reads limited to one contig’ and ‘extending low depth overlaps’ were selected. The 99% cut-off was chosen to allow overlap of reads with a 1% error, which is typical seen towards the end of pyrosequencing reads
<xref ref-type="bibr" rid="pone.0039948-McElroy1">[23]</xref>
. Lower stringency (e.g. 97% as used by Radax
<italic>et al.</italic>
during the assembly of 16S rRNA gene
<xref ref-type="bibr" rid="pone.0039948-Radax1">[27]</xref>
) resulted in unacceptable rates of chimera formation (data not shown). After aligning contigs to the SILVA 1.08 database by SINA
<xref ref-type="bibr" rid="pone.0039948-Pruesse1">[28]</xref>
, flanking regions that were not part of the 16S rRNA gene sequences were removed. Resulting contigs were then examined for chimerism. If a contig constituted reads from more than one strain and any of these strains was less than 99% sequence identity to the other strains, it was considered a chimera.</p>
</sec>
<sec id="s2c">
<title>Pyrosequencing of 16S rRNA genes amplified by PCR</title>
<p>Amplification of the 16S rRNA gene was performed on the same DNA sample as used for shotgun sequencing. Primers 28F ‘GAGTTTGATCNTGGCTCAG’ and 519R ‘GTNTTACNGCGGCKGCTG’ were used for amplification of the variable regions V1-3. PCR and subsequent sequencing are described in Dowd
<italic>et al.</italic>
2008
<xref ref-type="bibr" rid="pone.0039948-Dowd1">[29]</xref>
and were performed at the Research and Testing Laboratory (Lubbock, USA). Trace data was deposited at the NCBI Sequencing Read Archive database with the project accession SRP011939.</p>
<p>Analysis of the 16S rRNA tag-sequencing data was performed using Mothur v1.23.1
<xref ref-type="bibr" rid="pone.0039948-Schloss3">[30]</xref>
. Specifically, ‘shhh.flows’ was used for de-noising, ‘trim.seqs (pdiffs = 2, bdiffs = 1, maxhomop = 8, minlength = 200)’ was used for barcode removal and quality filtering, SINA was used for sequence alignment with the SILVA 1.08 database
<xref ref-type="bibr" rid="pone.0039948-Pruesse1">[28]</xref>
, ‘screen.seqs(start = 1048, minlength = 245)’ and ‘filter.seqs (vertical = T, trump = .)’ were used for alignment quality filtering, ‘pre.cluster(diffs = 2)’ was used for further error reduction, ‘chimera.uchime’ was used for
<italic>de novo</italic>
removal of chimeric reads, and Metaxa (version 1.0.2)
<xref ref-type="bibr" rid="pone.0039948-Bengtsson1">[26]</xref>
was used to remove mitochondrial and chloroplast sequences.</p>
</sec>
<sec id="s2d">
<title>Operational taxonomic unit (OTU) analysis</title>
<p>For simulated data, filtered 16S rRNAcontigs (with coverage of more than 10 reads and length greater than 700 nt; see below) and 16S rRNA reads not in contigs were pooled with the 16S rRNA sequences of the reference genomes used for simulation. Redundancy within these pools was removed with CD-hit (99% identify cut-off). PhylOTU
<xref ref-type="bibr" rid="pone.0039948-Sharpton1">[17]</xref>
was then used to generate OTUs with 0.01, 0.03 and 0.05 phylogenetic distance cut-off. OTUs containing both reference sequences and simulated shotgun sequences (filtered contigs or reads) were assigned as ‘recovered’. OTUs containing only reference sequences were termed as ‘missed’, while those containing only shotgun sequences were assigned as ‘artificial’. OTU coverage was defined as the number of reads contained in each OTU. For the sponge samples, filtered 16S rRNAcontigs (with coverage of more than 10 reads and length greater than 700 nt) and 16S RNA reads not in contigs were pooled with PCR-amplified tag-sequences and then processed as above to generate OTUs. Diversity analysis was performed with QIIME
<xref ref-type="bibr" rid="pone.0039948-Caporaso1">[31]</xref>
and phylogenetic distance-based rarefaction was based on the tree of non-redundant sequences generated during the PhylOTU process.</p>
</sec>
<sec id="s2e">
<title>Taxonomic classification and phylogenetic analysis</title>
<p>16S rRNA classification was performed with the RDP Classifier 2.3
<xref ref-type="bibr" rid="pone.0039948-Wang1">[32]</xref>
, except for the classification of the abundant OTUs in sponge samples, which was performed with the Greengenes Classifier (March 6, 2012)
<xref ref-type="bibr" rid="pone.0039948-McDonald1">[33]</xref>
followed by manual examination. Single-copy gene based analysis was performed using MLTreeMap (version 2.05, ‘minimal sequence length after Gblocks’ set to 35)
<xref ref-type="bibr" rid="pone.0039948-Stark1">[7]</xref>
. For phylogenetic analysis, Maximum-Likelihood trees of the 16S rRNA gene contigs were constructed using RAxML
<xref ref-type="bibr" rid="pone.0039948-Stamatakis1">[34]</xref>
after alignment by SINA and removal of ambiguous positions by Gblocks (−t = d −b4 = 5 −b5 = h)
<xref ref-type="bibr" rid="pone.0039948-Talavera1">[35]</xref>
.</p>
</sec>
</sec>
<sec id="s3">
<title>Results</title>
<sec id="s3a">
<title>16S rRNA gene assembly with minimal chimera formation</title>
<p>As chimera formation was a major issue in previous assembly approaches
<xref ref-type="bibr" rid="pone.0039948-Rusch1">[18]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Miller1">[19]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Radax1">[27]</xref>
, we first examined the occurrence of chimeric 16S rRNAcontigs in our assembly strategy on simulated datasets (see
<xref ref-type="supplementary-material" rid="pone.0039948.s002">Materials and Methods</xref>
). 9,931 (0.11%) reads containing 16S rRNA gene information were detected from 8,997,875 shotgun reads after quality filtering (
<xref ref-type="table" rid="pone-0039948-t001">Table 1</xref>
). After applying our assembly strategy we recovered between 125–130 contigs containing full or partial 16S rRNA genes (
<xref ref-type="table" rid="pone-0039948-t001">Table 1</xref>
).</p>
<table-wrap id="pone-0039948-t001" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.t001</object-id>
<label>Table 1</label>
<caption>
<title>Reads, 16S rRNAcontigs, OTUs and chimera examination of the simulated communities.</title>
</caption>
<alternatives>
<graphic id="pone-0039948-t001-1" xlink:href="pone.0039948.t001"></graphic>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col align="left" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
</colgroup>
<thead>
<tr>
<td align="left" rowspan="1" colspan="1">Sample</td>
<td align="left" rowspan="1" colspan="1">HC–A</td>
<td align="left" rowspan="1" colspan="1">HC–B</td>
<td align="left" rowspan="1" colspan="1">HC–C</td>
<td align="left" rowspan="1" colspan="1">MC–A</td>
<td align="left" rowspan="1" colspan="1">MC–B</td>
<td align="left" rowspan="1" colspan="1">MC–C</td>
<td align="left" rowspan="1" colspan="1">LC–A</td>
<td align="left" rowspan="1" colspan="1">LC–B</td>
<td align="left" rowspan="1" colspan="1">LC–C</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads after quality filtering</bold>
</td>
<td align="left" rowspan="1" colspan="1">999913</td>
<td align="left" rowspan="1" colspan="1">999909</td>
<td align="left" rowspan="1" colspan="1">999912</td>
<td align="left" rowspan="1" colspan="1">999703</td>
<td align="left" rowspan="1" colspan="1">999775</td>
<td align="left" rowspan="1" colspan="1">999769</td>
<td align="left" rowspan="1" colspan="1">999603</td>
<td align="left" rowspan="1" colspan="1">999606</td>
<td align="left" rowspan="1" colspan="1">999685</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>16S rRNA gene – containing reads</bold>
</td>
<td align="left" rowspan="1" colspan="1">1303</td>
<td align="left" rowspan="1" colspan="1">1353</td>
<td align="left" rowspan="1" colspan="1">1376</td>
<td align="left" rowspan="1" colspan="1">984</td>
<td align="left" rowspan="1" colspan="1">1112</td>
<td align="left" rowspan="1" colspan="1">1153</td>
<td align="left" rowspan="1" colspan="1">874</td>
<td align="left" rowspan="1" colspan="1">916</td>
<td align="left" rowspan="1" colspan="1">860</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>16S rRNAcontigs> 350 nt (chimera, chimera containing >1 contaminating read)</bold>
</td>
<td align="left" rowspan="1" colspan="1">130 (3, 1)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">126 (7, 1)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">125 (4, 3)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads in 16S rRNAcontigs>350 nt (chimera, chimera containing >1 contaminating read)</bold>
</td>
<td align="left" rowspan="1" colspan="1">3733 (85, 15)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">3005 (365, 8)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">2386 (374, 150)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Filtered 16SrRNAcontigs (chimera, chimera containing >1 contaminating read)</bold>
</td>
<td align="left" rowspan="1" colspan="1">73 (0, 0)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">53 (3, 0)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">54 (3, 2)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads in filtered 16SrRNAcontigs (chimera, chimera containing >1 contaminating read)</bold>
</td>
<td align="left" rowspan="1" colspan="1">3257 (0, 0)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">2610 (330, 0)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">2004 (364, 140)</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Length of filtered 16S rRNAcontigs (min, max, mean) (nt)</bold>
</td>
<td align="left" rowspan="1" colspan="1">458, 1548, 1262</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">574, 1529, 1127</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">515, 1532, 1174</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Recovered, missed, artificial OTUs (0.01)</bold>
</td>
<td align="left" rowspan="1" colspan="1">81, 0, 0</td>
<td align="left" rowspan="1" colspan="1">81, 0, 0</td>
<td align="left" rowspan="1" colspan="1">81, 0, 0</td>
<td align="left" rowspan="1" colspan="1">75, 1, 1</td>
<td align="left" rowspan="1" colspan="1">77, 1, 1</td>
<td align="left" rowspan="1" colspan="1">77, 1, 1</td>
<td align="left" rowspan="1" colspan="1">80, 0, 0</td>
<td align="left" rowspan="1" colspan="1">79, 0, 0</td>
<td align="left" rowspan="1" colspan="1">80, 0, 0</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads in recovered, missed, artificial OTUs (0.01)</bold>
</td>
<td align="left" rowspan="1" colspan="1">1303, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1353, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1376, 0, 0</td>
<td align="left" rowspan="1" colspan="1">978, 2, 4</td>
<td align="left" rowspan="1" colspan="1">1106, 2, 2</td>
<td align="left" rowspan="1" colspan="1">1148, 4, 4</td>
<td align="left" rowspan="1" colspan="1">870, 0, 0</td>
<td align="left" rowspan="1" colspan="1">915, 0, 0</td>
<td align="left" rowspan="1" colspan="1">857, 0, 0</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Recovered, missed, artificial OTUs (0.03)</bold>
</td>
<td align="left" rowspan="1" colspan="1">74, 0, 0</td>
<td align="left" rowspan="1" colspan="1">74, 0, 0</td>
<td align="left" rowspan="1" colspan="1">74, 0, 0</td>
<td align="left" rowspan="1" colspan="1">69, 0, 0</td>
<td align="left" rowspan="1" colspan="1">72, 0, 0</td>
<td align="left" rowspan="1" colspan="1">72, 0, 0</td>
<td align="left" rowspan="1" colspan="1">72, 0, 0</td>
<td align="left" rowspan="1" colspan="1">71, 0, 0</td>
<td align="left" rowspan="1" colspan="1">72, 0, 0</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads in recovered, missed, artificial OTUs (0.03)</bold>
</td>
<td align="left" rowspan="1" colspan="1">1303, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1353, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1376, 0, 0</td>
<td align="left" rowspan="1" colspan="1">982, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1108, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1150, 0, 0</td>
<td align="left" rowspan="1" colspan="1">870, 0, 0</td>
<td align="left" rowspan="1" colspan="1">915, 0, 0</td>
<td align="left" rowspan="1" colspan="1">857, 0, 0</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Recovered, missed, artificial OTUs (0.05)</bold>
</td>
<td align="left" rowspan="1" colspan="1">52, 0, 0</td>
<td align="left" rowspan="1" colspan="1">53, 0, 0</td>
<td align="left" rowspan="1" colspan="1">52, 0, 0</td>
<td align="left" rowspan="1" colspan="1">49, 0, 0</td>
<td align="left" rowspan="1" colspan="1">50, 0, 0</td>
<td align="left" rowspan="1" colspan="1">49, 0, 0</td>
<td align="left" rowspan="1" colspan="1">49, 0, 0</td>
<td align="left" rowspan="1" colspan="1">48, 0, 0</td>
<td align="left" rowspan="1" colspan="1">48, 0, 0</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads in recovered, missed, artificial OTUs (0.05)</bold>
</td>
<td align="left" rowspan="1" colspan="1">1303, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1353, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1376, 0, 0</td>
<td align="left" rowspan="1" colspan="1">982, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1108, 0, 0</td>
<td align="left" rowspan="1" colspan="1">1150, 0, 0</td>
<td align="left" rowspan="1" colspan="1">870, 0, 0</td>
<td align="left" rowspan="1" colspan="1">915, 0, 0</td>
<td align="left" rowspan="1" colspan="1">857, 0, 0</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<p>16S rRNAcontigs larger than 350 nt were plotted by their length and read coverage (
<xref ref-type="fig" rid="pone-0039948-g001">Figure 1</xref>
). Fourteen chimeric contigs (3.6%) were detected in all 381 contigs generated from the nine datasets (solid circle and triangles in
<xref ref-type="fig" rid="pone-0039948-g001">Figure 1</xref>
). Four of these contigs could be readily detected using UChime
<xref ref-type="bibr" rid="pone.0039948-Edgar1">[36]</xref>
(arrows in
<xref ref-type="fig" rid="pone-0039948-g001">Figure 1</xref>
). Eight chimeras contain only one ‘contaminating’ read (solid circles in
<xref ref-type="fig" rid="pone-0039948-g001">Figure 1</xref>
), which were mostly aligned to highly conserved regions of the 16S rRNA gene (data not shown). To examine whether these chimeras would affect the accuracy of community structure prediction, we generated OTUs with different phylogenetic distance cut-offs (0.01, 0.03 and 0.05). In nearly all case, all reference OTUs were recovered and no artificial OTUs were generated. The only exception was for MC communities at a 0.01 OTU level where one artificial OTU was generated and one OTU present in the reference was missed (
<xref ref-type="table" rid="pone-0039948-t001">Table 1</xref>
). This result shows that our assembly strategy recovers effectively the true microbial community structure, and especially OTU groupings of greater than 0.03 phylogenetic distance.</p>
<fig id="pone-0039948-g001" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.g001</object-id>
<label>Figure 1</label>
<caption>
<title>16S rRNA gene contigs and chimeric contigs for simulated datasets.</title>
<p>Open circle: non-chimeric contigs; solid circle: chimeric contigs containing one contaminating read; solid triangles: chimeric contigs containing more than one contaminating read. Arrow: chimera detected by UChime. (A) HC. (B) MC. (C) LC.</p>
</caption>
<graphic xlink:href="pone.0039948.g001"></graphic>
</fig>
<p>With the aim of recovering long 16S rRNA sequences for phylogenetic analysis and to minimize the effects of potential chimeric assembly, we filtered contigs for length of greater than 700 nt and for a coverage of more than 10 reads (
<xref ref-type="fig" rid="pone-0039948-g001">Figure 1</xref>
). In addition we used UChime for chimera removal. Sequences flanking the 16S rRNA gene were removed. This resulted in 180 contigs (mean length: 1,174–1,262 nt) in the nine samples with only two (1.1%) of them containing more than one contaminating read (
<xref ref-type="table" rid="pone-0039948-t001">Table 1</xref>
). This value is below the chimeric amplification rate generally reported for PCR-based assessment of 16S rRNA gene diversity (5 to 45%)
<xref ref-type="bibr" rid="pone.0039948-Haas1">[5]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Ashelford1">[37]</xref>
<xref ref-type="bibr" rid="pone.0039948-Quince1">[40]</xref>
.</p>
</sec>
<sec id="s3b">
<title>Assembly of 16S rRNA sequences improves taxonomic classification</title>
<p>With the assumption that longer 16S rRNA gene sequences can improve the taxonomic description of a community, we compared the proportion of reads before and after assembly that could be confidently assigned using the RDP Classifier (80% confidence). Despite all strains in the simulated datasets being deposited in the RDP database, a steady decline of classification success was observed with between 60–70% of unassembled reads being assigned at the genus level. In contrast, assembled data showed generally higher classification success and at genus level more than 80% could be confidently assigned (
<xref ref-type="fig" rid="pone-0039948-g002">Figure 2</xref>
). This shows a clear benefit of 16S rRNA gene assembly for taxonomic classification and will also improve phylogenetic analysis (see below).</p>
<fig id="pone-0039948-g002" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.g002</object-id>
<label>Figure 2</label>
<caption>
<title>Taxonomic classification of assembled and unassembled shotgun 16S rRNA gene reads for simulated datasets.</title>
<p>(A) HC. (B) MC. (C) LC.</p>
</caption>
<graphic xlink:href="pone.0039948.g002"></graphic>
</fig>
</sec>
<sec id="s3c">
<title>16S rRNA gene reconstruction reveals community diversity that is missed by PCR-based approaches</title>
<p>Sponges (phylum Porifera) host complex communities of microbial symbionts, which are essential for the host's function
<xref ref-type="bibr" rid="pone.0039948-Taylor1">[41]</xref>
. Over the last decade substantial efforts have been made to describe the phylogenetic diversity and biogeography of sponge-associated microorganisms
<xref ref-type="bibr" rid="pone.0039948-Taylor1">[41]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Schmitt1">[42]</xref>
. However, the vast majority of sponge microbiome surveys are based on PCR-amplification of the 16S rRNA gene. Only recently has one study generated 16S rRNAcontigs from a shotgun-sequenced transcriptome of a sponge microbial community
<xref ref-type="bibr" rid="pone.0039948-Radax1">[27]</xref>
. However, this study generated relatively short contigs (729 nt on average) despite extremely high sequencing coverage (66,743 reads containing 16S rRNA gene sequences) and the loose stringency during assembly could have created many chimeras (see above)
<xref ref-type="bibr" rid="pone.0039948-Radax1">[27]</xref>
.</p>
<p>To evaluate the phylogenetic diversity generated by our 16S rRNA gene reconstruction method, we analyzed six shotgun metagenomes from the two sponges
<italic>C. concentrica</italic>
and
<italic>C. coralliophila</italic>
. From 5,322,385 quality-filtered pyrosequencingreads, we could identify 1,942 reads containing 16S rRNA genes (0.04%) and generated 25 filteredcontigs (
<xref ref-type="table" rid="pone-0039948-t002">Table 2</xref>
). The majority of contigs were full or near-full length (
<xref ref-type="table" rid="pone-0039948-t002">Table 2</xref>
). Community composition of the six sponge DNA samples was also assessed by PCR-amplifying and pyrosequencing the variable region V1-3 of the 16S rRNA gene (pyro-tag-sequencing). 22,392 16S rRNA gene sequences were obtained and 1,366 were unique sequences after quality filtering and pre-clustering (
<xref ref-type="supplementary-material" rid="pone.0039948.s002">Materials and Methods</xref>
,
<xref ref-type="table" rid="pone-0039948-t003">Table 3</xref>
).</p>
<table-wrap id="pone-0039948-t002" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.t002</object-id>
<label>Table 2</label>
<caption>
<title>The sponge metagenomic datasets.</title>
</caption>
<alternatives>
<graphic id="pone-0039948-t002-2" xlink:href="pone.0039948.t002"></graphic>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col align="left" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
</colgroup>
<thead>
<tr>
<td align="left" rowspan="1" colspan="1">Sample</td>
<td align="left" rowspan="1" colspan="1">Cyr–A shotgun</td>
<td align="left" rowspan="1" colspan="1">Cyr–B shotgun</td>
<td align="left" rowspan="1" colspan="1">Cyr–C shotgun</td>
<td align="left" rowspan="1" colspan="1">Cyn–A shotgun</td>
<td align="left" rowspan="1" colspan="1">Cyn–B shotgun</td>
<td align="left" rowspan="1" colspan="1">Cyn–C shotgun</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Sponge host</bold>
</td>
<td colspan="3" align="left" rowspan="1">
<italic>C. coralliophila</italic>
</td>
<td colspan="3" align="left" rowspan="1">
<italic>C. concentrica</italic>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Raw reads</bold>
</td>
<td align="left" rowspan="1" colspan="1">897408</td>
<td align="left" rowspan="1" colspan="1">971976</td>
<td align="left" rowspan="1" colspan="1">888127</td>
<td align="left" rowspan="1" colspan="1">678263</td>
<td align="left" rowspan="1" colspan="1">1169872</td>
<td align="left" rowspan="1" colspan="1">1323699</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Average read size (nt)</bold>
</td>
<td align="left" rowspan="1" colspan="1">387.6</td>
<td align="left" rowspan="1" colspan="1">353.2</td>
<td align="left" rowspan="1" colspan="1">276.8</td>
<td align="left" rowspan="1" colspan="1">358.0</td>
<td align="left" rowspan="1" colspan="1">408.1</td>
<td align="left" rowspan="1" colspan="1">392.8</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads after quality filtering</bold>
</td>
<td align="left" rowspan="1" colspan="1">859525</td>
<td align="left" rowspan="1" colspan="1">898161</td>
<td align="left" rowspan="1" colspan="1">788662</td>
<td align="left" rowspan="1" colspan="1">660869</td>
<td align="left" rowspan="1" colspan="1">1004075</td>
<td align="left" rowspan="1" colspan="1">1111093</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>16S rRNA gene – containing reads</bold>
</td>
<td align="left" rowspan="1" colspan="1">282</td>
<td align="left" rowspan="1" colspan="1">385</td>
<td align="left" rowspan="1" colspan="1">95</td>
<td align="left" rowspan="1" colspan="1">237</td>
<td align="left" rowspan="1" colspan="1">530</td>
<td align="left" rowspan="1" colspan="1">413</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>16S rRNA gene contigs>350 nt (reads)</bold>
</td>
<td colspan="3" align="left" rowspan="1">48 (557)</td>
<td colspan="3" align="left" rowspan="1">66 (908)</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Filtered 16S rRNA gene contigs (reads)</bold>
</td>
<td colspan="3" align="left" rowspan="1">13 (445)</td>
<td colspan="3" align="left" rowspan="1">12 (727)</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Length of filtered 16S rRNA gene contigs (min, max, mean) (nt)</bold>
</td>
<td colspan="3" align="left" rowspan="1">1218, 1535, 1418</td>
<td colspan="3" align="left" rowspan="1">493, 1517, 1251</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<table-wrap id="pone-0039948-t003" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.t003</object-id>
<label>Table 3</label>
<caption>
<title>The sponge tag-sequencing data sets.</title>
</caption>
<alternatives>
<graphic id="pone-0039948-t003-3" xlink:href="pone.0039948.t003"></graphic>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col align="left" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
</colgroup>
<thead>
<tr>
<td align="left" rowspan="1" colspan="1">Sample</td>
<td align="left" rowspan="1" colspan="1">Cyr–A PCR</td>
<td align="left" rowspan="1" colspan="1">Cyr–B PCR</td>
<td align="left" rowspan="1" colspan="1">Cyr–C PCR</td>
<td align="left" rowspan="1" colspan="1">Cyn–A PCR</td>
<td align="left" rowspan="1" colspan="1">Cyn–B PCR</td>
<td align="left" rowspan="1" colspan="1">Cyn–C PCR</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Sponge host</bold>
</td>
<td colspan="3" align="left" rowspan="1">
<italic>C. coralliophila</italic>
</td>
<td colspan="3" align="left" rowspan="1">
<italic>C. concentrica</italic>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Raw reads</bold>
</td>
<td align="left" rowspan="1" colspan="1">5989</td>
<td align="left" rowspan="1" colspan="1">7895</td>
<td align="left" rowspan="1" colspan="1">13961</td>
<td align="left" rowspan="1" colspan="1">8257</td>
<td align="left" rowspan="1" colspan="1">5284</td>
<td align="left" rowspan="1" colspan="1">12509</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Average read size (nt)</bold>
</td>
<td align="left" rowspan="1" colspan="1">301.1</td>
<td align="left" rowspan="1" colspan="1">302.5</td>
<td align="left" rowspan="1" colspan="1">305.7</td>
<td align="left" rowspan="1" colspan="1">306.8</td>
<td align="left" rowspan="1" colspan="1">317.2</td>
<td align="left" rowspan="1" colspan="1">314.1</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Reads after quality filtering</bold>
</td>
<td align="left" rowspan="1" colspan="1">2342</td>
<td align="left" rowspan="1" colspan="1">3038</td>
<td align="left" rowspan="1" colspan="1">4988</td>
<td align="left" rowspan="1" colspan="1">3754</td>
<td align="left" rowspan="1" colspan="1">2140</td>
<td align="left" rowspan="1" colspan="1">6130</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Unique sequences</bold>
</td>
<td align="left" rowspan="1" colspan="1">212</td>
<td align="left" rowspan="1" colspan="1">179</td>
<td align="left" rowspan="1" colspan="1">311</td>
<td align="left" rowspan="1" colspan="1">265</td>
<td align="left" rowspan="1" colspan="1">155</td>
<td align="left" rowspan="1" colspan="1">244</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>Average size of unique sequences (nt)</bold>
</td>
<td align="left" rowspan="1" colspan="1">269.8</td>
<td align="left" rowspan="1" colspan="1">268.9</td>
<td align="left" rowspan="1" colspan="1">272.2</td>
<td align="left" rowspan="1" colspan="1">267.2</td>
<td align="left" rowspan="1" colspan="1">271</td>
<td align="left" rowspan="1" colspan="1">269.2</td>
</tr>
</tbody>
</table>
</alternatives>
</table-wrap>
<p>We first compared community composition derived from the pyro-tag-sequencing data, the shotgun reads with and without assembly and single-copy genes (
<xref ref-type="supplementary-material" rid="pone.0039948.s002">Material and Methods</xref>
) at the phylum level (
<xref ref-type="fig" rid="pone-0039948-g003">Figure 3</xref>
). In general, more phyla were detected in shotgun sequencing reads compared to pyro-tag-sequencing data. Specifically, the PCR-based approach using the 28F/519R primer set recovered predominately phylotypes belonging to cyanobacteria and proteobacteria, while the shotgun data also detected sequences in Actinobacteria, Nitrospira, Chloroflexi, and Verrucomicrobia (
<xref ref-type="fig" rid="pone-0039948-g003">Figure 3A, B</xref>
). This may be not only due to potential primer bias (see below), but also the short sequences (∼250nt after quality processing) (
<xref ref-type="supplementary-material" rid="pone.0039948.s002">Materials and Methods</xref>
,
<xref ref-type="table" rid="pone-0039948-t003">Table 3</xref>
) that are difficult to classify. The presence of these ‘missed’ phyla (e.g. Chloroflexi) was also confirmed by single-copy gene based search (
<xref ref-type="fig" rid="pone-0039948-g003">Figure 3D</xref>
). However, this single-copy gene approach also failed to detect some taxa (e.g. Nitrospira and Verrucomicrobia), which is likely due to the low number of reference genomes available for these phyla. Overall, these results show that 16S rRNA gene analysis from metagenomic datasets has superior capacity to detect a broad range of phylogenetic diversity.</p>
<fig id="pone-0039948-g003" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.g003</object-id>
<label>Figure 3</label>
<caption>
<title>Phylum-level classification of the sponge pyro-tag-sequencing and shotgun sequencing datasets.</title>
<p>(A) 16S rRNA gene PCR approach. (B) Unassembled shotgun 16S rRNA gene reads. (C) Assembled shotgun 16S rRNA gene reads. (D) Single-copy gene analysis.</p>
</caption>
<graphic xlink:href="pone.0039948.g003"></graphic>
</fig>
<p>We then compared the pyro-tag-sequencing data and the 16S rRNA gene reconstruction approach by generating OTUs at different phylogenetic distance cut-offs (
<xref ref-type="supplementary-material" rid="pone.0039948.s002">Materials and Methods</xref>
). In general, the PCR-based approach produced more OTUs than the metagenome-based approach, except at the 0.05 OTU-level for
<italic>C. concentrica</italic>
(
<xref ref-type="fig" rid="pone-0039948-g004">Figure 4</xref>
). This is obviously because of the much higher sequencing depth for the 16S rRNA gene in the pyro-tag samples (
<xref ref-type="table" rid="pone-0039948-t002">Table 2</xref>
,
<xref ref-type="table" rid="pone-0039948-t003">3</xref>
). A relative low number of common OTUs were observed between the two approaches. However, the OTUs unique to the PCR-based approach only present a low proportion (2.5–8.3%) of all pyro-tag reads at OTU-levels of 0.03 and 0.05. This result shows that the majority of pyro-tag reads come from phylotypes that are also contained in the metagenomic data set and that the unique OTUs of the PCR-based approach either constitute low abundance phylotypes (e.g. are part of the rare biosphere)
<xref ref-type="bibr" rid="pone.0039948-Sogin1">[43]</xref>
or are undetected chimeras
<xref ref-type="bibr" rid="pone.0039948-Quince2">[44]</xref>
. In contrast, a high proportion of reads (∼30%) belong to unique OTUs generated from the 16S rRNA gene reconstruction, which indicates that they come from abundant organisms that were missed by PCR-based approaches. Different levels of diversity of the PCR analysis and metagenomic reconstruction are also reflected in rarefaction plots (
<xref ref-type="supplementary-material" rid="pone.0039948.s001">Figure S1</xref>
). Although the sampling depths of the shotgun samples were relatively low, the trends reflected in their rarefaction plots compared to the plots of the PCR samples clearly suggests a higher community diversity.</p>
<fig id="pone-0039948-g004" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.g004</object-id>
<label>Figure 4</label>
<caption>
<title>Shared and unique OTUs of the PCR-based and shotgun-based sponge datasets.</title>
<p>Circle sizes are proportional to OTU number. (A) 0.01 phylogenetic distance OTU. (B) 0.03 phylogenetic distance OTU. (C) 0.05 phylogenetic distance OTU.</p>
</caption>
<graphic xlink:href="pone.0039948.g004"></graphic>
</fig>
</sec>
<sec id="s3d">
<title>Primer bias can explain the lack of OTU detection</title>
<p>To further investigate how PCR-amplification failed to detect certain groups of bacteria (see above), we taxonomically classified the most abundant 0.01-level OTUs (>2% in any of the 12 samples) (
<xref ref-type="fig" rid="pone-0039948-g005">Figure 5</xref>
). OTUs assigned to the bacterial groups of
<italic>Robiginitomaculum</italic>
,
<italic>Phyllobacteriaceae</italic>
_4, OCS116,
<italic>Rhodobacteraceae</italic>
,
<italic>Rhodospirillaceae</italic>
,
<italic>Acinetobacter</italic>
,
<italic>Oceanospirillaceae, Thiotrichaceae, Vibrionaceae</italic>
, PAUC26f, Sva0996 and
<italic>Verrucomicrobiaceae</italic>
were consistently missed or poorly recovered by PCR. Among them, eight 16S rRNA gene contigs belonging to seven 0.01 OTUs (i.e.
<italic>Robiginitomaculum</italic>
,
<italic>Rhodobacteraceae</italic>
,
<italic>Acinetobacter</italic>
,
<italic>Oceanospirillaceae</italic>
, PAUC26f, Sva0996, and
<italic>Verrucomicrobiaceae</italic>
, including two contigs belonging to Sva0996) covered the entire V1-3 region of the 16S rRNA gene. Alignment of these eight contigs to the degenerate primers 28F/519R found seven of them had mismatches (either one or both primers) (asterisks in
<xref ref-type="fig" rid="pone-0039948-g005">Figure 5</xref>
). This suggests that primer bias is one of the major causes for the PCR-based approach missing certain OTUs (
<xref ref-type="fig" rid="pone-0039948-g004">Figure 4</xref>
).</p>
<fig id="pone-0039948-g005" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0039948.g005</object-id>
<label>Figure 5</label>
<caption>
<title>Abundance and primer-mismatches in the top OTUs at the 0.01 phylogenetic distance level for the sponge datasets.</title>
<p>Asterisk, primer-mis-match event.</p>
</caption>
<graphic xlink:href="pone.0039948.g005"></graphic>
</fig>
</sec>
<sec id="s3e">
<title>Phylogenetic analysis of the novel 16S rRNA sequences detected by the shotgunapproach</title>
<p>To examine how many of the 25 16S rRNA gene contigs reconstructed from shotgun sequencing data have so far not been detected by PCR-based approaches in these two sponges, we performed searches against the NCBI nt database (7 April 2012) and the full-length 16S rRNA genes (primes 27F and 1492R) previously amplified from
<italic>C. concentrica</italic>
by Thomas
<italic>et al.</italic>
<xref ref-type="bibr" rid="pone.0039948-Thomas1">[45]</xref>
. Any match with a BlastN identity of >99% was considered as an amplicon counterpart to the contigs. While none of the 13 contigs from
<italic>C. coralliophila</italic>
found amplicon counterparts, 10 of the 12 contigs from
<italic>C. concentrica</italic>
had been previously detected (
<xref ref-type="supplementary-material" rid="pone.0039948.s004">Table S2</xref>
).</p>
<p>Among the 15 undetected sequences, ten were amplified by the primers used in the present study (
<xref ref-type="fig" rid="pone-0039948-g005">Figure 5</xref>
). Of the five remaining contigs, the archaeon
<italic>Nitrosopumilus</italic>
has been previously detected from the functional metaproteogenomic study of
<italic>C. concentrica</italic>
<xref ref-type="bibr" rid="pone.0039948-Liu2">[46]</xref>
. The four bacterial contigs were classified as Sva0996,
<italic>Rhodobacteraceae</italic>
, BD2-11 and
<italic>Oceanospirillaceae</italic>
(
<xref ref-type="supplementary-material" rid="pone.0039948.s004">Table S2</xref>
) and then further phylogenetically analyzed (
<xref ref-type="supplementary-material" rid="pone.0039948.s002">Figure S2</xref>
). The Acidimicrobiales- and the Gemmatimonadetes-phylotypes are part of sponge/coral specific clades in the Sva0996 group and the BD2-11 group, respectively (Figure S2B, C). The
<italic>Rhodobacteraceae</italic>
-phylotype branches distantly from the most closely related free-living neighbors (Figure S2A). The
<italic>Oceanospirillaceae</italic>
-phylotype has a closely related free-living strain (
<xref ref-type="supplementary-material" rid="pone.0039948.s002">Figure S2D</xref>
). This phylotype in the sponge
<italic>C. concentrica</italic>
has been consistently missed by PCR-based approaches despite current and previous extensive sequencing efforts using different protocols and primers
<xref ref-type="bibr" rid="pone.0039948-Thomas1">[45]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Taylor2">[47]</xref>
<xref ref-type="bibr" rid="pone.0039948-Yung1">[49]</xref>
.</p>
</sec>
</sec>
<sec id="s4">
<title>Discussion</title>
<p>In the present study, we describe how stringent assemblies and filtering can recover nearly full-length 16S rRNA gene sequences from metagenomicpyrosequencing datasets. Through simulation of communities with various complexities, we show that chimera formation is minimal and will not impact on prediction of community composition. These properties make the described approach readily applicable to existing and future metagenomic datasets. Advances in next generation sequencing technology have in recent years led to a surge of metagenomic studies and thousands of datasets are currently available
<xref ref-type="bibr" rid="pone.0039948-Simon1">[50]</xref>
,
<xref ref-type="bibr" rid="pone.0039948-Thomas2">[51]</xref>
. Our approach will thus prove itself useful in defining the phylogenetic diversity and community composition harbored in these metagenomic resources. We are also expecting that this will lead to the discovery of new phylotypes that have previously eluded PCR-based detection and our analysis of sponge symbiont communities has provided examples of this.</p>
<p>Pyro-tag-sequencing has been become a standard approach for defining community composition and has thus been extensively applied in, for example, the Human Microbiome Project
<xref ref-type="bibr" rid="pone.0039948-Peterson1">[52]</xref>
and clinical diagnosis
<xref ref-type="bibr" rid="pone.0039948-Siqueira1">[53]</xref>
. We show here that PCR can cause a substantial impact on the assessment of communities in terms of diversity, composition and abundance. It might therefore be worthwhile to benchmark primer choice based on 16S rRNA genes reconstructed from metagenomic data before establishing routine assays based on PCR methods.</p>
</sec>
<sec sec-type="supplementary-material" id="s5">
<title>Supporting Information</title>
<supplementary-material content-type="local-data" id="pone.0039948.s001">
<label>Figure S1</label>
<caption>
<p>
<bold>Rarefaction plots for the sponge datasets.</bold>
Dataare based on an OTU distance of 0.01 (A), 0.03 (B), and 0.05 (C), and based on phylogenetic distance (D). The plots on the right are enlargements of the dashed boxes on the diagrams to the left.</p>
<p>(TIFF)</p>
</caption>
<media xlink:href="pone.0039948.s001.tiff" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0039948.s002">
<label>Figure S2</label>
<caption>
<p>
<bold>Phylogenetic analysis of the 16S rRNA gene sequences missed by PCR.</bold>
Percentage bootstrapping values (1,000 replications) greater than 50% are shown. Sponge-derived sequences are shown in bold. Pentagram-marked sequences are from the present study. (A) The
<italic>Rhodobacteraceae</italic>
bacterium in the family
<italic>Rhodobacteraceae</italic>
, with tree rooted to
<italic>Leisingeramethylohalidivoraans</italic>
[AY005463]. (B) the
<italic>Acidimicrobiales</italic>
bacterium in the clade Sva0996, with tree rooted to
<italic>Iamiamajanohamensis</italic>
[AB360448]. (C) The
<italic>Gemmatimonadetes</italic>
(class) bacterium in the clade BD2-11, with tree rooted to
<italic>Gemmatimonasaurantiaca</italic>
[AP009153]. (D) The
<italic>Oceanospirillaceae</italic>
bacterium in the family
<italic>Oceanospirillaceae</italic>
, with tree rooted to
<italic>Comamonascomposti</italic>
[EF015884].</p>
<p>(PNG)</p>
</caption>
<media xlink:href="pone.0039948.s002.png" mimetype="image" mime-subtype="png">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0039948.s003">
<label>Table S1</label>
<caption>
<p>
<bold>Simulated datasets.</bold>
</p>
<p>(DOCX)</p>
</caption>
<media xlink:href="pone.0039948.s003.docx" mimetype="application" mime-subtype="msword">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pone.0039948.s004">
<label>Table S2</label>
<caption>
<p>
<bold>16S rRNA gene contigs generated from sponge metagenomic samples.</bold>
</p>
<p>(DOCX)</p>
</caption>
<media xlink:href="pone.0039948.s004.docx" mimetype="application" mime-subtype="msword">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>We acknowledge the J. Craig Venter Institute's Joint Technology Center under the leadership of Yu-Hui Rogers and the assistance of Matt Lewis for producing the sequencing data.</p>
</ack>
<fn-group>
<fn fn-type="conflict">
<p>
<bold>Competing Interests: </bold>
The authors have declared that no competing interests exist.</p>
</fn>
<fn fn-type="financial-disclosure">
<p>
<bold>Funding: </bold>
This work was funded by the Australian Research Council, the Gordon and Betty Moore Foundation and the Centre for Marine Bio-Innovation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="pone.0039948-Pace1">
<label>1</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pace</surname>
<given-names>NR</given-names>
</name>
</person-group>
<year>1997</year>
<article-title>A molecular view of microbial diversity and the biosphere.</article-title>
<source>Science</source>
<volume>276</volume>
<fpage>734</fpage>
<lpage>740</lpage>
<pub-id pub-id-type="pmid">9115194</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Tringe1">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tringe</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>A renaissance for the pioneering 16S rRNA gene.CurrOpinMicrobiol</article-title>
<volume>11</volume>
<fpage>442</fpage>
<lpage>446</lpage>
</element-citation>
</ref>
<ref id="pone.0039948-Hong1">
<label>3</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hong</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bunge</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Leslin</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Jeon</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Epstein</surname>
<given-names>SS</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Polymerase chain reaction primers miss half of rRNA microbial diversity.</article-title>
<source>ISME J</source>
<volume>3</volume>
<fpage>1365</fpage>
<lpage>1373</lpage>
<pub-id pub-id-type="pmid">19693101</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Venter1">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Venter</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Remington</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Halpern</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Rusch</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<year>2004</year>
<article-title>Environmental genome shotgun sequencing of the Sargasso Sea.</article-title>
<source>Science</source>
<volume>304</volume>
<fpage>66</fpage>
<lpage>74</lpage>
<pub-id pub-id-type="pmid">15001713</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Haas1">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Haas</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Gevers</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Earl</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Feldgarden</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>DV</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons.</article-title>
<source>Genome Res</source>
<volume>21</volume>
<fpage>494</fpage>
<lpage>504</lpage>
<pub-id pub-id-type="pmid">21212162</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Huson1">
<label>6</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schuster</surname>
<given-names>SC</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>MEGAN analysis of metagenomic data.</article-title>
<source>Genome Res</source>
<volume>17</volume>
<fpage>377</fpage>
<lpage>386</lpage>
<pub-id pub-id-type="pmid">17255551</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Stark1">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stark</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Stamatakis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>von Mering</surname>
<given-names>C</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>MLTreeMap – accurate Maximum Likelihood placement of environmental DNA sequences into taxonomic and functional reference phylogenies.</article-title>
<source>BMC Genomics</source>
<volume>11</volume>
<fpage>461</fpage>
<pub-id pub-id-type="pmid">20687950</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Wu1">
<label>8</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Eisen</surname>
<given-names>JA</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>A simple, fast, and accurate method of phylogenomic inference.</article-title>
<source>Genome Biol</source>
<volume>9</volume>
<fpage>R151</fpage>
<pub-id pub-id-type="pmid">18851752</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Liu1">
<label>9</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Gibbons</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ghodsi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Treangen</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Pop</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences.</article-title>
<source>BMC Genomics</source>
<volume>12</volume>
<fpage>S4</fpage>
<pub-id pub-id-type="pmid">21989143</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Teeling1">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Teeling</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Waldmann</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lombardot</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Bauer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Glöckner</surname>
<given-names>FO</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences.</article-title>
<source>BMC Bioinformatics</source>
<volume>5</volume>
<fpage>163</fpage>
<pub-id pub-id-type="pmid">15507136</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-McHardy1">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Martín</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Tsirigos</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Rigoutsos</surname>
<given-names>I</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Accurate phylogenetic classification of variable-length DNA fragments.</article-title>
<source>Nat Methods</source>
<volume>4</volume>
<fpage>63</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="pmid">17179938</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Brady1">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brady</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Salzberg</surname>
<given-names>SL</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models.</article-title>
<source>Nat Methods</source>
<volume>6</volume>
<fpage>673</fpage>
<lpage>676</lpage>
<pub-id pub-id-type="pmid">19648916</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Saeed1">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saeed</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>S-L</given-names>
</name>
<name>
<surname>Halgamuge</surname>
<given-names>SK</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition.</article-title>
<source>Nucleic Acids Res</source>
<volume>40</volume>
<fpage>e34</fpage>
<pub-id pub-id-type="pmid">22180538</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Wu2">
<label>14</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Mavromatis</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Pukall</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Dalin</surname>
<given-names>E</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea.</article-title>
<source>Nature</source>
<volume>462</volume>
<fpage>1056</fpage>
<lpage>1060</lpage>
<pub-id pub-id-type="pmid">20033048</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Schloss1">
<label>15</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schloss</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Handelsman</surname>
<given-names>J</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Status of the microbial census.MicrobiolMolBiol Rev</article-title>
<volume>68</volume>
<fpage>686</fpage>
<lpage>691</lpage>
</element-citation>
</ref>
<ref id="pone.0039948-PedrsAli1">
<label>16</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pedrós-Alió</surname>
<given-names>C</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>Marine microbial diversity: can it be determined?</article-title>
<source>Trends Microbiol</source>
<volume>14</volume>
<fpage>257</fpage>
<lpage>263</lpage>
<pub-id pub-id-type="pmid">16679014</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Sharpton1">
<label>17</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sharpton</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Riesenfeld</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Kembel</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>Ladau</surname>
<given-names>J</given-names>
</name>
<name>
<surname>O'Dwyer</surname>
<given-names>JP</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>PhylOTU: A High-Throughput Procedure Quantifies Microbial Community Diversity and Resolves Novel Taxa from Metagenomic Data.</article-title>
<source>PLoSComputBiol</source>
<volume>7</volume>
<fpage>e1001061</fpage>
</element-citation>
</ref>
<ref id="pone.0039948-Rusch1">
<label>18</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rusch</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Halpern</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Sutton</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>KB</given-names>
</name>
<name>
<surname>Williamson</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<year>2007</year>
<article-title>The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.</article-title>
<source>PLoSBiol</source>
<volume>5</volume>
<fpage>e77</fpage>
</element-citation>
</ref>
<ref id="pone.0039948-Miller1">
<label>19</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Miller</surname>
<given-names>CS</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>BC</given-names>
</name>
<name>
<surname>Singer</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>Banfield</surname>
<given-names>JF</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>EMIRGE: Reconstruction of full length ribosomal genes from microbial community short read sequencing data.</article-title>
<source>Genome Biol</source>
<volume>12</volume>
<fpage>R44</fpage>
<pub-id pub-id-type="pmid">21595876</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Schloss2">
<label>20</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schloss</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Handelsman</surname>
<given-names>J</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>Metagenomics for studying unculturable microorganisms: cutting the Gordian knot.</article-title>
<source>Genome Biol</source>
<volume>6</volume>
<fpage>229</fpage>
<pub-id pub-id-type="pmid">16086859</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Margulies1">
<label>21</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Margulies</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Egholm</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Altman</surname>
<given-names>WE</given-names>
</name>
<name>
<surname>Attiya</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bader</surname>
<given-names>JS</given-names>
</name>
<etal></etal>
</person-group>
<year>2005</year>
<article-title>Genome sequencing in microfabricated high-density picolitre reactors.</article-title>
<source>Nature</source>
<volume>437</volume>
<fpage>376</fpage>
<lpage>380</lpage>
<pub-id pub-id-type="pmid">16056220</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Mavromatis1">
<label>22</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mavromatis</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ivanova</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Barry</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Shapiro</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Goltsman</surname>
<given-names>E</given-names>
</name>
<etal></etal>
</person-group>
<year>2007</year>
<article-title>Use of simulated data sets to evaluate the fidelity of metagenomic processing methods.</article-title>
<source>Nat Methods</source>
<volume>4</volume>
<fpage>495</fpage>
<lpage>500</lpage>
<pub-id pub-id-type="pmid">17468765</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-McElroy1">
<label>23</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McElroy</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Luciani</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>T</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>GemSIM: General, Error-Model based SIMulator of next-generation sequencing data.</article-title>
<source>BMC Genomics</source>
<volume>13</volume>
<fpage>74</fpage>
<pub-id pub-id-type="pmid">22336055</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Fan1">
<label>24</label>
<element-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Fan</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Reynolds</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Stark</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kjelleberg</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<year>2012</year>
<article-title>Functional equivalence and evolutionary convergence in complex communities of microbial sponge symbionts.</article-title>
<source>ProcNatlAcadSci U S A</source>
<comment>(In Press).</comment>
</element-citation>
</ref>
<ref id="pone.0039948-Schmieder1">
<label>25</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schmieder</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Quality control and preprocessing of metagenomic datasets.</article-title>
<source>Bioinformatics</source>
<volume>27</volume>
<fpage>863</fpage>
<lpage>864</lpage>
<pub-id pub-id-type="pmid">21278185</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Bengtsson1">
<label>26</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bengtsson</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Eriksson</surname>
<given-names>KM</given-names>
</name>
<name>
<surname>Hartmann</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Shenoy</surname>
<given-names>BD</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets.</article-title>
<source>Antonie Van Leeuwenhoek</source>
<volume>100</volume>
<fpage>471</fpage>
<lpage>475</lpage>
<pub-id pub-id-type="pmid">21674231</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Radax1">
<label>27</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Radax</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Rattei</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Lanzen</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Bayer</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Rapp</surname>
<given-names>HT</given-names>
</name>
<etal></etal>
</person-group>
<year>2012</year>
<article-title>Metatranscriptomics of the marine sponge Geodiabarretti: tackling phylogeny and function of its microbial community.</article-title>
<source>Environ Microbiol</source>
<volume>14</volume>
<fpage>1308</fpage>
<lpage>1324</lpage>
<pub-id pub-id-type="pmid">22364353</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Pruesse1">
<label>28</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pruesse</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Quast</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Knittel</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Fuchs</surname>
<given-names>BM</given-names>
</name>
<name>
<surname>Ludwig</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<year>2007</year>
<article-title>SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB.</article-title>
<source>Nucleic Acids Res</source>
<volume>35</volume>
<fpage>7188</fpage>
<lpage>7196</lpage>
<pub-id pub-id-type="pmid">17947321</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Dowd1">
<label>29</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dowd</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Callaway</surname>
<given-names>TR</given-names>
</name>
<name>
<surname>Wolcott</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>McKeehan</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<year>2008</year>
<article-title>Evaluation of the bacterial diversity in the feces of cattle using 16S rDNA bacterial tag-encoded FLX ampliconpyrosequencing (bTEFAP).</article-title>
<source>BMC Microbiol</source>
<volume>8</volume>
<fpage>125</fpage>
<pub-id pub-id-type="pmid">18652685</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Schloss3">
<label>30</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schloss, Gevers</surname>
<given-names>PDA</given-names>
</name>
<name>
<surname>Westcott</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Sarah</surname>
<given-names>S</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Reducing the Effects of PCR Amplification and Sequencing Artifacts on 16S rRNA-Based Studies.</article-title>
<source>PLoS One</source>
<volume>6</volume>
<fpage>e27310</fpage>
<pub-id pub-id-type="pmid">22194782</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Caporaso1">
<label>31</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Caporaso</surname>
<given-names>JG</given-names>
</name>
<name>
<surname>Kuczynski</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Stombaugh</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bittinger</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Bushman</surname>
<given-names>FD</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>QIIME allows analysis of high-throughput community sequencing data.</article-title>
<source>Nat Methods</source>
<volume>7</volume>
<fpage>335</fpage>
<lpage>336</lpage>
<pub-id pub-id-type="pmid">20383131</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Wang1">
<label>32</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Garrity</surname>
<given-names>GM</given-names>
</name>
<name>
<surname>Tiedje</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Cole</surname>
<given-names>JR</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.</article-title>
<source>Appl Environ Microbiol</source>
<volume>73</volume>
<fpage>5261</fpage>
<lpage>5267</lpage>
<pub-id pub-id-type="pmid">17586664</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-McDonald1">
<label>33</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McDonald</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Price</surname>
<given-names>MN</given-names>
</name>
<name>
<surname>Goodrich</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Nawrocki</surname>
<given-names>EP</given-names>
</name>
<name>
<surname>Desantis</surname>
<given-names>TZ</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea.</article-title>
<source>ISME J</source>
<volume>6</volume>
<fpage>610</fpage>
<lpage>618</lpage>
<pub-id pub-id-type="pmid">22134646</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Stamatakis1">
<label>34</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stamatakis</surname>
<given-names>A</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.</article-title>
<source>Bioinformatics</source>
<volume>22</volume>
<fpage>2688</fpage>
<lpage>2690</lpage>
<pub-id pub-id-type="pmid">16928733</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Talavera1">
<label>35</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Talavera</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Castresana</surname>
<given-names>J</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.SystBiol</article-title>
<volume>56</volume>
<fpage>564</fpage>
<lpage>577</lpage>
</element-citation>
</ref>
<ref id="pone.0039948-Edgar1">
<label>36</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Haas</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Clemente</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Quince</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>R</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>UCHIME improves sensitivity and speed of chimera detection.</article-title>
<source>Bioinformatics</source>
<volume>27</volume>
<fpage>2194</fpage>
<lpage>2200</lpage>
<pub-id pub-id-type="pmid">21700674</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Ashelford1">
<label>37</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ashelford</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Chuzhanova</surname>
<given-names>NA</given-names>
</name>
<name>
<surname>Fry</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Weightman</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies.</article-title>
<source>Appl Environ Microbiol</source>
<volume>71</volume>
<fpage>7724</fpage>
<lpage>7736</lpage>
<pub-id pub-id-type="pmid">16332745</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Huber1">
<label>38</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huber</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Faulkner</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Bellerophon: a program to detect chimeric sequences in multiple sequence alignments.</article-title>
<source>Bioinformatics</source>
<volume>20</volume>
<fpage>2317</fpage>
<lpage>2319</lpage>
<pub-id pub-id-type="pmid">15073015</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Ashelford2">
<label>39</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ashelford</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Chuzhanova</surname>
<given-names>NA</given-names>
</name>
<name>
<surname>Fry</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Weightman</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>New screening software shows that most recent large 16S rRNA gene clone libraries contain chimeras.</article-title>
<source>Appl Environ Microbiol</source>
<volume>72</volume>
<fpage>5734</fpage>
<lpage>5741</lpage>
<pub-id pub-id-type="pmid">16957188</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Quince1">
<label>40</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Quince</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lanzén</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Curtis</surname>
<given-names>TP</given-names>
</name>
<name>
<surname>Davenport</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Hall</surname>
<given-names>N</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>Accurate determination of microbial diversity from 454 pyrosequencing data.</article-title>
<source>Nat Methods</source>
<volume>6</volume>
<fpage>639</fpage>
<lpage>641</lpage>
<pub-id pub-id-type="pmid">19668203</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Taylor1">
<label>41</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taylor</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Radax</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Steger</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Wagner</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Sponge-associated microorganisms: evolution, ecology, and biotechnological potential.MicrobiolMolBiol Rev</article-title>
<volume>71</volume>
<fpage>295</fpage>
<lpage>347</lpage>
</element-citation>
</ref>
<ref id="pone.0039948-Schmitt1">
<label>42</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schmitt</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Tsai</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bell</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Fromont</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ilan</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<year>2012</year>
<article-title>Assessing the complex sponge microbiota: core, variable and species-specific bacterial communities in marine sponges.</article-title>
<source>ISME J</source>
<volume>6</volume>
<fpage>564</fpage>
<lpage>576</lpage>
<pub-id pub-id-type="pmid">21993395</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Sogin1">
<label>43</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sogin</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Morrison</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Huber</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Mark Welch</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Huse</surname>
<given-names>SM</given-names>
</name>
<etal></etal>
</person-group>
<year>2006</year>
<article-title>Microbial diversity in the deep sea and the underexplored “rare biosphere”</article-title>
<source>ProcNatlAcadSci U S A</source>
<volume>103</volume>
<fpage>12115</fpage>
<lpage>12120</lpage>
</element-citation>
</ref>
<ref id="pone.0039948-Quince2">
<label>44</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Quince</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lanzen</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Davenport</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Turnbaugh</surname>
<given-names>PJ</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Removing noise from pyrosequencedamplicons.</article-title>
<source>BMC Bioinformatics</source>
<volume>12</volume>
<fpage>38</fpage>
<pub-id pub-id-type="pmid">21276213</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Thomas1">
<label>45</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thomas</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rusch</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Demaere</surname>
<given-names>MZ</given-names>
</name>
<name>
<surname>Yung</surname>
<given-names>PY</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>Functional genomic signatures of sponge bacteria reveal unique and shared features of symbiosis.</article-title>
<source>ISME J</source>
<volume>4</volume>
<fpage>1557</fpage>
<lpage>1567</lpage>
<pub-id pub-id-type="pmid">20520651</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Liu2">
<label>46</label>
<element-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fan</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Kjelleberg</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>T</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Metaproteogenomic analysis of a community of sponge symbionts.</article-title>
<source>ISME J</source>
<comment>(doi:10.1038/ismej.2012.1).</comment>
</element-citation>
</ref>
<ref id="pone.0039948-Taylor2">
<label>47</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taylor</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Schupp</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Dahllöf</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Kjelleberg</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Steinberg</surname>
<given-names>PD</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Host specificity in marine sponge-associated bacteria, and potential implications for marine microbial diversity.</article-title>
<source>Environ Microbiol</source>
<volume>6</volume>
<fpage>121</fpage>
<lpage>130</lpage>
<pub-id pub-id-type="pmid">14756877</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Taylor3">
<label>48</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taylor</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Schupp</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>de Nys</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Kjelleberg</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Steinberg</surname>
<given-names>PD</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>Biogeography of bacteria associated with the marine sponge Cymbastelaconcentrica.</article-title>
<source>Environ Microbiol</source>
<volume>7</volume>
<fpage>419</fpage>
<lpage>433</lpage>
<pub-id pub-id-type="pmid">15683402</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Yung1">
<label>49</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yung</surname>
<given-names>PY</given-names>
</name>
<name>
<surname>Burke</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Egan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kjelleberg</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>Phylogenetic screening of a bacterial, metagenomic library using homing endonuclease restriction and marker insertion.</article-title>
<source>Nucleic Acids Res</source>
<volume>37</volume>
<fpage>e144</fpage>
<pub-id pub-id-type="pmid">19767618</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Simon1">
<label>50</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simon</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Daniel</surname>
<given-names>R</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Metagenomic analyses: past and future trends.</article-title>
<source>Appl Environ Microbiol</source>
<volume>77</volume>
<fpage>1153</fpage>
<lpage>1161</lpage>
<pub-id pub-id-type="pmid">21169428</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Thomas2">
<label>51</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thomas</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Gilbert</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Metagenomics-a guide from sampling to data analysis.</article-title>
<source>Microbial Informatics and Experimentation</source>
<volume>2</volume>
<fpage>3</fpage>
<pub-id pub-id-type="pmid">22587947</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Peterson1">
<label>52</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peterson</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Garges</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Giovanni</surname>
<given-names>M</given-names>
</name>
<name>
<surname>McInnes</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>The NIH Human Microbiome Project.</article-title>
<source>Genome Res</source>
<volume>19</volume>
<fpage>2317</fpage>
<lpage>2323</lpage>
<pub-id pub-id-type="pmid">19819907</pub-id>
</element-citation>
</ref>
<ref id="pone.0039948-Siqueira1">
<label>53</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Siqueira</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Fouad</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Rôças</surname>
<given-names>IN</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Pyrosequencing as a tool for better understanding of human microbiomes.</article-title>
<source>J Oral Microbiol</source>
<volume>4</volume>
<fpage>10743</fpage>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000606 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000606 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:3384625
   |texte=   Reconstruction of Ribosomal RNA Genes from Metagenomic Data
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:22761935" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024