Serveur d'exploration sur les relations entre la France et l'Australie

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 0015930 ( Pmc/Corpus ); précédent : 0015929; suivant : 0015931 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Sequencing platform and library preparation choices impact viral metagenomes</title>
<author>
<name sortKey="Solonenko, Sergei A" sort="Solonenko, Sergei A" uniqKey="Solonenko S" first="Sergei A" last="Solonenko">Sergei A. Solonenko</name>
<affiliation>
<nlm:aff id="I1">Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ignacio Espinoza, J Cesar" sort="Ignacio Espinoza, J Cesar" uniqKey="Ignacio Espinoza J" first="J César" last="Ignacio-Espinoza">J César Ignacio-Espinoza</name>
<affiliation>
<nlm:aff id="I2">Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Alberti, Adriana" sort="Alberti, Adriana" uniqKey="Alberti A" first="Adriana" last="Alberti">Adriana Alberti</name>
<affiliation>
<nlm:aff id="I3">CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Cruaud, Corinne" sort="Cruaud, Corinne" uniqKey="Cruaud C" first="Corinne" last="Cruaud">Corinne Cruaud</name>
<affiliation>
<nlm:aff id="I3">CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hallam, Steven" sort="Hallam, Steven" uniqKey="Hallam S" first="Steven" last="Hallam">Steven Hallam</name>
<affiliation>
<nlm:aff id="I4">Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Konstantinidis, Kostas" sort="Konstantinidis, Kostas" uniqKey="Konstantinidis K" first="Kostas" last="Konstantinidis">Kostas Konstantinidis</name>
<affiliation>
<nlm:aff id="I5">Department of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tyson, Gene" sort="Tyson, Gene" uniqKey="Tyson G" first="Gene" last="Tyson">Gene Tyson</name>
<affiliation>
<nlm:aff id="I6">Austalian Center for Ecogenomics, University of Queensland, Brisbane, QLD, Australia</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wincker, Patrick" sort="Wincker, Patrick" uniqKey="Wincker P" first="Patrick" last="Wincker">Patrick Wincker</name>
<affiliation>
<nlm:aff id="I3">CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sullivan, Matthew B" sort="Sullivan, Matthew B" uniqKey="Sullivan M" first="Matthew B" last="Sullivan">Matthew B. Sullivan</name>
<affiliation>
<nlm:aff id="I1">Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">23663384</idno>
<idno type="pmc">3655917</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3655917</idno>
<idno type="RBID">PMC:3655917</idno>
<idno type="doi">10.1186/1471-2164-14-320</idno>
<date when="2013">2013</date>
<idno type="wicri:Area/Pmc/Corpus">001593</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">001593</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Sequencing platform and library preparation choices impact viral metagenomes</title>
<author>
<name sortKey="Solonenko, Sergei A" sort="Solonenko, Sergei A" uniqKey="Solonenko S" first="Sergei A" last="Solonenko">Sergei A. Solonenko</name>
<affiliation>
<nlm:aff id="I1">Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ignacio Espinoza, J Cesar" sort="Ignacio Espinoza, J Cesar" uniqKey="Ignacio Espinoza J" first="J César" last="Ignacio-Espinoza">J César Ignacio-Espinoza</name>
<affiliation>
<nlm:aff id="I2">Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Alberti, Adriana" sort="Alberti, Adriana" uniqKey="Alberti A" first="Adriana" last="Alberti">Adriana Alberti</name>
<affiliation>
<nlm:aff id="I3">CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Cruaud, Corinne" sort="Cruaud, Corinne" uniqKey="Cruaud C" first="Corinne" last="Cruaud">Corinne Cruaud</name>
<affiliation>
<nlm:aff id="I3">CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hallam, Steven" sort="Hallam, Steven" uniqKey="Hallam S" first="Steven" last="Hallam">Steven Hallam</name>
<affiliation>
<nlm:aff id="I4">Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Konstantinidis, Kostas" sort="Konstantinidis, Kostas" uniqKey="Konstantinidis K" first="Kostas" last="Konstantinidis">Kostas Konstantinidis</name>
<affiliation>
<nlm:aff id="I5">Department of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tyson, Gene" sort="Tyson, Gene" uniqKey="Tyson G" first="Gene" last="Tyson">Gene Tyson</name>
<affiliation>
<nlm:aff id="I6">Austalian Center for Ecogenomics, University of Queensland, Brisbane, QLD, Australia</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wincker, Patrick" sort="Wincker, Patrick" uniqKey="Wincker P" first="Patrick" last="Wincker">Patrick Wincker</name>
<affiliation>
<nlm:aff id="I3">CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sullivan, Matthew B" sort="Sullivan, Matthew B" uniqKey="Sullivan M" first="Matthew B" last="Sullivan">Matthew B. Sullivan</name>
<affiliation>
<nlm:aff id="I1">Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Genomics</title>
<idno type="eISSN">1471-2164</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Microbes drive the biogeochemistry that fuels the planet. Microbial viruses modulate their hosts directly through mortality and horizontal gene transfer, and indirectly by re-programming host metabolisms during infection. However, our ability to study these virus-host interactions is limited by methods that are low-throughput and heavily reliant upon the subset of organisms that are in culture. One way forward are culture-independent metagenomic approaches, but these novel methods are rarely rigorously tested, especially for studies of environmental viruses, air microbiomes, extreme environment microbiology and other areas with constrained sample amounts. Here we perform replicated experiments to evaluate Roche 454, Illumina HiSeq, and Ion Torrent PGM sequencing and library preparation protocols on virus metagenomes generated from as little as 10pg of DNA.</p>
</sec>
<sec>
<title>Results</title>
<p>Using %G + C content to compare metagenomes, we find that (i) metagenomes are highly replicable, (ii) some treatment effects are minimal, e.g., sequencing technology choice has 6-fold less impact than varying input DNA amount, and (iii) when restricted to a limited DNA concentration (<1μg), changing the amount of amplification produces little variation. These trends were also observed when examining the metagenomes for gene function and assembly performance, although the latter more closely aligned to sequencing effort and read length than preparation steps tested. Among Illumina library preparation options, transposon-based libraries diverged from all others and adaptor ligation was a critical step for optimizing sequencing yields.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>These data guide researchers in generating systematic, comparative datasets to understand complex ecosystems, and suggest that neither varied amplification nor sequencing platforms will deter such efforts.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Chaffron, S" uniqKey="Chaffron S">S Chaffron</name>
</author>
<author>
<name sortKey="Rehrauer, H" uniqKey="Rehrauer H">H Rehrauer</name>
</author>
<author>
<name sortKey="Pernthaler, J" uniqKey="Pernthaler J">J Pernthaler</name>
</author>
<author>
<name sortKey="Von Mering, C" uniqKey="Von Mering C">C von Mering</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shapiro, Bj" uniqKey="Shapiro B">BJ Shapiro</name>
</author>
<author>
<name sortKey="Friedman, J" uniqKey="Friedman J">J Friedman</name>
</author>
<author>
<name sortKey="Cordero, Ox" uniqKey="Cordero O">OX Cordero</name>
</author>
<author>
<name sortKey="Preheim, Sp" uniqKey="Preheim S">SP Preheim</name>
</author>
<author>
<name sortKey="Timberlake, Sc" uniqKey="Timberlake S">SC Timberlake</name>
</author>
<author>
<name sortKey="Szabo, G" uniqKey="Szabo G">G Szabo</name>
</author>
<author>
<name sortKey="Polz, Mf" uniqKey="Polz M">MF Polz</name>
</author>
<author>
<name sortKey="Alm, Ej" uniqKey="Alm E">EJ Alm</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Handelsman, J" uniqKey="Handelsman J">J Handelsman</name>
</author>
<author>
<name sortKey="Tiedje, Jm" uniqKey="Tiedje J">JM Tiedje</name>
</author>
<author>
<name sortKey="Alvarez Cohen, L" uniqKey="Alvarez Cohen L">L Alvarez-Cohen</name>
</author>
<author>
<name sortKey="Ashburner, M" uniqKey="Ashburner M">M Ashburner</name>
</author>
<author>
<name sortKey="Cann, Iko" uniqKey="Cann I">IKO Cann</name>
</author>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF Delong</name>
</author>
<author>
<name sortKey="Doolittle, Wf" uniqKey="Doolittle W">WF Doolittle</name>
</author>
<author>
<name sortKey="Fraser Liggett, Cm" uniqKey="Fraser Liggett C">CM Fraser-Liggett</name>
</author>
<author>
<name sortKey="Godzik, A" uniqKey="Godzik A">A Godzik</name>
</author>
<author>
<name sortKey="Gordon, Ji" uniqKey="Gordon J">JI Gordon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Glenn, Tc" uniqKey="Glenn T">TC Glenn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kircher, M" uniqKey="Kircher M">M Kircher</name>
</author>
<author>
<name sortKey="Kelso, J" uniqKey="Kelso J">J Kelso</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Metzker, Ml" uniqKey="Metzker M">ML Metzker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Loman, Nj" uniqKey="Loman N">NJ Loman</name>
</author>
<author>
<name sortKey="Misra, Rv" uniqKey="Misra R">RV Misra</name>
</author>
<author>
<name sortKey="Dallman, Tj" uniqKey="Dallman T">TJ Dallman</name>
</author>
<author>
<name sortKey="Constantinidou, C" uniqKey="Constantinidou C">C Constantinidou</name>
</author>
<author>
<name sortKey="Gharbia, Se" uniqKey="Gharbia S">SE Gharbia</name>
</author>
<author>
<name sortKey="Wain, J" uniqKey="Wain J">J Wain</name>
</author>
<author>
<name sortKey="Pallen, Mj" uniqKey="Pallen M">MJ Pallen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Linnarsson, S" uniqKey="Linnarsson S">S Linnarsson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Luo, C" uniqKey="Luo C">C Luo</name>
</author>
<author>
<name sortKey="Tsementzi, D" uniqKey="Tsementzi D">D Tsementzi</name>
</author>
<author>
<name sortKey="Kyrpides, N" uniqKey="Kyrpides N">N Kyrpides</name>
</author>
<author>
<name sortKey="Read, T" uniqKey="Read T">T Read</name>
</author>
<author>
<name sortKey="Konstantinidis, Kt" uniqKey="Konstantinidis K">KT Konstantinidis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Duhaime, M" uniqKey="Duhaime M">M Duhaime</name>
</author>
<author>
<name sortKey="Sullivan, Mb" uniqKey="Sullivan M">MB Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hurwitz, Bh" uniqKey="Hurwitz B">BH Hurwitz</name>
</author>
<author>
<name sortKey="Deng, L" uniqKey="Deng L">L Deng</name>
</author>
<author>
<name sortKey="Poulos, B" uniqKey="Poulos B">B Poulos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="John, Sg" uniqKey="John S">SG John</name>
</author>
<author>
<name sortKey="Mendez, Cb" uniqKey="Mendez C">CB Mendez</name>
</author>
<author>
<name sortKey="Deng, L" uniqKey="Deng L">L Deng</name>
</author>
<author>
<name sortKey="Poulos, B" uniqKey="Poulos B">B Poulos</name>
</author>
<author>
<name sortKey="Kauffman, Akm" uniqKey="Kauffman A">AKM Kauffman</name>
</author>
<author>
<name sortKey="Kern, S" uniqKey="Kern S">S Kern</name>
</author>
<author>
<name sortKey="Brum, J" uniqKey="Brum J">J Brum</name>
</author>
<author>
<name sortKey="Polz, Mf" uniqKey="Polz M">MF Polz</name>
</author>
<author>
<name sortKey="Boyle, Ea" uniqKey="Boyle E">EA Boyle</name>
</author>
<author>
<name sortKey="Sullivan, Mb" uniqKey="Sullivan M">MB Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yilmaz, S" uniqKey="Yilmaz S">S Yilmaz</name>
</author>
<author>
<name sortKey="Allgaier, M" uniqKey="Allgaier M">M Allgaier</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kim, Kh" uniqKey="Kim K">KH Kim</name>
</author>
<author>
<name sortKey="Bae, Jw" uniqKey="Bae J">JW Bae</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Duhaime, M" uniqKey="Duhaime M">M Duhaime</name>
</author>
<author>
<name sortKey="Deng, L" uniqKey="Deng L">L Deng</name>
</author>
<author>
<name sortKey="Poulos, B" uniqKey="Poulos B">B Poulos</name>
</author>
<author>
<name sortKey="Sullivan, Mb" uniqKey="Sullivan M">MB Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoeijmakers, Wa" uniqKey="Hoeijmakers W">WA Hoeijmakers</name>
</author>
<author>
<name sortKey="Bartfai, R" uniqKey="Bartfai R">R Bartfai</name>
</author>
<author>
<name sortKey="Francoijs, Kj" uniqKey="Francoijs K">KJ Francoijs</name>
</author>
<author>
<name sortKey="Stunnenberg, Hg" uniqKey="Stunnenberg H">HG Stunnenberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Oyola, So" uniqKey="Oyola S">SO Oyola</name>
</author>
<author>
<name sortKey="Otto, Td" uniqKey="Otto T">TD Otto</name>
</author>
<author>
<name sortKey="Gu, Y" uniqKey="Gu Y">Y Gu</name>
</author>
<author>
<name sortKey="Maslen, G" uniqKey="Maslen G">G Maslen</name>
</author>
<author>
<name sortKey="Manske, M" uniqKey="Manske M">M Manske</name>
</author>
<author>
<name sortKey="Campino, S" uniqKey="Campino S">S Campino</name>
</author>
<author>
<name sortKey="Turner, Dj" uniqKey="Turner D">DJ Turner</name>
</author>
<author>
<name sortKey="Macinnis, B" uniqKey="Macinnis B">B Macinnis</name>
</author>
<author>
<name sortKey="Kwiatkowski, Dp" uniqKey="Kwiatkowski D">DP Kwiatkowski</name>
</author>
<author>
<name sortKey="Swerdlow, Hp" uniqKey="Swerdlow H">HP Swerdlow</name>
</author>
<author>
<name sortKey="Quail, Ma" uniqKey="Quail M">MA Quail</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hurwitz, Bh" uniqKey="Hurwitz B">BH Hurwitz</name>
</author>
<author>
<name sortKey="Sullivan, Mb" uniqKey="Sullivan M">MB Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roux, S" uniqKey="Roux S">S Roux</name>
</author>
<author>
<name sortKey="Faubladier, M" uniqKey="Faubladier M">M Faubladier</name>
</author>
<author>
<name sortKey="Mahul, A" uniqKey="Mahul A">A Mahul</name>
</author>
<author>
<name sortKey="Paulhe, N" uniqKey="Paulhe N">N Paulhe</name>
</author>
<author>
<name sortKey="Bernard, A" uniqKey="Bernard A">A Bernard</name>
</author>
<author>
<name sortKey="Debroas, D" uniqKey="Debroas D">D Debroas</name>
</author>
<author>
<name sortKey="Enault, F" uniqKey="Enault F">F Enault</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wommack, Ke" uniqKey="Wommack K">KE Wommack</name>
</author>
<author>
<name sortKey="Polson, Sw" uniqKey="Polson S">SW Polson</name>
</author>
<author>
<name sortKey="Bhaysar, J" uniqKey="Bhaysar J">J Bhaysar</name>
</author>
<author>
<name sortKey="Srinivasiah, S" uniqKey="Srinivasiah S">S Srinivasiah</name>
</author>
<author>
<name sortKey="Jamindar, S" uniqKey="Jamindar S">S Jamindar</name>
</author>
<author>
<name sortKey="Dumas, M" uniqKey="Dumas M">M Dumas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karsenti, E" uniqKey="Karsenti E">E Karsenti</name>
</author>
<author>
<name sortKey="Acinas, Sg" uniqKey="Acinas S">SG Acinas</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
<author>
<name sortKey="Bowler, C" uniqKey="Bowler C">C Bowler</name>
</author>
<author>
<name sortKey="De Vargas, C" uniqKey="De Vargas C">C De Vargas</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Sullivan, M" uniqKey="Sullivan M">M Sullivan</name>
</author>
<author>
<name sortKey="Arendt, D" uniqKey="Arendt D">D Arendt</name>
</author>
<author>
<name sortKey="Benzoni, F" uniqKey="Benzoni F">F Benzoni</name>
</author>
<author>
<name sortKey="Claverie, Jm" uniqKey="Claverie J">JM Claverie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Adey, A" uniqKey="Adey A">A Adey</name>
</author>
<author>
<name sortKey="Morrison, Hg" uniqKey="Morrison H">HG Morrison</name>
</author>
<author>
<name sortKey="Asan" uniqKey="Asan">Asan</name>
</author>
<author>
<name sortKey="Xun, X" uniqKey="Xun X">X Xun</name>
</author>
<author>
<name sortKey="Kitzman, Jo" uniqKey="Kitzman J">JO Kitzman</name>
</author>
<author>
<name sortKey="Turner, Eh" uniqKey="Turner E">EH Turner</name>
</author>
<author>
<name sortKey="Stackhouse, B" uniqKey="Stackhouse B">B Stackhouse</name>
</author>
<author>
<name sortKey="Mackenzie, Ap" uniqKey="Mackenzie A">AP MacKenzie</name>
</author>
<author>
<name sortKey="Caruccio, Nc" uniqKey="Caruccio N">NC Caruccio</name>
</author>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X Zhang</name>
</author>
<author>
<name sortKey="Shendure, J" uniqKey="Shendure J">J Shendure</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dong, H" uniqKey="Dong H">H Dong</name>
</author>
<author>
<name sortKey="Chen, Y" uniqKey="Chen Y">Y Chen</name>
</author>
<author>
<name sortKey="Shen, Y" uniqKey="Shen Y">Y Shen</name>
</author>
<author>
<name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
<author>
<name sortKey="Zhao, G" uniqKey="Zhao G">G Zhao</name>
</author>
<author>
<name sortKey="Jin, W" uniqKey="Jin W">W Jin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kozarewa, I" uniqKey="Kozarewa I">I Kozarewa</name>
</author>
<author>
<name sortKey="Ning, Z" uniqKey="Ning Z">Z Ning</name>
</author>
<author>
<name sortKey="Quail, Ma" uniqKey="Quail M">MA Quail</name>
</author>
<author>
<name sortKey="Sanders, Mj" uniqKey="Sanders M">MJ Sanders</name>
</author>
<author>
<name sortKey="Berriman, M" uniqKey="Berriman M">M Berriman</name>
</author>
<author>
<name sortKey="Turner, Dj" uniqKey="Turner D">DJ Turner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hurlbert, Sh" uniqKey="Hurlbert S">SH Hurlbert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Danhorn, T" uniqKey="Danhorn T">T Danhorn</name>
</author>
<author>
<name sortKey="Young, Cr" uniqKey="Young C">CR Young</name>
</author>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF Delong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
<author>
<name sortKey="Jansson, J" uniqKey="Jansson J">J Jansson</name>
</author>
<author>
<name sortKey="Field, D" uniqKey="Field D">D Field</name>
</author>
<author>
<name sortKey="Fierer, N" uniqKey="Fierer N">N Fierer</name>
</author>
<author>
<name sortKey="Desai, N" uniqKey="Desai N">N Desai</name>
</author>
<author>
<name sortKey="Fuhrman, Ja" uniqKey="Fuhrman J">JA Fuhrman</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Van Der Lelie, D" uniqKey="Van Der Lelie D">D van der Lelie</name>
</author>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
<author>
<name sortKey="Stevens, R" uniqKey="Stevens R">R Stevens</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kishore, R" uniqKey="Kishore R">R Kishore</name>
</author>
<author>
<name sortKey="Reef Hardy, W" uniqKey="Reef Hardy W">W Reef Hardy</name>
</author>
<author>
<name sortKey="Anderson, Vj" uniqKey="Anderson V">VJ Anderson</name>
</author>
<author>
<name sortKey="Sanchez, Na" uniqKey="Sanchez N">NA Sanchez</name>
</author>
<author>
<name sortKey="Buoncristiani, Mr" uniqKey="Buoncristiani M">MR Buoncristiani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sambrook, J" uniqKey="Sambrook J">J Sambrook</name>
</author>
<author>
<name sortKey="Fritsch, Ef" uniqKey="Fritsch E">EF Fritsch</name>
</author>
<author>
<name sortKey="Maniatis, T" uniqKey="Maniatis T">T Maniatis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aird, D" uniqKey="Aird D">D Aird</name>
</author>
<author>
<name sortKey="Ross, Mg" uniqKey="Ross M">MG Ross</name>
</author>
<author>
<name sortKey="Chen, Ws" uniqKey="Chen W">WS Chen</name>
</author>
<author>
<name sortKey="Danielsson, M" uniqKey="Danielsson M">M Danielsson</name>
</author>
<author>
<name sortKey="Fennell, T" uniqKey="Fennell T">T Fennell</name>
</author>
<author>
<name sortKey="Russ, C" uniqKey="Russ C">C Russ</name>
</author>
<author>
<name sortKey="Jaffe, Db" uniqKey="Jaffe D">DB Jaffe</name>
</author>
<author>
<name sortKey="Nusbaum, C" uniqKey="Nusbaum C">C Nusbaum</name>
</author>
<author>
<name sortKey="Gnirke, A" uniqKey="Gnirke A">A Gnirke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schwientek, P" uniqKey="Schwientek P">P Schwientek</name>
</author>
<author>
<name sortKey="Szczepanowski, R" uniqKey="Szczepanowski R">R Szczepanowski</name>
</author>
<author>
<name sortKey="Ruckert, C" uniqKey="Ruckert C">C Ruckert</name>
</author>
<author>
<name sortKey="Stoye, J" uniqKey="Stoye J">J Stoye</name>
</author>
<author>
<name sortKey="Puhler, A" uniqKey="Puhler A">A Puhler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quail, Ma" uniqKey="Quail M">MA Quail</name>
</author>
<author>
<name sortKey="Kozarewa, I" uniqKey="Kozarewa I">I Kozarewa</name>
</author>
<author>
<name sortKey="Smith, F" uniqKey="Smith F">F Smith</name>
</author>
<author>
<name sortKey="Scally, A" uniqKey="Scally A">A Scally</name>
</author>
<author>
<name sortKey="Stephens, Pj" uniqKey="Stephens P">PJ Stephens</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
<author>
<name sortKey="Swerdlow, H" uniqKey="Swerdlow H">H Swerdlow</name>
</author>
<author>
<name sortKey="Turner, Dj" uniqKey="Turner D">DJ Turner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quail, Ma" uniqKey="Quail M">MA Quail</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marine, R" uniqKey="Marine R">R Marine</name>
</author>
<author>
<name sortKey="Polson, Sw" uniqKey="Polson S">SW Polson</name>
</author>
<author>
<name sortKey="Ravel, J" uniqKey="Ravel J">J Ravel</name>
</author>
<author>
<name sortKey="Hatfull, G" uniqKey="Hatfull G">G Hatfull</name>
</author>
<author>
<name sortKey="Russell, D" uniqKey="Russell D">D Russell</name>
</author>
<author>
<name sortKey="Sullivan, M" uniqKey="Sullivan M">M Sullivan</name>
</author>
<author>
<name sortKey="Syed, F" uniqKey="Syed F">F Syed</name>
</author>
<author>
<name sortKey="Dumas, M" uniqKey="Dumas M">M Dumas</name>
</author>
<author>
<name sortKey="Wommack, Ke" uniqKey="Wommack K">KE Wommack</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gomez Alvarez, V" uniqKey="Gomez Alvarez V">V Gomez-Alvarez</name>
</author>
<author>
<name sortKey="Teal, Tk" uniqKey="Teal T">TK Teal</name>
</author>
<author>
<name sortKey="Schmidt, Tm" uniqKey="Schmidt T">TM Schmidt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jerome, M" uniqKey="Jerome M">M Jerome</name>
</author>
<author>
<name sortKey="Noirot, C" uniqKey="Noirot C">C Noirot</name>
</author>
<author>
<name sortKey="Klopp, C" uniqKey="Klopp C">C Klopp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kristensen, Dm" uniqKey="Kristensen D">DM Kristensen</name>
</author>
<author>
<name sortKey="Mushegian, Ar" uniqKey="Mushegian A">AR Mushegian</name>
</author>
<author>
<name sortKey="Dolja, Vv" uniqKey="Dolja V">VV Dolja</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wommack, Ke" uniqKey="Wommack K">KE Wommack</name>
</author>
<author>
<name sortKey="Bhavsar, J" uniqKey="Bhavsar J">J Bhavsar</name>
</author>
<author>
<name sortKey="Ravel, J" uniqKey="Ravel J">J Ravel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wanunu, M" uniqKey="Wanunu M">M Wanunu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Allers, E" uniqKey="Allers E">E Allers</name>
</author>
<author>
<name sortKey="Moraru, C" uniqKey="Moraru C">C Moraru</name>
</author>
<author>
<name sortKey="Duhaime, M" uniqKey="Duhaime M">M Duhaime</name>
</author>
<author>
<name sortKey="Beneze, E" uniqKey="Beneze E">E Beneze</name>
</author>
<author>
<name sortKey="Solonenko, N" uniqKey="Solonenko N">N Solonenko</name>
</author>
<author>
<name sortKey="Barerro Canosa, J" uniqKey="Barerro Canosa J">J Barerro-Canosa</name>
</author>
<author>
<name sortKey="Amann, R" uniqKey="Amann R">R Amann</name>
</author>
<author>
<name sortKey="Sullivan, Mb" uniqKey="Sullivan M">MB Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deng, L" uniqKey="Deng L">L Deng</name>
</author>
<author>
<name sortKey="Gregory, A" uniqKey="Gregory A">A Gregory</name>
</author>
<author>
<name sortKey="Yilmaz, S" uniqKey="Yilmaz S">S Yilmaz</name>
</author>
<author>
<name sortKey="Poulos, Bt" uniqKey="Poulos B">BT Poulos</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Sullivan, Mb" uniqKey="Sullivan M">MB Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tadmor, Ad" uniqKey="Tadmor A">AD Tadmor</name>
</author>
<author>
<name sortKey="Ottesen, Ea" uniqKey="Ottesen E">EA Ottesen</name>
</author>
<author>
<name sortKey="Leadbetter, Jr" uniqKey="Leadbetter J">JR Leadbetter</name>
</author>
<author>
<name sortKey="Phillips, R" uniqKey="Phillips R">R Phillips</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Luo, C" uniqKey="Luo C">C Luo</name>
</author>
<author>
<name sortKey="Tsementzi, D" uniqKey="Tsementzi D">D Tsementzi</name>
</author>
<author>
<name sortKey="Kyrpides, Nc" uniqKey="Kyrpides N">NC Kyrpides</name>
</author>
<author>
<name sortKey="Konstantinidis, Kt" uniqKey="Konstantinidis K">KT Konstantinidis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taupp, M" uniqKey="Taupp M">M Taupp</name>
</author>
<author>
<name sortKey="Lee, S" uniqKey="Lee S">S Lee</name>
</author>
<author>
<name sortKey="Hawley, A" uniqKey="Hawley A">A Hawley</name>
</author>
<author>
<name sortKey="Yang, J" uniqKey="Yang J">J Yang</name>
</author>
<author>
<name sortKey="Hallam, Sj" uniqKey="Hallam S">SJ Hallam</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huse, Sm" uniqKey="Huse S">SM Huse</name>
</author>
<author>
<name sortKey="Dethlefsen, L" uniqKey="Dethlefsen L">L Dethlefsen</name>
</author>
<author>
<name sortKey="Huber, Ja" uniqKey="Huber J">JA Huber</name>
</author>
<author>
<name sortKey="Welch, Dm" uniqKey="Welch D">DM Welch</name>
</author>
<author>
<name sortKey="Relman, Da" uniqKey="Relman D">DA Relman</name>
</author>
<author>
<name sortKey="Sogin, Ml" uniqKey="Sogin M">ML Sogin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rothberg, Jm" uniqKey="Rothberg J">JM Rothberg</name>
</author>
<author>
<name sortKey="Hinz, W" uniqKey="Hinz W">W Hinz</name>
</author>
<author>
<name sortKey="Rearick, Tm" uniqKey="Rearick T">TM Rearick</name>
</author>
<author>
<name sortKey="Schultz, J" uniqKey="Schultz J">J Schultz</name>
</author>
<author>
<name sortKey="Mileski, W" uniqKey="Mileski W">W Mileski</name>
</author>
<author>
<name sortKey="Davey, M" uniqKey="Davey M">M Davey</name>
</author>
<author>
<name sortKey="Leamon, Jh" uniqKey="Leamon J">JH Leamon</name>
</author>
<author>
<name sortKey="Johnson, K" uniqKey="Johnson K">K Johnson</name>
</author>
<author>
<name sortKey="Milgrew, Mj" uniqKey="Milgrew M">MJ Milgrew</name>
</author>
<author>
<name sortKey="Edwards, M" uniqKey="Edwards M">M Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dohm, Jc" uniqKey="Dohm J">JC Dohm</name>
</author>
<author>
<name sortKey="Lottaz, C" uniqKey="Lottaz C">C Lottaz</name>
</author>
<author>
<name sortKey="Borodina, T" uniqKey="Borodina T">T Borodina</name>
</author>
<author>
<name sortKey="Himmelbauer, H" uniqKey="Himmelbauer H">H Himmelbauer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Minoche, Ae" uniqKey="Minoche A">AE Minoche</name>
</author>
<author>
<name sortKey="Dohm, Jc" uniqKey="Dohm J">JC Dohm</name>
</author>
<author>
<name sortKey="Himmelbauer, H" uniqKey="Himmelbauer H">H Himmelbauer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cock, Pj" uniqKey="Cock P">PJ Cock</name>
</author>
<author>
<name sortKey="Fields, Cj" uniqKey="Fields C">CJ Fields</name>
</author>
<author>
<name sortKey="Goto, N" uniqKey="Goto N">N Goto</name>
</author>
<author>
<name sortKey="Heuer, Ml" uniqKey="Heuer M">ML Heuer</name>
</author>
<author>
<name sortKey="Rice, Pm" uniqKey="Rice P">PM Rice</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benjamini, Y" uniqKey="Benjamini Y">Y Benjamini</name>
</author>
<author>
<name sortKey="Speed, Tp" uniqKey="Speed T">TP Speed</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Niu, B" uniqKey="Niu B">B Niu</name>
</author>
<author>
<name sortKey="Fu, L" uniqKey="Fu L">L Fu</name>
</author>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Fu, L" uniqKey="Fu L">L Fu</name>
</author>
<author>
<name sortKey="Niu, B" uniqKey="Niu B">B Niu</name>
</author>
<author>
<name sortKey="Wu, S" uniqKey="Wu S">S Wu</name>
</author>
<author>
<name sortKey="Wooley, J" uniqKey="Wooley J">J Wooley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zerbino, Dr" uniqKey="Zerbino D">DR Zerbino</name>
</author>
<author>
<name sortKey="Mcewen, Gk" uniqKey="Mcewen G">GK McEwen</name>
</author>
<author>
<name sortKey="Margulies, Eh" uniqKey="Margulies E">EH Margulies</name>
</author>
<author>
<name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article" xml:lang="en">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Genomics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Genomics</journal-id>
<journal-title-group>
<journal-title>BMC Genomics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2164</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">23663384</article-id>
<article-id pub-id-type="pmc">3655917</article-id>
<article-id pub-id-type="publisher-id">1471-2164-14-320</article-id>
<article-id pub-id-type="doi">10.1186/1471-2164-14-320</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Sequencing platform and library preparation choices impact viral metagenomes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" id="A1">
<name>
<surname>Solonenko</surname>
<given-names>Sergei A</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>ssolonen@email.arizona.edu</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Ignacio-Espinoza</surname>
<given-names>J César</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>ignacioe@email.arizona.edu</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Alberti</surname>
<given-names>Adriana</given-names>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>aalberti@genoscope.cns.fr</email>
</contrib>
<contrib contrib-type="author" id="A4">
<name>
<surname>Cruaud</surname>
<given-names>Corinne</given-names>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>ccruaud@genoscope.cns.fr</email>
</contrib>
<contrib contrib-type="author" id="A5">
<name>
<surname>Hallam</surname>
<given-names>Steven</given-names>
</name>
<xref ref-type="aff" rid="I4">4</xref>
<email>shallam@interchange.ubc.ca</email>
</contrib>
<contrib contrib-type="author" id="A6">
<name>
<surname>Konstantinidis</surname>
<given-names>Kostas</given-names>
</name>
<xref ref-type="aff" rid="I5">5</xref>
<email>kostas.konstantinidis@ce.gatech.edu</email>
</contrib>
<contrib contrib-type="author" id="A7">
<name>
<surname>Tyson</surname>
<given-names>Gene</given-names>
</name>
<xref ref-type="aff" rid="I6">6</xref>
<email>g.tyson@awmc.uq.edu.au</email>
</contrib>
<contrib contrib-type="author" id="A8">
<name>
<surname>Wincker</surname>
<given-names>Patrick</given-names>
</name>
<xref ref-type="aff" rid="I3">3</xref>
<email>pwincker@genoscope.cns.fr</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A9">
<name>
<surname>Sullivan</surname>
<given-names>Matthew B</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>mbsulli@email.arizona.edu</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA</aff>
<aff id="I2">
<label>2</label>
Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA</aff>
<aff id="I3">
<label>3</label>
CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France</aff>
<aff id="I4">
<label>4</label>
Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada</aff>
<aff id="I5">
<label>5</label>
Department of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, USA</aff>
<aff id="I6">
<label>6</label>
Austalian Center for Ecogenomics, University of Queensland, Brisbane, QLD, Australia</aff>
<pub-date pub-type="collection">
<year>2013</year>
</pub-date>
<pub-date pub-type="epub">
<day>10</day>
<month>5</month>
<year>2013</year>
</pub-date>
<volume>14</volume>
<fpage>320</fpage>
<lpage>320</lpage>
<history>
<date date-type="received">
<day>4</day>
<month>2</month>
<year>2013</year>
</date>
<date date-type="accepted">
<day>2</day>
<month>5</month>
<year>2013</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2013 Solonenko et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2013</copyright-year>
<copyright-holder>Solonenko et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2164/14/320"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>Microbes drive the biogeochemistry that fuels the planet. Microbial viruses modulate their hosts directly through mortality and horizontal gene transfer, and indirectly by re-programming host metabolisms during infection. However, our ability to study these virus-host interactions is limited by methods that are low-throughput and heavily reliant upon the subset of organisms that are in culture. One way forward are culture-independent metagenomic approaches, but these novel methods are rarely rigorously tested, especially for studies of environmental viruses, air microbiomes, extreme environment microbiology and other areas with constrained sample amounts. Here we perform replicated experiments to evaluate Roche 454, Illumina HiSeq, and Ion Torrent PGM sequencing and library preparation protocols on virus metagenomes generated from as little as 10pg of DNA.</p>
</sec>
<sec>
<title>Results</title>
<p>Using %G + C content to compare metagenomes, we find that (i) metagenomes are highly replicable, (ii) some treatment effects are minimal, e.g., sequencing technology choice has 6-fold less impact than varying input DNA amount, and (iii) when restricted to a limited DNA concentration (<1μg), changing the amount of amplification produces little variation. These trends were also observed when examining the metagenomes for gene function and assembly performance, although the latter more closely aligned to sequencing effort and read length than preparation steps tested. Among Illumina library preparation options, transposon-based libraries diverged from all others and adaptor ligation was a critical step for optimizing sequencing yields.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>These data guide researchers in generating systematic, comparative datasets to understand complex ecosystems, and suggest that neither varied amplification nor sequencing platforms will deter such efforts.</p>
</sec>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>Advances in sequencing technologies have revolutionized the life sciences. For example, ecology and evolution can now be examined across the tree of life [
<xref ref-type="bibr" rid="B1">1</xref>
], and at resolutions ranging from broad analyses (e.g., BGI’s 10,000 Microbial Genomes Project,
<ext-link ext-link-type="uri" xlink:href="http://ldl.genomics.cn/page/M-research.jsp">http://ldl.genomics.cn/page/M-research.jsp</ext-link>
) to focused investigation of population structure within particular species [
<xref ref-type="bibr" rid="B2">2</xref>
]. These analyses, however, center on genomes as the unit of interest and represent a “bottom-up approach” to exploring the diversity of life [
<xref ref-type="bibr" rid="B3">3</xref>
].</p>
<p>Concurrently, metagenomics provides a “top-down approach” for studying complex microbial assemblages in nature [
<xref ref-type="bibr" rid="B3">3</xref>
]. Recent reviews cover next generation sequencing applications [
<xref ref-type="bibr" rid="B4">4</xref>
-
<xref ref-type="bibr" rid="B6">6</xref>
], but rarely acknowledge the factors that generate quantitative data needed for metagenomics. For example, sequence quality evaluated across benchtop systems did not consider library preparation [
<xref ref-type="bibr" rid="B7">7</xref>
], and recommendations of amplification-free protocols that require >2 μg of DNA to minimize biases [
<xref ref-type="bibr" rid="B8">8</xref>
] are not meaningful for DNA-limited applications. There are also numerous sequencing platform options, though microbial metagenomes generated across commonly-used sequencing platforms only minimally differ in taxonomic distributions or contig assembly quality [
<xref ref-type="bibr" rid="B9">9</xref>
].</p>
<p>Some fields, such as viral ecology or microbial ecology of permafrost soils or the atmosphere, are routinely DNA-limited (<1 ng) and thus require optimization and quantitation assessment at each step in the metagenomic sample-to-sequence pipeline [
<xref ref-type="bibr" rid="B10">10</xref>
]. Towards this end, empirical data are now available to guide researchers in concentrating and purifying viruses [
<xref ref-type="bibr" rid="B11">11</xref>
,
<xref ref-type="bibr" rid="B12">12</xref>
] prior to DNA extraction. Once DNA is extracted, small yields require amplification to obtain enough material for sequencing. While whole genome amplification was an attractive option, it is now documented to result in non-quantitative metagenomes due to both stochastic [
<xref ref-type="bibr" rid="B13">13</xref>
] and systematic biases [
<xref ref-type="bibr" rid="B14">14</xref>
]. In contrast, linker-amplification-based libraries [
<xref ref-type="bibr" rid="B15">15</xref>
-
<xref ref-type="bibr" rid="B17">17</xref>
] provide a nearly quantitative alternative, even from sub-nanogram amounts of DNA [
<xref ref-type="bibr" rid="B15">15</xref>
]. Together these advances allowed the compilation of the first large-scale, systematically prepared comparative metagenomic dataset for quantitative viral ecology [
<xref ref-type="bibr" rid="B18">18</xref>
] with new tools and analytical platforms now emerging to handle such datasets [
<xref ref-type="bibr" rid="B19">19</xref>
,
<xref ref-type="bibr" rid="B20">20</xref>
]. Beyond viral ecology, these studies provide a roadmap for generating quantitative metagenomic datasets from any low (<100 ng) input DNA samples.</p>
<p>Here we expand upon these efforts to focus on the final steps in viral metagenomic sequencing (overview in Figure 
<xref ref-type="fig" rid="F1">1</xref>
, and sequencing statistics summarized in Table 
<xref ref-type="table" rid="T1">1</xref>
). The first experiment evaluates co-varied input DNA and amplification cycle amounts, as well as sequencing platform choice on the resulting metagenomes. These data were derived from DNA extracted from a 1,080L Biosphere 2 Ocean viral concentrate and included small-insert metagenomes prepared from varied low-input DNA amounts (10 pg—100 ng) and amplification conditions for commonly used sequencing platforms (Illumina HiSeq2000, herein ‘Illumina’ and Roche 454 Titanium, herein ‘454’). Additionally, these low-input samples were complemented by standard input DNA(≥1,000ng) small-insert metagenomes to compare three sequencing platforms (Illumina, 454, Ion Torrent) and limited large-insert clone library Sanger end-sequencing (8,000ng fosmid library). The second experiment focuses on Illumina sequencing only. Here, viral DNA derived from two separate ocean samples (Tara Oceans [
<xref ref-type="bibr" rid="B21">21</xref>
] stations 41 and 109) was used to examine the effect of amplification conditions (e.g., cycle number) and input DNA amount independently, as well as compare standard Illumina libraries to transposon-based Nextera libraries [
<xref ref-type="bibr" rid="B22">22</xref>
].</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>Experimental design overview.</bold>
Library preparation treatments were done at varying levels of replication, as indicated by the numbers (1 to 3) next to each treatment. The number of amplification cycles (see y axis) includes those necessary to generate enough DNA for library preparation, but does not include the emPCR (454, Ion Torrent) or bridge (Illumina) amplification cycles used to build large enough populations of reads for nucleotide sequencing signal detection.</p>
</caption>
<graphic xlink:href="1471-2164-14-320-1"></graphic>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption>
<p>Summary statistics for all metagenomic libraries used in analysis</p>
</caption>
<table frame="hsides" rules="groups" border="1">
<colgroup>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
<col align="left"></col>
</colgroup>
<thead valign="top">
<tr>
<th align="left"> </th>
<th align="left">
<bold>DNA source</bold>
</th>
<th align="left">
<bold>Technology</bold>
</th>
<th align="left">
<bold>Starting DNA (ng)</bold>
</th>
<th align="left">
<bold>Library amplification (# cycles)</bold>
</th>
<th align="left">
<bold>Replicates</bold>
</th>
<th align="left">
<bold>Raw reads (millions)</bold>
</th>
<th align="left">
<bold>Raw quality +/-SD (PHRED)</bold>
</th>
<th align="left">
<bold>Raw length (bp)</bold>
</th>
<th align="left">
<bold>Failed QC +/- SD (%)</bold>
</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="9" align="left" valign="top">
<bold>
<italic>Experiment 1</italic>
</bold>
<hr></hr>
</td>
<td rowspan="9" align="left" valign="top">Biosphere 2 Ocean
<hr></hr>
</td>
<td rowspan="4" align="left" valign="top">Illumina HiSeq 2000
<hr></hr>
</td>
<td align="left" valign="bottom">1,000
<hr></hr>
</td>
<td align="left" valign="bottom">14
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">65.5, 51.8
<hr></hr>
</td>
<td align="left" valign="bottom">34.2 +/- 0.0
<hr></hr>
</td>
<td align="left" valign="bottom">100 PE
<hr></hr>
</td>
<td align="left" valign="bottom">29.9 +/- 0.5
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">100
<hr></hr>
</td>
<td align="left" valign="bottom">14
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">6.7, 0.3
<hr></hr>
</td>
<td align="left" valign="bottom">33.8 +/- 0.2
<hr></hr>
</td>
<td align="left" valign="bottom">100 PE
<hr></hr>
</td>
<td align="left" valign="bottom">28.3 +/- 0.2 *
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">18
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">2.5, 0
<hr></hr>
</td>
<td align="left" valign="bottom">32
<hr></hr>
</td>
<td align="left" valign="bottom">100 PE
<hr></hr>
</td>
<td align="left" valign="bottom">31.9 *
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">1
<hr></hr>
</td>
<td align="left" valign="bottom">18
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">0, 0
<hr></hr>
</td>
<td align="left" valign="bottom">0
<hr></hr>
</td>
<td align="left" valign="bottom">0
<hr></hr>
</td>
<td align="left" valign="bottom">0
<hr></hr>
</td>
</tr>
<tr>
<td rowspan="3" align="left" valign="top">Roche 454 GS FLX
<hr></hr>
</td>
<td align="left" valign="bottom">1,500
<hr></hr>
</td>
<td align="left" valign="bottom">0
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">0.30, 0.38
<hr></hr>
</td>
<td align="left" valign="bottom">32.5 +/- 0.7
<hr></hr>
</td>
<td align="left" valign="bottom">408 +/- 11
<hr></hr>
</td>
<td align="left" valign="bottom">15.4 +/- 0.4
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">15 (LA)
<hr></hr>
</td>
<td align="left" valign="bottom">3
<hr></hr>
</td>
<td rowspan="2" align="left" valign="top">0.91, 0.90, 0.85
<hr></hr>
</td>
<td rowspan="2" align="left" valign="top">32.8 +/- 0.8
<hr></hr>
</td>
<td rowspan="2" align="left" valign="top">377 +/- 15
<hr></hr>
</td>
<td rowspan="2" align="left" valign="top">31.5 +/- 4.0
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">0.01
<hr></hr>
</td>
<td align="left" valign="bottom">25 (LA)
<hr></hr>
</td>
<td align="left" valign="bottom">3
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">Ion Torrent PGM 316 chip
<hr></hr>
</td>
<td align="left" valign="bottom">1,000
<hr></hr>
</td>
<td align="left" valign="bottom">5
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">2.3, 2.4
<hr></hr>
</td>
<td align="left" valign="bottom">16.3 +/- 0.2
<hr></hr>
</td>
<td align="left" valign="bottom">105 +/- 5
<hr></hr>
</td>
<td align="left" valign="bottom">40.3 +/- 7.6
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">ABI 3730xl
<hr></hr>
</td>
<td align="left" valign="bottom">8,000
<hr></hr>
</td>
<td align="left" valign="bottom">0
<hr></hr>
</td>
<td align="left" valign="bottom">1
<hr></hr>
</td>
<td align="left" valign="bottom">0.7
<hr></hr>
</td>
<td align="left" valign="bottom">44.6
<hr></hr>
</td>
<td align="left" valign="bottom">603
<hr></hr>
</td>
<td align="left" valign="bottom">7.9
<hr></hr>
</td>
</tr>
<tr>
<td rowspan="8" align="left" valign="top">
<bold>
<italic>Experiment 2</italic>
</bold>
</td>
<td rowspan="4" align="left" valign="top">Tara Oceans Station 41
<hr></hr>
</td>
<td rowspan="4" align="left" valign="top">Illumina HiSeq 2000
<hr></hr>
</td>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">9 (N)
<hr></hr>
</td>
<td align="left" valign="bottom">1
<hr></hr>
</td>
<td align="left" valign="bottom">20.3
<hr></hr>
</td>
<td align="left" valign="bottom">34.8
<hr></hr>
</td>
<td align="left" valign="bottom">101 PE
<hr></hr>
</td>
<td align="left" valign="bottom">36.3
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">12
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">18.6, 31.3
<hr></hr>
</td>
<td align="left" valign="bottom">34.2 +/- 0.2
<hr></hr>
</td>
<td align="left" valign="bottom">101 PE
<hr></hr>
</td>
<td align="left" valign="bottom">36.2 +/- 0.9
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">15
<hr></hr>
</td>
<td align="left" valign="bottom">1
<hr></hr>
</td>
<td align="left" valign="bottom">15.4
<hr></hr>
</td>
<td align="left" valign="bottom">34.3
<hr></hr>
</td>
<td align="left" valign="bottom">101 PE
<hr></hr>
</td>
<td align="left" valign="bottom">35.7
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">100
<hr></hr>
</td>
<td align="left" valign="bottom">12
<hr></hr>
</td>
<td align="left" valign="bottom">1
<hr></hr>
</td>
<td align="left" valign="bottom">17.7
<hr></hr>
</td>
<td align="left" valign="bottom">34.6
<hr></hr>
</td>
<td align="left" valign="bottom">101 PE
<hr></hr>
</td>
<td align="left" valign="bottom">35.0
<hr></hr>
</td>
</tr>
<tr>
<td rowspan="4" align="left" valign="top">Tara Oceans Station 109</td>
<td rowspan="4" align="left" valign="top">Illumina HiSeq 2000</td>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">9 (N)
<hr></hr>
</td>
<td align="left" valign="bottom">1
<hr></hr>
</td>
<td align="left" valign="bottom">2.6
<hr></hr>
</td>
<td align="left" valign="bottom">34.9
<hr></hr>
</td>
<td align="left" valign="bottom">101 PE
<hr></hr>
</td>
<td align="left" valign="bottom">35.4
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">12
<hr></hr>
</td>
<td align="left" valign="bottom">1
<hr></hr>
</td>
<td align="left" valign="bottom">20.4
<hr></hr>
</td>
<td align="left" valign="bottom">34.9
<hr></hr>
</td>
<td align="left" valign="bottom">101 PE
<hr></hr>
</td>
<td align="left" valign="bottom">34.3
<hr></hr>
</td>
</tr>
<tr>
<td align="left" valign="bottom">10
<hr></hr>
</td>
<td align="left" valign="bottom">15
<hr></hr>
</td>
<td align="left" valign="bottom">2
<hr></hr>
</td>
<td align="left" valign="bottom">28.6, 16.2
<hr></hr>
</td>
<td align="left" valign="bottom">34.4 +/- 0.5
<hr></hr>
</td>
<td align="left" valign="bottom">101 PE
<hr></hr>
</td>
<td align="left" valign="bottom">33.6 +/- 0.6
<hr></hr>
</td>
</tr>
<tr>
<td align="left">100</td>
<td align="left">12</td>
<td align="left">1</td>
<td align="left">16.7</td>
<td align="left">34.8</td>
<td align="left">101 PE</td>
<td align="left">34.3</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Starting DNA refers to the amount of pre-size selection DNA used in library construction; Library amplification abbreviations are LA = linker amplification and N = Nextera; Raw quality scores reported are PHRED scores; Raw length ‘PE’ denotes paired end reads. * denotes the successful 10ng library and one of the 100 ng libraries had an additional 40% of QC-passed reads that were lost due to removal of TruSeq adaptor sequence contaminants.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec sec-type="results">
<title>Results</title>
<sec>
<title>Experiment 1: The impact of input DNA, amplification, and sequencing platform on metagenomes</title>
<sec>
<title>Library success varies by sequencing protocol</title>
<p>As expected, the fosmid library and all 6 libraries made from ≥1,000 ng DNA were successful in generating sufficient DNA for sequencing regardless of sequencing platform (Table 
<xref ref-type="table" rid="T1">1</xref>
). Additionally, low DNA input libraries for 454 (linker-amplified [
<xref ref-type="bibr" rid="B15">15</xref>
] to obtain sufficient genetic material) were all successful, with highest read yields per ng of input DNA of any method (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S1).</p>
<p>In contrast, Experiment 1 Illumina libraries constructed from low starting DNA amounts were less successful (Table 
<xref ref-type="table" rid="T1">1</xref>
). Specifically, 3 of 6 libraries, one 10ng and both 1ng libraries, failed library construction, even with the addition of carrier DNA and adaptor concentration adjustment to increase ligation efficiencies. Two of the remaining 3 low input DNA libraries, one 10ng and two 100ng, were sequenced, but yielded fewer and more variable numbers of reads and abundant adaptor sequence (see * in Table 
<xref ref-type="table" rid="T1">1</xref>
).</p>
</sec>
<sec>
<title>%G + C content variation within treatments is minimal</title>
<p>The replicates’ read %G + C distributions were correlated using the Pearson product–moment correlation coefficient (Pearson’s r). There is little variation in %G + C across replicate libraries from any 454, Illumina, or Ion Torrent sequencing data – replicates have pairwise correlation values from 0.99 to 1 and cluster together >94% of the time (Figure 
<xref ref-type="fig" rid="F2">2</xref>
). This indicates that, at least for the range of %G + C in this B2O sample, intra-replicate variation is minimal and therefore there is high power to detect statistically significant differences across treatments.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>%G + C and duplication plots for Experiment 1 metagenomes.</bold>
Heatmap coloring indicates the relative pairwise correlations (Pearson’s r) in the %G + C distributions (red-to-yellow) and duplicates (blue-to-green) where red and blue colors indicate the lowest levels of correlation, while white represents highly correlated data. The %G + C distribution correlations were UPGMA clustered with 100 bootstrap runs to indicate statistical support (only >60% support shown). Abbreviations are as follows: “Tech” is sequencing technology represented by 4 (454), T (Ion Torrent), I (Illumina), S (Sanger); “Pair” is the forward or reverse paired end sequence data; “Rep” is the arbitrarily labeled replicate ranging from two (
<bold>A</bold>
and
<bold>B</bold>
) to three (
<bold>A</bold>
,
<bold>B</bold>
, or
<bold>C</bold>
); “ng” is the nanograms of input DNA from which the viral metagenome was derived. The most reliable estimate of the true %G + C distribution is the unamplified 454 metagenomes. Relative to these, fosmid end sequences generated using Sanger sequencing were the most shifted toward high %G + C, while problematic <1000ng input DNA metagenomes were less shifted toward high %G + C, and reliable 1000ng Illumina metagenomes were only slightly shifted toward high %G + C.</p>
</caption>
<graphic xlink:href="1471-2164-14-320-2"></graphic>
</fig>
</sec>
<sec>
<title>Input DNA amount, decision to amplify impact %G + C content</title>
<p>Hierarchical clustering of sample %G + C distribution correlations shows consistent differences. First, all ≥1,000 ng metagenomes cluster together 100% of the time (Figure 
<xref ref-type="fig" rid="F2">2</xref>
). Of the treatments tested, input DNA most strongly impacts the resulting metagenomes, with ≥1,000 ng next-generation metagenomes clearly separated from the rest. Among these ≥1,000 ng samples, Illumina metagenomes have higher %G + C than 454 and Ion Torrent metagenomes (see Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S2 for example %G + C distribution plots), but differences between sequencing platforms are much less than differences between DNA inputs, with UPGMA branch length distances of 0.02 and 0.16, respectively (Figure 
<xref ref-type="fig" rid="F2">2</xref>
). While of limited sampling, the largest shift towards higher %G + C sequences (Pearson’s r <0.8) was in the fosmid library relative to the unamplified libraries (Figure 
<xref ref-type="fig" rid="F2">2</xref>
, Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S3).</p>
<p>Among the <1,000ng metagenomes, there are minimal differences between platforms and the only supported relationship, with bootstraps greater than the intra-replicate 94% value, was the clustering of Illumina 100ng samples with Illumina 10 ng samples (Figure 
<xref ref-type="fig" rid="F2">2</xref>
). This implies that, among amplified metagenomes, the degree of amplification and sequencing platform choice only minimally impact the resulting metagenomes. The fact that these diversely prepared metagenomes were nearly indistinguishable by %G + C distribution metrics (Pearson’s r values >0.99, Figure 
<xref ref-type="fig" rid="F2">2</xref>
) is promising for comparability of amplified metagenomes across sequencing platforms.</p>
</sec>
<sec>
<title>Duplicate reads uncorrelated with any single variable</title>
<p>Duplicates in metagenomes are derived from either naturally occurring duplicates in genomes and communities, or artificial duplicates generated during 454’s emPCR step or at some unknown point in Illumina preparations that is inconsistent across replicate libraries [
<xref ref-type="bibr" rid="B23">23</xref>
,
<xref ref-type="bibr" rid="B24">24</xref>
].</p>
<p>Here, hierarchical clustering of duplicate frequencies (Figure 
<xref ref-type="fig" rid="F2">2</xref>
) and raw duplicate distributions, normalized to metagenome size (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S3), suggest a pattern of three duplication groups. The first, composed of unamplified 454 and 10ng Illumina metagenomes, contains intermediate levels of duplication (14.6 to 42.7%) and few high-frequency (>10 fold) duplicate reads (0.06 to 5.1%). The second cluster, composed of most Illumina metagenomes, has an intermediate level of duplication (27.1 to 37.3%), but also an excess of highly duplicated reads (10.4 to 15.6%). The third includes the amplified 454 metagenomes, both Ion Torrent metagenomes, and the poorly amplified 100ng Illumina library, all of which have few duplicate reads (0.9 to 16.6%) and very few high-frequency duplicate reads (0.0005 to 0.9%) (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S4). However, these deep internal nodes lacked support, with bootstraps less than the intra-replicate 90% value, and duplication frequencies do not obviously correlate to any single metagenome category (e.g., technology, amplification, DNA amount, or paired end).</p>
<p>Some duplicate sequences may be real. For example, one 100bp sequence is overrepresented in multiple libraries including 1,000ng Illumina (0.14% of the total reads), Ion Torrent (0.006%), and unamplified 454 (0.36%) libraries. Artificial duplicate frequency correlations (see Online Methods) match overall duplicate frequencies for all treatments except a single 10ng, poorly-amplified, adapter-containing Illumina library (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S5-7), where 40% of the reads were predominantly artificial, high frequency duplicates (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S8 and S9).</p>
</sec>
<sec>
<title>Gene function and read assembly parallel %G + C findings</title>
<p>To evaluate variations in gene function, metagenomic reads were compared to an expansive database of marine virus protein sequences (>456K protein clusters derived from over 6M reads from 32 diverse pelagic ocean virus communities [
<xref ref-type="bibr" rid="B18">18</xref>
]). As is common for viral metagenomes (reviewed in ref. [
<xref ref-type="bibr" rid="B18">18</xref>
]) only 3—7% of the reads mapped to protein clusters without self-clustering. However, the resulting gene frequency patterns were well-supported and mirror patterns observed in the above %G + C analyses (Figure 
<xref ref-type="fig" rid="F3">3</xref>
A). Replicate metagenomes were most similar (pairwise r-values >0.95), while the biggest difference was between metagenomes prepared from ≥1,000 ng of starting DNA and those prepared from 100ng or less (r-values <0.8). Within these two large clusters, sequencing technology choice contributed additional, but minor, divergences (r-values 0.8—0.9). Notably, these protein cluster pairwise r-values are lower than those for either %G + C or duplicate frequency. This likely reflects increased analytical resolution, as 1,500 protein clusters correlated per metagenome in the function analysis, while only 50 or 10 bins were resolved in the %G + C and duplicate analyses, respectively.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>Protein cluster functional analysis and assembly statistics for Experiment 1 metagenomes.</bold>
Metagenomic reads were mapped to POV protein clusters (see text) and hit frequencies were used to produce pairwise correlation heat maps. Details as described in Figure 
<xref ref-type="fig" rid="F2">2</xref>
, including bootstrap analysis of statistical support for correlations across metagenomes. Assembly performance of each sample across the dataset was evaluated using metrics of n50 and maximum contig size, as well as the number of reads and base pairs that were assembled. Note that inferior assembly performance was restricted to samples with reduced read yields. Lastly, the Newbler assembler yielded larger contigs and smaller total assemblies when compared to Velvet assembly of the same Ion Torrent dataset.</p>
</caption>
<graphic xlink:href="1471-2164-14-320-3"></graphic>
</fig>
<p>Finally, assembly experiments (see Methods, Figure 
<xref ref-type="fig" rid="F3">3</xref>
B) revealed that total assembly size positively correlated to the number of reads used in assembly. In turn, the maximum and N50 contig sizes were relatively insensitive to increased read numbers in the larger datasets. This was true for both k-mer and overlap-based assembly algorithms (see Methods).</p>
</sec>
</sec>
<sec>
<title>Experiment 2: The independent effects of input DNA and library amplification on Illumina-sequenced metagenomes</title>
<sec>
<title>Low input DNA library success improved with optimization</title>
<p>In contrast to Experiment 1, all 10 Experiment 2 Illumina libraries (eight 10ng and two 100ng libraries) were successful. Replicate libraries did not cluster together consistently, but this reflected the extremely minimal variance across the replicates rather than poor replication (Figure 
<xref ref-type="fig" rid="F4">4</xref>
, note reduced axis scales relative to Figure 
<xref ref-type="fig" rid="F2">2</xref>
).</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>%G + C and duplication plots for Experiment 2 metagenomes.</bold>
Details as described in Figure 
<xref ref-type="fig" rid="F2">2</xref>
, including bootstrap analysis of statistical support for correlations across metagenomes. UPGMA clustering bootstrap support >60% shown only.</p>
</caption>
<graphic xlink:href="1471-2164-14-320-4"></graphic>
</fig>
</sec>
<sec>
<title>Transposon-based library preparation slightly impacts %G + C</title>
<p>In both Tara Oceans station 41 and 109 datasets, the amount of input DNA (10 or 100 ng) and amplification (12 or 15 cycles) resulted in less variation than was observed in replicate library preparations (Figure 
<xref ref-type="fig" rid="F4">4</xref>
). The only exception was transposon-based libraries, which diverged from the relatively invariant standard Illumina libraries. For all samples, duplicate frequencies varied as much between as within treatments (Figure 
<xref ref-type="fig" rid="F4">4</xref>
) and much less duplication was observed in Experiment 2 than 1. The dendrogram topology observed in pairwise %G + C analyses was recovered in analyses of function (Figure 
<xref ref-type="fig" rid="F5">5</xref>
A), but not assembly (Figure 
<xref ref-type="fig" rid="F5">5</xref>
B), where the transposon-based treatment for the Station 109 sample produced many fewer reads than other metagenomes.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption>
<p>
<bold>Protein cluster functional analysis and assembly statistics for Illumina-sequenced Experiment 2 metagenomes.</bold>
Note that one metagenome from Station 109 DNA yielded significantly fewer reads and thus had a lower total assembly size. Details as described in Figure 
<xref ref-type="fig" rid="F3">3</xref>
, including bootstrap analysis of statistical support for correlations across metagenomes.</p>
</caption>
<graphic xlink:href="1471-2164-14-320-5"></graphic>
</fig>
</sec>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>Replication is fundamental to rigorous experimental design [
<xref ref-type="bibr" rid="B25">25</xref>
], but it is only now becoming financially possible for metagenomic studies [
<xref ref-type="bibr" rid="B26">26</xref>
,
<xref ref-type="bibr" rid="B27">27</xref>
]. Here we examined replicate metagenomes across varied DNA input amounts, library preparation procedures, and sequencing platforms.</p>
<sec>
<title>Low input DNA library success depends on adaptor ligation</title>
<p>While all ≥1,000 ng DNA libraries were successful, environmental samples, particularly for viruses, routinely yield <1ng of DNA [
<xref ref-type="bibr" rid="B15">15</xref>
]. Libraries constructed from ≤100 ng DNA were successful using the linker-amplification protocol for 454 [
<xref ref-type="bibr" rid="B15">15</xref>
], but Illumina libraries failed or were low-quality for Experiment 1, but not Experiment 2. Two separate protocols were used – both optimized for recovery from column purification steps [
<xref ref-type="bibr" rid="B28">28</xref>
], but employed different template:adaptor ratios in ligation [
<xref ref-type="bibr" rid="B29">29</xref>
]. Specifically, Experiment 1 used 170:1, while Experiment 2 used 22:1 for 10ng starting DNA. Thus low DNA libraries require adjusted adaptor:template ratios during ligation (see Genoscope protocol for guidelines).</p>
</sec>
<sec>
<title>Presence of library amplification drives bias</title>
<p>Two amplification reactions are common in metagenomic sample preparations. The first, library amplification, increases input DNA to balance library preparation losses from purification, size selection, and quality titrations [
<xref ref-type="bibr" rid="B8">8</xref>
]. This adaptor-mediated amplification step is used for limiting DNA for 454 (15—25 cycles [
<xref ref-type="bibr" rid="B15">15</xref>
]), but is routinely employed in Ion Torrent (5 cycles) and Illumina (12—16 cycles) to enrich for correctly ligated adaptors. This step can alter overall library %G + C [
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B30">30</xref>
]. The second amplification step is specific to the sequencing technology (e.g., emPCR in 454 or Ion Torrent, bridge amplification in Illumina) and used for improving signal detection. This step should not alter overall library %G + C, but can artificially over-represent sequences [
<xref ref-type="bibr" rid="B23">23</xref>
,
<xref ref-type="bibr" rid="B24">24</xref>
].</p>
<p>In this study, two libraries received no library amplification: unamplified 454 and fosmid libraries. Fosmids had elevated %G + C, which is ascribed to a cloning bias [
<xref ref-type="bibr" rid="B26">26</xref>
]. Among the remaining libraries, we expected a low %G + C shift due to the adaptor-mediated amplification step, commonly attributed to inhibitory effects of high %G + C DNA secondary structures, either during library amplification [
<xref ref-type="bibr" rid="B30">30</xref>
] or downstream emPCR [
<xref ref-type="bibr" rid="B31">31</xref>
]. However, these trends were not observed: in Experiment 1, the 454 unamplified and amplified Illumina 1,000 ng libraries correlate well with one another (r-values > 0.99), but poorly (r-values < 0.9) with the amplified (18 cycles) 10ng Illumina libraries. This difference appears to be driven by reduced low %G + C reads relative to the ≥1,000 ng libra ries, which may implicate low input DNA libraries as more sensitive to loss of low %G + C reads either during gel extraction heat steps [
<xref ref-type="bibr" rid="B32">32</xref>
] or preferential fragmentation through heating [
<xref ref-type="bibr" rid="B33">33</xref>
]. A possible improvement over gel extraction is Sage Science’s Pippin Prep (tested with 65ng of DNA, see Figure 
<xref ref-type="fig" rid="F2">2</xref>
B in ref. [
<xref ref-type="bibr" rid="B15">15</xref>
]), which avoids heating. Heat during fragmentation is avoidable with Covaris acoustic shearing. Both techniques also minimize contamination, which is crucial for DNA-limited libraries.</p>
<p>While amplified ≤100 ng metagenomes displayed different %G + C distributions from ≥1,000 ng metagenomes, the amount of amplification only minimally impacts the resulting metagenomes. This was true in Experiment 1, where starting DNA amount and amplification cycling co-varied, as well as Experiment 2, where these parameters were independent. Fragment competition resulting from cycling conditions is thought to select for higher %G + C and shorter fragments, thus linker-mediated amplification protocols employ tight sizing conditions and %G + C optimized PCR conditions [
<xref ref-type="bibr" rid="B30">30</xref>
]. Such careful library construction can produce minimally biased (<1.5-fold %G + C variation) viral metagenomes from sub-nanogram amounts of DNA [
<xref ref-type="bibr" rid="B10">10</xref>
,
<xref ref-type="bibr" rid="B15">15</xref>
]. The %G + C patterns observed in the current larger-scale study were also paralleled in functional analyses (protein cluster mapping) and assembly performance. This suggests that systematically prepared linker-amplified metagenomes derived from variable input DNA amounts are quantitatively comparable.</p>
<p>Some caution is warranted for high-throughput transposon-based library preparation options like Nextera. Specifically, Experiment 2 revealed that standard libraries prepared from limiting DNA and under varied conditions were relatively invariant, whereas the transposon-based protocol led to divergent %G + C and protein cluster profiles for metagenomes from both stations. While these deviations were statistically significant (90% bootstrap clustering in Figures 
<xref ref-type="fig" rid="F4">4</xref>
and
<xref ref-type="fig" rid="F5">5</xref>
), they were minor in magnitude relative to other treatment effects observed here. Such a %G + C bias in Nextera library preps is not entirely surprising as previous work demonstrated reduced coverage in both high and low %G + C regions of virus genomes [
<xref ref-type="bibr" rid="B34">34</xref>
], presumably due to non-random transposition. Evaluation of new transposition methods should be considered if their eventual products require strictly unbiased representation of input DNA.</p>
<p>Finally, while not investigated here, polymerases used in amplification can alter metagenomes. Phi29 polymerase, for example, leads to stochastic and systematic biases that can impact resulting coverage [
<xref ref-type="bibr" rid="B13">13</xref>
], while some high-fidelity polymerases (e.g., TAKARA) enrich for rare sequences and others (e.g., PfuTurbo) do not [
<xref ref-type="bibr" rid="B11">11</xref>
,
<xref ref-type="bibr" rid="B15">15</xref>
]. In Experiment 1, the ≥1,000 ng libraries only minimally differed from each other despite the fact that they employ different polymerases across sequencing platforms. These polymerase-specific effects would depend on protocol particulars (e.g., PCR cycler settings and additives) [
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B30">30</xref>
] and the underlying %G + C distribution (particularly for <20% or >80% G + C fragments) of the DNA to be amplified. Future work to determine the impact of polymerase choice empirically on metagenomes derived from a wider range of %G + C than those employed here would be informative.</p>
</sec>
<sec>
<title>Duplicates vary by input DNA, amplification, technology</title>
<p>Duplicated reads are problematic in quantitative applications as they can be real or artificial [
<xref ref-type="bibr" rid="B23">23</xref>
,
<xref ref-type="bibr" rid="B24">24</xref>
,
<xref ref-type="bibr" rid="B35">35</xref>
,
<xref ref-type="bibr" rid="B36">36</xref>
]. Here, Experiment 1’s true distribution of duplicates is presumably represented by the first cluster (includes unamplified 454 libraries), except the artificial duplicates discussed below. By comparison, metagenomes from the second cluster contained highly duplicated artificial reads that reduced library complexity during amplification. The last cluster, which included amplified 454, as well as one Illumina and two Ion Torrent metagenomes, had low levels of duplication. For the 454 libraries, this could be due to the diversifying effects of the linker amplification process [
<xref ref-type="bibr" rid="B15">15</xref>
], but it is harder to explain this trend in the Ion Torrent metagenomes or find a process that ties low library amplification in the 100ng Illumina metagenome to lower duplication levels. Artificial duplicates in Illumina libraries were only an issue in the problematic 10ng library, where 40% of the reads were high-frequency, predominantly artificial duplicates. Further study is required to determine mechanisms that generate artificial duplicates in Illumina data.</p>
</sec>
<sec>
<title>Sequencing technologies produce comparable output</title>
<p>While the metagenomes here were derived from three very different ocean viral communities, the range of %G + C was not extreme. Given that, sequencing technology is not a major factor impacting ocean viral metagenomes, which is consistent with previous microbial metagenomic studies [
<xref ref-type="bibr" rid="B9">9</xref>
]. However, read length can influence many downstream applications, from assembly efforts to functional identification of genes [
<xref ref-type="bibr" rid="B37">37</xref>
,
<xref ref-type="bibr" rid="B38">38</xref>
]. Of widely used next-generation technologies, 454 currently has the longest read length of 800bp, with paired-end Illumina capable of 250 + bp [
<xref ref-type="bibr" rid="B7">7</xref>
]. However, emerging nanopore technologies are likely to be truly transformative [
<xref ref-type="bibr" rid="B39">39</xref>
]. Details are not yet public, but these technologies promise longer reads, direct observation of fragment sequences, and minimal library preparation enabling low input DNA applications.</p>
</sec>
</sec>
<sec sec-type="conclusions">
<title>Conclusions</title>
<p>As we strive for systematic and quantitative analyses of complex environments, a thorough understanding of empirically-documented biases in methods is critical. Here we demonstrate that while sequencing platform choice and degree of amplification have little impact on resulting metagenomes, presence of amplification and starting DNA amounts do influence library success and composition. Our findings are critical both for the interpretation of systematic comparisons of DNA-limited community metagenomes, as well as for novel methods of studying virus-host interactions [
<xref ref-type="bibr" rid="B40">40</xref>
-
<xref ref-type="bibr" rid="B42">42</xref>
] that generate small amounts of DNA. Notably, however, high replicability observed here might have been aided by diluting the initial concentrated DNA sample, and potential inhibitors, to obtain ‘low input DNA’ samples. Consideration should be made of the impact of inhibitors on low input DNA samples, particularly when amplification steps are needed for sample preparation.</p>
<p>Given current findings, unamplified libraries are best when DNA is not limiting (>2 ug) [
<xref ref-type="bibr" rid="B43">43</xref>
] while sequencing platform choice minimally impacts quantitative representation in the resulting metagenomes. When DNA is limiting, as in viral community samples or microbial communities of permafrost soils or air samples, specific recommendations for quantitative metagenomics are as follows. Low input DNA (1—100 ng) libraries can utilize either a linker-amplified protocol [
<xref ref-type="bibr" rid="B15">15</xref>
] optimized for the appropriate sequencing technology of choice [
<xref ref-type="bibr" rid="B10">10</xref>
] or, for Illumina sequencing, standard library preparations where adaptor:template ratios are carefully controlled. For samples with ultra-low DNA yields (<1 ng), it is best not to risk failure in standard library preparations and to use instead a sequencing technology optimized linker-amplified protocol. Future research directions include developing a mechanistic understanding of the non-intuitive, but replicable differences in linker-amplified metagenomes, as well as improving understanding of polymerase impacts and developing empirical datasets for a broader range of %G + C samples.</p>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<sec>
<title>Source DNAs and sample preparation details</title>
<sec>
<title>Experimental protocol availability</title>
<p>All detailed protocols are listed by name, and are documented and available at
<ext-link ext-link-type="uri" xlink:href="http://eebweb.arizona.edu/Faculty/mbsulli/protocols.htm">http://eebweb.arizona.edu/Faculty/mbsulli/protocols.htm</ext-link>
<italic>.</italic>
</p>
<p>Briefly, FeCl-precipitated viral concentrates were obtained from 0.2μm filtered seawater collected from the man-made Biosphere 2 Ocean in December 2010, as well as Stations 41 (Indian Ocean, 14°34.572 N 70°1 E, deep chlorophyll maximum) and 109 (south Pacific Ocean, 1°58.286 N 84°26.772 W, deep chlorophyll maximum) of the Tara Oceans expedition on March 30
<sup>th</sup>
, 2010, and May 12
<sup>th</sup>
, 2011, respectively. The viral concentrate from the former was purified using both CsCl and DNase, while only DNAse was used for the latter.</p>
</sec>
<sec>
<title>DNA Source for B2O metagenomes (Biosphere 2 Ocean)</title>
<p>The B2 Ocean environment is host to a stable microbial community, as measurements of microbial phyletic frequencies are consistent across samples taken a year apart (Additional file
<xref ref-type="supplementary-material" rid="S2">2</xref>
). FeCl precipitation [
<xref ref-type="bibr" rid="B12">12</xref>
] was used to concentrate viruses from 1,080L of 0.2 μm filtered seawater, which were then DNase I treated [
<xref ref-type="bibr" rid="B11">11</xref>
] to remove free DNA, cesium chloride purified to remove microbial contaminants (dsDNA viral band was pulled 1.4—1.52 g/ml [
<xref ref-type="bibr" rid="B11">11</xref>
]), and further concentrated to 4 mL using an Amicon 30KDa filter. The final yield was 1.26 × 10
<sup>12</sup>
SYBR-stained virus particles. DNA was extracted using the Wizard Prep DNA Purification system (Promega, cat# A7211 and A7181).</p>
</sec>
<sec>
<title>DNA Source for TARA metagenomes</title>
<p>20—60L seawater was collected and filtered for two TARA Oceans [
<xref ref-type="bibr" rid="B21">21</xref>
] stations using the protocol described above. These samples yielded 690 ng (station 41) and 950 ng (station 109) of DNA, using the Wizard Prep DNA Purification system. Starting DNA amounts of 10 and 100 ng were used in Illumina sequencing library construction as described in the Genoscope protocol (Genoscope Illumina protocol).</p>
</sec>
<sec>
<title>454 Library Prep (Sullivan lab)</title>
<p>The linker amplification protocol was used to generate amplicon libraries for 454 sequencing, as well as amplification-free libraries, as previously described [
<xref ref-type="bibr" rid="B15">15</xref>
]. Briefly, genomic DNA was Covaris-sheared, unidirectionally ligated to an adaptor, and amplified using adaptor-specific primers using 15 to 25 amplification cycles, depending on the starting DNA amount (a description of the amount of cycling and relationship to input DNA were documented previously [
<xref ref-type="bibr" rid="B15">15</xref>
]). Following the addition of barcodes, sequencing libraries were ligated to 454-specific adaptors.</p>
</sec>
<sec>
<title>Fosmid Library Prep (Hallam lab)</title>
<p>8μg of B2O viral DNA was used in large-insert fosmid library construction using the Epicentre CopyControl Fosmid Library Production Kit (CCFOS110) as previously described [
<xref ref-type="bibr" rid="B44">44</xref>
]. A total of 17 384-well plates of clones were picked, and 384 fosmids were sequenced bi-directionally with Sanger sequencing.</p>
</sec>
<sec>
<title>Ion Torrent Library Prep (University of Arizona Genomics Core)</title>
<p>2μg of B2O viral DNA was used for sequencing library preparation following the Ion Fragment Library Kit User Guide (Rev July 2011), loaded onto beads, emPCR-ed, then sequenced using the 316 chip on the Ion Torrent PGM.</p>
</sec>
<sec>
<title>Illumina Library Prep for B2O metagenomes (Emory Genomics Core)</title>
<p>DNA samples were Covaris-sheared and size-selected to 300—600bp using SPRI Size Selection chemistry, enrichment amplified using Phusion DNA polymerase according to starting amount of DNA (14—18 cycles), and paired end sequenced. Two libraries starting with 1ng of DNA failed to amplify to sufficient amounts, even with the use of a carrier DNA protocol (Emory carrier DNA protocol). One 10 ng library experienced the same problem, and was not sequenced. The libraries were multiplexed on two sequencing lanes, with one replicate of each starting amount library present together on each lane.</p>
</sec>
<sec>
<title>Illumina Library Prep for TARA metagenomes (Genoscope)</title>
<p>DNA samples were Covaris-sheared and size selected to 160—180bp, amplified according to starting amount of DNA (9—15 cycles) and paired-end sequenced. Several modifications of the standard Illumina protocol [
<xref ref-type="bibr" rid="B32">32</xref>
] were introduced in order to minimize losses of ultra-low DNA amounts. The low-fragment-size shearing settings, coupled with Ampure beads to remove very short fragments, ensured the recovery of appropriately sized fragments without the need for gel sizing. The Pfx Platinum polymerase was used to increase amplification efficiency and thus decrease the number of total library amplification cycles. During ligation, proper adaptor ratios were chosen to correspond to 2—3 fold more adaptor ends than fragment ends in the working ligation reaction (Genoscope Illumina protocol). Transposon-based Nextera libraries were prepared per manufacturer’s instructions using the Illumina compatible Nextera DNA Sample Prep Kit (Epicentre Biotechnologies, cat#GA09115).</p>
</sec>
</sec>
<sec>
<title>Bioinformatics methods</title>
<sec>
<title>Script availability</title>
<p>All custom scripts are listed by name and available at
<ext-link ext-link-type="uri" xlink:href="http://code.google.com/p/tmpl/">http://code.google.com/p/tmpl/</ext-link>
<italic>.</italic>
</p>
</sec>
<sec>
<title>Sequencing data</title>
<p>All metagenomic sequences are publically available through the CAMERA portal at
<ext-link ext-link-type="uri" xlink:href="http://camera.calit2.net/">http://camera.calit2.net/</ext-link>
[CAMERA
<bold>:</bold>
CAM_P_00001027]. 454 and Ion Torrent data, provided by UAGC, were delivered in .sff format and converted for downstream processing to FASTA and QUAL formats using sffinfo (roche454 v2.6) and then to FASTQ format using BioPerl 1.6.1. B2 Ocean Illumina data, by Emory Genomics Core, and TARA Oceans Illumina data, by Genoscope, were provided in FASTQ format. Each library was examined for raw quality using FastQC (v0.9, downloaded Aug 2012 from
<ext-link ext-link-type="uri" xlink:href="http://www.bioinformatics.babraham.ac.uk/projects/fastqc/">http://www.bioinformatics.babraham.ac.uk/projects/fastqc/</ext-link>
) and Fastx_Toolkit (v0.0.13 downloaded Feb 2010 from
<ext-link ext-link-type="uri" xlink:href="http://hannonlab.cshl.edu/fastx_toolkit/">http://hannonlab.cshl.edu/fastx_toolkit/</ext-link>
). The FastQC report was the source of duplication data used in the figures. Adapter sequences were detected in two metagenomes (I1A18N10 and I1A14N100) through the overrepresented sequences functionality of FastQC. The fastx_toolkit utility ‘fastx_clipper’ was used with the –C option to remove all reads matching the above adapter motif from the forward paired end reads, removing approximately 40% of the reads that passed QC in each of these libraries.</p>
</sec>
<sec>
<title>Quality control</title>
<p>Next, procedures for quality control were established to remove suspect sequence data, either by filtering whole reads or trimming reads in accordance with known sequencing technology artifacts. For 454 and Ion Torrent data, whole-read filtering was used, as is common for metagenomics [
<xref ref-type="bibr" rid="B11">11</xref>
,
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B45">45</xref>
,
<xref ref-type="bibr" rid="B46">46</xref>
] (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S10). In contrast, because Illumina errors are localized to particular parts of a read [
<xref ref-type="bibr" rid="B47">47</xref>
,
<xref ref-type="bibr" rid="B48">48</xref>
], these data were trimmed using a threshold predicted quality score to remove suspect regions of the read at both the 3’ and 5’ ends using DynamicTrim.pl, part of the SolexaQA package [
<xref ref-type="bibr" rid="B49">49</xref>
] (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
: Figure S11). After QC steps, 69—85% of the 454 reads remained, compared to 60% for Ion Torrent and 63—74% for Illumina (Table 
<xref ref-type="table" rid="T1">1</xref>
). The fastx_toolkit software package was also used to remove Illumina reads under 50bp, while the 454 and Ion Torrent reads were cleaned using a custom pipeline [
<xref ref-type="bibr" rid="B18">18</xref>
]. This processing ensured that the data analyzed would be analogous to that used for metagenomic inference. FastQC and Fastx_Toolkit were used to check the QC process of each metagenome.</p>
</sec>
<sec>
<title>%G + C analytics</title>
<p>The mean read %G + C was chosen as the focus of our analysis, rather than the %G + C of sequence subsets of a read or the larger genome regions from which the read fragment originated, since mean fragment %G + C the best predictor of GC bias [
<xref ref-type="bibr" rid="B50">50</xref>
]. QC-ed reads were processed using the BioPerl 1.6.1 script bp_gc_calc.pl to obtain average %G + C values for each read. Given the large read length differences across these libraries (90bp to 350bp), only the first 50bp of each read are used in all %G + C distribution analyses to match the shortest QC-ed Illumina data, while normalizing for read length. Reads were truncated to 50bp using fastx_toolkit and processed with bp_gc_calc.pl. Phage metagenomic reads were cut into non-overlapping 50bp fragments using a bash script and also processed with bp_gc_calc.pl.</p>
</sec>
<sec>
<title>Statistical analysis and figures</title>
<p>R 2.14.1 (
<ext-link ext-link-type="uri" xlink:href="http://www.R-project.org/">http://www.R-project.org/</ext-link>
) was used to run a custom script, 0.02gc.R, which calculated frequencies of reads in 2% G + C bins for each metagenome. Pearsons’s r pairwise correlation values were calculated using the cov() function, and heatmap figures were generated using the heatmap.2() function found in the gplots library (
<ext-link ext-link-type="uri" xlink:href="http://CRAN.R-project.org/package=gplots">http://CRAN.R-project.org/package=gplots</ext-link>
). Lastly, bootstrapped UPGMA clustering values for each node were obtained using the pvclust() function in the pvclust library (
<ext-link ext-link-type="uri" xlink:href="http://CRAN.R-project.org/package=pvclust">http://CRAN.R-project.org/package=pvclust</ext-link>
), with pairwise distances calculated from Pearson’s correlation values and hierarchical clustering done using the “average” method.</p>
</sec>
<sec>
<title>Duplicate analyses</title>
<p>Duplication levels were assessed in raw reads by counting the occurrence of duplicates only in the starting 50bp of each read using the FastQC duplication level utility output, normalized to total metagenome size to reflect relative frequencies. Artificial duplicates were defined as those with identical starts and >95% identity throughout the read, and were detected using CD-HIT-454 [
<xref ref-type="bibr" rid="B51">51</xref>
] and CD-HIT-DUP [
<xref ref-type="bibr" rid="B52">52</xref>
] with default parameters.</p>
</sec>
<sec>
<title>Protein cluster analyses</title>
<p>Functional differences within and between metagenomes were assessed in Experiment 1 by mapping metagenomic reads to the Pacific Ocean Virome database [
<xref ref-type="bibr" rid="B18">18</xref>
]. The hit frequencies of the 1,500 protein clusters that were most abundant across all metagenomes were then used to obtain pairwise correlation values. A range of 3—7% of the metagenomic reads mapped to these POV PCs, while the ‘top 1,500 PCs’ subsample represented >99% of the data that mapped. Because the Experiment 1 dataset represented a large diversity of read lengths, greatly impacting inference capacity [
<xref ref-type="bibr" rid="B38">38</xref>
], the dataset was normalized to assess sequencing platform biases rather than read length impacts as follows: (i) the longer Ion Torrent and 454 reads were trimmed to 100bp, and (ii) only reads ≥100 bp were used from Illumina data.</p>
</sec>
<sec>
<title>Assembly analyses</title>
<p>The short reads derived from Illumina and Ion Torrent data were assembled using Velvet v 1.2.03 [
<xref ref-type="bibr" rid="B53">53</xref>
] using default parameters across a range of kmer sizes (23, 27, 31bp), but only 31-mer data are reported as kmer size did not impact assemblies. The longer 454 reads were assembled using GS De Novo Assembler v2.6 (
<ext-link ext-link-type="uri" xlink:href="http://my454.com/products/analysis-software/index.asp">http://my454.com/products/analysis-software/index.asp</ext-link>
) with default parameters.</p>
</sec>
</sec>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<title>Authors’ contributions</title>
<p>SAS and MBS conceived the project and designed the experiments with contributions from AA, SH, KK, GT and PW. AA and CC performed experiments. SAS and JCIE collected and analyzed the results. SAS, MBS, SH, KK, GT wrote the manuscript. All authors read and approved the final manuscript.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="S1">
<caption>
<title>Additional file 1: Figures S1-S11</title>
<p>A log-log plot of all B2 Ocean metagenome read yields per starting DNA amount (
<bold>Figure S1</bold>
). % G + C histogram of several ‘problematic’ and ‘reliable’ libraries, and GC distribution of full dsDNA bacteriophage genomes for reference (
<bold>Figure S2</bold>
). %G + C distribution differences between whole-read mean % G + C in unamplified 454 metagenome, in green, and Sanger-sequenced fosmid library, in blue, shows a shift toward high %G + C in the fosmid library (
<bold>Figure S3</bold>
). Duplicate frequencies in Experiment 1 metagenomes (
<bold>Figure S4</bold>
). Heatmap of Pearson’s r pairwise correlation values for artificial duplicate frequencies, as detected using CD-HIT-454 for 454 and Ion Torrent data and CD-HIT-DUP for Illumina data (
<bold>Figure S5</bold>
). CD-HIT-454 artificial duplicate frequencies in Experiment 1 metagenomes generated using 454 and Ion Torrent sequencing (
<bold>Figure S6</bold>
). Duplicate frequency minus artificial duplicate frequency for Experiment 1 CD-HIT-454 –processed metagenomes (
<bold>Figure S7</bold>
). CD-HIT-DUP artificial duplicate frequencies in Experiment 1 Illumina metagenomes (
<bold>Figure S8</bold>
). Duplicate frequency minus artificial duplicate frequency for Experiment 1 CD-HIT-DUP –processed metagenomes (
<bold>Figure S9</bold>
). Ion Torrent QC length distribution (
<bold>Figure S10</bold>
). Methods for Trimming Illumina Reads (
<bold>Figure S11</bold>
). </p>
</caption>
<media xlink:href="1471-2164-14-320-S1.docx">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S2">
<caption>
<title>Additional file 2</title>
<p>
<bold>Pyrotag data for microbial composition of Biosphere 2 Ocean in Nov 2008 and Sep 2009.</bold>
The Biosphere 2 Ocean was the source of the DNA sample used in Experiment 1 metagenomes. The distribution of microbial phyla in the B2 Ocean community appears stable across two samples taken a year apart. </p>
</caption>
<media xlink:href="1471-2164-14-320-S2.xlsx">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<sec>
<title>Acknowledgements</title>
<p>We thank B. Poulos, N. Solonenko, A. Gregory, and C. Decker for technical assistance, as well as TMPL members and two anonymous reviewers for comments on the manuscript. Funding for this particular study was provided by BIO5, Biosphere 2 and the Gordon and Betty Moore Foundation to MBS. We thank the coordinators and members of the Tara Oceans consortium (
<ext-link ext-link-type="uri" xlink:href="http://www.embl.de/tara_oceans/start/">http://www.embl.de/tara_oceans/start/</ext-link>
) for organizing sampling and data analysis. We thank the commitment of the following people and sponsors who made this singular expedition possible: CNRS, EMBL, Genoscope/CEA, VIB, Stazione Zoologica Anton Dohrn, UNIMIB, ANR (projects POSEIDON/ANR-09-BLAN-0348, BIOMARKS/ANR-08-BDVA-003, PROMETHEUS/ANR-09-GENM-031, and TARA-GIRUS/ANR-09-PCS-GENM-218), EU FP7 (MicroB3/No.287589), FWO, BIO5, Biosphere 2, agne`s b., the Veolia Environment Foundation, Region Bretagne, World Courier, Illumina, Cap L’Orient, the EDF Foundation EDF Diversiterre, FRB, the Prince Albert II de Monaco Foundation, Etienne Bourgois, the Tara schooner and its captain and crew. Tara Oceans would not exist without continuous support from 23 institutes (
<ext-link ext-link-type="uri" xlink:href="http://oceans.taraexpeditions.org">http://oceans.taraexpeditions.org</ext-link>
). This article is contribution number 0005 of the Tara Oceans Expedition 2009–2012.</p>
</sec>
<ref-list>
<ref id="B1">
<mixed-citation publication-type="journal">
<name>
<surname>Chaffron</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Rehrauer</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Pernthaler</surname>
<given-names>J</given-names>
</name>
<name>
<surname>von Mering</surname>
<given-names>C</given-names>
</name>
<article-title>A global network of coexisting microbes from environmental and whole-genome sequence data</article-title>
<source>Genome Res</source>
<year>2010</year>
<volume>20</volume>
<fpage>947</fpage>
<lpage>959</lpage>
<pub-id pub-id-type="doi">10.1101/gr.104521.109</pub-id>
<pub-id pub-id-type="pmid">20458099</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<name>
<surname>Shapiro</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Friedman</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Cordero</surname>
<given-names>OX</given-names>
</name>
<name>
<surname>Preheim</surname>
<given-names>SP</given-names>
</name>
<name>
<surname>Timberlake</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Szabo</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Polz</surname>
<given-names>MF</given-names>
</name>
<name>
<surname>Alm</surname>
<given-names>EJ</given-names>
</name>
<article-title>Population genomics of early events in the ecological differentiation of bacteria</article-title>
<source>Science</source>
<year>2012</year>
<volume>336</volume>
<fpage>48</fpage>
<lpage>51</lpage>
<pub-id pub-id-type="doi">10.1126/science.1218198</pub-id>
<pub-id pub-id-type="pmid">22491847</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="other">
<name>
<surname>Handelsman</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Tiedje</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Alvarez-Cohen</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ashburner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Cann</surname>
<given-names>IKO</given-names>
</name>
<name>
<surname>Delong</surname>
<given-names>EF</given-names>
</name>
<name>
<surname>Doolittle</surname>
<given-names>WF</given-names>
</name>
<name>
<surname>Fraser-Liggett</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Godzik</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Gordon</surname>
<given-names>JI</given-names>
</name>
<article-title>New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet</article-title>
<source>Nat Res Council Report</source>
<year>2007</year>
<fpage>13</fpage>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<name>
<surname>Glenn</surname>
<given-names>TC</given-names>
</name>
<article-title>Field guide to next-generation DNA sequencers</article-title>
<source>Mol Ecol Resour</source>
<year>2011</year>
<volume>11</volume>
<fpage>759</fpage>
<lpage>769</lpage>
<pub-id pub-id-type="doi">10.1111/j.1755-0998.2011.03024.x</pub-id>
<pub-id pub-id-type="pmid">21592312</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal">
<name>
<surname>Kircher</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kelso</surname>
<given-names>J</given-names>
</name>
<article-title>High-throughput DNA sequencing–concepts and limitations</article-title>
<source>BioEssays : news and reviews in molecular, cellular and developmental biology</source>
<year>2010</year>
<volume>32</volume>
<fpage>524</fpage>
<lpage>536</lpage>
<pub-id pub-id-type="doi">10.1002/bies.200900181</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<name>
<surname>Metzker</surname>
<given-names>ML</given-names>
</name>
<article-title>Sequencing technologies - the next generation</article-title>
<source>Nat Rev Genet</source>
<year>2010</year>
<volume>11</volume>
<fpage>31</fpage>
<lpage>46</lpage>
<pub-id pub-id-type="doi">10.1038/nrg2626</pub-id>
<pub-id pub-id-type="pmid">19997069</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="journal">
<name>
<surname>Loman</surname>
<given-names>NJ</given-names>
</name>
<name>
<surname>Misra</surname>
<given-names>RV</given-names>
</name>
<name>
<surname>Dallman</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Constantinidou</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gharbia</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Wain</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Pallen</surname>
<given-names>MJ</given-names>
</name>
<article-title>Performance comparison of benchtop high-throughput sequencing platforms</article-title>
<source>Nat Biotechnol</source>
<year>2012</year>
<volume>30</volume>
<fpage>434</fpage>
<lpage>439</lpage>
<pub-id pub-id-type="doi">10.1038/nbt.2198</pub-id>
<pub-id pub-id-type="pmid">22522955</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal">
<name>
<surname>Linnarsson</surname>
<given-names>S</given-names>
</name>
<article-title>Recent advances in DNA sequencing methods - general principles of sample preparation</article-title>
<source>Exp Cell Res</source>
<year>2010</year>
<volume>316</volume>
<fpage>1339</fpage>
<lpage>1343</lpage>
<pub-id pub-id-type="doi">10.1016/j.yexcr.2010.02.036</pub-id>
<pub-id pub-id-type="pmid">20211618</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal">
<name>
<surname>Luo</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Tsementzi</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Kyrpides</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Read</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Konstantinidis</surname>
<given-names>KT</given-names>
</name>
<article-title>Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample</article-title>
<source>PLoS One</source>
<year>2012</year>
<volume>7</volume>
<fpage>e30087</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0030087</pub-id>
<pub-id pub-id-type="pmid">22347999</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<name>
<surname>Duhaime</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>MB</given-names>
</name>
<article-title>Ocean viruses: Rigorously evaluating the metagenomic sample-to-sequence pipeline</article-title>
<source>Virology</source>
<year>2012</year>
<volume>434</volume>
<fpage>181</fpage>
<lpage>186</lpage>
<pub-id pub-id-type="doi">10.1016/j.virol.2012.09.036</pub-id>
<pub-id pub-id-type="pmid">23084423</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="book">
<name>
<surname>Hurwitz</surname>
<given-names>BH</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Poulos</surname>
<given-names>B</given-names>
</name>
<source>Sullivan</source>
<year>2012</year>
<publisher-name>Evaluation of methods to concentrate and purify wild ocean virus communities through comparative, replicated metagenomics. Environ Microbiol: MB</publisher-name>
<pub-id pub-id-type="doi">10.1111/j.1462-2920.2012.02836.x</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="journal">
<name>
<surname>John</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Mendez</surname>
<given-names>CB</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Poulos</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Kauffman</surname>
<given-names>AKM</given-names>
</name>
<name>
<surname>Kern</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Brum</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Polz</surname>
<given-names>MF</given-names>
</name>
<name>
<surname>Boyle</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>MB</given-names>
</name>
<article-title>A simple and efficient method for concentration of ocean viruses by chemical flocculation</article-title>
<source>Environ Microbiol Rep</source>
<year>2011</year>
<volume>3</volume>
<fpage>195</fpage>
<lpage>202</lpage>
<pub-id pub-id-type="doi">10.1111/j.1758-2229.2010.00208.x</pub-id>
<pub-id pub-id-type="pmid">21572525</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<name>
<surname>Yilmaz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Allgaier</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<article-title>Multiple displacement amplification compromises quantitative analysis of metagenomes</article-title>
<source>Nat Methods</source>
<year>2010</year>
<volume>7</volume>
<fpage>943</fpage>
<lpage>944</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth1210-943</pub-id>
<pub-id pub-id-type="pmid">21116242</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<name>
<surname>Kim</surname>
<given-names>KH</given-names>
</name>
<name>
<surname>Bae</surname>
<given-names>JW</given-names>
</name>
<article-title>Amplification methods bias metagenomic libraries of uncultured single-stranded and double-stranded DNA viruses</article-title>
<source>Appl Environ Microbiol</source>
<year>2011</year>
<volume>77</volume>
<fpage>7663</fpage>
<lpage>7668</lpage>
<pub-id pub-id-type="doi">10.1128/AEM.00289-11</pub-id>
<pub-id pub-id-type="pmid">21926223</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<name>
<surname>Duhaime</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Poulos</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>MB</given-names>
</name>
<article-title>Towards quantitative metagenomics of wild viruses and other ultra-low concentration DNA samples: a rigorous assessment and optimization of the linker amplification method</article-title>
<source>Environ Microbiol</source>
<year>2012</year>
<volume>14</volume>
<fpage>2526</fpage>
<lpage>2537</lpage>
<pub-id pub-id-type="doi">10.1111/j.1462-2920.2012.02791.x</pub-id>
<pub-id pub-id-type="pmid">22713159</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="journal">
<name>
<surname>Hoeijmakers</surname>
<given-names>WA</given-names>
</name>
<name>
<surname>Bartfai</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Francoijs</surname>
<given-names>KJ</given-names>
</name>
<name>
<surname>Stunnenberg</surname>
<given-names>HG</given-names>
</name>
<article-title>Linear amplification for deep sequencing</article-title>
<source>Nat Protoc</source>
<year>2011</year>
<volume>6</volume>
<fpage>1026</fpage>
<lpage>1036</lpage>
<pub-id pub-id-type="doi">10.1038/nprot.2011.345</pub-id>
<pub-id pub-id-type="pmid">21720315</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<name>
<surname>Oyola</surname>
<given-names>SO</given-names>
</name>
<name>
<surname>Otto</surname>
<given-names>TD</given-names>
</name>
<name>
<surname>Gu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Maslen</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Manske</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Campino</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Macinnis</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Kwiatkowski</surname>
<given-names>DP</given-names>
</name>
<name>
<surname>Swerdlow</surname>
<given-names>HP</given-names>
</name>
<name>
<surname>Quail</surname>
<given-names>MA</given-names>
</name>
<article-title>Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes</article-title>
<source>BMC Genomics</source>
<year>2012</year>
<volume>13</volume>
<fpage>1</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-13-1</pub-id>
<pub-id pub-id-type="pmid">22214261</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="other">
<name>
<surname>Hurwitz</surname>
<given-names>BH</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>MB</given-names>
</name>
<article-title>The Pacific Ocean Virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology</article-title>
<source>PLoS One</source>
<year>2012</year>
<comment>submitted</comment>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<name>
<surname>Roux</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Faubladier</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mahul</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Paulhe</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Bernard</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Debroas</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Enault</surname>
<given-names>F</given-names>
</name>
<article-title>Metavir: a web server dedicated to virome analysis</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<fpage>3074</fpage>
<lpage>3075</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btr519</pub-id>
<pub-id pub-id-type="pmid">21911332</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<name>
<surname>Wommack</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Polson</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>Bhaysar</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Srinivasiah</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Jamindar</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Dumas</surname>
<given-names>M</given-names>
</name>
<article-title>VIROME: a standard operating procedure for classification of viral metagenome sequences</article-title>
<source>Stand Genomic Sci</source>
<year>2011</year>
<volume>4</volume>
<fpage>427</fpage>
<lpage>439</lpage>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<name>
<surname>Karsenti</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Acinas</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bowler</surname>
<given-names>C</given-names>
</name>
<name>
<surname>De Vargas</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Arendt</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Benzoni</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Claverie</surname>
<given-names>JM</given-names>
</name>
<article-title>A holistic approach to marine eco-systems biology</article-title>
<source>PLoS Biol</source>
<year>2011</year>
<volume>9</volume>
<fpage>e1001177</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.1001177</pub-id>
<pub-id pub-id-type="pmid">22028628</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="journal">
<name>
<surname>Adey</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Morrison</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Asan</surname>
</name>
<name>
<surname>Xun</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Kitzman</surname>
<given-names>JO</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Stackhouse</surname>
<given-names>B</given-names>
</name>
<name>
<surname>MacKenzie</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Caruccio</surname>
<given-names>NC</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Shendure</surname>
<given-names>J</given-names>
</name>
<article-title>Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition</article-title>
<source>Genome Biol</source>
<year>2010</year>
<volume>11</volume>
<fpage>R119</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2010-11-12-r119</pub-id>
<pub-id pub-id-type="pmid">21143862</pub-id>
</mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="journal">
<name>
<surname>Dong</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>W</given-names>
</name>
<article-title>Artificial duplicate reads in sequencing data of 454 Genome Sequencer FLX System</article-title>
<source>Acta Biochim Biophys Sin</source>
<year>2011</year>
<volume>43</volume>
<fpage>496</fpage>
<lpage>500</lpage>
<pub-id pub-id-type="doi">10.1093/abbs/gmr030</pub-id>
<pub-id pub-id-type="pmid">21543404</pub-id>
</mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="journal">
<name>
<surname>Kozarewa</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Ning</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Quail</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Sanders</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Berriman</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>DJ</given-names>
</name>
<article-title>Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G + C)-biased genomes</article-title>
<source>Nat Methods</source>
<year>2009</year>
<volume>6</volume>
<fpage>291</fpage>
<lpage>295</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.1311</pub-id>
<pub-id pub-id-type="pmid">19287394</pub-id>
</mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal">
<name>
<surname>Hurlbert</surname>
<given-names>SH</given-names>
</name>
<article-title>Pseudoreplication and the design of ecological field experiments</article-title>
<source>Ecological Monographs</source>
<year>1984</year>
<volume>54</volume>
<fpage>187</fpage>
<lpage>211</lpage>
<pub-id pub-id-type="doi">10.2307/1942661</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<mixed-citation publication-type="journal">
<name>
<surname>Danhorn</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>CR</given-names>
</name>
<name>
<surname>Delong</surname>
<given-names>EF</given-names>
</name>
<article-title>Comparison of large-insert, small-insert and pyrosequencing libraries for metagenomic analysis</article-title>
<source>ISME J</source>
<year>2012</year>
<volume>6</volume>
<fpage>2056</fpage>
<lpage>2066</lpage>
<pub-id pub-id-type="doi">10.1038/ismej.2012.35</pub-id>
<pub-id pub-id-type="pmid">22534608</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<mixed-citation publication-type="journal">
<name>
<surname>Knight</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Jansson</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Field</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Fierer</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Desai</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Fuhrman</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>van der Lelie</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>R</given-names>
</name>
<article-title>Unlocking the potential of metagenomics through replicated experimental design</article-title>
<source>Nat Biotechnol</source>
<year>2012</year>
<volume>30</volume>
<fpage>513</fpage>
<lpage>520</lpage>
<pub-id pub-id-type="doi">10.1038/nbt.2235</pub-id>
<pub-id pub-id-type="pmid">22678395</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<mixed-citation publication-type="journal">
<name>
<surname>Kishore</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Reef Hardy</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>VJ</given-names>
</name>
<name>
<surname>Sanchez</surname>
<given-names>NA</given-names>
</name>
<name>
<surname>Buoncristiani</surname>
<given-names>MR</given-names>
</name>
<article-title>Optimization of DNA extraction from low-yield and degraded samples using the BioRobot EZ1 and BioRobot M48</article-title>
<source>J Forensic Sci</source>
<year>2006</year>
<volume>51</volume>
<fpage>1055</fpage>
<lpage>1061</lpage>
<pub-id pub-id-type="doi">10.1111/j.1556-4029.2006.00204.x</pub-id>
<pub-id pub-id-type="pmid">17018081</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<mixed-citation publication-type="book">
<name>
<surname>Sambrook</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Fritsch</surname>
<given-names>EF</given-names>
</name>
<name>
<surname>Maniatis</surname>
<given-names>T</given-names>
</name>
<source>Molecular Cloning, a laboratory manual</source>
<year>1989</year>
<publisher-name>Cold Harbor Spring Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B30">
<mixed-citation publication-type="journal">
<name>
<surname>Aird</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>WS</given-names>
</name>
<name>
<surname>Danielsson</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fennell</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Russ</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Jaffe</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Nusbaum</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gnirke</surname>
<given-names>A</given-names>
</name>
<article-title>Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries</article-title>
<source>Genome Biol</source>
<year>2011</year>
<volume>12</volume>
<fpage>R18</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2011-12-2-r18</pub-id>
<pub-id pub-id-type="pmid">21338519</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<mixed-citation publication-type="journal">
<name>
<surname>Schwientek</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Szczepanowski</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Ruckert</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Stoye</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Puhler</surname>
<given-names>A</given-names>
</name>
<article-title>Sequencing of high G + C microbial genomes using the ultrafast pyrosequencing technology</article-title>
<source>J Biotechnol</source>
<year>2011</year>
<volume>155</volume>
<fpage>68</fpage>
<lpage>77</lpage>
<pub-id pub-id-type="doi">10.1016/j.jbiotec.2011.04.010</pub-id>
<pub-id pub-id-type="pmid">21536083</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<mixed-citation publication-type="journal">
<name>
<surname>Quail</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Kozarewa</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Scally</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stephens</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Swerdlow</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>DJ</given-names>
</name>
<article-title>A large genome center's improvements to the Illumina sequencing system</article-title>
<source>Nat Methods</source>
<year>2008</year>
<volume>5</volume>
<fpage>1005</fpage>
<lpage>1010</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.1270</pub-id>
<pub-id pub-id-type="pmid">19034268</pub-id>
</mixed-citation>
</ref>
<ref id="B33">
<mixed-citation publication-type="book">
<name>
<surname>Quail</surname>
<given-names>MA</given-names>
</name>
<article-title>DNA: Mechanical breakage</article-title>
<source>Encyclopedia of Life Sciences</source>
<year>2010</year>
<publisher-name>Chichester: John Wiley & Sons, Ltd</publisher-name>
</mixed-citation>
</ref>
<ref id="B34">
<mixed-citation publication-type="journal">
<name>
<surname>Marine</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Polson</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>Ravel</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hatfull</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Russell</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Syed</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Dumas</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wommack</surname>
<given-names>KE</given-names>
</name>
<article-title>Evaluation of a transposase protocol for rapid generation of shotgun high-throughput sequencing libraries from nanogram quantities of DNA</article-title>
<source>Appl Environ Microbiol</source>
<year>2011</year>
<volume>77</volume>
<fpage>8071</fpage>
<lpage>8079</lpage>
<pub-id pub-id-type="doi">10.1128/AEM.05610-11</pub-id>
<pub-id pub-id-type="pmid">21948828</pub-id>
</mixed-citation>
</ref>
<ref id="B35">
<mixed-citation publication-type="journal">
<name>
<surname>Gomez-Alvarez</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Teal</surname>
<given-names>TK</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>TM</given-names>
</name>
<article-title>Systematic artifacts in metagenomes from complex microbial communities</article-title>
<source>ISME J</source>
<year>2009</year>
<volume>3</volume>
<fpage>1314</fpage>
<lpage>1317</lpage>
<pub-id pub-id-type="doi">10.1038/ismej.2009.72</pub-id>
<pub-id pub-id-type="pmid">19587772</pub-id>
</mixed-citation>
</ref>
<ref id="B36">
<mixed-citation publication-type="journal">
<name>
<surname>Jerome</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Noirot</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Klopp</surname>
<given-names>C</given-names>
</name>
<article-title>Assessment of replicate bias in 454 pyrosequencing and a multi-purpose read-filtering tool</article-title>
<source>BMC Res Notes</source>
<year>2011</year>
<volume>4</volume>
<fpage>149</fpage>
<pub-id pub-id-type="doi">10.1186/1756-0500-4-149</pub-id>
<pub-id pub-id-type="pmid">21615897</pub-id>
</mixed-citation>
</ref>
<ref id="B37">
<mixed-citation publication-type="journal">
<name>
<surname>Kristensen</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Mushegian</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Dolja</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
<article-title>New dimensions of the virus world discovered through metagenomics</article-title>
<source>Trends Microbiol</source>
<year>2010</year>
<volume>18</volume>
<fpage>11</fpage>
<lpage>19</lpage>
<pub-id pub-id-type="doi">10.1016/j.tim.2009.11.003</pub-id>
<pub-id pub-id-type="pmid">19942437</pub-id>
</mixed-citation>
</ref>
<ref id="B38">
<mixed-citation publication-type="journal">
<name>
<surname>Wommack</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Bhavsar</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ravel</surname>
<given-names>J</given-names>
</name>
<article-title>Metagenomics: read length matters</article-title>
<source>Appl Environ Microbiol</source>
<year>2008</year>
<volume>74</volume>
<fpage>1453</fpage>
<lpage>1463</lpage>
<pub-id pub-id-type="doi">10.1128/AEM.02181-07</pub-id>
<pub-id pub-id-type="pmid">18192407</pub-id>
</mixed-citation>
</ref>
<ref id="B39">
<mixed-citation publication-type="journal">
<name>
<surname>Wanunu</surname>
<given-names>M</given-names>
</name>
<article-title>Nanopores: A journey towards DNA sequencing</article-title>
<source>Phys Life Rev</source>
<year>2012</year>
<volume>9</volume>
<fpage>125</fpage>
<lpage>158</lpage>
<pub-id pub-id-type="doi">10.1016/j.plrev.2012.05.010</pub-id>
<pub-id pub-id-type="pmid">22658507</pub-id>
</mixed-citation>
</ref>
<ref id="B40">
<mixed-citation publication-type="other">
<name>
<surname>Allers</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Moraru</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Duhaime</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Beneze</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Solonenko</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Barerro-Canosa</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Amann</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>MB</given-names>
</name>
<article-title>Single-cell and population level viral infection dynamics revealed by phageFISH, a method to visualize intracellular and free viruses</article-title>
<source>Environ Microbiol</source>
<year>2013</year>
<comment>in press</comment>
</mixed-citation>
</ref>
<ref id="B41">
<mixed-citation publication-type="journal">
<name>
<surname>Deng</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Gregory</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yilmaz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Poulos</surname>
<given-names>BT</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>MB</given-names>
</name>
<article-title>Contrasting strategies of viruses that infect photo- and hetero- trophic bacteria revealed by viral-tagging</article-title>
<source>mBio</source>
<year>2012</year>
<volume>3</volume>
<fpage>e00373</fpage>
<lpage>00312</lpage>
<pub-id pub-id-type="pmid">23111870</pub-id>
</mixed-citation>
</ref>
<ref id="B42">
<mixed-citation publication-type="journal">
<name>
<surname>Tadmor</surname>
<given-names>AD</given-names>
</name>
<name>
<surname>Ottesen</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Leadbetter</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Phillips</surname>
<given-names>R</given-names>
</name>
<article-title>Probing individual environmental bacteria for viruses by using microfluidic digital PCR</article-title>
<source>Science</source>
<year>2011</year>
<volume>333</volume>
<fpage>58</fpage>
<lpage>62</lpage>
<pub-id pub-id-type="doi">10.1126/science.1200758</pub-id>
<pub-id pub-id-type="pmid">21719670</pub-id>
</mixed-citation>
</ref>
<ref id="B43">
<mixed-citation publication-type="journal">
<name>
<surname>Luo</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Tsementzi</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Kyrpides</surname>
<given-names>NC</given-names>
</name>
<name>
<surname>Konstantinidis</surname>
<given-names>KT</given-names>
</name>
<article-title>Individual genome assembly from complex community short-read metagenomic datasets</article-title>
<source>ISME J</source>
<year>2012</year>
<volume>6</volume>
<fpage>898</fpage>
<lpage>901</lpage>
<pub-id pub-id-type="doi">10.1038/ismej.2011.147</pub-id>
<pub-id pub-id-type="pmid">22030673</pub-id>
</mixed-citation>
</ref>
<ref id="B44">
<mixed-citation publication-type="other">
<name>
<surname>Taupp</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hawley</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hallam</surname>
<given-names>SJ</given-names>
</name>
<article-title>Large insert environmental genomic library production</article-title>
<source>J Visualized Exp: JoVE</source>
<year>2009</year>
<pub-id pub-id-type="doi">10.3791/1387</pub-id>
</mixed-citation>
</ref>
<ref id="B45">
<mixed-citation publication-type="journal">
<name>
<surname>Huse</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Dethlefsen</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Huber</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Welch</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Relman</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Sogin</surname>
<given-names>ML</given-names>
</name>
<article-title>Exploring microbial diversity and taxonomy using SSU rRNA hypervariable tag sequencing</article-title>
<source>PLoS Genet</source>
<year>2008</year>
<volume>4</volume>
<fpage>e1000255</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pgen.1000255</pub-id>
<pub-id pub-id-type="pmid">19023400</pub-id>
</mixed-citation>
</ref>
<ref id="B46">
<mixed-citation publication-type="journal">
<name>
<surname>Rothberg</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Hinz</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Rearick</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Schultz</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Mileski</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Davey</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Leamon</surname>
<given-names>JH</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Milgrew</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>M</given-names>
</name>
<article-title>An integrated semiconductor device enabling non-optical genome sequencing</article-title>
<source>Nature</source>
<year>2011</year>
<volume>475</volume>
<fpage>348</fpage>
<lpage>352</lpage>
<pub-id pub-id-type="doi">10.1038/nature10242</pub-id>
<pub-id pub-id-type="pmid">21776081</pub-id>
</mixed-citation>
</ref>
<ref id="B47">
<mixed-citation publication-type="journal">
<name>
<surname>Dohm</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Lottaz</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Borodina</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Himmelbauer</surname>
<given-names>H</given-names>
</name>
<article-title>Substantial biases in ultra-short read data sets from high-throughput DNA sequencing</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<fpage>e105</fpage>
<pub-id pub-id-type="doi">10.1093/nar/gkn425</pub-id>
<pub-id pub-id-type="pmid">18660515</pub-id>
</mixed-citation>
</ref>
<ref id="B48">
<mixed-citation publication-type="journal">
<name>
<surname>Minoche</surname>
<given-names>AE</given-names>
</name>
<name>
<surname>Dohm</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Himmelbauer</surname>
<given-names>H</given-names>
</name>
<article-title>Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems</article-title>
<source>Genome Biol</source>
<year>2011</year>
<volume>12</volume>
<fpage>R112</fpage>
<pub-id pub-id-type="doi">10.1186/gb-2011-12-11-r112</pub-id>
<pub-id pub-id-type="pmid">22067484</pub-id>
</mixed-citation>
</ref>
<ref id="B49">
<mixed-citation publication-type="journal">
<name>
<surname>Cock</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Fields</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Goto</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Heuer</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Rice</surname>
<given-names>PM</given-names>
</name>
<article-title>The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants</article-title>
<source>Nucleic Acids Res</source>
<year>2010</year>
<volume>38</volume>
<fpage>1767</fpage>
<lpage>1771</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkp1137</pub-id>
<pub-id pub-id-type="pmid">20015970</pub-id>
</mixed-citation>
</ref>
<ref id="B50">
<mixed-citation publication-type="journal">
<name>
<surname>Benjamini</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Speed</surname>
<given-names>TP</given-names>
</name>
<article-title>Summarizing and correcting the GC content bias in high-throughput sequencing</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>e72</fpage>
<pub-id pub-id-type="doi">10.1093/nar/gks001</pub-id>
<pub-id pub-id-type="pmid">22323520</pub-id>
</mixed-citation>
</ref>
<ref id="B51">
<mixed-citation publication-type="journal">
<name>
<surname>Niu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<article-title>Artificial and natural duplicates in pyrosequencing reads of metagenomic data</article-title>
<source>BMC Bioinformatics</source>
<year>2010</year>
<volume>11</volume>
<fpage>187</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-11-187</pub-id>
<pub-id pub-id-type="pmid">20388221</pub-id>
</mixed-citation>
</ref>
<ref id="B52">
<mixed-citation publication-type="journal">
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Niu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wooley</surname>
<given-names>J</given-names>
</name>
<article-title>Ultrafast clustering algorithms for metagenomic sequence analysis</article-title>
<source>Brief Bioinform</source>
<year>2012</year>
<volume>13</volume>
<fpage>656</fpage>
<pub-id pub-id-type="doi">10.1093/bib/bbs035</pub-id>
<pub-id pub-id-type="pmid">22772836</pub-id>
</mixed-citation>
</ref>
<ref id="B53">
<mixed-citation publication-type="journal">
<name>
<surname>Zerbino</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>McEwen</surname>
<given-names>GK</given-names>
</name>
<name>
<surname>Margulies</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Birney</surname>
<given-names>E</given-names>
</name>
<article-title>Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler</article-title>
<source>PLoS One</source>
<year>2009</year>
<volume>4</volume>
<fpage>e8407</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0008407</pub-id>
<pub-id pub-id-type="pmid">20027311</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0015930 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0015930 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024