Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Signal Processing for Metagenomics: Extracting Information from the Soup

Identifieur interne : 000515 ( Pmc/Corpus ); précédent : 000514; suivant : 000516

Signal Processing for Metagenomics: Extracting Information from the Soup

Auteurs : Gail L. Rosen ; Bahrad A. Sokhansanj ; Robi Polikar ; Mary Ann Bruns ; Jacob Russell ; Elaine Garbarine ; Steve Essinger ; Non Yok

Source :

RBID : PMC:2808676

Abstract

Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology.


Url:
DOI: 10.2174/138920209789208255
PubMed: 20436876
PubMed Central: 2808676

Links to Exploration step

PMC:2808676

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Signal Processing for Metagenomics: Extracting Information from the Soup</title>
<author>
<name sortKey="Rosen, Gail L" sort="Rosen, Gail L" uniqKey="Rosen G" first="Gail L." last="Rosen">Gail L. Rosen</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sokhansanj, Bahrad A" sort="Sokhansanj, Bahrad A" uniqKey="Sokhansanj B" first="Bahrad A." last="Sokhansanj">Bahrad A. Sokhansanj</name>
<affiliation>
<nlm:aff id="aff2">School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Polikar, Robi" sort="Polikar, Robi" uniqKey="Polikar R" first="Robi" last="Polikar">Robi Polikar</name>
<affiliation>
<nlm:aff id="aff3">Electrical and Computer Engineering Department, Rowan University, Glassboro, NJ, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bruns, Mary Ann" sort="Bruns, Mary Ann" uniqKey="Bruns M" first="Mary Ann" last="Bruns">Mary Ann Bruns</name>
<affiliation>
<nlm:aff id="aff4">Soil Science/Microbial Ecology, Pennsylvania State University, University Park, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Russell, Jacob" sort="Russell, Jacob" uniqKey="Russell J" first="Jacob" last="Russell">Jacob Russell</name>
<affiliation>
<nlm:aff id="aff5">Biology Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Garbarine, Elaine" sort="Garbarine, Elaine" uniqKey="Garbarine E" first="Elaine" last="Garbarine">Elaine Garbarine</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Essinger, Steve" sort="Essinger, Steve" uniqKey="Essinger S" first="Steve" last="Essinger">Steve Essinger</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yok, Non" sort="Yok, Non" uniqKey="Yok N" first="Non" last="Yok">Non Yok</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">20436876</idno>
<idno type="pmc">2808676</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2808676</idno>
<idno type="RBID">PMC:2808676</idno>
<idno type="doi">10.2174/138920209789208255</idno>
<date when="2009">2009</date>
<idno type="wicri:Area/Pmc/Corpus">000515</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Signal Processing for Metagenomics: Extracting Information from the Soup</title>
<author>
<name sortKey="Rosen, Gail L" sort="Rosen, Gail L" uniqKey="Rosen G" first="Gail L." last="Rosen">Gail L. Rosen</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sokhansanj, Bahrad A" sort="Sokhansanj, Bahrad A" uniqKey="Sokhansanj B" first="Bahrad A." last="Sokhansanj">Bahrad A. Sokhansanj</name>
<affiliation>
<nlm:aff id="aff2">School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Polikar, Robi" sort="Polikar, Robi" uniqKey="Polikar R" first="Robi" last="Polikar">Robi Polikar</name>
<affiliation>
<nlm:aff id="aff3">Electrical and Computer Engineering Department, Rowan University, Glassboro, NJ, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bruns, Mary Ann" sort="Bruns, Mary Ann" uniqKey="Bruns M" first="Mary Ann" last="Bruns">Mary Ann Bruns</name>
<affiliation>
<nlm:aff id="aff4">Soil Science/Microbial Ecology, Pennsylvania State University, University Park, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Russell, Jacob" sort="Russell, Jacob" uniqKey="Russell J" first="Jacob" last="Russell">Jacob Russell</name>
<affiliation>
<nlm:aff id="aff5">Biology Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Garbarine, Elaine" sort="Garbarine, Elaine" uniqKey="Garbarine E" first="Elaine" last="Garbarine">Elaine Garbarine</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Essinger, Steve" sort="Essinger, Steve" uniqKey="Essinger S" first="Steve" last="Essinger">Steve Essinger</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yok, Non" sort="Yok, Non" uniqKey="Yok N" first="Non" last="Yok">Non Yok</name>
<affiliation>
<nlm:aff id="aff1">Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Current Genomics</title>
<idno type="ISSN">1389-2029</idno>
<idno type="eISSN">1875-5488</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Handelsman, J" uniqKey="Handelsman J">J Handelsman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Swanson, Ms" uniqKey="Swanson M">MS Swanson</name>
</author>
<author>
<name sortKey="Hammer, Bk" uniqKey="Hammer B">BK Hammer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Han, Y W" uniqKey="Han Y">Y. W Han</name>
</author>
<author>
<name sortKey="Shen, T" uniqKey="Shen T">T Shen</name>
</author>
<author>
<name sortKey="Chung, P" uniqKey="Chung P">P Chung</name>
</author>
<author>
<name sortKey="Buhimschi, I A" uniqKey="Buhimschi I">I. A Buhimschi</name>
</author>
<author>
<name sortKey="Buhimschi, C S" uniqKey="Buhimschi C">C. S Buhimschi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aas, J A" uniqKey="Aas J">J. A Aas</name>
</author>
<author>
<name sortKey="Paster, B J" uniqKey="Paster B">B. J Paster</name>
</author>
<author>
<name sortKey="Stokes, L N" uniqKey="Stokes L">L. N Stokes</name>
</author>
<author>
<name sortKey="Olsen, I" uniqKey="Olsen I">I Olsen</name>
</author>
<author>
<name sortKey="Dewhirst, F E" uniqKey="Dewhirst F">F. E Dewhirst</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Amann, R I" uniqKey="Amann R">R. I Amann</name>
</author>
<author>
<name sortKey="Ludwig, W" uniqKey="Ludwig W">W Ludwig</name>
</author>
<author>
<name sortKey="Schleifer, K H" uniqKey="Schleifer K">K. H Schleifer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mardis, E R" uniqKey="Mardis E">E. R Mardis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M Pop</name>
</author>
<author>
<name sortKey="Salzberg, S L" uniqKey="Salzberg S">S. L Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bohannon, J" uniqKey="Bohannon J">J Bohannon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lukashin, A V" uniqKey="Lukashin A">A. V Lukashin</name>
</author>
<author>
<name sortKey="Borodovsky, M" uniqKey="Borodovsky M">M Borodovsky</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ley, Re" uniqKey="Ley R">RE Ley</name>
</author>
<author>
<name sortKey="Turnbaugh, P" uniqKey="Turnbaugh P">P Turnbaugh</name>
</author>
<author>
<name sortKey="Klein, S" uniqKey="Klein S">S Klein</name>
</author>
<author>
<name sortKey="Gordon, Ji" uniqKey="Gordon J">JI Gordon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turnbaugh, P J" uniqKey="Turnbaugh P">P. J Turnbaugh</name>
</author>
<author>
<name sortKey="Ley, R E" uniqKey="Ley R">R. E Ley</name>
</author>
<author>
<name sortKey="Hamady, M" uniqKey="Hamady M">M Hamady</name>
</author>
<author>
<name sortKey="Fraser Liggett, C M" uniqKey="Fraser Liggett C">C. M Fraser-Liggett</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
<author>
<name sortKey="Gordon, J I" uniqKey="Gordon J">J. I Gordon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gill, S R" uniqKey="Gill S">S. R Gill</name>
</author>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M Pop</name>
</author>
<author>
<name sortKey="Deboy, R T" uniqKey="Deboy R">R. T DeBoy</name>
</author>
<author>
<name sortKey="Eckburg, P B" uniqKey="Eckburg P">P. B Eckburg</name>
</author>
<author>
<name sortKey="Turnbaugh, P J" uniqKey="Turnbaugh P">P. J Turnbaugh</name>
</author>
<author>
<name sortKey="Samuel, B S" uniqKey="Samuel B">B. S Samuel</name>
</author>
<author>
<name sortKey="Gordon, J I" uniqKey="Gordon J">J. I Gordon</name>
</author>
<author>
<name sortKey="Relman, D A" uniqKey="Relman D">D. A Relman</name>
</author>
<author>
<name sortKey="Fraser Liggett, C M" uniqKey="Fraser Liggett C">C. M Fraser-Liggett</name>
</author>
<author>
<name sortKey="Nelson, K E" uniqKey="Nelson K">K. E Nelson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kurokawa, K" uniqKey="Kurokawa K">K Kurokawa</name>
</author>
<author>
<name sortKey="Itoh, T" uniqKey="Itoh T">T Itoh</name>
</author>
<author>
<name sortKey="Kuwahara, T" uniqKey="Kuwahara T">T Kuwahara</name>
</author>
<author>
<name sortKey="Oshima, K" uniqKey="Oshima K">K Oshima</name>
</author>
<author>
<name sortKey="Toh, H" uniqKey="Toh H">H Toh</name>
</author>
<author>
<name sortKey="Toyoda, A" uniqKey="Toyoda A">A Toyoda</name>
</author>
<author>
<name sortKey="Takami, H" uniqKey="Takami H">H Takami</name>
</author>
<author>
<name sortKey="Morita, H" uniqKey="Morita H">H Morita</name>
</author>
<author>
<name sortKey="Sharma, V K" uniqKey="Sharma V">V. K Sharma</name>
</author>
<author>
<name sortKey="Srivastava, T P" uniqKey="Srivastava T">T. P Srivastava</name>
</author>
<author>
<name sortKey="Taylor, T D" uniqKey="Taylor T">T. D Taylor</name>
</author>
<author>
<name sortKey="Noguchi, H" uniqKey="Noguchi H">H Noguchi</name>
</author>
<author>
<name sortKey="Mori, H" uniqKey="Mori H">H Mori</name>
</author>
<author>
<name sortKey="Ogura, Y" uniqKey="Ogura Y">Y Ogura</name>
</author>
<author>
<name sortKey="Ehrlich, D S" uniqKey="Ehrlich D">D. S Ehrlich</name>
</author>
<author>
<name sortKey="Itoh, K" uniqKey="Itoh K">K Itoh</name>
</author>
<author>
<name sortKey="Takagi, T" uniqKey="Takagi T">T Takagi</name>
</author>
<author>
<name sortKey="Sakaki, Y" uniqKey="Sakaki Y">Y Sakaki</name>
</author>
<author>
<name sortKey="Hayashi, T" uniqKey="Hayashi T">T Hayashi</name>
</author>
<author>
<name sortKey="Hattori, M" uniqKey="Hattori M">M Hattori</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Frank, Dn" uniqKey="Frank D">DN Frank</name>
</author>
<author>
<name sortKey="Pace, N R" uniqKey="Pace N">N. R Pace</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Andersson, A" uniqKey="Andersson A">A Andersson</name>
</author>
<author>
<name sortKey="Lindberg, M" uniqKey="Lindberg M">M Lindberg</name>
</author>
<author>
<name sortKey="Jakobsson, H" uniqKey="Jakobsson H">H Jakobsson</name>
</author>
<author>
<name sortKey="Backhed, F" uniqKey="Backhed F">F Backhed</name>
</author>
<author>
<name sortKey="Nyren, P" uniqKey="Nyren P">P Nyren</name>
</author>
<author>
<name sortKey="Engstrand, L" uniqKey="Engstrand L">L Engstrand</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Corby, P M" uniqKey="Corby P">P. M Corby</name>
</author>
<author>
<name sortKey="Lyons Weiler, J" uniqKey="Lyons Weiler J">J Lyons-Weiler</name>
</author>
<author>
<name sortKey="Bretz, W A" uniqKey="Bretz W">W. A Bretz</name>
</author>
<author>
<name sortKey="Hart, T C" uniqKey="Hart T">T. C Hart</name>
</author>
<author>
<name sortKey="Aas, J A" uniqKey="Aas J">J. A Aas</name>
</author>
<author>
<name sortKey="Boumenna, T" uniqKey="Boumenna T">T Boumenna</name>
</author>
<author>
<name sortKey="Goss, J" uniqKey="Goss J">J Goss</name>
</author>
<author>
<name sortKey="Corby, A L" uniqKey="Corby A">A. L Corby</name>
</author>
<author>
<name sortKey="Junior, H M" uniqKey="Junior H">H. M Junior</name>
</author>
<author>
<name sortKey="Weyant, R J" uniqKey="Weyant R">R. J Weyant</name>
</author>
<author>
<name sortKey="Paster, B J" uniqKey="Paster B">B. J Paster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Faveri, M" uniqKey="Faveri M">M Faveri</name>
</author>
<author>
<name sortKey="Mayer, M P" uniqKey="Mayer M">M. P Mayer</name>
</author>
<author>
<name sortKey="Feres, M" uniqKey="Feres M">M Feres</name>
</author>
<author>
<name sortKey="De Figueiredo, L C" uniqKey="De Figueiredo L">L. C de Figueiredo</name>
</author>
<author>
<name sortKey="Dewhirst, F E" uniqKey="Dewhirst F">F. E Dewhirst</name>
</author>
<author>
<name sortKey="Paster, B J" uniqKey="Paster B">B. J Paster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grice, E A" uniqKey="Grice E">E. A Grice</name>
</author>
<author>
<name sortKey="Kong, H H" uniqKey="Kong H">H. H Kong</name>
</author>
<author>
<name sortKey="Renaud, G" uniqKey="Renaud G">G Renaud</name>
</author>
<author>
<name sortKey="Young, A C" uniqKey="Young A">A. C Young</name>
</author>
<author>
<name sortKey="Program, Nc S" uniqKey="Program N">NC. S Program</name>
</author>
<author>
<name sortKey="Bouffard, G G" uniqKey="Bouffard G">G. G Bouffard</name>
</author>
<author>
<name sortKey="Blakesley, R W" uniqKey="Blakesley R">R. W Blakesley</name>
</author>
<author>
<name sortKey="Wolfsberg, T G" uniqKey="Wolfsberg T">T. G Wolfsberg</name>
</author>
<author>
<name sortKey="Turner, M L" uniqKey="Turner M">M. L Turner</name>
</author>
<author>
<name sortKey="Segre, J A" uniqKey="Segre J">J. A Segre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sundquist, A" uniqKey="Sundquist A">A Sundquist</name>
</author>
<author>
<name sortKey="Bigdeli, S" uniqKey="Bigdeli S">S Bigdeli</name>
</author>
<author>
<name sortKey="Jalili, R" uniqKey="Jalili R">R Jalili</name>
</author>
<author>
<name sortKey="Druzin, M L" uniqKey="Druzin M">M. L Druzin</name>
</author>
<author>
<name sortKey="Waller, S" uniqKey="Waller S">S Waller</name>
</author>
<author>
<name sortKey="Pullen, K M" uniqKey="Pullen K">K. M Pullen</name>
</author>
<author>
<name sortKey="El Sayed, Y" uniqKey="El Sayed Y">Y El-Sayed</name>
</author>
<author>
<name sortKey="Taslimi, M M" uniqKey="Taslimi M">M. M Taslimi</name>
</author>
<author>
<name sortKey="Batzoglou, S" uniqKey="Batzoglou S">S Batzoglou</name>
</author>
<author>
<name sortKey="Ronaghi, M" uniqKey="Ronaghi M">M Ronaghi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noordin, K" uniqKey="Noordin K">K Noordin</name>
</author>
<author>
<name sortKey="Kamin, S" uniqKey="Kamin S">S Kamin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fierer, N" uniqKey="Fierer N">N Fierer</name>
</author>
<author>
<name sortKey="Breitbart, M" uniqKey="Breitbart M">M Breitbart</name>
</author>
<author>
<name sortKey="Nulton, J" uniqKey="Nulton J">J Nulton</name>
</author>
<author>
<name sortKey="Salamon, P" uniqKey="Salamon P">P Salamon</name>
</author>
<author>
<name sortKey="Lozupone, C" uniqKey="Lozupone C">C Lozupone</name>
</author>
<author>
<name sortKey="Jones, R" uniqKey="Jones R">R Jones</name>
</author>
<author>
<name sortKey="Robeson, M" uniqKey="Robeson M">M Robeson</name>
</author>
<author>
<name sortKey="Edwards, R A" uniqKey="Edwards R">R. A Edwards</name>
</author>
<author>
<name sortKey="Felts, B" uniqKey="Felts B">B Felts</name>
</author>
<author>
<name sortKey="Rayhawk, S" uniqKey="Rayhawk S">S Rayhawk</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Jackson, R B" uniqKey="Jackson R">R. B Jackson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, S G" uniqKey="Tringe S">S. G Tringe</name>
</author>
<author>
<name sortKey="Von Mering, C" uniqKey="Von Mering C">C von Mering</name>
</author>
<author>
<name sortKey="Kobayashi, A" uniqKey="Kobayashi A">A Kobayashi</name>
</author>
<author>
<name sortKey="Salamov, A A" uniqKey="Salamov A">A. A Salamov</name>
</author>
<author>
<name sortKey="Chen, K" uniqKey="Chen K">K Chen</name>
</author>
<author>
<name sortKey="Chang, H W" uniqKey="Chang H">H. W Chang</name>
</author>
<author>
<name sortKey="Podar, M" uniqKey="Podar M">M Podar</name>
</author>
<author>
<name sortKey="Short, J M" uniqKey="Short J">J. M Short</name>
</author>
<author>
<name sortKey="Mathur, E J" uniqKey="Mathur E">E. J Mathur</name>
</author>
<author>
<name sortKey="Detter, J C" uniqKey="Detter J">J. C Detter</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Rubin, E M" uniqKey="Rubin E">E. M Rubin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nielsen, M N" uniqKey="Nielsen M">M. N Nielsen</name>
</author>
<author>
<name sortKey="Winding, A" uniqKey="Winding A">A Winding</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Elsas, J D" uniqKey="Van Elsas J">J. D van Elsas</name>
</author>
<author>
<name sortKey="Speksnijder, A J" uniqKey="Speksnijder A">A. J Speksnijder</name>
</author>
<author>
<name sortKey="Van Overbeek, L S" uniqKey="Van Overbeek L">L. S van Overbeek</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eyers, L" uniqKey="Eyers L">L Eyers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Demaneche, S" uniqKey="Demaneche S">S Demaneche</name>
</author>
<author>
<name sortKey="David, M M" uniqKey="David M">M. M David</name>
</author>
<author>
<name sortKey="Navarro, E" uniqKey="Navarro E">E Navarro</name>
</author>
<author>
<name sortKey="Simonet, P" uniqKey="Simonet P">P Simonet</name>
</author>
<author>
<name sortKey="Vogel, T M" uniqKey="Vogel T">T. M Vogel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fitch, J P" uniqKey="Fitch J">J. P Fitch</name>
</author>
<author>
<name sortKey="Raber, E" uniqKey="Raber E">E Raber</name>
</author>
<author>
<name sortKey="Imbro, D R" uniqKey="Imbro D">D. R Imbro</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Enserink, M" uniqKey="Enserink M">M Enserink</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Enserink, M" uniqKey="Enserink M">M Enserink</name>
</author>
<author>
<name sortKey="Bhattacharjee, Y" uniqKey="Bhattacharjee Y">Y Bhattacharjee</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blow, M J" uniqKey="Blow M">M. J Blow</name>
</author>
<author>
<name sortKey="Zhang, T" uniqKey="Zhang T">T Zhang</name>
</author>
<author>
<name sortKey="Woyke, T" uniqKey="Woyke T">T Woyke</name>
</author>
<author>
<name sortKey="Speller, C F" uniqKey="Speller C">C. F Speller</name>
</author>
<author>
<name sortKey="Krivoshapkin, A" uniqKey="Krivoshapkin A">A Krivoshapkin</name>
</author>
<author>
<name sortKey="Yang, D Y" uniqKey="Yang D">D. Y Yang</name>
</author>
<author>
<name sortKey="Derevianko, A" uniqKey="Derevianko A">A Derevianko</name>
</author>
<author>
<name sortKey="Rubin, E M" uniqKey="Rubin E">E. M Rubin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ho, S Y W" uniqKey="Ho S">S. Y. W Ho</name>
</author>
<author>
<name sortKey="Heupink, T H" uniqKey="Heupink T">T. H Heupink</name>
</author>
<author>
<name sortKey="Rambaut, A" uniqKey="Rambaut A">A Rambaut</name>
</author>
<author>
<name sortKey="Shapiro, B" uniqKey="Shapiro B">B Shapiro</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Margulies, M" uniqKey="Margulies M">M Margulies</name>
</author>
<author>
<name sortKey="Egholm, M" uniqKey="Egholm M">M Egholm</name>
</author>
<author>
<name sortKey="Altman, W E" uniqKey="Altman W">W. E Altman</name>
</author>
<author>
<name sortKey="Attiya, S" uniqKey="Attiya S">S Attiya</name>
</author>
<author>
<name sortKey="Bader, J S" uniqKey="Bader J">J. S Bader</name>
</author>
<author>
<name sortKey="Bemben, L A" uniqKey="Bemben L">L. A Bemben</name>
</author>
<author>
<name sortKey="Berka, J" uniqKey="Berka J">J Berka</name>
</author>
<author>
<name sortKey="Braverman, M S" uniqKey="Braverman M">M. S Braverman</name>
</author>
<author>
<name sortKey="Chen, Y J" uniqKey="Chen Y">Y. J Chen</name>
</author>
<author>
<name sortKey="Chen, Z" uniqKey="Chen Z">Z Chen</name>
</author>
<author>
<name sortKey="Dewell, S B" uniqKey="Dewell S">S. B Dewell</name>
</author>
<author>
<name sortKey="Du, L" uniqKey="Du L">L Du</name>
</author>
<author>
<name sortKey="Fierro, J M" uniqKey="Fierro J">J. M Fierro</name>
</author>
<author>
<name sortKey="Gomes, X V" uniqKey="Gomes X">X. V Gomes</name>
</author>
<author>
<name sortKey="Godwin, B C" uniqKey="Godwin B">B. C Godwin</name>
</author>
<author>
<name sortKey="He, W" uniqKey="He W">W He</name>
</author>
<author>
<name sortKey="Helgesen, S" uniqKey="Helgesen S">S Helgesen</name>
</author>
<author>
<name sortKey="Ho, C H" uniqKey="Ho C">C. H Ho</name>
</author>
<author>
<name sortKey="Irzyk, G P" uniqKey="Irzyk G">G. P Irzyk</name>
</author>
<author>
<name sortKey="Jando, S C" uniqKey="Jando S">S. C Jando</name>
</author>
<author>
<name sortKey="Alenquer, M L" uniqKey="Alenquer M">M. L Alenquer</name>
</author>
<author>
<name sortKey="Jarvie, T P" uniqKey="Jarvie T">T. P Jarvie</name>
</author>
<author>
<name sortKey="Jirage, K B" uniqKey="Jirage K">K. B Jirage</name>
</author>
<author>
<name sortKey="Kim, J B" uniqKey="Kim J">J. B Kim</name>
</author>
<author>
<name sortKey="Knight, J R" uniqKey="Knight J">J. R Knight</name>
</author>
<author>
<name sortKey="Lanza, J R" uniqKey="Lanza J">J. R Lanza</name>
</author>
<author>
<name sortKey="Leamon, J H" uniqKey="Leamon J">J. H Leamon</name>
</author>
<author>
<name sortKey="Lefkowitz, S M" uniqKey="Lefkowitz S">S. M Lefkowitz</name>
</author>
<author>
<name sortKey="Lei, M" uniqKey="Lei M">M Lei</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Lohman, K L" uniqKey="Lohman K">K. L Lohman</name>
</author>
<author>
<name sortKey="Lu, H" uniqKey="Lu H">H Lu</name>
</author>
<author>
<name sortKey="Makhijani, V B" uniqKey="Makhijani V">V. B Makhijani</name>
</author>
<author>
<name sortKey="Mcdade, K E" uniqKey="Mcdade K">K. E McDade</name>
</author>
<author>
<name sortKey="Mckenna, M P" uniqKey="Mckenna M">M. P McKenna</name>
</author>
<author>
<name sortKey="Myers, E W" uniqKey="Myers E">E. W Myers</name>
</author>
<author>
<name sortKey="Nickerson, E" uniqKey="Nickerson E">E Nickerson</name>
</author>
<author>
<name sortKey="Nobile, J R" uniqKey="Nobile J">J. R Nobile</name>
</author>
<author>
<name sortKey="Plant, R" uniqKey="Plant R">R Plant</name>
</author>
<author>
<name sortKey="Puc, B P" uniqKey="Puc B">B. P Puc</name>
</author>
<author>
<name sortKey="Ronan, M T" uniqKey="Ronan M">M. T Ronan</name>
</author>
<author>
<name sortKey="Roth, G T" uniqKey="Roth G">G. T Roth</name>
</author>
<author>
<name sortKey="Sarkis, G J" uniqKey="Sarkis G">G. J Sarkis</name>
</author>
<author>
<name sortKey="Simons, J F" uniqKey="Simons J">J. F Simons</name>
</author>
<author>
<name sortKey="Simpson, J W" uniqKey="Simpson J">J. W Simpson</name>
</author>
<author>
<name sortKey="Srinivasan, M" uniqKey="Srinivasan M">M Srinivasan</name>
</author>
<author>
<name sortKey="Tartaro, K R" uniqKey="Tartaro K">K. R Tartaro</name>
</author>
<author>
<name sortKey="Tomasz, A" uniqKey="Tomasz A">A Tomasz</name>
</author>
<author>
<name sortKey="Vogt, K A" uniqKey="Vogt K">K. A Vogt</name>
</author>
<author>
<name sortKey="Volkmer, G A" uniqKey="Volkmer G">G. A Volkmer</name>
</author>
<author>
<name sortKey="Wang, Sh" uniqKey="Wang S">SH Wang</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Weiner, M P" uniqKey="Weiner M">M. P Weiner</name>
</author>
<author>
<name sortKey="Yu, P" uniqKey="Yu P">P Yu</name>
</author>
<author>
<name sortKey="Begley, R F" uniqKey="Begley R">R. F Begley</name>
</author>
<author>
<name sortKey="Rothberg, J M" uniqKey="Rothberg J">J. M Rothberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Poinar, H N" uniqKey="Poinar H">H. N Poinar</name>
</author>
<author>
<name sortKey="Schwarz, C" uniqKey="Schwarz C">C Schwarz</name>
</author>
<author>
<name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
<author>
<name sortKey="Shapiro, B" uniqKey="Shapiro B">B Shapiro</name>
</author>
<author>
<name sortKey="Macphee, R D E" uniqKey="Macphee R">R. D. E MacPhee</name>
</author>
<author>
<name sortKey="Buigues, B" uniqKey="Buigues B">B Buigues</name>
</author>
<author>
<name sortKey="Tikhonov, A" uniqKey="Tikhonov A">A Tikhonov</name>
</author>
<author>
<name sortKey="Huson, D H" uniqKey="Huson D">D. H Huson</name>
</author>
<author>
<name sortKey="Tomsho, L P" uniqKey="Tomsho L">L. P Tomsho</name>
</author>
<author>
<name sortKey="Auch, A" uniqKey="Auch A">A Auch</name>
</author>
<author>
<name sortKey="Rampp, M" uniqKey="Rampp M">M Rampp</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Schuster, S C" uniqKey="Schuster S">S. C Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noonan, Jp" uniqKey="Noonan J">JP Noonan</name>
</author>
<author>
<name sortKey="Coop, G" uniqKey="Coop G">G Coop</name>
</author>
<author>
<name sortKey="Kudaravalli, S" uniqKey="Kudaravalli S">S Kudaravalli</name>
</author>
<author>
<name sortKey="Smith, D" uniqKey="Smith D">D Smith</name>
</author>
<author>
<name sortKey="Krause, J" uniqKey="Krause J">J Krause</name>
</author>
<author>
<name sortKey="Alessi, J" uniqKey="Alessi J">J Alessi</name>
</author>
<author>
<name sortKey="Chen, F" uniqKey="Chen F">F Chen</name>
</author>
<author>
<name sortKey="Platt, D" uniqKey="Platt D">D Platt</name>
</author>
<author>
<name sortKey="Paabo, S" uniqKey="Paabo S">S Paabo</name>
</author>
<author>
<name sortKey="Pritchard, Jk" uniqKey="Pritchard J">JK Pritchard</name>
</author>
<author>
<name sortKey="Rubin, Em" uniqKey="Rubin E">EM Rubin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
<author>
<name sortKey="Zhang, T" uniqKey="Zhang T">T Zhang</name>
</author>
<author>
<name sortKey="Liu, X" uniqKey="Liu X">X Liu</name>
</author>
<author>
<name sortKey="Yu, Y" uniqKey="Yu Y">Y Yu</name>
</author>
<author>
<name sortKey="Lee, Wh" uniqKey="Lee W">WH Lee</name>
</author>
<author>
<name sortKey="Yap, J" uniqKey="Yap J">J Yap</name>
</author>
<author>
<name sortKey="Yao, F" uniqKey="Yao F">F Yao</name>
</author>
<author>
<name sortKey="Suan, St" uniqKey="Suan S">ST Suan</name>
</author>
<author>
<name sortKey="Ing, Sk" uniqKey="Ing S">SK Ing</name>
</author>
<author>
<name sortKey="Haynes, M" uniqKey="Haynes M">M Haynes</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Wei, Cl" uniqKey="Wei C">CL Wei</name>
</author>
<author>
<name sortKey="Tan, P" uniqKey="Tan P">P Tan</name>
</author>
<author>
<name sortKey="Bristow, J" uniqKey="Bristow J">J Bristow</name>
</author>
<author>
<name sortKey="Rubin, Em" uniqKey="Rubin E">EM Rubin</name>
</author>
<author>
<name sortKey="Ruan, Y" uniqKey="Ruan Y">Y Ruan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bruns, Ma" uniqKey="Bruns M">MA Bruns</name>
</author>
<author>
<name sortKey="Scow, Km" uniqKey="Scow K">KM Scow</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Heath, Le" uniqKey="Heath L">LE Heath</name>
</author>
<author>
<name sortKey="Saunders, Va" uniqKey="Saunders V">VA Saunders</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sanger, F" uniqKey="Sanger F">F Sanger</name>
</author>
<author>
<name sortKey="Nicklen, S" uniqKey="Nicklen S">S Nicklen</name>
</author>
<author>
<name sortKey="Coulson, Ar" uniqKey="Coulson A">AR Coulson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wommack, Ke" uniqKey="Wommack K">KE Wommack</name>
</author>
<author>
<name sortKey="Bhavsar, J" uniqKey="Bhavsar J">J Bhavsar</name>
</author>
<author>
<name sortKey="Ravel, J" uniqKey="Ravel J">J Ravel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garcia Martinez, J" uniqKey="Garcia Martinez J">J Garcia-Martinez</name>
</author>
<author>
<name sortKey="Acinas, Sg" uniqKey="Acinas S">SG Acinas</name>
</author>
<author>
<name sortKey="Anton, Ai" uniqKey="Anton A">AI Anton</name>
</author>
<author>
<name sortKey="Rodriguez Valera, F" uniqKey="Rodriguez Valera F">F Rodriguez-Valera</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Macrae, A" uniqKey="Macrae A">A Macrae</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Harris, Ka" uniqKey="Harris K">KA Harris</name>
</author>
<author>
<name sortKey="Hartley, Jc" uniqKey="Hartley J">JC Hartley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Q" uniqKey="Wang Q">Q Wang</name>
</author>
<author>
<name sortKey="Garrity, G" uniqKey="Garrity G">G Garrity</name>
</author>
<author>
<name sortKey="Tiedje, Jm" uniqKey="Tiedje J">JM Tiedje</name>
</author>
<author>
<name sortKey="Cole, Jr" uniqKey="Cole J">JR Cole</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, Z" uniqKey="Liu Z">Z Liu</name>
</author>
<author>
<name sortKey="Desantis, Tz" uniqKey="Desantis T">TZ DeSantis</name>
</author>
<author>
<name sortKey="Andersen, Gl" uniqKey="Andersen G">GL Andersen</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Peplies, J" uniqKey="Peplies J">J Peplies</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO Glockner</name>
</author>
<author>
<name sortKey="Amann, R" uniqKey="Amann R">R Amann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Treimo, J" uniqKey="Treimo J">J Treimo</name>
</author>
<author>
<name sortKey="Vegarud, G" uniqKey="Vegarud G">G Vegarud</name>
</author>
<author>
<name sortKey="Langsrud, T" uniqKey="Langsrud T">T Langsrud</name>
</author>
<author>
<name sortKey="Marki, S" uniqKey="Marki S">S Marki</name>
</author>
<author>
<name sortKey="Rudi, K" uniqKey="Rudi K">K Rudi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Loy, A" uniqKey="Loy A">A Loy</name>
</author>
<author>
<name sortKey="Schulz, C" uniqKey="Schulz C">C Schulz</name>
</author>
<author>
<name sortKey="Lucker, S" uniqKey="Lucker S">S Lucker</name>
</author>
<author>
<name sortKey="Schopfer Wends, A" uniqKey="Schopfer Wends A">A Schopfer-Wends</name>
</author>
<author>
<name sortKey="Stoecker, K" uniqKey="Stoecker K">K Stoecker</name>
</author>
<author>
<name sortKey="Baranyi, C" uniqKey="Baranyi C">C Baranyi</name>
</author>
<author>
<name sortKey="Lehner, A" uniqKey="Lehner A">A Lehner</name>
</author>
<author>
<name sortKey="Wagner, M" uniqKey="Wagner M">M Wagner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Maron, P A" uniqKey="Maron P">P-A Maron</name>
</author>
<author>
<name sortKey="Ranjard, L" uniqKey="Ranjard L">L Ranjard</name>
</author>
<author>
<name sortKey="Mougel, C" uniqKey="Mougel C">C Mougel</name>
</author>
<author>
<name sortKey="Lemanceau, P" uniqKey="Lemanceau P">P Lemanceau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schulze, Wx" uniqKey="Schulze W">WX Schulze</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kan, J" uniqKey="Kan J">J Kan</name>
</author>
<author>
<name sortKey="Hanson, Te" uniqKey="Hanson T">TE Hanson</name>
</author>
<author>
<name sortKey="Ginter, Jm" uniqKey="Ginter J">JM Ginter</name>
</author>
<author>
<name sortKey="Wang, K" uniqKey="Wang K">K Wang</name>
</author>
<author>
<name sortKey="Chen, F" uniqKey="Chen F">F Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lacerda, Cmr" uniqKey="Lacerda C">CMR Lacerda</name>
</author>
<author>
<name sortKey="Choe, Lh" uniqKey="Choe L">LH Choe</name>
</author>
<author>
<name sortKey="Reardon, Kf" uniqKey="Reardon K">KF Reardon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benndorf, D" uniqKey="Benndorf D">D Benndorf</name>
</author>
<author>
<name sortKey="Balcke, Gu" uniqKey="Balcke G">GU Balcke</name>
</author>
<author>
<name sortKey="Harms, H" uniqKey="Harms H">H Harms</name>
</author>
<author>
<name sortKey="Von Bergen, M" uniqKey="Von Bergen M">M von Bergen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Foerstner, Ku" uniqKey="Foerstner K">KU Foerstner</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Valdivia Granda, W" uniqKey="Valdivia Granda W">W Valdivia-Granda</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rusch, Db" uniqKey="Rusch D">DB Rusch</name>
</author>
<author>
<name sortKey="Halpern, Al" uniqKey="Halpern A">AL Halpern</name>
</author>
<author>
<name sortKey="Sutton, G" uniqKey="Sutton G">G Sutton</name>
</author>
<author>
<name sortKey="Heidelberg, Kb" uniqKey="Heidelberg K">KB Heidelberg</name>
</author>
<author>
<name sortKey="Williamson, S" uniqKey="Williamson S">S Williamson</name>
</author>
<author>
<name sortKey="Yooseph, S" uniqKey="Yooseph S">S Yooseph</name>
</author>
<author>
<name sortKey="Wu, D" uniqKey="Wu D">D Wu</name>
</author>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
<author>
<name sortKey="Hoffman, Jm" uniqKey="Hoffman J">JM Hoffman</name>
</author>
<author>
<name sortKey="Remington, K" uniqKey="Remington K">K Remington</name>
</author>
<author>
<name sortKey="Beeson, K" uniqKey="Beeson K">K Beeson</name>
</author>
<author>
<name sortKey="Tran, B" uniqKey="Tran B">B Tran</name>
</author>
<author>
<name sortKey="Smith, H" uniqKey="Smith H">H Smith</name>
</author>
<author>
<name sortKey="Baden Tillson, H" uniqKey="Baden Tillson H">H Baden-Tillson</name>
</author>
<author>
<name sortKey="Stewart, C" uniqKey="Stewart C">C Stewart</name>
</author>
<author>
<name sortKey="Thorpe, J" uniqKey="Thorpe J">J Thorpe</name>
</author>
<author>
<name sortKey="Freeman, J" uniqKey="Freeman J">J Freeman</name>
</author>
<author>
<name sortKey="Andrews Pfannkoch, C" uniqKey="Andrews Pfannkoch C">C Andrews-Pfannkoch</name>
</author>
<author>
<name sortKey="Venter, Je" uniqKey="Venter J">JE Venter</name>
</author>
<author>
<name sortKey="Li, K" uniqKey="Li K">K Li</name>
</author>
<author>
<name sortKey="Kravitz, S" uniqKey="Kravitz S">S Kravitz</name>
</author>
<author>
<name sortKey="Heidelberg, Jf" uniqKey="Heidelberg J">JF Heidelberg</name>
</author>
<author>
<name sortKey="Utterback, T" uniqKey="Utterback T">T Utterback</name>
</author>
<author>
<name sortKey="Rogers, Y H" uniqKey="Rogers Y">Y-H Rogers</name>
</author>
<author>
<name sortKey="Falcon, Li" uniqKey="Falcon L">LI Falcon</name>
</author>
<author>
<name sortKey="Souza, V" uniqKey="Souza V">V Souza</name>
</author>
<author>
<name sortKey="Bonilla Rosso, G" uniqKey="Bonilla Rosso G">G Bonilla-Rosso</name>
</author>
<author>
<name sortKey="Eguiarte, Le" uniqKey="Eguiarte L">LE Eguiarte</name>
</author>
<author>
<name sortKey="Karl, Dm" uniqKey="Karl D">DM Karl</name>
</author>
<author>
<name sortKey="Sathyendranath, S" uniqKey="Sathyendranath S">S Sathyendranath</name>
</author>
<author>
<name sortKey="Platt, T" uniqKey="Platt T">T Platt</name>
</author>
<author>
<name sortKey="Bermingham, E" uniqKey="Bermingham E">E Bermingham</name>
</author>
<author>
<name sortKey="Gallardo, V" uniqKey="Gallardo V">V Gallardo</name>
</author>
<author>
<name sortKey="Tamayo Castillo, G" uniqKey="Tamayo Castillo G">G Tamayo-Castillo</name>
</author>
<author>
<name sortKey="Ferrari, Mr" uniqKey="Ferrari M">MR Ferrari</name>
</author>
<author>
<name sortKey="Strausberg, Rl" uniqKey="Strausberg R">RL Strausberg</name>
</author>
<author>
<name sortKey="Nealson, K" uniqKey="Nealson K">K Nealson</name>
</author>
<author>
<name sortKey="Friedman, R" uniqKey="Friedman R">R Friedman</name>
</author>
<author>
<name sortKey="Frazier, M" uniqKey="Frazier M">M Frazier</name>
</author>
<author>
<name sortKey="Venter, Jc" uniqKey="Venter J">JC Venter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gianoulis, Ta" uniqKey="Gianoulis T">TA Gianoulis</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Patel, Pv" uniqKey="Patel P">PV Patel</name>
</author>
<author>
<name sortKey="Bjornson, R" uniqKey="Bjornson R">R Bjornson</name>
</author>
<author>
<name sortKey="Korbel, Jo" uniqKey="Korbel J">JO Korbel</name>
</author>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Yamada, T" uniqKey="Yamada T">T Yamada</name>
</author>
<author>
<name sortKey="Paccanaro, A" uniqKey="Paccanaro A">A Paccanaro</name>
</author>
<author>
<name sortKey="Jensen, Lj" uniqKey="Jensen L">LJ Jensen</name>
</author>
<author>
<name sortKey="Snyder, M" uniqKey="Snyder M">M Snyder</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
<author>
<name sortKey="Gerstein, Mb" uniqKey="Gerstein M">MB Gerstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Martin, Hg" uniqKey="Martin H">HG Martin</name>
</author>
<author>
<name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A Tsirigos</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
<author>
<name sortKey="Garbarine, Em" uniqKey="Garbarine E">EM Garbarine</name>
</author>
<author>
<name sortKey="Caseiro, Da" uniqKey="Caseiro D">DA Caseiro</name>
</author>
<author>
<name sortKey="Polikar, R" uniqKey="Polikar R">R Polikar</name>
</author>
<author>
<name sortKey="Sokhansanj, Ba" uniqKey="Sokhansanj B">BA Sokhansanj</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Curtis, Tp" uniqKey="Curtis T">TP Curtis</name>
</author>
<author>
<name sortKey="Sloan, Wt" uniqKey="Sloan W">WT Sloan</name>
</author>
<author>
<name sortKey="Scannell, Jw" uniqKey="Scannell J">JW Scannell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, De" uniqKey="Huson D">DE Huson</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sandberg, R" uniqKey="Sandberg R">R Sandberg</name>
</author>
<author>
<name sortKey="Winberg, G" uniqKey="Winberg G">G Winberg</name>
</author>
<author>
<name sortKey="Branden, Ci" uniqKey="Branden C">CI Branden</name>
</author>
<author>
<name sortKey="Kaske, A" uniqKey="Kaske A">A Kaske</name>
</author>
<author>
<name sortKey="Ernberg, I" uniqKey="Ernberg I">I Ernberg</name>
</author>
<author>
<name sortKey="Coster, J" uniqKey="Coster J">J Coster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Koski, Lb" uniqKey="Koski L">LB Koski</name>
</author>
<author>
<name sortKey="Golding, Gb" uniqKey="Golding G">GB Golding</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Venter, Jc" uniqKey="Venter J">JC Venter</name>
</author>
<author>
<name sortKey="Remington, K" uniqKey="Remington K">K Remington</name>
</author>
<author>
<name sortKey="Heidelberg, Jf" uniqKey="Heidelberg J">JF Heidelberg</name>
</author>
<author>
<name sortKey="Halpern, Al" uniqKey="Halpern A">AL Halpern</name>
</author>
<author>
<name sortKey="Rusch, D" uniqKey="Rusch D">D Rusch</name>
</author>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
<author>
<name sortKey="Wu, D" uniqKey="Wu D">D Wu</name>
</author>
<author>
<name sortKey="Paulsen, I" uniqKey="Paulsen I">I Paulsen</name>
</author>
<author>
<name sortKey="Nelson, Ke" uniqKey="Nelson K">KE Nelson</name>
</author>
<author>
<name sortKey="Nelson, W" uniqKey="Nelson W">W Nelson</name>
</author>
<author>
<name sortKey="Fouts, De" uniqKey="Fouts D">DE Fouts</name>
</author>
<author>
<name sortKey="Levy, S" uniqKey="Levy S">S Levy</name>
</author>
<author>
<name sortKey="Knap, Ah" uniqKey="Knap A">AH Knap</name>
</author>
<author>
<name sortKey="Lomas, Mw" uniqKey="Lomas M">MW Lomas</name>
</author>
<author>
<name sortKey="Nealson, K" uniqKey="Nealson K">K Nealson</name>
</author>
<author>
<name sortKey="White, O" uniqKey="White O">O White</name>
</author>
<author>
<name sortKey="Peterson, J" uniqKey="Peterson J">J Peterson</name>
</author>
<author>
<name sortKey="Hoffman, J" uniqKey="Hoffman J">J Hoffman</name>
</author>
<author>
<name sortKey="Parsons, R" uniqKey="Parsons R">R Parsons</name>
</author>
<author>
<name sortKey="Baden Tillson, H" uniqKey="Baden Tillson H">H Baden-Tillson</name>
</author>
<author>
<name sortKey="Pfannkoch, C" uniqKey="Pfannkoch C">C Pfannkoch</name>
</author>
<author>
<name sortKey="Rogers, Y H" uniqKey="Rogers Y">Y-H Rogers</name>
</author>
<author>
<name sortKey="Smith, Ho" uniqKey="Smith H">HO Smith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Havre, Sl" uniqKey="Havre S">SL Havre</name>
</author>
<author>
<name sortKey="Webb Robertson, B J" uniqKey="Webb Robertson B">B-J Webb-Robertson</name>
</author>
<author>
<name sortKey="Shah, A" uniqKey="Shah A">A Shah</name>
</author>
<author>
<name sortKey="Posse, C" uniqKey="Posse C">C Posse</name>
</author>
<author>
<name sortKey="Gopalan, B" uniqKey="Gopalan B">B Gopalan</name>
</author>
<author>
<name sortKey="Brockman, Fj" uniqKey="Brockman F">FJ Brockman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Neph, S" uniqKey="Neph S">S Neph</name>
</author>
<author>
<name sortKey="Tompa, M" uniqKey="Tompa M">M Tompa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pignatelli, M" uniqKey="Pignatelli M">M Pignatelli</name>
</author>
<author>
<name sortKey="Aparicio, G" uniqKey="Aparicio G">G Aparicio</name>
</author>
<author>
<name sortKey="Blanquer, I" uniqKey="Blanquer I">I Blanquer</name>
</author>
<author>
<name sortKey="Hernandez, V" uniqKey="Hernandez V">V Hernandez</name>
</author>
<author>
<name sortKey="Moya, A" uniqKey="Moya A">A Moya</name>
</author>
<author>
<name sortKey="Tamames, J" uniqKey="Tamames J">J Tamames</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tress, Ml" uniqKey="Tress M">ML Tress</name>
</author>
<author>
<name sortKey="Cozzetto, D" uniqKey="Cozzetto D">D Cozzetto</name>
</author>
<author>
<name sortKey="Tramontano, A" uniqKey="Tramontano A">A Tramontano</name>
</author>
<author>
<name sortKey="Valencia, A" uniqKey="Valencia A">A Valencia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Manichanh, C" uniqKey="Manichanh C">C Manichanh</name>
</author>
<author>
<name sortKey="Chapple, Ce" uniqKey="Chapple C">CE Chapple</name>
</author>
<author>
<name sortKey="Frangeul, L" uniqKey="Frangeul L">L Frangeul</name>
</author>
<author>
<name sortKey="Gloux, K" uniqKey="Gloux K">K Gloux</name>
</author>
<author>
<name sortKey="Guigo, R" uniqKey="Guigo R">R Guigo</name>
</author>
<author>
<name sortKey="Dore, J" uniqKey="Dore J">J Dore</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Barry, K" uniqKey="Barry K">K Barry</name>
</author>
<author>
<name sortKey="Shapiro, H" uniqKey="Shapiro H">H Shapiro</name>
</author>
<author>
<name sortKey="Goltsman, E" uniqKey="Goltsman E">E Goltsman</name>
</author>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
<author>
<name sortKey="Salamov, A" uniqKey="Salamov A">A Salamov</name>
</author>
<author>
<name sortKey="Korzeniewski, F" uniqKey="Korzeniewski F">F Korzeniewski</name>
</author>
<author>
<name sortKey="Land, M" uniqKey="Land M">M Land</name>
</author>
<author>
<name sortKey="Lapidus, A" uniqKey="Lapidus A">A Lapidus</name>
</author>
<author>
<name sortKey="Grigoriev, I" uniqKey="Grigoriev I">I Grigoriev</name>
</author>
<author>
<name sortKey="Richardson, P" uniqKey="Richardson P">P Richardson</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Kyripides, Nc" uniqKey="Kyripides N">NC Kyripides</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
<author>
<name sortKey="Burge, C" uniqKey="Burge C">C Burge</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
<author>
<name sortKey="Mrazek, J" uniqKey="Mrazek J">J Mrazek</name>
</author>
<author>
<name sortKey="Campbell, Am" uniqKey="Campbell A">AM Campbell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nakashima, H" uniqKey="Nakashima H">H Nakashima</name>
</author>
<author>
<name sortKey="Ota, M" uniqKey="Ota M">M Ota</name>
</author>
<author>
<name sortKey="Nishikawa, K" uniqKey="Nishikawa K">K Nishikawa</name>
</author>
<author>
<name sortKey="Ooi, T" uniqKey="Ooi T">T Ooi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deschavanne, Pj" uniqKey="Deschavanne P">PJ Deschavanne</name>
</author>
<author>
<name sortKey="Giron, A" uniqKey="Giron A">A Giron</name>
</author>
<author>
<name sortKey="Vilain, J" uniqKey="Vilain J">J Vilain</name>
</author>
<author>
<name sortKey="Fagot, G" uniqKey="Fagot G">G Fagot</name>
</author>
<author>
<name sortKey="Fertil, B" uniqKey="Fertil B">B Fertil</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Abe, T" uniqKey="Abe T">T Abe</name>
</author>
<author>
<name sortKey="Kanaya, S" uniqKey="Kanaya S">S Kanaya</name>
</author>
<author>
<name sortKey="Kinouchi, M" uniqKey="Kinouchi M">M Kinouchi</name>
</author>
<author>
<name sortKey="Ichiba, Y" uniqKey="Ichiba Y">Y Ichiba</name>
</author>
<author>
<name sortKey="Kozuki, T" uniqKey="Kozuki T">T Kozuki</name>
</author>
<author>
<name sortKey="Ikemura, T" uniqKey="Ikemura T">T Ikemura</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pride, Dt" uniqKey="Pride D">DT Pride</name>
</author>
<author>
<name sortKey="Meinersmann, Rj" uniqKey="Meinersmann R">RJ Meinersmann</name>
</author>
<author>
<name sortKey="Wassenaar, Tm" uniqKey="Wassenaar T">TM Wassenaar</name>
</author>
<author>
<name sortKey="Blaser, Mj" uniqKey="Blaser M">MJ Blaser</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H Teeling</name>
</author>
<author>
<name sortKey="Waldmann, J" uniqKey="Waldmann J">J Waldmann</name>
</author>
<author>
<name sortKey="Lombardot, T" uniqKey="Lombardot T">T Lombardot</name>
</author>
<author>
<name sortKey="Bauer, M" uniqKey="Bauer M">M Bauer</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO Glockner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Abe, T" uniqKey="Abe T">T Abe</name>
</author>
<author>
<name sortKey="Sugawara, H" uniqKey="Sugawara H">H Sugawara</name>
</author>
<author>
<name sortKey="Kinouchi, M" uniqKey="Kinouchi M">M Kinouchi</name>
</author>
<author>
<name sortKey="Kanaya, S" uniqKey="Kanaya S">S Kanaya</name>
</author>
<author>
<name sortKey="Ikemura, T" uniqKey="Ikemura T">T Ikemura</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fertil, B" uniqKey="Fertil B">B Fertil</name>
</author>
<author>
<name sortKey="Massin, M" uniqKey="Massin M">M Massin</name>
</author>
<author>
<name sortKey="Lespinats, S" uniqKey="Lespinats S">S Lespinats</name>
</author>
<author>
<name sortKey="Devic, C" uniqKey="Devic C">C Devic</name>
</author>
<author>
<name sortKey="Dumee, P" uniqKey="Dumee P">P Dumee</name>
</author>
<author>
<name sortKey="Giron, A" uniqKey="Giron A">A Giron</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Akhtar, M" uniqKey="Akhtar M">M Akhtar</name>
</author>
<author>
<name sortKey="Epps, J" uniqKey="Epps J">J Epps</name>
</author>
<author>
<name sortKey="Ambikairajah, E" uniqKey="Ambikairajah E">E Ambikairajah</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garbarine, E" uniqKey="Garbarine E">E Garbarine</name>
</author>
<author>
<name sortKey="Rosen, G" uniqKey="Rosen G">G Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gadia, V" uniqKey="Gadia V">V Gadia</name>
</author>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, K" uniqKey="Chen K">K Chen</name>
</author>
<author>
<name sortKey="Pachter, L" uniqKey="Pachter L">L Pachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chan, C K" uniqKey="Chan C">C-K Chan</name>
</author>
<author>
<name sortKey="Hsu, Al" uniqKey="Hsu A">AL Hsu</name>
</author>
<author>
<name sortKey="Tang, Sl" uniqKey="Tang S">SL Tang</name>
</author>
<author>
<name sortKey="Halgamuge, Sk" uniqKey="Halgamuge S">SK Halgamuge</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chatterji, S" uniqKey="Chatterji S">S Chatterji</name>
</author>
<author>
<name sortKey="Yamazaki, I" uniqKey="Yamazaki I">I Yamazaki</name>
</author>
<author>
<name sortKey="Bai, Z" uniqKey="Bai Z">Z Bai</name>
</author>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nasser, S" uniqKey="Nasser S">S Nasser</name>
</author>
<author>
<name sortKey="Breland, A" uniqKey="Breland A">A Breland</name>
</author>
<author>
<name sortKey="Harris, F C" uniqKey="Harris F">F C Harris</name>
</author>
<author>
<name sortKey="Nicolescu, M" uniqKey="Nicolescu M">M Nicolescu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Wooley, Jc" uniqKey="Wooley J">JC Wooley</name>
</author>
<author>
<name sortKey="Godzik, A" uniqKey="Godzik A">A Godzik</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Harrison, Cj" uniqKey="Harrison C">CJ Harrison</name>
</author>
<author>
<name sortKey="Langdale, Ja" uniqKey="Langdale J">JA Langdale</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nye, Tmw" uniqKey="Nye T">TMW Nye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gevers, D" uniqKey="Gevers D">D Gevers</name>
</author>
<author>
<name sortKey="Cohan, Fm" uniqKey="Cohan F">FM Cohan</name>
</author>
<author>
<name sortKey="Lawrence, Jg" uniqKey="Lawrence J">JG Lawrence</name>
</author>
<author>
<name sortKey="Spratt, Bg" uniqKey="Spratt B">BG Spratt</name>
</author>
<author>
<name sortKey="Coenye, T" uniqKey="Coenye T">T Coenye</name>
</author>
<author>
<name sortKey="Feil, Ej" uniqKey="Feil E">EJ Feil</name>
</author>
<author>
<name sortKey="Stackebrandt, E" uniqKey="Stackebrandt E">E Stackebrandt</name>
</author>
<author>
<name sortKey="Van De Peer, Y" uniqKey="Van De Peer Y">Y Van de Peer</name>
</author>
<author>
<name sortKey="Vandamme, P" uniqKey="Vandamme P">P Vandamme</name>
</author>
<author>
<name sortKey="Thompson, Fl" uniqKey="Thompson F">FL Thompson</name>
</author>
<author>
<name sortKey="Swings, J" uniqKey="Swings J">J Swings</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hagstrom, A" uniqKey="Hagstrom A">A Hagstrom</name>
</author>
<author>
<name sortKey="Pommier, T" uniqKey="Pommier T">T Pommier</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Simu, K" uniqKey="Simu K">K Simu</name>
</author>
<author>
<name sortKey="Stolte, W" uniqKey="Stolte W">W Stolte</name>
</author>
<author>
<name sortKey="Svensson, D" uniqKey="Svensson D">D Svensson</name>
</author>
<author>
<name sortKey="Zweifel, Ul" uniqKey="Zweifel U">UL Zweifel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hazen, Rm" uniqKey="Hazen R">RM Hazen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krause, L" uniqKey="Krause L">L Krause</name>
</author>
<author>
<name sortKey="Diaz, Nn" uniqKey="Diaz N">NN Diaz</name>
</author>
<author>
<name sortKey="Goesmann, A" uniqKey="Goesmann A">A Goesmann</name>
</author>
<author>
<name sortKey="Kelley, S" uniqKey="Kelley S">S Kelley</name>
</author>
<author>
<name sortKey="Nattkemper, Tw" uniqKey="Nattkemper T">TW Nattkemper</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Edwards, Ra" uniqKey="Edwards R">RA Edwards</name>
</author>
<author>
<name sortKey="Stoye, J" uniqKey="Stoye J">J Stoye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thompson, Jd" uniqKey="Thompson J">JD Thompson</name>
</author>
<author>
<name sortKey="Plewniak, F" uniqKey="Plewniak F">F Plewniak</name>
</author>
<author>
<name sortKey="Poch, O" uniqKey="Poch O">O Poch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Higgins, Dg" uniqKey="Higgins D">DG Higgins</name>
</author>
<author>
<name sortKey="Sharp, Pm" uniqKey="Sharp P">PM Sharp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tamura, K" uniqKey="Tamura K">K Tamura</name>
</author>
<author>
<name sortKey="Dudley, J" uniqKey="Dudley J">J Dudley</name>
</author>
<author>
<name sortKey="Nei, M" uniqKey="Nei M">M Nei</name>
</author>
<author>
<name sortKey="Kumar, S" uniqKey="Kumar S">S Kumar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Swofford, Dl" uniqKey="Swofford D">DL Swofford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huelsenbeck, J" uniqKey="Huelsenbeck J">J Huelsenbeck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Felsenstein, J" uniqKey="Felsenstein J">J Felsenstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lozupone, C" uniqKey="Lozupone C">C Lozupone</name>
</author>
<author>
<name sortKey="Hamady, M" uniqKey="Hamady M">M Hamady</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jeanmougin, F" uniqKey="Jeanmougin F">F Jeanmougin</name>
</author>
<author>
<name sortKey="Thompson, Jd" uniqKey="Thompson J">JD Thompson</name>
</author>
<author>
<name sortKey="Gouy, M" uniqKey="Gouy M">M Gouy</name>
</author>
<author>
<name sortKey="Higgins, Dg" uniqKey="Higgins D">DG Higgins</name>
</author>
<author>
<name sortKey="Gibson, Tj" uniqKey="Gibson T">TJ Gibson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sneath, Pha" uniqKey="Sneath P">PHA Sneath</name>
</author>
<author>
<name sortKey="Sokal, Rr" uniqKey="Sokal R">RR Sokal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gaut, Bs" uniqKey="Gaut B">BS Gaut</name>
</author>
<author>
<name sortKey="Lewis, Po" uniqKey="Lewis P">PO Lewis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, Z" uniqKey="Yang Z">Z Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hobolth, A" uniqKey="Hobolth A">A Hobolth</name>
</author>
<author>
<name sortKey="Yoshida, R" uniqKey="Yoshida R">R Yoshida</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Q" uniqKey="Wang Q">Q Wang</name>
</author>
<author>
<name sortKey="Salter, La" uniqKey="Salter L">LA Salter</name>
</author>
<author>
<name sortKey="Pearl, Dk" uniqKey="Pearl D">DK Pearl</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jukes, Th" uniqKey="Jukes T">TH Jukes</name>
</author>
<author>
<name sortKey="Cantor, C" uniqKey="Cantor C">C Cantor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ripplinger, J" uniqKey="Ripplinger J">J Ripplinger</name>
</author>
<author>
<name sortKey="Sullivan, J" uniqKey="Sullivan J">J Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saitou, N" uniqKey="Saitou N">N Saitou</name>
</author>
<author>
<name sortKey="Nei, M" uniqKey="Nei M">M Nei</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Martin, Ap" uniqKey="Martin A">AP Martin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sokal, R" uniqKey="Sokal R">R Sokal</name>
</author>
<author>
<name sortKey="Rohlf, F" uniqKey="Rohlf F">F Rohlf</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fitch, Jp" uniqKey="Fitch J">JP Fitch</name>
</author>
<author>
<name sortKey="Sokhansanj, B" uniqKey="Sokhansanj B">B Sokhansanj</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sheikh, M A" uniqKey="Sheikh M">M A Sheikh</name>
</author>
<author>
<name sortKey="Milenkov, O" uniqKey="Milenkov O">O Milenkov</name>
</author>
<author>
<name sortKey="Baraniuk, R G" uniqKey="Baraniuk R">R G Baraniuk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gingell, T" uniqKey="Gingell T">T Gingell</name>
</author>
<author>
<name sortKey="Lewis, C" uniqKey="Lewis C">C Lewis</name>
</author>
<author>
<name sortKey="Kowahl, N" uniqKey="Kowahl N">N Kowahl</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yok, N" uniqKey="Yok N">N Yok</name>
</author>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Vikalo, H" uniqKey="Vikalo H">H Vikalo</name>
</author>
<author>
<name sortKey="Parvresh, F" uniqKey="Parvresh F">F Parvresh</name>
</author>
<author>
<name sortKey="Misra, S" uniqKey="Misra S">S Misra</name>
</author>
<author>
<name sortKey="Hassibi, B" uniqKey="Hassibi B">B Hassibi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schliep, A" uniqKey="Schliep A">A Schliep</name>
</author>
<author>
<name sortKey="Torney, D" uniqKey="Torney D">D Torney</name>
</author>
<author>
<name sortKey="Rahmann, S" uniqKey="Rahmann S">S Rahmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jones, Bv" uniqKey="Jones B">BV Jones</name>
</author>
<author>
<name sortKey="Begley, M" uniqKey="Begley M">M Begley</name>
</author>
<author>
<name sortKey="Hill, C" uniqKey="Hill C">C Hill</name>
</author>
<author>
<name sortKey="Gahan, Cgm" uniqKey="Gahan C">CGM Gahan</name>
</author>
<author>
<name sortKey="Marchesi, Jr" uniqKey="Marchesi J">JR Marchesi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Elshahed, Ms" uniqKey="Elshahed M">MS Elshahed</name>
</author>
<author>
<name sortKey="Youssef, Nh" uniqKey="Youssef N">NH Youssef</name>
</author>
<author>
<name sortKey="Spain, Am" uniqKey="Spain A">AM Spain</name>
</author>
<author>
<name sortKey="Sheik, C" uniqKey="Sheik C">C Sheik</name>
</author>
<author>
<name sortKey="Najar, Fz" uniqKey="Najar F">FZ Najar</name>
</author>
<author>
<name sortKey="Sukharnikov, L O" uniqKey="Sukharnikov L">L O Sukharnikov</name>
</author>
<author>
<name sortKey="Roe, Ba" uniqKey="Roe B">BA Roe</name>
</author>
<author>
<name sortKey="Davis, Jp" uniqKey="Davis J">JP Davis</name>
</author>
<author>
<name sortKey="Schloss, Pd" uniqKey="Schloss P">PD Schloss</name>
</author>
<author>
<name sortKey="Bailey, Vl" uniqKey="Bailey V">VL Bailey</name>
</author>
<author>
<name sortKey="Krumholz, Lr" uniqKey="Krumholz L">LR Krumholz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fierer, N" uniqKey="Fierer N">N Fierer</name>
</author>
<author>
<name sortKey="Jackson, Rb" uniqKey="Jackson R">RB Jackson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Allison, Sd" uniqKey="Allison S">SD Allison</name>
</author>
<author>
<name sortKey="Martiny, Jbh" uniqKey="Martiny J">JBH Martiny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF DeLong</name>
</author>
<author>
<name sortKey="Preston, Cm" uniqKey="Preston C">CM Preston</name>
</author>
<author>
<name sortKey="Mincer, T" uniqKey="Mincer T">T Mincer</name>
</author>
<author>
<name sortKey="Rich, V" uniqKey="Rich V">V Rich</name>
</author>
<author>
<name sortKey="Hallam, Sj" uniqKey="Hallam S">SJ Hallam</name>
</author>
<author>
<name sortKey="Frigaard, N U" uniqKey="Frigaard N">N-U Frigaard</name>
</author>
<author>
<name sortKey="Martinez, A" uniqKey="Martinez A">A Martinez</name>
</author>
<author>
<name sortKey="Sullivan, Mb" uniqKey="Sullivan M">MB Sullivan</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R Edwards</name>
</author>
<author>
<name sortKey="Rodriguez Brito, B" uniqKey="Rodriguez Brito B">B Rodriguez Brito</name>
</author>
<author>
<name sortKey="Chisholm, Sw" uniqKey="Chisholm S">SW Chisholm</name>
</author>
<author>
<name sortKey="Karl, Dm" uniqKey="Karl D">DM Karl</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kulp, D" uniqKey="Kulp D">D Kulp</name>
</author>
<author>
<name sortKey="Haussler, D" uniqKey="Haussler D">D Haussler</name>
</author>
<author>
<name sortKey="Reese, Mg" uniqKey="Reese M">MG Reese</name>
</author>
<author>
<name sortKey="Eeckman, Fh" uniqKey="Eeckman F">FH Eeckman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burge, C" uniqKey="Burge C">C Burge</name>
</author>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
<author>
<name sortKey="Delcher, Al" uniqKey="Delcher A">AL Delcher</name>
</author>
<author>
<name sortKey="Kasif, S" uniqKey="Kasif S">S Kasif</name>
</author>
<author>
<name sortKey="White, O" uniqKey="White O">O White</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noguchi, H" uniqKey="Noguchi H">H Noguchi</name>
</author>
<author>
<name sortKey="Park, J" uniqKey="Park J">J Park</name>
</author>
<author>
<name sortKey="Takagi, T" uniqKey="Takagi T">T Takagi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benson, Da" uniqKey="Benson D">DA Benson</name>
</author>
<author>
<name sortKey="Karsch Mizrachi, I" uniqKey="Karsch Mizrachi I">I Karsch-Mizrachi</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
<author>
<name sortKey="Ostell, J" uniqKey="Ostell J">J Ostell</name>
</author>
<author>
<name sortKey="Rapp, Ba" uniqKey="Rapp B">BA Rapp</name>
</author>
<author>
<name sortKey="Wheeler, Dl" uniqKey="Wheeler D">DL Wheeler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Harrington, Ed" uniqKey="Harrington E">ED Harrington</name>
</author>
<author>
<name sortKey="Singh, Ah" uniqKey="Singh A">AH Singh</name>
</author>
<author>
<name sortKey="Doerks, T" uniqKey="Doerks T">T Doerks</name>
</author>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Von Mering, C" uniqKey="Von Mering C">C von Mering</name>
</author>
<author>
<name sortKey="Jensen, Lj" uniqKey="Jensen L">LJ Jensen</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Copley, Rr" uniqKey="Copley R">RR Copley</name>
</author>
<author>
<name sortKey="Schmidt, S" uniqKey="Schmidt S">S Schmidt</name>
</author>
<author>
<name sortKey="Ciccarelli, Fd" uniqKey="Ciccarelli F">FD Ciccarelli</name>
</author>
<author>
<name sortKey="Doerks, T" uniqKey="Doerks T">T Doerks</name>
</author>
<author>
<name sortKey="Schultz, J" uniqKey="Schultz J">J Schultz</name>
</author>
<author>
<name sortKey="Ponting, Cp" uniqKey="Ponting C">CP Ponting</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Finn, Rd" uniqKey="Finn R">RD Finn</name>
</author>
<author>
<name sortKey="Tate, J" uniqKey="Tate J">J Tate</name>
</author>
<author>
<name sortKey="Mistry, J" uniqKey="Mistry J">J Mistry</name>
</author>
<author>
<name sortKey="Coggill, Pc" uniqKey="Coggill P">PC Coggill</name>
</author>
<author>
<name sortKey="Sammut, Sj" uniqKey="Sammut S">SJ Sammut</name>
</author>
<author>
<name sortKey="Hotz, H R" uniqKey="Hotz H">H-R Hotz</name>
</author>
<author>
<name sortKey="Ceric, G" uniqKey="Ceric G">G Ceric</name>
</author>
<author>
<name sortKey="Forslund, K" uniqKey="Forslund K">K Forslund</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
<author>
<name sortKey="Sonnhammer, Ell" uniqKey="Sonnhammer E">ELL Sonnhammer</name>
</author>
<author>
<name sortKey="Bateman, A" uniqKey="Bateman A">A Bateman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yooseph, S" uniqKey="Yooseph S">S Yooseph</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Sutton, G" uniqKey="Sutton G">G Sutton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoff, Kj" uniqKey="Hoff K">KJ Hoff</name>
</author>
<author>
<name sortKey="Tech, M" uniqKey="Tech M">M Tech</name>
</author>
<author>
<name sortKey="Lingner, T" uniqKey="Lingner T">T Lingner</name>
</author>
<author>
<name sortKey="Daniel, R" uniqKey="Daniel R">R Daniel</name>
</author>
<author>
<name sortKey="Morgenstern, B" uniqKey="Morgenstern B">B Morgenstern</name>
</author>
<author>
<name sortKey="Meinicke, P" uniqKey="Meinicke P">P Meinicke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dinsdale, Ea" uniqKey="Dinsdale E">EA Dinsdale</name>
</author>
<author>
<name sortKey="Edwards, Ra" uniqKey="Edwards R">RA Edwards</name>
</author>
<author>
<name sortKey="Hall, D" uniqKey="Hall D">D Hall</name>
</author>
<author>
<name sortKey="Angly, F" uniqKey="Angly F">F Angly</name>
</author>
<author>
<name sortKey="Breitbart, M" uniqKey="Breitbart M">M Breitbart</name>
</author>
<author>
<name sortKey="Brulc, Jm" uniqKey="Brulc J">JM Brulc</name>
</author>
<author>
<name sortKey="Furlan, M" uniqKey="Furlan M">M Furlan</name>
</author>
<author>
<name sortKey="Desnues, C" uniqKey="Desnues C">C Desnues</name>
</author>
<author>
<name sortKey="Haynes, M" uniqKey="Haynes M">M Haynes</name>
</author>
<author>
<name sortKey="Li, L" uniqKey="Li L">L Li</name>
</author>
<author>
<name sortKey="Mcdaniel, L" uniqKey="Mcdaniel L">L McDaniel</name>
</author>
<author>
<name sortKey="Moran, Ma" uniqKey="Moran M">MA Moran</name>
</author>
<author>
<name sortKey="Nelson, Ke" uniqKey="Nelson K">KE Nelson</name>
</author>
<author>
<name sortKey="Nilsson, C" uniqKey="Nilsson C">C Nilsson</name>
</author>
<author>
<name sortKey="Olson, R" uniqKey="Olson R">R Olson</name>
</author>
<author>
<name sortKey="Paul, J" uniqKey="Paul J">J Paul</name>
</author>
<author>
<name sortKey="Rodriguez Brito, B" uniqKey="Rodriguez Brito B">B Rodriguez Brito</name>
</author>
<author>
<name sortKey="Ruan, Y" uniqKey="Ruan Y">Y Ruan</name>
</author>
<author>
<name sortKey="Swan, Bk" uniqKey="Swan B">BK Swan</name>
</author>
<author>
<name sortKey="Stevens, R" uniqKey="Stevens R">R Stevens</name>
</author>
<author>
<name sortKey="Valentine, Dl" uniqKey="Valentine D">DL Valentine</name>
</author>
<author>
<name sortKey="Thurber, Rv" uniqKey="Thurber R">RV Thurber</name>
</author>
<author>
<name sortKey="Wegley, L" uniqKey="Wegley L">L Wegley</name>
</author>
<author>
<name sortKey="White, Ba" uniqKey="White B">BA White</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lozupone, Ca" uniqKey="Lozupone C">CA Lozupone</name>
</author>
<author>
<name sortKey="Knight, R" uniqKey="Knight R">R Knight</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krause, L" uniqKey="Krause L">L Krause</name>
</author>
<author>
<name sortKey="Diaz, Nn" uniqKey="Diaz N">NN Diaz</name>
</author>
<author>
<name sortKey="Bartels, D" uniqKey="Bartels D">D Bartels</name>
</author>
<author>
<name sortKey="Edwards, Ra" uniqKey="Edwards R">RA Edwards</name>
</author>
<author>
<name sortKey="Puhler, A" uniqKey="Puhler A">A Puhler</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
<author>
<name sortKey="Stoye, J" uniqKey="Stoye J">J Stoye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcgrath, Kc" uniqKey="Mcgrath K">KC McGrath</name>
</author>
<author>
<name sortKey="Thomas Hall, Sr" uniqKey="Thomas Hall S">SR Thomas-Hall</name>
</author>
<author>
<name sortKey="Cheng, Ct" uniqKey="Cheng C">CT Cheng</name>
</author>
<author>
<name sortKey="Leo, L" uniqKey="Leo L">L Leo</name>
</author>
<author>
<name sortKey="Alexa, A" uniqKey="Alexa A">A Alexa</name>
</author>
<author>
<name sortKey="Schmidt, S" uniqKey="Schmidt S">S Schmidt</name>
</author>
<author>
<name sortKey="Schenk, Pm" uniqKey="Schenk P">PM Schenk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scholten, Jcm" uniqKey="Scholten J">JCM Scholten</name>
</author>
<author>
<name sortKey="Culley, De" uniqKey="Culley D">DE Culley</name>
</author>
<author>
<name sortKey="Nie, L" uniqKey="Nie L">L Nie</name>
</author>
<author>
<name sortKey="Munn, Kj" uniqKey="Munn K">KJ Munn</name>
</author>
<author>
<name sortKey="Chow, L" uniqKey="Chow L">L Chow</name>
</author>
<author>
<name sortKey="Brockman, Fj" uniqKey="Brockman F">FJ Brockman</name>
</author>
<author>
<name sortKey="Zhang, W" uniqKey="Zhang W">W Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="He, Z" uniqKey="He Z">Z He</name>
</author>
<author>
<name sortKey="Gentry, Tj" uniqKey="Gentry T">TJ Gentry</name>
</author>
<author>
<name sortKey="Schadt, Cw" uniqKey="Schadt C">CW Schadt</name>
</author>
<author>
<name sortKey="Wu, L" uniqKey="Wu L">L Wu</name>
</author>
<author>
<name sortKey="Liebich, J" uniqKey="Liebich J">J Liebich</name>
</author>
<author>
<name sortKey="Chong, Sc" uniqKey="Chong S">SC Chong</name>
</author>
<author>
<name sortKey="Huang, Z" uniqKey="Huang Z">Z Huang</name>
</author>
<author>
<name sortKey="Wu, W" uniqKey="Wu W">W Wu</name>
</author>
<author>
<name sortKey="Gu, B" uniqKey="Gu B">B Gu</name>
</author>
<author>
<name sortKey="Jardin, P" uniqKey="Jardin P">P Jardin</name>
</author>
<author>
<name sortKey="Criddle, C" uniqKey="Criddle C">C Criddle</name>
</author>
<author>
<name sortKey="Zhou, J" uniqKey="Zhou J">J Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rhee, Sk" uniqKey="Rhee S">SK Rhee</name>
</author>
<author>
<name sortKey="Liu, X" uniqKey="Liu X">X Liu</name>
</author>
<author>
<name sortKey="Wu, L" uniqKey="Wu L">L Wu</name>
</author>
<author>
<name sortKey="Chong, Sc" uniqKey="Chong S">SC Chong</name>
</author>
<author>
<name sortKey="Wan, X" uniqKey="Wan X">X Wan</name>
</author>
<author>
<name sortKey="Zhou, J" uniqKey="Zhou J">J Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yergeau, E" uniqKey="Yergeau E">E Yergeau</name>
</author>
<author>
<name sortKey="Kang, S" uniqKey="Kang S">S Kang</name>
</author>
<author>
<name sortKey="He, Z" uniqKey="He Z">Z He</name>
</author>
<author>
<name sortKey="Zhou, J" uniqKey="Zhou J">J Zhou</name>
</author>
<author>
<name sortKey="Kowalchuk, Ga" uniqKey="Kowalchuk G">GA Kowalchuk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gilbert, Ja" uniqKey="Gilbert J">JA Gilbert</name>
</author>
<author>
<name sortKey="Field, D" uniqKey="Field D">D Field</name>
</author>
<author>
<name sortKey="Huang, Y" uniqKey="Huang Y">Y Huang</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R Edwards</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Gilna, P" uniqKey="Gilna P">P Gilna</name>
</author>
<author>
<name sortKey="Joint, I" uniqKey="Joint I">I Joint</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Frias Lopez, J" uniqKey="Frias Lopez J">J Frias-Lopez</name>
</author>
<author>
<name sortKey="Shi, Y" uniqKey="Shi Y">Y Shi</name>
</author>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
<author>
<name sortKey="Coleman, Ml" uniqKey="Coleman M">ML Coleman</name>
</author>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
<author>
<name sortKey="Chisholdm, Sw" uniqKey="Chisholdm S">SW Chisholdm</name>
</author>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF DeLong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Klevecz, Rr" uniqKey="Klevecz R">RR Klevecz</name>
</author>
<author>
<name sortKey="Li, Cm" uniqKey="Li C">CM Li</name>
</author>
<author>
<name sortKey="Bolen, Jl" uniqKey="Bolen J">JL Bolen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alter, O" uniqKey="Alter O">O Alter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alter, O" uniqKey="Alter O">O Alter</name>
</author>
<author>
<name sortKey="Brown, Po" uniqKey="Brown P">PO Brown</name>
</author>
<author>
<name sortKey="Botstein, D" uniqKey="Botstein D">D Botstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boutros, Pc" uniqKey="Boutros P">PC Boutros</name>
</author>
<author>
<name sortKey="Okey, Ab" uniqKey="Okey A">AB Okey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Valafar, F" uniqKey="Valafar F">F Valafar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wilmes, P" uniqKey="Wilmes P">P Wilmes</name>
</author>
<author>
<name sortKey="Bond, Pl" uniqKey="Bond P">PL Bond</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ram, Rj" uniqKey="Ram R">RJ Ram</name>
</author>
<author>
<name sortKey="Vanberkmoes, Nc" uniqKey="Vanberkmoes N">NC VanBerkmoes</name>
</author>
<author>
<name sortKey="Thelen, Mp" uniqKey="Thelen M">MP Thelen</name>
</author>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
<author>
<name sortKey="Baker, Bj" uniqKey="Baker B">BJ Baker</name>
</author>
<author>
<name sortKey="Blake, Rc" uniqKey="Blake R">RC Blake</name>
</author>
<author>
<name sortKey="Shah, M" uniqKey="Shah M">M Shah</name>
</author>
<author>
<name sortKey="Hettich, Rl" uniqKey="Hettich R">RL Hettich</name>
</author>
<author>
<name sortKey="Banfield, Jf" uniqKey="Banfield J">JF Banfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wilmes, P" uniqKey="Wilmes P">P Wilmes</name>
</author>
<author>
<name sortKey="Wexler, M" uniqKey="Wexler M">M Wexler</name>
</author>
<author>
<name sortKey="Bond, Pl" uniqKey="Bond P">PL Bond</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Denef, V J" uniqKey="Denef V">V J Denef</name>
</author>
<author>
<name sortKey="Verberkmoes, Nc" uniqKey="Verberkmoes N">NC VerBerkmoes</name>
</author>
<author>
<name sortKey="Shah, Mb" uniqKey="Shah M">MB Shah</name>
</author>
<author>
<name sortKey="Abraham, P" uniqKey="Abraham P">P Abraham</name>
</author>
<author>
<name sortKey="Lefsrud, M" uniqKey="Lefsrud M">M Lefsrud</name>
</author>
<author>
<name sortKey="Hettich, Rl" uniqKey="Hettich R">RL Hettich</name>
</author>
<author>
<name sortKey="Banfield, Jf" uniqKey="Banfield J">JF Banfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zucht, Hd" uniqKey="Zucht H">HD Zucht</name>
</author>
<author>
<name sortKey="Lamerz, J" uniqKey="Lamerz J">J Lamerz</name>
</author>
<author>
<name sortKey="Khamenia, V" uniqKey="Khamenia V">V Khamenia</name>
</author>
<author>
<name sortKey="Schiller, C" uniqKey="Schiller C">C Schiller</name>
</author>
<author>
<name sortKey="Appel, A" uniqKey="Appel A">A Appel</name>
</author>
<author>
<name sortKey="Tammen, H" uniqKey="Tammen H">H Tammen</name>
</author>
<author>
<name sortKey="Crameri, R" uniqKey="Crameri R">R Crameri</name>
</author>
<author>
<name sortKey="Selle, H" uniqKey="Selle H">H Selle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ressom, Hw" uniqKey="Ressom H">HW Ressom</name>
</author>
<author>
<name sortKey="Varghese, Rs" uniqKey="Varghese R">RS Varghese</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Xuan, J" uniqKey="Xuan J">J Xuan</name>
</author>
<author>
<name sortKey="Clarke, R" uniqKey="Clarke R">R Clarke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Levner, I" uniqKey="Levner I">I Levner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X Zhang</name>
</author>
<author>
<name sortKey="Lu, X" uniqKey="Lu X">X Lu</name>
</author>
<author>
<name sortKey="Shi, Q" uniqKey="Shi Q">Q Shi</name>
</author>
<author>
<name sortKey="Zu, Xq" uniqKey="Zu X">XQ Zu</name>
</author>
<author>
<name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author>
<name sortKey="Harris, Ln" uniqKey="Harris L">LN Harris</name>
</author>
<author>
<name sortKey="Iglehart, Jd" uniqKey="Iglehart J">JD Iglehart</name>
</author>
<author>
<name sortKey="Miron, A" uniqKey="Miron A">A Miron</name>
</author>
<author>
<name sortKey="Liu, Js" uniqKey="Liu J">JS Liu</name>
</author>
<author>
<name sortKey="Wong, Wh" uniqKey="Wong W">WH Wong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bensmail, H" uniqKey="Bensmail H">H Bensmail</name>
</author>
<author>
<name sortKey="Golek, J" uniqKey="Golek J">J Golek</name>
</author>
<author>
<name sortKey="Moody, Mm" uniqKey="Moody M">MM Moody</name>
</author>
<author>
<name sortKey="Semmes, Jo" uniqKey="Semmes J">JO Semmes</name>
</author>
<author>
<name sortKey="Haoudi, A" uniqKey="Haoudi A">A Haoudi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Baria, A" uniqKey="Baria A">A Baria</name>
</author>
<author>
<name sortKey="Jurman, G" uniqKey="Jurman G">G Jurman</name>
</author>
<author>
<name sortKey="Riccadonna, S" uniqKey="Riccadonna S">S Riccadonna</name>
</author>
<author>
<name sortKey="Merler, S" uniqKey="Merler S">S Merler</name>
</author>
<author>
<name sortKey="Chierici, M" uniqKey="Chierici M">M Chierici</name>
</author>
<author>
<name sortKey="Furianello, C" uniqKey="Furianello C">C Furianello</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Coen, M" uniqKey="Coen M">M Coen</name>
</author>
<author>
<name sortKey="Holmes, E" uniqKey="Holmes E">E Holmes</name>
</author>
<author>
<name sortKey="Lindon, Jc" uniqKey="Lindon J">JC Lindon</name>
</author>
<author>
<name sortKey="Nicholson, Jk" uniqKey="Nicholson J">JK Nicholson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kiefer, P" uniqKey="Kiefer P">P Kiefer</name>
</author>
<author>
<name sortKey="Portais, Jc" uniqKey="Portais J">JC Portais</name>
</author>
<author>
<name sortKey="Vorholt, Ja" uniqKey="Vorholt J">JA Vorholt</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Markowitz, Vm" uniqKey="Markowitz V">VM Markowitz</name>
</author>
<author>
<name sortKey="Ivanova, Nn" uniqKey="Ivanova N">NN Ivanova</name>
</author>
<author>
<name sortKey="Szeto, E" uniqKey="Szeto E">E Szeto</name>
</author>
<author>
<name sortKey="Palaniappan, K" uniqKey="Palaniappan K">K Palaniappan</name>
</author>
<author>
<name sortKey="Chu, K" uniqKey="Chu K">K Chu</name>
</author>
<author>
<name sortKey="Dalevi, D" uniqKey="Dalevi D">D Dalevi</name>
</author>
<author>
<name sortKey="Chen, I Ma" uniqKey="Chen I">I-MA Chen</name>
</author>
<author>
<name sortKey="Grechkin, Y" uniqKey="Grechkin Y">Y Grechkin</name>
</author>
<author>
<name sortKey="Dubchak, I" uniqKey="Dubchak I">I Dubchak</name>
</author>
<author>
<name sortKey="Anderson, I" uniqKey="Anderson I">I Anderson</name>
</author>
<author>
<name sortKey="Lykidis, A" uniqKey="Lykidis A">A Lykidis</name>
</author>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Kyripides, Nc" uniqKey="Kyripides N">NC Kyripides</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marcy, Y" uniqKey="Marcy Y">Y Marcy</name>
</author>
<author>
<name sortKey="Ouverney, C" uniqKey="Ouverney C">C Ouverney</name>
</author>
<author>
<name sortKey="Bik, Em" uniqKey="Bik E">EM Bik</name>
</author>
<author>
<name sortKey="Losekann, T" uniqKey="Losekann T">T Losekann</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Martin, Hg" uniqKey="Martin H">HG Martin</name>
</author>
<author>
<name sortKey="Szeto, E" uniqKey="Szeto E">E Szeto</name>
</author>
<author>
<name sortKey="Platt, D" uniqKey="Platt D">D Platt</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Relman, Da" uniqKey="Relman D">DA Relman</name>
</author>
<author>
<name sortKey="Quake, Sr" uniqKey="Quake S">SR Quake</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Seshadri, R" uniqKey="Seshadri R">R Seshadri</name>
</author>
<author>
<name sortKey="Kravitz, Sa" uniqKey="Kravitz S">SA Kravitz</name>
</author>
<author>
<name sortKey="Smarr, L" uniqKey="Smarr L">L Smarr</name>
</author>
<author>
<name sortKey="Gilna, P" uniqKey="Gilna P">P Gilna</name>
</author>
<author>
<name sortKey="Frazier, M" uniqKey="Frazier M">M Frazier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
<author>
<name sortKey="Paarmann, D" uniqKey="Paarmann D">D Paarmann</name>
</author>
<author>
<name sortKey="D Souza, M" uniqKey="D Souza M">M D'Souza</name>
</author>
<author>
<name sortKey="Olson, R" uniqKey="Olson R">R Olson</name>
</author>
<author>
<name sortKey="Glass, Em" uniqKey="Glass E">EM Glass</name>
</author>
<author>
<name sortKey="Kubal, M" uniqKey="Kubal M">M Kubal</name>
</author>
<author>
<name sortKey="Paczian, T" uniqKey="Paczian T">T Paczian</name>
</author>
<author>
<name sortKey="Rodriguez, A" uniqKey="Rodriguez A">A Rodriguez</name>
</author>
<author>
<name sortKey="Stevens, R" uniqKey="Stevens R">R Stevens</name>
</author>
<author>
<name sortKey="Wilke, A" uniqKey="Wilke A">A Wilke</name>
</author>
<author>
<name sortKey="Wilkening, J" uniqKey="Wilkening J">J Wilkening</name>
</author>
<author>
<name sortKey="Edwards, Ra" uniqKey="Edwards R">RA Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garrity, Gm" uniqKey="Garrity G">GM Garrity</name>
</author>
<author>
<name sortKey="Field, D" uniqKey="Field D">D Field</name>
</author>
<author>
<name sortKey="Kyripides, N" uniqKey="Kyripides N">N Kyripides</name>
</author>
<author>
<name sortKey="Hirschman, L" uniqKey="Hirschman L">L Hirschman</name>
</author>
<author>
<name sortKey="Sansone, Sa" uniqKey="Sansone S">SA Sansone</name>
</author>
<author>
<name sortKey="Angiuoli, S" uniqKey="Angiuoli S">S Angiuoli</name>
</author>
<author>
<name sortKey="Cole, Jr" uniqKey="Cole J">JR Cole</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO Glockner</name>
</author>
<author>
<name sortKey="Kolker, E" uniqKey="Kolker E">E Kolker</name>
</author>
<author>
<name sortKey="Kowalchuk, G" uniqKey="Kowalchuk G">G Kowalchuk</name>
</author>
<author>
<name sortKey="Moran, Ma" uniqKey="Moran M">MA Moran</name>
</author>
<author>
<name sortKey="Ussery, D" uniqKey="Ussery D">D Ussery</name>
</author>
<author>
<name sortKey="White, O" uniqKey="White O">O White</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF Delong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richter, Dc" uniqKey="Richter D">DC Richter</name>
</author>
<author>
<name sortKey="Ott, F" uniqKey="Ott F">F Ott</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Schmid, R" uniqKey="Schmid R">R Schmid</name>
</author>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turnbaugh, Pj" uniqKey="Turnbaugh P">PJ Turnbaugh</name>
</author>
<author>
<name sortKey="Ley, Re" uniqKey="Ley R">RE Ley</name>
</author>
<author>
<name sortKey="Mahowald, Ma" uniqKey="Mahowald M">MA Mahowald</name>
</author>
<author>
<name sortKey="Magrini, V" uniqKey="Magrini V">V Magrini</name>
</author>
<author>
<name sortKey="Mardis, Er" uniqKey="Mardis E">ER Mardis</name>
</author>
<author>
<name sortKey="Gordon, Ji" uniqKey="Gordon J">JI Gordon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ley, Re" uniqKey="Ley R">RE Ley</name>
</author>
<author>
<name sortKey="Backhed, F" uniqKey="Backhed F">F Backhed</name>
</author>
<author>
<name sortKey="Turnbaugh, P" uniqKey="Turnbaugh P">P Turnbaugh</name>
</author>
<author>
<name sortKey="Lozupone, Ca" uniqKey="Lozupone C">CA Lozupone</name>
</author>
<author>
<name sortKey="Knight, Rd" uniqKey="Knight R">RD Knight</name>
</author>
<author>
<name sortKey="Gordon, Ji" uniqKey="Gordon J">JI Gordon</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Curr Genomics</journal-id>
<journal-id journal-id-type="publisher-id">CG</journal-id>
<journal-title-group>
<journal-title>Current Genomics</journal-title>
</journal-title-group>
<issn pub-type="ppub">1389-2029</issn>
<issn pub-type="epub">1875-5488</issn>
<publisher>
<publisher-name>Bentham Science Publishers Ltd.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">20436876</article-id>
<article-id pub-id-type="pmc">2808676</article-id>
<article-id pub-id-type="publisher-id">CG-10-493</article-id>
<article-id pub-id-type="doi">10.2174/138920209789208255</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Signal Processing for Metagenomics: Extracting Information from the Soup</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Rosen</surname>
<given-names>Gail L.</given-names>
</name>
<xref ref-type="corresp" rid="cor1">*</xref>
<xref ref-type="aff" rid="aff1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sokhansanj</surname>
<given-names>Bahrad A.</given-names>
</name>
<xref ref-type="aff" rid="aff2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Polikar</surname>
<given-names>Robi</given-names>
</name>
<xref ref-type="aff" rid="aff3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bruns</surname>
<given-names>Mary Ann</given-names>
</name>
<xref ref-type="aff" rid="aff4">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Russell</surname>
<given-names>Jacob</given-names>
</name>
<xref ref-type="aff" rid="aff5">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Garbarine</surname>
<given-names>Elaine</given-names>
</name>
<xref ref-type="aff" rid="aff1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Essinger</surname>
<given-names>Steve</given-names>
</name>
<xref ref-type="aff" rid="aff1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yok</surname>
<given-names>Non</given-names>
</name>
<xref ref-type="aff" rid="aff1">1</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<label>1</label>
Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA</aff>
<aff id="aff2">
<label>2</label>
School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, PA, USA</aff>
<aff id="aff3">
<label>3</label>
Electrical and Computer Engineering Department, Rowan University, Glassboro, NJ, USA</aff>
<aff id="aff4">
<label>4</label>
Soil Science/Microbial Ecology, Pennsylvania State University, University Park, PA, USA</aff>
<aff id="aff5">
<label>5</label>
Biology Department, Drexel University, Philadelphia, PA, USA</aff>
<author-notes>
<corresp id="cor1">
<label>*</label>
Address correspondence to this author at the Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA 19104, USA; E-mail:
<email xlink:href="gailr@ece.drexel.edu"> gailr@ece.drexel.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>11</month>
<year>2009</year>
</pub-date>
<volume>10</volume>
<issue>7</issue>
<fpage>493</fpage>
<lpage>510</lpage>
<history>
<date date-type="received">
<day>04</day>
<month>12</month>
<year>2008</year>
</date>
<date date-type="rev-recd">
<day>31</day>
<month>3</month>
<year>2009</year>
</date>
<date date-type="accepted">
<day>25</day>
<month>4</month>
<year>2009</year>
</date>
</history>
<permissions>
<copyright-statement>©2009 Bentham Science Publishers Ltd.</copyright-statement>
<copyright-year>2009</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.5/">
<license-p>This is an open access article distributed under the terms of the Creative Commons Attribution License (
<uri xlink:type="simple" xlink:href="http://creativecommons.org/licenses/by/2.5/">http://creativecommons.org/licenses/by/2.5/</uri>
), which permits unrestrictive use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology.</p>
</abstract>
</article-meta>
</front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>INTRODUCTION</title>
<p>Currently, the complete genome of an organism is obtained through 1) isolating and culturing the organism to obtain sufficient DNA mass, 2) extracting and amplifying DNA, 3) sequencing the genomes, 4) assembling them, and 5) finally annotating genes and regulatory elements. This process breaks down at the first step for organisms that cannot be cultured. Given that >99% of microbes cannot be cultivated in isolation [
<xref ref-type="bibr" rid="R1">1</xref>
], this traditional approach has vastly constrained our ability to study microbial genomes. New approaches propose to start at step 2 and sequence as much as possible of the DNA present in a sample, but such sequencing is slow with classical methods.</p>
<p>PCR-based techniques that can identify ribosomal RNA show what species are present in a sample. However, isolation and culturing of an individual species has conventionally been required to obtain its genome sequence. One of the most compelling advantages of metagenomics is avoiding the need to isolate and culture individual organisms. When people think of cultivating microbes in culture, they typically imagine bacteria growing on a dish with agar. There are indeed a number of bacterial species that grow easily in such cultures, such as
<italic>Escherichia coli</italic>
. Not coincidentally, such bacteria are the most well-studied and the first to be sequenced. However, the vast majority of bacteria. Bacteria often require specific growth conditions that are either difficult to achieve in a laboratory or even unknown. For example, Legionella pneumophila, the bacteria that cause Legionnaire's Disease, were not cultured until 6 months after the original outbreak of the disease. This was despite an intense effort by CDC scientists [
<xref ref-type="bibr" rid="R2">2</xref>
]. A recent study suggested that over 60% of the bacterial species found in the amniotic fluid of women with preterm births were from uncultured or difficult-to-culture species [
<xref ref-type="bibr" rid="R3">3</xref>
]. Culture-independent techniques have found that half or more of the bacteria in the human mouth are uncultured species [
<xref ref-type="bibr" rid="R4">4</xref>
]. Overall, past work has shown that perhaps 85% or more of total bacterial diversity consists of uncultured species [
<xref ref-type="bibr" rid="R5">5</xref>
]. Metagenomics provides the only way to obtain gene sequences for these otherwise hidden organisms.</p>
<p>Fortunately, the recent advent and application of high throughput next generation sequencing methods have enabled a large increase in productivity [
<xref ref-type="bibr" rid="R6">6</xref>
,
<xref ref-type="bibr" rid="R7"> 7</xref>
]. This allows the decoding and assembly of multiple genomes from multiple species in communities. This now becomes the field of metagenomics, where scientists must now think on a broad-scale [
<xref ref-type="bibr" rid="R8">8</xref>
,
<xref ref-type="bibr" rid="R9"> 9</xref>
], shifting their focus from “How does one organism work?” to “Who all is here and what are they doing?”</p>
<p>This shift is not the only challenge facing biologists in the emerging era of metagenomics. The increased complexity of the data poses challenges in assembling, annotating, and classifying genomic fragments from multiple organisms. Complications also stem from the difficulty of assembling, annotating, and classifying the short sequence fragments typically obtained with next-generation sequencing methods. So, novel computational methods are needed to address these issues and the massive amounts of sequence data that have become available through recent technological advances.</p>
<p>Signal processing and machine learning disciplines are well-equipped to solve problems where background noise, clutter, and jamming signals are commonplace. Hidden Markov models (HMMs), originally popularized for speech processing, have been used for over a decade for gene recognition [
<xref ref-type="bibr" rid="R10">10</xref>
], and it has been found that many techniques used in speech and text mining can now be applied to biology. Metagenomics allows the classification of millions of organisms and their genes, including identifying particular community differences and markers. Supervised and unsupervised machine learning methods, linear classifiers, advanced Bayesian techniques, etc. are all promising to advance rapid annotation and comparison of samples. In this paper, we survey the potential and utility of new methods in metagenomics, which are already revolutionizing the field of bioinformatics. In doing so, we emphasize how these approaches allow us to identify the taxa from which sequenced fragments originate. Furthermore, we highlight how tools for functional annotation have shed light on the coding capacities of natural bacterial communities, focusing on the potential harmful or beneficial consequences of these microbes from a human perspective.</p>
</sec>
<sec>
<label>2.</label>
<title>EMERGING BIOLOGICAL STUDIES IN METAGENOMICS</title>
<p>It is important to highlight the biological objectives of metagenomic studies. In this section, some of the more exciting and potentially useful applications are reviewed.</p>
<sec>
<label>2.1.</label>
<title>Human Health</title>
<p>In the human gastrointestinal tract, microbes outnumber human cells by 10 to 1, and approximately 100 trillion live in the gut alone [
<xref ref-type="bibr" rid="R1">1</xref>
]. Microbes symbiotically perform functions that humans have not evolved, including the extraction of calories from otherwise indigestible components of our diet, and the synthesis of essential vitamins and amino acids. It has been hypothesized that an imbalance in microbial health can cause obesity [
<xref ref-type="bibr" rid="R11">11</xref>
], and methods are needed to determine what microbes and/or metabolics contribute to a microbial community's behavior.</p>
<p>The National Institute of Health has extended an initiative, entitled The Human Microbiome Project, to examine microbes associated with health of several areas of the human body [
<xref ref-type="bibr" rid="R12">12</xref>
]. These include: 1) our gastro-intestinal (GI) tract [
<xref ref-type="bibr" rid="R11">11</xref>
,
<xref ref-type="bibr" rid="R13"> 13</xref>
-
<xref ref-type="bibr" rid="R16">16</xref>
], 2) the oral cavity [
<xref ref-type="bibr" rid="R17">17</xref>
,
<xref ref-type="bibr" rid="R18"> 18</xref>
], 3) the nasal cavity/lung, 4) skin [
<xref ref-type="bibr" rid="R19">19</xref>
], and 5) genital regions [
<xref ref-type="bibr" rid="R20">20</xref>
]. GI-illnesses and tooth decay have loosely been linked to “bad” build-up of bacteria that cause cavities [
<xref ref-type="bibr" rid="R17">17</xref>
], but the make-up of these bacterial communities needs extensive study. The taxonomic and functional characteristics of these microbes can then be used to decipher the mechanisms behind potentially harmful or beneficial activities of human bacterial associates. The results of metagenomic analyses may contribute, for example, to improving the formula and use of mouthwash [
<xref ref-type="bibr" rid="R21">21</xref>
].</p>
</sec>
<sec>
<label>2.2.</label>
<title>Soil Fertility</title>
<p>Microbial soil communities are highly diverse [
<xref ref-type="bibr" rid="R22">22</xref>
], consisting of many undescribed bacterial lineages [
<xref ref-type="bibr" rid="R23">23</xref>
]. It has been shown that some soils are more capable than others of supporting growth of healthy plants, and that many desirable soil properties are correlated with microbial composition in the soil [
<xref ref-type="bibr" rid="R24">24</xref>
]. Soil microbial communities have been implicated in the suppression of plant pathogens [
<xref ref-type="bibr" rid="R25">25</xref>
], and breakdown of pollutants [
<xref ref-type="bibr" rid="R26">26</xref>
], which favor agricultural productivity. It is hypothesized that degraded soils with low microbiological diversity suffer from an imbalance of nutrients and cannot suppress plant pathogens [
<xref ref-type="bibr" rid="R24">24</xref>
]. This suggests that humans could stimulate soil microbial processes that assist plant growth by replenishing nutrients favoring beneficial microorganisms. Greater knowledge is needed of how agricultural management practices induce shifts in soil microbial community composition and function [
<xref ref-type="bibr" rid="R27">27</xref>
]. Metagenomic studies could lead to understanding how changes in soil microbial communities influence long-term agricultural sustainability.</p>
</sec>
<sec>
<label>2.3.</label>
<title>Forensics</title>
<p>The anthrax scare of 2001 highlighted the need for microbial forensics. The Bacillus anthracis spores found in the mailed envelopes were related to the Ames strain, commonly used in research in over 20 laboratories [
<xref ref-type="bibr" rid="R28">28</xref>
,
<xref ref-type="bibr" rid="R29"> 29</xref>
]. Since the Ames strain was created, unique point mutations arose separately in distinct populations grown in separate labs. Because the anthrax-laden envelopes contained billions of spores, many of these envelopes harbored mutations that further distinguished them from existing lab populations. Since scientists did not initially know where these mutations had occurred, elucidating the origins of this anthrax strain required a large amount of genome-wide sequencing and analyses to generate sufficient data for evolutionary reconstruction [
<xref ref-type="bibr" rid="R29">29</xref>
]. Metagenomics techniques were crucial in obtaining the diversity of mutations within the envelopes' samples [
<xref ref-type="bibr" rid="R30">30</xref>
].</p>
<p>Recent applications of metagenomics to studies of ancient DNA [
<xref ref-type="bibr" rid="R31">31</xref>
,
<xref ref-type="bibr" rid="R32"> 32</xref>
] may benefit the field of forensic science. For example, to study the genome of the extinct wooly mammoth, DNA was extracted from well-preserved mammoth remains and sequenced using the Roche/454 method of pyrosequencing [
<xref ref-type="bibr" rid="R33">33</xref>
]. Although a considerable proportion of sequence reads came from the genomes of other organisms, approximately 50% were closely related to the elephant genome, suggesting that the authors had successfully sequenced mammoth DNA from 28,000 year-old remains [
<xref ref-type="bibr" rid="R34">34</xref>
]. A similar approach has also been used to study the genomes of extinct Neanderthals [
<xref ref-type="bibr" rid="R35">35</xref>
], and may be applied to the study of human remains or environmental samples from crime scenes. Such a technique can offer the opportunity to identify victims, to detect DNA from a suspect, or to match the microbial profiles from samples at the crime scene with those observed in association with an identified suspect. These methods may also enable detection of air-borne pathogens within indoor facilities [
<xref ref-type="bibr" rid="R36">36</xref>
] or soil in outdoor environments [
<xref ref-type="bibr" rid="R37">37</xref>
,
<xref ref-type="bibr" rid="R38"> 38</xref>
], an area of special concern in the attempt to prevent effective bioterrorism [
<xref ref-type="bibr" rid="R28">28</xref>
].</p>
</sec>
</sec>
<sec>
<label>3.</label>
<title>METAGENOMIC TECHNOLOGIES</title>
<p>The first step of any metagenomics study, is to acquire the data -- whether it be DNA sequences, specific genes, mRNA, or proteins. This first step is fundamental to the process, and is the assumption on which further analysis and comparison operate. Any technological limitation with the first step must be compensated for in subsequent analysis. </p>
<sec>
<label>3.1.</label>
<title>DNA Sequencing</title>
<p>Traditionally, DNA has been sequenced using a chain-termination method developed by Fred Sanger
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R39">39</xref>
]. This method revolutionized genomics by being able to read (or identify the nucleotide bases of) complete genes. Since then, the method has been refined and it produces the average read-length of 750 basepairs (bp). However, this process requires several steps, with current instrumentation, and can only process 96 reads at a time, thus rendering this method extremely slow and costly [
<xref ref-type="bibr" rid="R6">6</xref>
,
<xref ref-type="bibr" rid="R40"> 40</xref>
]. Recently, next-generation sequencing technology has emerged which can process millions of sequence reads in parallel, requiring only one or two instrument runs to complete an experiment. But this massively parallel approach comes at a price -- most next-generation technologies produce sequence reads much shorter than 750bp.</p>
<p>For example, the Roche 454 pyrosequencers can obtain 400K reads, each with an average length of 250 bp (a total of 100 Megabases per 7-hour run) [
<xref ref-type="bibr" rid="R6">6</xref>
]. Illumina sequencing-by-synthesis, on the other hand can deliver 36 million reads of average length of 35bp in 4 days (a total of 1.3 Gigabases per 4-day run) [
<xref ref-type="bibr" rid="R6">6</xref>
]. In the end, the throughput is similar, but the pyrosequencing method yields longer reads. Longer reads are likelier to yield uniquely identifiable sequences that are easier to BLAST [
<xref ref-type="bibr" rid="R41">41</xref>
] or to string-match to a database [
<xref ref-type="bibr" rid="R7">7</xref>
]. Because short reads miss some homologs found only in longer reads, doubt has been cast on the feasibility of short-read technologies [
<xref ref-type="bibr" rid="R42">42</xref>
]. Therefore, it is of current interest to show that metagenomic methods can overcome poor resolution of short reads using computational techniques.</p>
</sec>
<sec>
<label>3.2.</label>
<title>16S rRNA Detection</title>
<p>Instead of sequencing the DNA of an entire sample, which can be costly with traditional sequencing, a common approach is to restrict sequencing to taxonomically informative genome segments, such as those coding for highly conserved ribosomal RNAs. The 16S and 18S rRNA genes, with respective lengths of 1500 bp for prokaryotes [
<xref ref-type="bibr" rid="R23">23</xref>
] and 2800 bp for eukaryotes, encode RNAs destined for small subunits in ribosomes, the essential and universal sites in all cells where messenger RNAs are translated into proteins. Because these genes are so critical for proper cell function, they are highly conserved and reflect genetic variation among all life forms over evolutionary time. Sequence variations in these genes thus signify fundamental differences among phyla/divisions/genera/species. To obtain these sequences from complex mixtures of genomes, classical polymerase chain reaction (PCR) is used with primers complementary to the highly conserved regions of 16S rRNA [
<xref ref-type="bibr" rid="R43">43</xref>
-
<xref ref-type="bibr" rid="R45">45</xref>
]. Searchable databases for phylogenetic placement of new sequences are available in GenBank, RDP [
<xref ref-type="bibr" rid="R46">46</xref>
], while other models are based on shorter portions (500- bp or 400-bp) of 16S rRNA genes which are neither highly conserved not hypervariable and which have been used to distinguish various genus and species [
<xref ref-type="bibr" rid="R47">47</xref>
]. Recently, organism detection has moved to microarrays composed of 16S probes, which do not require long amplification steps [
<xref ref-type="bibr" rid="R48">48</xref>
-
<xref ref-type="bibr" rid="R50">50</xref>
].</p>
</sec>
<sec>
<label>3.3.</label>
<title>Metaproteomic Technologies</title>
<p>In addition to meta
<italic> genomics</italic>
, other “omics” approaches hold great promise for deciphering complex mixtures. One emerging area is that of metaproteomics. Traditionally, scientists have been able to separate proteins from complex mixtures of cellular extracts using 2-D gel electrophoresis [
<xref ref-type="bibr" rid="R51">51</xref>
]. In the 90's, mass-spectrometry enabled rapid and highly sensitive protein identification [
<xref ref-type="bibr" rid="R51">51</xref>
]. In Schulze
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R52">52</xref>
], a mass-spectrometry (MS) method to analyze the protein complement of water containing organic matter from four different environments was introduced. Subsequent studies have used variants of MS approaches [
<xref ref-type="bibr" rid="R53">53</xref>
-
<xref ref-type="bibr" rid="R55">55</xref>
]. Although this article focuses on metagenomics, metaproteomics is discussed briefly in section 6.</p>
</sec>
</sec>
<sec>
<label>4.</label>
<title>GENOME-CENTRIC METAGENOMICS</title>
<p>Microbial community classification and comparison may appear at first as a daunting challenge. Yet, the problems are not too different from traditional signal processing applications. As in many applications, such as speech recognition, the first step starts with a vast amount of data. If the problem were posed -- “Given a set of acoustic waves from speech, decipher the words being said,” the solution seems distant at first. After decades of research on acoustic theory and speech processing, there is a rich theory describing how to segment the data and extract features followed by clustering and classification. A similar approach is extended to metagenomics. Fig. (
<bold>
<xref ref-type="fig" rid="F1">1</xref>
</bold>
) illustrates the parallel between speech processing and metagenomics.</p>
<p>Metagenomics in its infancy has focused on two of three fundamental questions -- “Who is here?” and “How much of each is here?” [
<xref ref-type="bibr" rid="R1">1</xref>
,
<xref ref-type="bibr" rid="R56"> 56</xref>
-
<xref ref-type="bibr" rid="R58">58</xref>
]. (With an emerging third question addressed in sections 5 and 6 -- “What are they doing?”). In early metagenomics project, such as the Venter Institute's Sargasso Sea project and Sorcerer II Global Ocean Expedition, 2 million sequence and 7.7 million reads were collected, respectively [
<xref ref-type="bibr" rid="R59">59</xref>
].</p>
<p>To even answer the “Who is here?” question, the analysis is complicated with a mixture of organisms. Remember, biologists traditionally culture an organism, so this question has not even been considered before. Usually, in single-genome analysis, DNA reads are all considered to be from the same genome, where each read can be matched to the
<bold>one</bold>
reference genome, and can therefore be thought as contigs (contiguous fragments) which form a scaffold. But now, in the environment, there are multitudes of genomes from a diversity of organisms, where the amount of each organism varies. Also, each DNA read can be from hundreds of
<italic>known</italic>
or millions of
<italic>unknown</italic>
genomes. A given environmental sample will have hundreds of thousands of organisms corresponding to billions, if not trillions, of basepairs -- and some organisms may only compose 0.01% of the sample. For example, it is known that pathogenic bacteria are present in our bodies at all times, but they are competing with healthy bacteria and are present in such small amounts, that it is negligent to our overall health. Usually, when the balance of “bad” to “good” increases, health problems arise. So one major question is -- if we gather a sample from the human gut, and a majority of the bacteria are probiotic
<italic>E. Coli</italic>
, how can we detect the few that are pathogenic? The near-10 million readers from the Venter expeditions, is just scratching the surface of all the diversity in the sea.</p>
<p>In signal processing, we usually think of capturing information in time -- that if there is a quickly changing (or high-frequency) signal, we need a higher sampling rate to detect it. In metagenomics, the case of sampling (or sequencing) is -- how well do you want to detect the “infrequent” signals/organisms? If one wanted to detect the top-5 organisms in a sample, it would probably be acceptable to undersample the environment because of high-redunancy of abundant organisms; compressive sensing techniques would be valuable here. But if the objective is to determine ALL organisms present, infinite sampling would most likely be needed. Biologists have stated that metagenomics samples can only be sampled and never fully characterized [
<xref ref-type="bibr" rid="R1">1</xref>
], and given prior knowledge about low-diversity, it has been hypothesized that some low-complexity environmental samples would need to be oversampled by 10 × to get a decent coverage of diversity [
<xref ref-type="bibr" rid="R1">1</xref>
,
<xref ref-type="bibr" rid="R42"> 42</xref>
]. But to generalize this mathematically given different environments is still an open-problem, and metagenomics still needs its own Nyquist theorem.</p>
<p>To further quantify this to a metagenomics problem, we can formulate the data types associated with metagenomics. For example, it is well-known that DNA is composed of a discrete, finite alphabet, {
<italic>A,T,C,G</italic>
} [
<xref ref-type="bibr" rid="R60">60</xref>
], and therefore different discrete, word-like features can be formed. However continuous valued features can be generated from such data, such as the probability/frequency profiles of different
<italic>N</italic>
-mers. Also, there is the fundamental unit of the “gene”, and this can be used as a discrete feature and its frequency can be continuous.</p>
<p>The computational objectives associated with the “Who? How much? and What are they doing?” problems can be broken down into different categories. For the “Who?” question, a current problem is taxa-recognition which would be to classify reads into different hierarchical classes, such as top-level Kingdom, the mid-level Order, or even as specific as the type of strain. The difficulty in going higher and higher resolution, is that in biology the definitions become quite arbitrary and nonlinear on the genome-level. Some biologists are considering more genomic-definitions for defining taxa. The “How much?” problem is associated with the “depth” of the sampling, and obtaining a statistical confidence in the read-classifications. For example, with a particular error rate in classification, can we still say that the amount of reads classified do represent the true representation of a taxa in a sample? The emerging “What are they doing?” question has computational objectives on several different levels -- can individual genes be recognized from reads? This signifies the potential function of a sample. Also, once these genes are recognized, are they associated with pathways [
<xref ref-type="bibr" rid="R61">61</xref>
]? Another area, are what secondary structures are predicted and what genes are actually expressed in sample? -- which now goes into meta-proteomic and transciptomics.</p>
<p>To solve the “Which taxa and how much?”, there are vast amounts of unlabeled test data; very little labeled data is available to “train” on. Therefore, the genome fragment classification problem can be broken down into a) supervised
<italic>vs</italic>
. b) unsupervised methods [
<xref ref-type="bibr" rid="R62">62</xref>
].</p>
<p>The computational objective in this problem can be formulated in the following way: Given a feature vector x=[
<italic>x</italic>
<sub>1</sub>
,
<italic>x</italic>
<sub>2</sub>
,...,
<italic>x
<sub>N</sub>
</italic>
], obtained from the raw sequenced DNA, through some feature extraction approach, the learner
<italic>L</italic>
, is trained to recognize presence of one or more genomes in the set
<italic>G = g
<sub>1</sub>
,g
<sub>2</sub>
,…,g
<sub>M</sub>
</italic>
. In a supervised problem, the applicable labels for each
<bold>x</bold>
is available to
<italic>L</italic>
, whereas in an unsupervised problem
<italic>L</italic>
is simply asked to determine the clusterings within the data. Since the learner is not guided by the labels of the existing training data, unsupervised clustering is often a much harder problem. Going back to the speaker / speech identification problem: Having prelabeled data from, say 10 speakers, and asking the classifier to recognize each speaker based on the prelabeled data would be the supervised problem, whereas, providing all the data to an algorithm without labels, and telling to cluster the data into as many distinct categories as it finds would be the clustering problem.</p>
<p>The limitation regarding the availability of training data is also closely associated with the dimensionality of the data. When working with HMM for gene recognition, which are only 1000-2000 bp in length, researchers rarely venture past 5-mer feature sizes, but for whole-genome analysis, much greater feature sizes are needed [
<xref ref-type="bibr" rid="R63">63</xref>
,
<xref ref-type="bibr" rid="R64"> 64</xref>
]. This poses huge problems for computing pattern recognition algorithms. For example, if one were to use the
<italic>N</italic>
-mer frequency profiles as features, the length of the feature vector grows very quickly (exponentially) with
<italic>N</italic>
. While most classifiers can handle feature vectors that are in the hundreds or even thousands of points, when the feature length reaches millions or hundreds of millions (4
<sup>9</sup>
, 4
<sup>12</sup>
, etc.), most popular classifiers become infeasible. Classifiers such as MLP, SVMs or other neural networks, that need to solve complex optimization problems (where feature sizes such as 4
<sup>9</sup>
) are near impossible, while simpler classifiers such as k-nearest neighbor - or even dimensionality reduction approaches (such as PCA) become unfeasible (working with a 4
<sup>12</sup>
by 4
<sup>12</sup>
matrix).</p>
<p>The problem is complicated more because unlike a standard classification problem, where
<italic>L</italic>
chooses only one element of
<italic>G</italic>
, more than one element of
<italic>G</italic>
may be chosen in the metagenomics problems. This can be true because multiple DNA reads maybe belong to different strains, or closely-related
<italic>G</italic>
. Also, in the case of horizontally transferred genes, similar sequence can be in unrelated
<italic>G</italic>
.</p>
<sec>
<label>4.1.</label>
<title>Supervised Taxonomic Classification</title>
<p>Supervised classification methods have traditionally been more popular, since unsupervised methods rely on intrinsic, possibly false, assumptions of the data. The disadvantage of supervised methods is the lack of sufficient data for training. Only a fraction of the species diversity exists in the current databases, and estimating diversity has been seen as unknowable as it is in constant change [
<xref ref-type="bibr" rid="R65">65</xref>
], making supervised approaches difficult to apply. However, as our knowledge of genomes expands, supervised methods hold promise to learn the data that will become available.</p>
<p>In this section, we review several methods in the following table:</p>
<p>
<table-wrap id="TB1" position="anchor">
<table frame="border" rules="all" width="100%">
<thead>
<tr>
<th rowspan="1" colspan="1">Features</th>
<th rowspan="1" colspan="1">Classifier</th>
<th rowspan="1" colspan="1">Published Method</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">Homology-based</td>
<td align="center" rowspan="1" colspan="1">Nearest-Neighbor</td>
<td align="center" rowspan="1" colspan="1">BLAST [
<xref ref-type="bibr" rid="R41">41</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Nearest-Neighbor & Last Common Ancestor</td>
<td align="center" rowspan="1" colspan="1">MEGAN [
<xref ref-type="bibr" rid="R66">66</xref>
]</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Composition-based</td>
<td align="center" rowspan="1" colspan="1">Naïve Bayesian </td>
<td align="center" rowspan="1" colspan="1">Sandberg
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R67">67</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">RDP classifier (16S sequences only) [
<xref ref-type="bibr" rid="R46">46</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Rosen
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R64">64</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Support Vector Machines</td>
<td align="center" rowspan="1" colspan="1">PhyloPythia [
<xref ref-type="bibr" rid="R63">63</xref>
]</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<sec>
<label>4.1.1.</label>
<title>Homology-Based Approaches</title>
<p>Many current approaches align sequenced fragments to known genomes using homology [
<xref ref-type="bibr" rid="R16">16</xref>
,
<xref ref-type="bibr" rid="R42"> 42</xref>
,
<xref ref-type="bibr" rid="R66"> 66</xref>
,
<xref ref-type="bibr" rid="R68"> 68</xref>
-
<xref ref-type="bibr" rid="R72">72</xref>
]. As mentioned in section 3.1, DNA is fragmented during sequencing so that the sequencer can “read” (or call the bases of) a relatively short length of DNA. Usually, the shorter the fragment, the shorter the time it takes to sequence, thereby driving next-generation technology. Short reads are generally not unique, thus yielding ambiguous classifications, and this has cast doubt about their applicability to metagenomics [
<xref ref-type="bibr" rid="R42">42</xref>
,
<xref ref-type="bibr" rid="R68"> 68</xref>
,
<xref ref-type="bibr" rid="R72"> 72</xref>
]. Therefore, when classifying sequences, an important aspect is to assess methods for these short-reads.</p>
<p>When the Venter Institute first shotgun-sequenced fragments from the Sargasso Sea, the natural first step was to BLAST these sequences against the comprehensive Genbank database [
<xref ref-type="bibr" rid="R69">69</xref>
,
<xref ref-type="bibr" rid="R73"> 73</xref>
]. Although, the closest BLAST hit is often not the nearest neighbor [
<xref ref-type="bibr" rid="R68">68</xref>
]. Yet, without questioning the results, most metagenomic analysis relies on BLAST [
<xref ref-type="bibr" rid="R16">16</xref>
,
<xref ref-type="bibr" rid="R66"> 66</xref>
,
<xref ref-type="bibr" rid="R70"> 70</xref>
]. Only recently researchers have begun to analyze and compare the performance of BLAST for metagenomic datasets [
<xref ref-type="bibr" rid="R42">42</xref>
,
<xref ref-type="bibr" rid="R74"> 74</xref>
]. Simply classifying genomic fragments based on a best BLAST hit will yield reliable results only if close relatives are available for comparison. While recently published MEGAN software relies on BLAST for analysis, it attempts to address this problem by classifying DNA fragments based on a lowest common ancestor algorithm (LCA) [
<xref ref-type="bibr" rid="R66">66</xref>
]. LCA allows fragments to generalize to a higher branch in the tree and not the nearest neighbor. Mavromatis
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R75">75</xref>
] show that homology-based approaches have lower specificity and hence are not very accurate. But, it has been shown that BLASTing all random sequence reads (RSRs) in a sample has comparable performance and can be faster and cheaper than extracting 16S sequences alone [
<xref ref-type="bibr" rid="R74">74</xref>
].</p>
<p>A notably relevant analysis demonstrates the drawbacks of using BLAST to identify short-reads from next-generation technology. For most metagenomics datasets to date, the significant BLAST hits only account for 35% of the sample [
<xref ref-type="bibr" rid="R42">42</xref>
]. Wommack
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R42">42</xref>
] take long read metagenomic samples and randomly chooses a shorter read within the larger one. The performance of BLAST nucleotide annotation is compared to BLAST for protein function classification using Clusters of Orthologous Genes (COGs). Short-reads retrieve up to 11% of the sample with correct BLAST hits and significance. They find that short reads tend to miss distantly-related sequences and miss a significant amount of homologs found with long reads. Therefore, improving short-read (less than 400bp) taxonomic and functional classification are open problems.</p>
</sec>
<sec>
<label>4.1.2.</label>
<title>Composition-Based Approaches</title>
<p>Besides homology, there are many sequence-composition based approaches [
<xref ref-type="bibr" rid="R46">46</xref>
,
<xref ref-type="bibr" rid="R63"> 63</xref>
,
<xref ref-type="bibr" rid="R64"> 64</xref>
,
<xref ref-type="bibr" rid="R67"> 67</xref>
,
<xref ref-type="bibr" rid="R76"> 76</xref>
-
<xref ref-type="bibr" rid="R84">84</xref>
]. Compositional approaches use features of length-
<italic>N</italic>
motifs, or
<italic>N </italic>
mers, and usually build models based on the motif frequencies of occurrence. Intrinsic compositional structure has been instrumental in gene recognition through Markov models [
<xref ref-type="bibr" rid="R10">10</xref>
] and in tandem repeat detection [
<xref ref-type="bibr" rid="R60">60</xref>
,
<xref ref-type="bibr" rid="R85"> 85</xref>
]. In [
<xref ref-type="bibr" rid="R76">76</xref>
-
<xref ref-type="bibr" rid="R78">78</xref>
,
<xref ref-type="bibr" rid="R80"> 80</xref>
-
<xref ref-type="bibr" rid="R84">84</xref>
], evolutionary and classification methods are based on di-, tri-, and tetra-nucleotide compositions, which soon lead researchers to look at longer oligos for genomic signatures [
<xref ref-type="bibr" rid="R79">79</xref>
]. Wang
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R46">46</xref>
] use a naive Bayes classifier with 8 mers (
<italic>N</italic>
mers of length 8) for 16S recognition. Researchers have since investigated ranges of different oligo-sized frequencies, with the initial pioneering work and the first naive Bayes implementation by Sandberg
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R67">67</xref>
]. McHardy
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R63">63</xref>
] found that 5mer and 6mer signatures worked the best for support vector machine (SVM) classification, but they concluded that accurate classification only occurs for read-lengths that are ≥ 1000bp. Sandberg
<italic>et al</italic>
. were able to obtain over 85% genome-accuracy performance for 400bp fragments using 9mers on a dataset of 28 species. Rosen
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R64">64</xref>
] took this further to show that the method can achieve 88% for 500bp fragments, but more impressively, it can achieve 76% for strain-accuracy for 25bp fragments.</p>
<p>Wang
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R46">46</xref>
] shows reasonable classification of 16S rRNA sequences while Rosen
<italic>et al</italic>
.'s [
<xref ref-type="bibr" rid="R64">64</xref>
] technique can use any fragment including reasonable performance on short-sequence reads. Because Manichanh
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R74">74</xref>
] shows RSR-based classification is advantageous to 16S, Rosen
<italic>et al</italic>
.'s approach has its advantages, especially since the approach achieves 76% accuracy for ALL 25bp reads at the strain-level. Wang
<italic>et al</italic>
. verifies that with 16S rRNA sequences, one can get 83.2% accuracy (200bp fragments) and 51.5% (50bp) on the genus-level
<italic>via </italic>
a leave-one-out cross-validation(CV) test set. For comparison, Rosen
<italic>et al</italic>
.'s Naïve Bayes classifier (NBC) achieve 95% accuracy for 100bp and 90% accuracy for 25bp fragments on the species-level.</p>
<p>A direct comparison of NBC with BLAST for 25bp fragments is shown in the table:</p>
<p>The 635 completely sequenced microbial genomes, as of Feb. 2008, are still an incomplete representation of extant
<table-wrap id="TB2" position="anchor">
<table frame="border" rules="all" width="100%">
<thead>
<tr>
<th rowspan="1" colspan="1">Taxonomic-level Accuracy</th>
<th rowspan="1" colspan="1">BLAST </th>
<th rowspan="1" colspan="1">NBC</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">Strain (635 genome training data only)</td>
<td align="center" rowspan="1" colspan="1">66%</td>
<td align="center" rowspan="1" colspan="1">76%</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Species (77 strains, 5-fold CV)</td>
<td align="center" rowspan="1" colspan="1">89.2% ± 1.9%</td>
<td align="center" rowspan="1" colspan="1">90.2% ± 1.2%</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Genera (216 strains, 5-fold CV)</td>
<td align="center" rowspan="1" colspan="1">86.0% ± 3.5%</td>
<td align="center" rowspan="1" colspan="1">66.3% ± 6.3%</td>
</tr>
</tbody>
</table>
</table-wrap>
diversity, as the microbial sequencing projects grow exponentially. Metagenomic data will produce a significant set of sequences that cannot be assigned to any known taxon, and the question arises how to estimate the number of unknown species. Huson
<italic>et al</italic>
. show that anywhere between 10% and 90% of all reads may fail to produce any hits [
<xref ref-type="bibr" rid="R66">66</xref>
].</p>
</sec>
</sec>
<sec>
<label>4.2.</label>
<title>Unsupervised Taxonomic Classification</title>
<p>Unsupervised techniques are usually based on a clustering method, although information-theoretic and text-mining measures have been used [
<xref ref-type="bibr" rid="R86">86</xref>
,
<xref ref-type="bibr" rid="R87"> 87</xref>
]. Recognizing that BLAST can only identify a fraction of reads in metagenomics data, clustering has been a natural step [
<xref ref-type="bibr" rid="R88">88</xref>
]. It has been recognized that supervised methods may be insufficient to represent all the extremely diverse microbial genomes. Recently, new methods have emerged to expand the power of unsupervised clustering [
<xref ref-type="bibr" rid="R89">89</xref>
-
<xref ref-type="bibr" rid="R92">92</xref>
]. Chan
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R89">89</xref>
] uses Self-organizing maps (SOM) and Growing-SOM (GSOM), which group items based on an adaptive filter learning model, to cluster 1kb to 10kb sequences. Another promising technique is Compostbin, which clusters 6 mer feature vectors (4096 features) of reads based on principal component analysis, and then iteratively segments the data based on a semi-supervised algorithm. On low-complexity datasets, 2-6 genomes per metagenomic sample, the highest error rate was 10%. This approach must now be validated on complex mixtures. In Nasser
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R91">91</xref>
], a fuzzy k-means clustering method uses GC-content and different order Markov chains features of two different organisms and genera, which obtains 99% accuracy but still needs to be tested on a more complex mixture. Another promising technique by Li
<italic>et al</italic>
. uses a similarity-based clustering to form groups that then are matched to known ORFs. Then, a consensus sequence is chosen to represent each family to filter out non-protein-coding ORFs [
<xref ref-type="bibr" rid="R92">92</xref>
]. From this study, 33,000 protein clusters were predicted from the 17.4 million ORFs, and 20% of the predicted ORFs were previously unknown, which might represent novel protein families. While unsupervised clustering techniques remain relatively uncharted territory, these methods hold promise for discovering new organisms and genes in metagenomics datasets.</p>
</sec>
<sec>
<label>4.3.</label>
<title>Methods for Constructing Environmental Community Trees</title>
<p>Each environmental community is composed of a different phylogenetic composition, and there are many different methods for constructing its phylogenetic tree [
<xref ref-type="bibr" rid="R93">93</xref>
]. Generally, each method used for tree construction will lead to a different conclusion of the taxonomy of the organisms under study. However, there is nature's ground truth for the taxonomy of the organisms. Therefore, researchers may employ several models for tree construction for a given set of data. From these multiple phylogenetic trees they attempt to arrive at a consensus of the environment under study [
<xref ref-type="bibr" rid="R94">94</xref>
]. Therefore when performing a comparative metagenomic analysis we are motivated to construct a phylogenetic tree for each environment.</p>
<p>Most phylogenetic reconstruction is based on short subunit 16S rRNA sequences. Operational taxonomic units (OTUs) at the species level are distinguished when the sequences vary more than 3% [
<xref ref-type="bibr" rid="R95">95</xref>
], whereas a genus-level OTU should not have more than 7% sequence variance [
<xref ref-type="bibr" rid="R96">96</xref>
]. Over 200,000 16S rRNA sequences have been collected over the years, which are being used to construct a universal tree [
<xref ref-type="bibr" rid="R97">97</xref>
]. Although extracting and comparing 16S rRNA sequences is the standard way to classify a sample's contents, it is not without its problems. If PCR (polymerase chain reaction) is used, not all rRNA genes amplify equally well with the same “universal” primers. Also, multiple, nonidentical copies exist in various organisms and may lead to overrepresentation of species.</p>
<p>Accurate taxonomic studies for the family and phylum are now within grasp using next-generation sequencing technology [
<xref ref-type="bibr" rid="R98">98</xref>
]. While this technology is not sufficient to sequence the generally accepted 500 bp 16S rRNA sequence for genus and species studies, there is a 400 bp model on the horizon [
<xref ref-type="bibr" rid="R47">47</xref>
]. Also, devices that are capable of sequencing the entire 16S rRNA gene may be available in the near future [
<xref ref-type="bibr" rid="R33">33</xref>
].</p>
<p>Regardless of the sequencing technology used, taxonomists can begin classifying an organism using various analytical statistical tools. Numerous researchers have developed software tools both to aid in the alignment of sequences and tools for developing phylogenetic (evolutionary) trees, all of which can be utilized for taxonomic purposes. Many of these have been incorporated into software packages and source code and are offered online. Some are proprietary and are available for purchase; however, the vast majorities are available for free.</p>
<p>Often, a researcher needs to compare two pieces of genetic information between two different organisms. Currently, a common technique is to align two sequences before any phylogeny can be inferred. The function of sequence alignment between two primary sequences of DNA, RNA or proteins is to determine regions of similarity between the two samples that may identify a structural or evolutionary relationship [
<xref ref-type="bibr" rid="R99">99</xref>
]. Once a relationship has been determined, an evolutionary tree may be constructed.</p>
<p>The software packages highlighted in this section are:</p>
<p>
<table-wrap id="TB3" position="anchor">
<table frame="border" rules="all" width="100%">
<thead>
<tr>
<th rowspan="1" colspan="1">Purpose</th>
<th rowspan="1" colspan="1">Tool </th>
<th rowspan="1" colspan="1">Algorithm </th>
<th rowspan="1" colspan="1">Access </th>
<th rowspan="1" colspan="1">Cost </th>
<th rowspan="1" colspan="1">Website</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">Sequence Alignment </td>
<td align="center" rowspan="1" colspan="1">BLAST [
<xref ref-type="bibr" rid="R41">41</xref>
]</td>
<td align="center" rowspan="1" colspan="1">Local alignment; similar to Smith-Waterman</td>
<td align="center" rowspan="1" colspan="1">Server; Executable</td>
<td align="center" rowspan="1" colspan="1">Free</td>
<td align="center" rowspan="1" colspan="1">
<uri xlink:type="simple" xlink:href="http://blast.ncbi.nlm.nih.gov/Blast.cgi">*http://blast.ncbi.nlm.nih.gov/Blast.cgi</uri>
<break></break>
<uri xlink:type="simple" xlink:href="http://www.ncbi.nlm.nih.gov/blast/download.shtml">*http://www.ncbi.nlm.nih.gov/blast/download.shtml</uri>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Clustal [
<xref ref-type="bibr" rid="R100">100</xref>
]</td>
<td align="center" rowspan="1" colspan="1">Global alignment; distance matrix, neighbor-joining</td>
<td align="center" rowspan="1" colspan="1">Server; Executable</td>
<td align="center" rowspan="1" colspan="1">Free</td>
<td align="center" rowspan="1" colspan="1">
<uri xlink:type="simple" xlink:href="http://www.ebi.ac.uk/clustalw/">*http://www.ebi.ac.uk/clustalw/</uri>
<break></break>
<uri xlink:type="simple" xlink:href="ftp://ftp.ebi.ac.uk/pub/software/clustalw2/">*ftp://ftp.ebi.ac.uk/pub/software/clustalw2/</uri>
</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Phylogeny Inference</td>
<td align="center" rowspan="1" colspan="1">MEGA [
<xref ref-type="bibr" rid="R101">101</xref>
]</td>
<td align="center" rowspan="1" colspan="1">Graphical Clustal ; Parsimony, neighbor-joining, UPGMA</td>
<td align="center" rowspan="1" colspan="1">Executable</td>
<td align="center" rowspan="1" colspan="1">Free</td>
<td align="center" rowspan="1" colspan="1">
<uri xlink:type="simple" xlink:href="http://www.megasoftware.net">http://www.megasoftware.net</uri>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">PAUP* [
<xref ref-type="bibr" rid="R102">102</xref>
]</td>
<td align="center" rowspan="1" colspan="1">Maximum Parsimony</td>
<td align="center" rowspan="1" colspan="1">Executable</td>
<td align="center" rowspan="1" colspan="1">$100</td>
<td align="center" rowspan="1" colspan="1">
<uri xlink:type="simple" xlink:href="http://paup.csit.fsu.edu/downl.html">http://paup.csit.fsu.edu/downl.html</uri>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">MrBayes [
<xref ref-type="bibr" rid="R103">103</xref>
]</td>
<td align="center" rowspan="1" colspan="1">Bayesian inference</td>
<td align="center" rowspan="1" colspan="1">Executable</td>
<td align="center" rowspan="1" colspan="1">Free</td>
<td align="center" rowspan="1" colspan="1">
<uri xlink:type="simple" xlink:href="http://mrbayes.csit.fsu.edu">http://mrbayes.csit.fsu.edu</uri>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Phylip [
<xref ref-type="bibr" rid="R104">104</xref>
]</td>
<td align="center" rowspan="1" colspan="1">Parsimony, distance matrix, bootstrapping, maximum likelihood</td>
<td align="center" rowspan="1" colspan="1">Executable</td>
<td align="center" rowspan="1" colspan="1">Free</td>
<td align="center" rowspan="1" colspan="1">
<uri xlink:type="simple" xlink:href="http://evolution.genetics.washington.edu/phylip.html">http://evolution.genetics.washington.edu/phylip.html</uri>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">UniFrac [
<xref ref-type="bibr" rid="R105">105</xref>
] </td>
<td align="center" rowspan="1" colspan="1">UniFrac distance metric; P-test </td>
<td align="center" rowspan="1" colspan="1">Server </td>
<td align="center" rowspan="1" colspan="1">Free </td>
<td align="center" rowspan="1" colspan="1">
<uri xlink:type="simple" xlink:href="http://bmf.colorado.edu/unifrac">http://bmf.colorado.edu/unifrac</uri>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<sec>
<label>4.3.1.</label>
<title>Sequence Alignment</title>
<p>In addition to pairwise alignment methods, Smith-Waterman and BLAST [
<xref ref-type="bibr" rid="R41">41</xref>
], multiple alignment methods can be used to compare multiple sequences at a time and be used for phylogenetic tree construction. The tradeoff is speed and accuracy where global alignment generally takes longer to compare than local, but has great accuracy. Unlike BLAST which uses local alignment, Clustal [
<xref ref-type="bibr" rid="R100">100</xref>
] performs sequence alignment globally, which may be more accurate. However, Clustal should not be used when multiple sequences are entered that do not share common ancestry. This type of alignment is better suited for BLAST, since BLAST compares the sequences against known databases. The Clustal algorithm attempts to align the sequences in query that are most-closely related to one-another to build a representative profile of the family of sequences [
<xref ref-type="bibr" rid="R106">106</xref>
]. Using dynamic programming the basic alignment algorithm consists of three main stages: a) all pairs of sequences are aligned separately in order to calculate a distance matrix giving the divergence of each pair of sequences, b) a guide tree is calculated typically using the Neighbor-Joining method from the distance matrix and c) finally, sequences are progressively aligned according to the branching order in the guide tree.</p>
</sec>
<sec>
<label>4.3.2.</label>
<title>Inferring Phylogenies</title>
<p>Generally, a phylogenetic tree is created for taxonomic purposes. Each organism on this evolutionary tree represents a node in which these descendants can be traced back to a common ancestor. To build a tree, a researcher first needs to have a file of aligned sequences such as the output files from an alignment method. These files would then be input to various software packages that have been developed for inferring phylogenies to generate the evolutionary tree. The most frequently cited phylogeny packages include PAUP* [
<xref ref-type="bibr" rid="R102">102</xref>
], MrBayes [
<xref ref-type="bibr" rid="R103">103</xref>
], Phylip [
<xref ref-type="bibr" rid="R104">104</xref>
], annd MEGA [
<xref ref-type="bibr" rid="R101">101</xref>
]. A new tool that builds and compares trees from metagenomics datasets is UniFrac [
<xref ref-type="bibr" rid="R105">105</xref>
].</p>
<p>Parsimony is the classical method for building trees using a non-parametric statistical method. Both PAUP* and Phylip utilize this algorithm. Parsimony searches for minimum length trees, i.e. trees that require the least evolutionary change to explain the set of aligned sequences describing them. Additionally, many clustering methods are used as an alternative to parsimony, such as neighbor-joining, Bayesian inference, and UPGMA [
<xref ref-type="bibr" rid="R107">107</xref>
]. MrBayes's use of this approach allows the user to compare heterogeneous data sets consisting of morphological data, nucleotides and proteins in a single analysis. Phylip also invokes maximum likelihood methods and bootstrapping to assign confidence levels to the tree. It is difficult to compare algorithms because taxonomy is constantly changing, and each is used on a different dataset. In addition to parsimony, neighbor-joining, UPGMA and Bayesian inference also have widespread use.</p>
<p>Other methods that use maximum likelihood (ML) method have been well established for phylogenetic tree reconstruction [
<xref ref-type="bibr" rid="R108">108</xref>
-
<xref ref-type="bibr" rid="R110">110</xref>
]. The objective is to maximize the likelihood of the mutation rates between different sequences while simultaneously estimating the tree topology [
<xref ref-type="bibr" rid="R111">111</xref>
]. The evolution between the sequences may be modeled by a discrete-state continuous-time Markov process on a phylogenetic tree. The substitution matrix determines the Markov process. This matrix may be estimated using the expectation maximization algorithm described in [
<xref ref-type="bibr" rid="R110">110</xref>
]. Another substitution model such as Jukes-Cantor may be chosen [
<xref ref-type="bibr" rid="R112">112</xref>
]. The ML method is advantageous in that it provides robustness against incorrect parameter selection in the underlying substitution model [
<xref ref-type="bibr" rid="R111">111</xref>
]. However, model selection is a critical component in a ML phylogenetic analysis and should be carefully considered as the resulting phylogenetic tree could change depending on the model [
<xref ref-type="bibr" rid="R111">111</xref>
,
<xref ref-type="bibr" rid="R113"> 113</xref>
]. For large data sets it is computationally expensive to search for the ML phylogenetic tree. Therefore, additional methods such as neighbor-joining are employed to expedite the analysis [
<xref ref-type="bibr" rid="R110">110</xref>
,
<xref ref-type="bibr" rid="R114"> 114</xref>
].</p>
<p>There are tools available that enable researchers to compare multiple environmental community trees in a phylogenetic context. UniFrac was developed to analyze significant differences between these multiple environments [
<xref ref-type="bibr" rid="R105">105</xref>
]. To accomplish this it implements the UniFrac significance test and the ubiquitous statistical P-test [
<xref ref-type="bibr" rid="R115">115</xref>
]. Once a researcher has found that there may be a significant difference between two or more environments they can perform a lineage-specific analysis which is also integrated in UniFrac. Using the G-test, a method similar to the chi-squared test for goodness of fit, the tool determines whether particular lineages within a global phylogenetic tree (consisting of all the environments in the comparative analysis) are abundant with sequences from a particular environment [
<xref ref-type="bibr" rid="R116">116</xref>
]. Thus environments may be clustered with respect to consisting of a particular lineage. With Unifrac, it has been shown that humans living in different geographic locations have distinct gut microbiomes.</p>
</sec>
</sec>
<sec>
<label>4.4.</label>
<title>Microarrays for Organism Detection</title>
<p>Microarrays, DNA chips composed of spots (wells that contain probes), are printed with DNA probes that hybridize with complementary DNA sequences [
<xref ref-type="bibr" rid="R117">117</xref>
]. The probes are short and are designed to unique identify target DNA/RNA sequences. A common use is for the detection of mRNA and gene expression. However, recently, this technology has been extended for organism detection in a given environment, e.g. air, soil or water [
<xref ref-type="bibr" rid="R118">118</xref>
-
<xref ref-type="bibr" rid="R121">121</xref>
]. The traditional caveat of microarrays is cross-hybridization, but it is hypothesized that grouping and compressed sensing methods can minimize and actually leverage information from this biochemical phenomenon [
<xref ref-type="bibr" rid="R118">118</xref>
]. Currently, a large number of probes (and therefore spots) are needed to detect a vast amount of organisms. Therefore, the goal of group-testing and compressed sensing microarrays (CSM) is to reduce the number of spots needed and cost of these devices.</p>
<p>Group testing design was extended by Schliep
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R122">122</xref>
] and applied to cover each target with a certain number of probes to allow identification of several targets simultaneously, while using a reasonably small total number of probes. In group testing, a potential group is specified by a probe which hybridizes to a set of target sequences. For instance, a potential target group only exists if there is a probe that binds to all - and exclusively those - sequences in the target. Probe selection for group testing is achieved by an algorithm known as SEPARATE, developed by Schliep
<italic>et al</italic>
., which avoids cross-hybridization between targets. This method has its disadvantages. For instance, Schliep
<italic>et al</italic>
. mentioned that out of 19 of the 679 sequences chosen, they were unable to find any suitable oligos demonstrating that the algorithm may fail to find suitable probes. Therefore, microarray target detection can be improved.</p>
<p>In recent years, compressed sensing in signal processing has promised to overcome the lack-of-satisfactory probes from group testing by using fewer probes for organism identification. The essential idea of compressive sensing (or sampling) is that an inherently sparse signal can be recovered by using far fewer measurements than what is typically needed by Shannon's law. Current CSM (compressed sensing microrray) designs focus on: 1) sensing organisms through unique DNA pattern identifiers, rather than single DNA sequences per organism [
<xref ref-type="bibr" rid="R118">118</xref>
], and 2) leveraging cross-hybridization properties of DNA sequences as useful side information for genetic identification [
<xref ref-type="bibr" rid="R118">118</xref>
,
<xref ref-type="bibr" rid="R120"> 120</xref>
], and 3) using multiple probes per spot so that the number of spots is significantly fewer than the number of organisms [
<xref ref-type="bibr" rid="R121">121</xref>
].</p>
<p>The compressive sensing DNA microarray is a type of group testing. In CSMs, however, organisms are being grouped according to their DNA sequence similarity. Such groupings are obtained by using the Cluster of Orthologous Genes website (COGs), which organizes prokaryote and unicellular eukaryotes into groups according to the similarity of their protein sequences [
<xref ref-type="bibr" rid="R118">118</xref>
]. Sheikh
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R118">118</xref>
] extracted probe candidates from the shortest genes in a group of organisms, thus restricting the full search space and not yielding the optimal probe candidates. Yok
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R120">120</xref>
] have introduced an alternative compressive sensing probe picking algorithm, which consider all possible hybridization affinities and chooses the best group identifier probe among all possible probe candidates from all the members of a group [
<xref ref-type="bibr" rid="R120">120</xref>
].</p>
</sec>
</sec>
<sec>
<label>5.</label>
<title>GENE-CENTRIC METAGENOMICS: FUNCTIONAL CLASSIFICATION OF SAMPLES</title>
<p>Beyond asking “who” and “how many,” the next question is “What are they (the microbial communities) doing?” By using high-resolution community-wide genomic information, we can describe the composition, function, and emergent properties of integrated microbial communities more accurately. Such analyses might distinguish the characteristics associated with environmentally-robust bacterial communities from those that allow pathogens in certain habitats.</p>
<p>In fact, several recent gene-centric studies have focused on comparative metagenomics to investigate whether distinct commonalities and/or differences can be observed in microbial communities that can be attributed to their habitat or physical environment. The consensus opinion of these studies indicate that there is a strong correlation between the communities and the habitat in which they live, whether the environment is soil, marine or the human gut. Tringe
<italic>et al</italic>
. (2005)'s seminal work [
<xref ref-type="bibr" rid="R23">23</xref>
], for example, compared samples from agricultural soil, deep-sea whale-fall carcasess, the Sargasso Sea and the acid mine drainage environments. Using a clustering based approach, they showed that profiles of the microbial communities from each environment clustered with those of others in the same community, and concluded that “functional profile of a community is influenced by its environment.” Similar comparative analyses have also shown the existence of “functional anchors in complex microbial communities” of the human gut [
<xref ref-type="bibr" rid="R123">123</xref>
], or that while some rare members of the soil bacterial community were closely related to abundant taxonomic groups, a significant portion of the “rare biosphere showed evolutionarily distinct lineages at various taxonomic cutoffs” [
<xref ref-type="bibr" rid="R124">124</xref>
]. Fierer
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="R22">22</xref>
,
<xref ref-type="bibr" rid="R125"> 125</xref>
] compared the diversities, richness and evenness of four major microbial taxa, (bacteria, archaea, fungi, and viruses), in prairie, desert, and rainforest soils, concluding that all communities display local as well as global diversity. The same group also showed that bacterial diversity was unrelated to physical features (such as temperature) that typically predict plant and animal diversity, however, the diversity and richness of soil bacterial communities does differ by ecosystem type. Allison
<italic>et al</italic>
. investigated whether microbial community composition is resistant, resilient, or functionally redundant in response to different environmental disturbances (and concluded that they are not) [
<xref ref-type="bibr" rid="R126">126</xref>
]. On the other hand, Kurokawa
<italic>et al. </italic>
showed that gut microbiota from unweaned infants were simple with a higher variation in taxonomic and gene composition, while those from adults and weaned children were more complex with a higher functional uniformity regardless of age or sex [
<xref ref-type="bibr" rid="R14">14</xref>
]. De Long
<italic>et al</italic>
. compared microbial communities from the ocean's surface to near-sea floor depths, which showed “vertical zonation of taxonomic groups,” suggesting “depth-variable community trends in carbon and energy metabolism,” among other interactions [
<xref ref-type="bibr" rid="R127">127</xref>
].</p>
<p>While the aforementioned studies established that there is a relationship between functions of communities and their habitats, a separate line of work tried to determine exactly what those functions are. An important first step to discern function is to find the regions of DNA which encode for proteins. Early gene finding methods focused on finding Open Reading Frames in DNA sequence. An Open Reading Frame is generally defined as a sequence of DNA that begins with a start codon and ends with one of the stop codons. Many methods have been developed for locating ORFs within a DNA sequence, including simply locating start and stop codons, as in the NCBI ORF finder tool [
<xref ref-type="bibr" rid="R128">128</xref>
]. This simple method, however, only gives us ORFs but does not indicate which regions actually encode proteins. Methods such as GENIE [
<xref ref-type="bibr" rid="R129">129</xref>
], GENSCAN [
<xref ref-type="bibr" rid="R130">130</xref>
], GENEMARK [
<xref ref-type="bibr" rid="R10">10</xref>
], GLIMMER [
<xref ref-type="bibr" rid="R131">131</xref>
], not only look for regions with start and stop codons but also predict whether the region in question has a chance of actually encoding for a protein. GENIE uses a generalized HMM to give a gene model of a DNA sequence [
<xref ref-type="bibr" rid="R129">129</xref>
].</p>
<p>GeneMark [
<xref ref-type="bibr" rid="R10">10</xref>
] or GLIMMER [
<xref ref-type="bibr" rid="R131">131</xref>
] can be used to predict protein coding regions in prokaryotic organisms. It scores coding regions by creating an HMM with 9 hidden states. GLIMMER, on the other hand, improves on GeneMark by using interpolated Markov models (IMMs) with varying orders (instead of the fixed 5th order HMM used by GeneMark) [
<xref ref-type="bibr" rid="R131">131</xref>
]. Specifically, Glimmer uses models ranging from 1st through 8th order and combines three periodic-nonhomogeneous Markov models in the IMM to predict protein coding regions. In metagenomic samples however, most bacteria and their genes have not been previously sequenced, resulting in little training data being available for these training-reliant methods. Thus a set of new methods must be developed in order to perform gene finding on previously uncultured environmental samples.</p>
<sec>
<label>5.1.</label>
<title>Towards Functional Metagenomics</title>
<sec>
<label>5.1.1.</label>
<title>Metagene [
<xref ref-type="bibr" rid="R132">132</xref>
] </title>
<p>MetaGene is a utility that seeks to make use of existing packages on the web to analyze predicted gene features. MetaGene uses a large set of prokaryotic genes in Genbank [
<xref ref-type="bibr" rid="R133">133</xref>
] to create a training set, and runs in two stages. First, all ORFs are extracted from the data and are scored according to their base compositions and lengths. Partial ORFs are only extracted if they encompass the entire sequence being analyzed, or if they appear at the very end of a sequence. The second stage uses these scores, as well as the distances of neighboring ORFs, to find an optimal combination of ORFs. Metagene's computes its scores using log-odds ratios on such features as di-codon frequency, ORF length distributions, distance distributions from an annotated start codon to the nearest start codon and frequencies of orientations and orientation dependent distances of neighboring ORFs [
<xref ref-type="bibr" rid="R132">132</xref>
]. MetaGene was first tested on whole bacterial genomes and compared to GeneMark, which unlike MetaGene, uses CG% to estimate codon frequencies and distance distributions and performed comparably for the bacterial and archaeal genomes analyzed in the test. On the other hand, while performing well on long shotgun sequences, no performance analysis is shown for shorter reads, and there has been no significant investigation for hypothetical gene regions identified by GeneMark. Therefore, the feasibility of this approach for finding novel genes is currently unknown.</p>
</sec>
<sec>
<label>5.1.2.</label>
<title>Harrington et al. [
<xref ref-type="bibr" rid="R134">134</xref>
]</title>
<p>While MetaGene shows promising results when known genes are used as a training set, it only evaluates regions based on simple criteria and it has no ability to predict function. Harrington
<italic>et al</italic>
. propose an approach that analyzes ORFs to infer function from the proteins these regions coded for [
<xref ref-type="bibr" rid="R134">134</xref>
]. Harrington
<italic>et al</italic>
.'s method was evaluated on Genbank as well as other functional databases such as KEGG [
<xref ref-type="bibr" rid="R135">135</xref>
], COG [
<xref ref-type="bibr" rid="R136">136</xref>
], UniRef [
<xref ref-type="bibr" rid="R137">137</xref>
], SMART [
<xref ref-type="bibr" rid="R138">138</xref>
], and Pfam [
<xref ref-type="bibr" rid="R139">139</xref>
]. Specifically, Harrington
<italic>et al</italic>
. use these databases to find gene regions inside environmental samples with high similarity, or in the domain or gene neighborhood as existing protein sequences. The approach allows categorizing the ORFs as being in the domain of known proteins even though many of the bacteria in these environmental samples have never been cultured. This means that the ORF regions with little or no similarity to known sequences may be inferred as being in the same family or domain as a group of known proteins. By using a combination of functional and sequence similarity along with genomic neighborhood, Harrington
<italic>et al</italic>
. were able to infer function for 76% of the ORFs found in four different environmental samples. Previous to this study, function was only predicted for 27%-48% of the ORFs in three different wale fall carcasses [
<xref ref-type="bibr" rid="R134">134</xref>
]. It should be noted, however, this method has only been demonstrated to work on longer sequence reads.</p>
</sec>
<sec>
<label>5.1.3.</label>
<title>Yooseph's Incremental Clustering [
<xref ref-type="bibr" rid="R140">140</xref>
]</title>
<p>Clustering approaches can also find gene regions and identify their functions. One such method uses known protein families and sequences as inputs to identify protein coding regions, and cluster the data based on their function [
<xref ref-type="bibr" rid="R140">140</xref>
]. This method was compared to MetaGene and was found that a large portion of the identified regions overlapped. Of those regions that did not overlap, only 4% of the MetaGene predictions had matches to Pfam models, as opposed to 21% with the clustering method. Yooseph's method was also shown to have high specificity, though its sensitivity in detecting a gene is dependent on the representation of existing protein clusters in the organisms' neighbors (taxonomic).</p>
</sec>
<sec>
<label>5.1.4.</label>
<title>Hoff et al. [
<xref ref-type="bibr" rid="R141">141</xref>
]</title>
<p>Many of the aforementiond methods have difficulties dealing with shorter fragment lengths produced by pyrosequencing. To address this issue, Hoff
<italic>et al</italic>
. developed a two-stage machine learning approach to gene prediction that analyzed performance for fragments ranging in size from 100bp to 2000bp. First, linear discriminants are used to extract features from identified ORFs. Incomplete ORFs are permitted as many ORFs could be fragmented due to pyrosequencing. The features extracted are monocodon and dicodon usage, translation initiation sites, ORF sequence length, and CG content. In stage 2, these features are used to build a multilayer perceptron (MLP) neural network for binary ORF classification (coding or non-coding). The trained MLP then determines the final coding candidates. The authors note their results to be similar to MetaGene, and conclude that their method's ability to have high prediction specificity complements MetaGene's high sensitivity. Therefore, they recommend a combination of the two methods for gene finding in metagenomic samples [
<xref ref-type="bibr" rid="R141">141</xref>
].</p>
<p>The method's benefit is that it directly addresses relatively short fragments. It does not however attempt to infer the function of any of the predicted genes or to group those genes based on their potential to have the same function. This could potentially be addressed by combining this approach with that of Harrington's [
<xref ref-type="bibr" rid="R134">134</xref>
].</p>
</sec>
<sec>
<label>5.1.5.</label>
<title>Dinsdale et al. [
<xref ref-type="bibr" rid="R142">142</xref>
]</title>
<p>Dinsdale
<italic>et al. </italic>
looked at the possibility that different environments may have different metabolic profiles [
<xref ref-type="bibr" rid="R142">142</xref>
], which was tested using canonical discriminant analysis (CDA). Also known as multiple discriminant analysis or discriminant factor analysis, CDA seeks to classify cases into three or more categories using dummy categorical variables as predictors. The authors wished to find metabolic functions (the variables in CDA) that would distinguish different organisms. Samples were sequenced using pyrosequencing and were compared to functional genes in the SEED platform (
<uri xlink:type="simple" xlink:href="http://www.theseed.org">http://www.theseed.org</uri>
) using BLASTX with an E-value < 0.0001. In order to perform the CDA the sequences were grouped according to their SEED classification. CDA builds a model for each membership in each group and calculates a discriminant value for each metagenomic fragment (sample). CDA is advantageous because it can identify which variables best separate the groups, analyze those variables only, and discard the rest. The CDA was performed on 15 million sequences from 45 microbiomes and 42 viromes. Most of the variance between the different environments (79.8% of the combined microbiome and 69.9% of the virome) was explained in this analysis, showing that metagenomes are highly predictive of metabolic potential within an ecosystem. In contrast, a recent analysis of 16S rRNA genes from multiple environments only explained about 10% of the variance [
<xref ref-type="bibr" rid="R143">143</xref>
], which suggests that taxa alone is not sufficient, but metabolic function is also needed to distinguish different ecosystems.</p>
</sec>
<sec>
<label>5.1.6.</label>
<title>Krause et al. [
<xref ref-type="bibr" rid="R144">144</xref>
]</title>
<p>In order to overcome the short-read limitation of next-generation sequencing, Krause
<italic>et al</italic>
. follow a four-stage approach: First, a BLAST search divides the sequence into six reading frames. BLAST searches are conducted on the amino acid level where each hit is associated with a specific reading frame in the contig. BLAST hits are filtered to retain those indicating the presence of a coding sequence. In stage two, combined scores are calculated which indicate the coding potential of each nucleotide in a contig. The sequence of each reading frame is compared with all the database matches that were generated from the BLAST search prior. The number of synonymous substitutions for each match is used as a positive score with non-synonymous substitutions counting as negative scores. The scores for each position and reading frame are stored in a matrix giving a position specific score that the contig is coding (or non-coding) in one of the six reading frames. In stage three, this matrix is used within a dynamic programming based optimization algorithm to find an optimal path. Finally, in stage four, postprocessing combines predictions from previous steps and identifies frame shifts. This algorithm is computationally expensive due to the dynamic programming, but it achieves good success and is able to quickly process the large number of sequences generated by 454 pyrosequencing.</p>
</sec>
</sec>
</sec>
<sec>
<label>6.</label>
<title>BIOMOLECULAR DYNAMICS IN MICROBIAL COMMUNITIES</title>
<p>The main thrust of our review is the analysis of DNA sequence data. However, characterizing the organisms and genes present in a metagenomic sample only tells us the “parts list” of the organisms within the microbial community. Under different environmental conditions and stresses -- such as the presence of toxins or changing nutrient levels -- different parts will be expressed as needed for the organisms within the community to adapt and grow. Furthermore, while sequences that are identified as hypothetical genes based on homology analysis may be found within a metagenome sequence, they may contain mutations or be otherwise non-functional within the microbes that are present in the community. Thus, after sequencing the DNA of a microbial community, we need to understand how the community behaves by identifying what genes are expressed and produce proteins that perform cellular functions. To do so, biological researchers are taking advantage of “post-genome” technologies [
<xref ref-type="bibr" rid="R117">117</xref>
] that were initially developed to analyze the molecular behavior at the level of mRNA molecules transcribed from genes, proteins that are translated from mRNA, and other molecules that are significant for cellular functions. While our review emphasizes signal processing methods applied to metagenome data, we will briefly discuss new applications of technologies to elucidate the dynamics of biomolecular networks that respond to environmental changes: specifically, changing the expression of genes, the level of proteins that are produced, and the levels of metabolites (small molecules) that change with the activity of metabolic pathways within microbial cells.</p>
<sec>
<label>6.1.</label>
<title>Metatranscriptomics</title>
<p>Functional genomics is the high-throughput generation of data for the expression of genes in cells. Gene expression is the transcription of DNA to produce mRNA, which goes on to form the template for protein generation. There has been substantial work done on developing platforms to mRNA levels expressed from the whole genome from cells of single organisms. These techniques can be applied to multiple organisms in a community as reviewed in [
<xref ref-type="bibr" rid="R145">145</xref>
], but with an increase in the necessary complexity. One approach is to extend microarrays, which typically have oligonucleotide probes that can identify the presence of mRNA expressed from each gene of a genome. This can be done by developing a microarray that has probes for genes from multiple genomes, such as was done in [
<xref ref-type="bibr" rid="R146">146</xref>
] for the study of 4 microbial species cultured together. However, this strategy requires knowing
<italic>a priori</italic>
what organisms will be present in a sample or else selecting only a few organisms within a community to study. As an alternative, a microarray can be developed to analyze genes within a set of functional pathways, such as those involved in contaminant degradation [
<xref ref-type="bibr" rid="R147">147</xref>
]. In this strategy, microarrays are designed with probes that recognize regions of these genes that are highly conserved between species [
<xref ref-type="bibr" rid="R148">148</xref>
]. Consequently, the expression of genes with these functions can be detected from many different organisms (including those with unknown organisms). This kind of microarray was recently used to compare gene expression in samples from different ecological niches of Antarctic soil [
<xref ref-type="bibr" rid="R149">149</xref>
].</p>
<p>In general, the microarray platform is limited by the increased cost of adding increased number of probes, as well as the potential for cross-hybridization noise when trying to differentiate between the expression of genes with highly similar sequences. Another strategy that has been employed is high-throughput DNA sequencing technologies employed for metagenomics studies, such as pyrosequencing technology. The mRNA expressed by a microbial community can be isolated and chemically copied to form a complementary DNA strand, which can then be sequenced. This approach has been recently used to analyze gene expression in oceanic samples [
<xref ref-type="bibr" rid="R150">150</xref>
,
<xref ref-type="bibr" rid="R151"> 151</xref>
]. Notably, at least 99.9% of the RNA was found to be mRNA expressed from genes, as opposed to ribosomal RNA. Furthermore, in both studies, they found many more genes in the mRNA complement then in a simultaneous sequencing of the DNA isolated from the sample, including approximately 50% of previously unknown genes found by [
<xref ref-type="bibr" rid="R151">151</xref>
].</p>
<p>Like metagenomic DNA sequences, functional metagenomic mRNA data sets represent a large-scale analysis problem. Previous studies have demonstrated the efficacy of signal processing methods for the analysis of gene expression data for single organisms, as reviewed in [
<xref ref-type="bibr" rid="R152">152</xref>
,
<xref ref-type="bibr" rid="R153"> 153</xref>
]. These methods include single value decomposition for identifying groups of genes that are expressed under different stimuli [
<xref ref-type="bibr" rid="R154">154</xref>
], unsupervised clustering methods [
<xref ref-type="bibr" rid="R155">155</xref>
], and other pattern recognition methods reviewed in [
<xref ref-type="bibr" rid="R156">156</xref>
]. The analysis and interpretation of gene expression data is still an area of ongoing research. It is reasonable to expect that metagenomic samples will pose new challenges, since many more genes are present in data sets, e.g., 330 million base pairs and potentially 10
<sup>5</sup>
genes found by [
<xref ref-type="bibr" rid="R150">150</xref>
].</p>
</sec>
<sec>
<label>6.2.</label>
<title>Metaproteomics</title>
<p>While the mRNA expression of genes drives changes in protein levels under different environmental conditions and stimuli, protein expression dynamics are further regulated by different rates of degradation, post-translational modifications, etc. that cannot be measured with functional metagenomics. The high-throughput measurement of protein expression within a microbial community is called
<italic>metaproteomics</italic>
, and has been reviewed in [
<xref ref-type="bibr" rid="R51">51</xref>
,
<xref ref-type="bibr" rid="R157"> 157</xref>
]. One of the initial studies, which used mass spectrometry (MS)-based proteomics along with metagenomic DNA sequencing, studied a low complexity biofilm from underground mine sites [
<xref ref-type="bibr" rid="R158">158</xref>
]. Further examples of MS-based metaproteomics include the analysis of samples from chlorobenzene-contaminated sites [
<xref ref-type="bibr" rid="R55">55</xref>
], studying uncontaminated soil samples cultured in the presence of cadmium to measure the temporal response of a community to a controlled stimulus [
<xref ref-type="bibr" rid="R54">54</xref>
], and the analysis of a bioreactor used to optimize sludges for phosphorus removal [
<xref ref-type="bibr" rid="R159">159</xref>
]. Besides studying biomolecular dynamics, metaproteomics can also be used to complement the identification of genes and genomes within a community, through directly sequencing peptides (protein fragments) found in samples in an initial MS analysis. This was integrated with DNA sequencing to characterize previously unknown proteins in [
<xref ref-type="bibr" rid="R55">55</xref>
], as well as to distinguish between the expression of proteins from related organisms that differed by as little as a single amino acid in [
<xref ref-type="bibr" rid="R160">160</xref>
] -- a difference so small that sequence analysis would be unable to distinguish the genes that code for them.</p>
<p>As with functional genomics, signal processing methods are critical for the analysis of metaproteomic data. Unlike gene expression data, proteomics data does not cleanly identify the levels of individual proteins. Rather, the mass spectrum of protein fragments is obtained, and peaks are correlated with a database to identify individual proteins. Clustering and other statistical signal processing approaches to this problem are reviewed in [
<xref ref-type="bibr" rid="R161">161</xref>
,
<xref ref-type="bibr" rid="R162"> 162</xref>
]. A specific analysis of statistical classification, including various methods based on univariate statistics and principle components analysis, has been reported on representative data sets [
<xref ref-type="bibr" rid="R163">163</xref>
]. Other work has described the use of support vector machines for protein identification and classification [
<xref ref-type="bibr" rid="R164">164</xref>
], as well as the use of FFT for data noise reduction followed by Bayesian clustering on reconstructed data sets to identify proteomic differences between samples [
<xref ref-type="bibr" rid="R165">165</xref>
]. Machine learning methods for proteomics are reviewed in [
<xref ref-type="bibr" rid="R166">166</xref>
], including the application of peak clustering and wavelet-based methods for mass spectrum pre-processing, and the use of classifier methods for identifying proteins that change under different conditions.</p>
</sec>
<sec>
<label>6.3.</label>
<title>Meta-Metabolomics</title>
<p>The principal activity of a microbial cell is to metabolize nutrients and generate energy required to survive and grow. The enzymatic reactions for metabolism are structured in metabolic pathways and networks within a cell. Metabolism in a microbial community is interactive -- the products of metabolism from one species may enhance or inhibit metabolic pathways in other species. And, in a community hosted with a multicellular organism, such as the microbial community in the human gut, metabolic pathways within bacterial cells may interact with pathways within host cells. Changes in the activity of metabolic pathways is reflected by changes in the levels of small molecules that are the substrates and intermediates of enzymatic pathways. The levels of many metabolites can be measured simultaneously through nuclear magnetic resonance (NMR) spectroscopy, reviewed in [
<xref ref-type="bibr" rid="R167">167</xref>
] or by liquid chromatography separation followed by mass spectrometry to identify metabolites by their masses and charge levels, reviewed in [
<xref ref-type="bibr" rid="R168">168</xref>
]. Notably, these
<italic>metabolomic</italic>
(also known as
<italic>metabonomic</italic>
in some literature) technologies are inherently “meta-metabolomic” -- measurements of metabolites in a sample from mammalian blood or urine, for example, will reflect the contributions of both the host metabolic pathways as well as those of microbial communities colonizing it.</p>
</sec>
</sec>
<sec>
<label>7.</label>
<title>METAGENOMICS DATABASES, TOOLS, AND BENCHMARKING</title>
<p>One of the first extensive metagenomics datasets was published in 2004 by the Craig Venter Institute, which composes approximately 2 million reads, averaging 818 bp per read, sampled at 7 different sites in the Sargasso Sea [
<xref ref-type="bibr" rid="R69">69</xref>
,
<xref ref-type="bibr" rid="R169"> 169</xref>
]. Sargasso sea analysis countered traditional views that the salty Sargasso Sea is nutrient poor and showed that reads aligned to a diversity of life.</p>
<p>Subsequently, many projects have been sequenced and are publicly available (see Fig.
<bold>
<xref ref-type="fig" rid="F2">2</xref>
</bold>
for a history). After the Human Gut Microbiome dataset [
<xref ref-type="bibr" rid="R170">170</xref>
] was released in 2006, the NIH (National Institute of Health) made the human microbiome a part of its roadmap initiatives in 2007 [
<xref ref-type="bibr" rid="R12">12</xref>
,
<xref ref-type="bibr" rid="R171"> 171</xref>
]. In 2007, the Department of Energy's Joint Genome Intiative (DOE/JGI) had sequenced about 50% of the metagenomics projects including various soil microbiomes, human, mouse, and termite gut samples, and also airborne samples [
<xref ref-type="bibr" rid="R172">172</xref>
,
<xref ref-type="bibr" rid="R173"> 173</xref>
]. San Diego State University's SCUMS (SDSU Center for Universal Microbial Sequencing) contains samples from coral reefs, Soudan mine, human lungs, etc. [
<xref ref-type="bibr" rid="R174">174</xref>
]. In 2007, microbes were isolated from the human mouth that come from a previously unknown phylum, TM7 [
<xref ref-type="bibr" rid="R175">175</xref>
]. Because of horizontal gene transfer and possible contamination, some of the genes aligned to the Leptotrichia species. Thus, while it was intended as a single cell genome sequencing project, the result is considered a metagenomic dataset [
<xref ref-type="bibr" rid="R176">176</xref>
].</p>
<p>Some of the databases online provide their own tools for analysis. Two of such online services are CAMERA (Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis) [
<xref ref-type="bibr" rid="R177">177</xref>
,
<xref ref-type="bibr" rid="R178"> 178</xref>
] and the MG-RAST (Meta Genome Rapid Annotation using Subsystem Technology) [
<xref ref-type="bibr" rid="R179">179</xref>
] server. Much of CAMERA's tools are visualizations of the BLAST hits of the reads. The tools included in RAST are annotation, phylogeny, metabolic reconstruction and visual comparison tools.</p>
<p>With the vast amount of data becoming available and published, researchers are calling for a standardization process to register new projects, tools, and other publications [
<xref ref-type="bibr" rid="R180">180</xref>
]. There is also contamination present in some of the metagenomics datasets such as in the Sargasso Sea dataset [
<xref ref-type="bibr" rid="R181">181</xref>
]. Also, metagenomic datasets contain many unknown phyla, genera, and species. If a standardized metagenomics dataset is designed to simulate training and test data, computational tools can use such a dataset to benchmark and compare their performance for known and unknown organisms. The first such attempt at simulating metagenomic data has been released and is called MetaSim [
<xref ref-type="bibr" rid="R182">182</xref>
].</p>
</sec>
<sec>
<label>8.</label>
<title>FUTURE APPLICATIONS</title>
<p>As metagenomic approaches become more feasible and cost-effective, we stand to gain a large amount of sequence data from previously uncultured and uncharacterized microbes. The expected influx of these data will undoubtedly shed a great deal of insight into the bacterial phylogeny, enabling us to study the evolution of many novel lineages that live in complex communities within previously understudied environments. Two applications that are of interest are health diagnosis and food security that we present in this section.</p>
<sec>
<label>8.1.</label>
<title>Correlation of Metagenome to Function for Obesity</title>
<p>As metagenomics and metaproteomics advance, the pivotal process in the field will be to merge the two and infer collective function from the interactions of multitudes of microbial species. One important example applies to human health in a recent study by Turnbaugh and colleagues [
<xref ref-type="bibr" rid="R183">183</xref>
]. Using a combination of 454 and Sanger sequencing, the authors sequenced the metagenome of lean and obese mouse littermates. After performing a functional annotation of the sequenced fragments, genes were classified into distinct functional categories. The relative abundances of sequences from these categories were then compared between lean and obese siblings to identify differences in the genomic signatures of their distal gut communities. Strikingly, their analyses illustrated that gut microbes from obese mice were enriched for genes encoding enzymes that metabolize “indigestible” polysaccharides. Combined with experimental evidence from caloric measurements of mouse feces, this indicated that the gut bacteria of obese mice are better able to extract energy from their hosts’ diets, providing a plausible means by which bacteria could promote obesity. Accordingly, Turnbaugh and colleagues demonstrated that the addition of “obese” microbial communities to germ free mice did indeed lead to an increase in body fat.</p>
<p>Several observations reveal that these findings have direct implications for obesity in human populations. First, analyses of 16S rRNA sequences reveal that bacteria from the phylum Firmicutes are more abundant in the guts of both obese mice and humans compared to the guts of their lean conspecific counterparts [
<xref ref-type="bibr" rid="R11">11</xref>
,
<xref ref-type="bibr" rid="R184"> 184</xref>
]. Second, and conversely, bacteria from the phylum Bacteroidetes were less abundant in the guts of obese mice and humans compared to the guts of lean individuals [
<xref ref-type="bibr" rid="R11">11</xref>
,
<xref ref-type="bibr" rid="R184"> 184</xref>
]. Third, and most importantly, human weight loss was correlated with a concomitant decrease in Firmicute bacteria and a corresponding increase in the proportion of “healthy” Bacteroidetes [
<xref ref-type="bibr" rid="R11">11</xref>
]. So combined, these findings implicate bacteria as playing a direct role in human obesity, identifying novel targets in the fight against this growing epidemic.</p>
</sec>
<sec>
<label>8.2.</label>
<title>Food Security</title>
<p>An example of a future linkage between metagenomics and function is soil microbial community assessment for agricultural decision making and food security. The presence in soils of specific plant pathogens, pests, growth inhibitors, and nutrient imbalances can interfere to unknown degrees with the production of desired crops. The absence in soils of specific plant symbionts or root associates, on the other hand, can also limit crop productivity. Soil metagenomics offers the means to diagnose functional capabilities of microbial communities for optimizing agricultural production on arable lands, the supply of which is becoming more limited in the face of a rapidly growing global population. Unbeknownst to us today, soils may not be providing optimal yields due to the lack of microbial assemblages needed for improved plant growth or disease resistance, despite provision of adequate fertilizers and appropriate cultivation practices. Moreover, current agricultural practices, such as fertilization with animal manures or municipal biosolids, may foster the establishment of soil microbial communities that pose food safety threats by serving as reservoirs for emerging pathogens or by facilitating exchange of antibiotic resistance genes among microorganisms [
<xref ref-type="bibr" rid="R27">27</xref>
]. Thus insights from linking metagenomics and function can help improve the safety and sustainability of our food supply.</p>
<p>Greater understanding of microbial communities and the factors that drive their compositions will be key in engineering better human health, food security, and environmental quality. While still at an early stage, these findings highlight the utility of metagenomics in studies of human disease, soil productivity, and ecosystem services, while also revealing a new-found ability to elucidate and compare genomic signatures of natural bacterial communities.</p>
</sec>
</sec>
</body>
<back>
<ref-list>
<title>REFERENCES</title>
<ref id="R1">
<label>1</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Handelsman</surname>
<given-names>J</given-names>
</name>
</person-group>
<source>Committee on metagenomics: challenges and functional applications</source>
<year>2007</year>
<publisher-name>The National Academies Press</publisher-name>
</element-citation>
</ref>
<ref id="R2">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Swanson</surname>
<given-names>MS</given-names>
</name>
<name>
<surname>Hammer</surname>
<given-names>BK</given-names>
</name>
</person-group>
<article-title>Legionella pneumophila pathogesesis: a fateful journey from amoebae to macrophages</article-title>
<source>Annu. Rev. Microbiol</source>
<year>2000</year>
<volume>54</volume>
<fpage>567</fpage>
<lpage>613</lpage>
<pub-id pub-id-type="pmid">11018138</pub-id>
</element-citation>
</ref>
<ref id="R3">
<label>3</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Han</surname>
<given-names>Y. W</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Chung</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Buhimschi</surname>
<given-names>I. A</given-names>
</name>
<name>
<surname>Buhimschi</surname>
<given-names>C. S</given-names>
</name>
</person-group>
<article-title>Uncultivated bacteria as etiologic agents of intra-amniotic inflammation leading to preterm birth</article-title>
<source>J. Clin. Microbiol</source>
<year>2009</year>
<volume>47</volume>
<fpage>38</fpage>
<lpage>47</lpage>
<pub-id pub-id-type="pmid">18971361</pub-id>
</element-citation>
</ref>
<ref id="R4">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aas</surname>
<given-names>J. A</given-names>
</name>
<name>
<surname>Paster</surname>
<given-names>B. J</given-names>
</name>
<name>
<surname>Stokes</surname>
<given-names>L. N</given-names>
</name>
<name>
<surname>Olsen</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Dewhirst</surname>
<given-names>F. E</given-names>
</name>
</person-group>
<article-title>Defining the normal bacterial flora of the oral cavity</article-title>
<source>J. Clin. Microbiol</source>
<year>2005</year>
<volume>43</volume>
<fpage>5721</fpage>
<lpage>5732</lpage>
<pub-id pub-id-type="pmid">16272510</pub-id>
</element-citation>
</ref>
<ref id="R5">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Amann</surname>
<given-names>R. I</given-names>
</name>
<name>
<surname>Ludwig</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Schleifer</surname>
<given-names>K. H</given-names>
</name>
</person-group>
<article-title>Phylogenetic identification and in situ detection of individual microbial cells without cultivation</article-title>
<source>Microbiol. Rev</source>
<year>1995</year>
<volume>59</volume>
<fpage>143</fpage>
<lpage>169</lpage>
<pub-id pub-id-type="pmid">7535888</pub-id>
</element-citation>
</ref>
<ref id="R6">
<label>6</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mardis</surname>
<given-names>E. R</given-names>
</name>
</person-group>
<article-title>The impact of next-generation sequencing technology on genetics</article-title>
<source>Els. Trends Genet</source>
<year>2008</year>
<volume>24</volume>
<issue>3</issue>
<fpage>142</fpage>
<lpage>149</lpage>
</element-citation>
</ref>
<ref id="R7">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pop</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Salzberg</surname>
<given-names>S. L</given-names>
</name>
</person-group>
<article-title>Bioinformatics challenges of new sequencing technology</article-title>
<source>Trends Genet</source>
<year>2008</year>
<volume>24</volume>
<issue>3</issue>
<fpage>142</fpage>
<lpage>149</lpage>
<pub-id pub-id-type="pmid">18262676</pub-id>
</element-citation>
</ref>
<ref id="R8">
<label>8</label>
<element-citation publication-type="journal">
<article-title>Sequencing the microbial soup</article-title>
<source>Nat. Struct. Mol. Biol</source>
<year>2008</year>
<volume>15</volume>
<issue>2</issue>
<fpage>177</fpage>
<lpage>182</lpage>
<pub-id pub-id-type="pmid">18204466</pub-id>
</element-citation>
</ref>
<ref id="R9">
<label>9</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bohannon</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Microbial ecology. Confusing kinships</article-title>
<source>Science</source>
<year>2008</year>
<volume>320</volume>
<issue>5879</issue>
<fpage>1031</fpage>
<lpage>1033</lpage>
<pub-id pub-id-type="pmid">18497286</pub-id>
</element-citation>
</ref>
<ref id="R10">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lukashin</surname>
<given-names>A. V</given-names>
</name>
<name>
<surname>Borodovsky</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>GeneMark.hmm: new solutions for gene finding</article-title>
<source>Nucleic Acids Res</source>
<year>1997</year>
<volume>26</volume>
<issue>4</issue>
<fpage>1107</fpage>
<lpage>1115</lpage>
<pub-id pub-id-type="pmid">9461475</pub-id>
</element-citation>
</ref>
<ref id="R11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ley</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Turnbaugh</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Klein</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gordon</surname>
<given-names>JI</given-names>
</name>
</person-group>
<article-title>Microbial ecology: Human gut microbes associated with obesity</article-title>
<source>Nature</source>
<year>2006</year>
<volume>444</volume>
<fpage>1022</fpage>
<lpage>1023</lpage>
<pub-id pub-id-type="pmid">17183309</pub-id>
</element-citation>
</ref>
<ref id="R12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Turnbaugh</surname>
<given-names>P. J</given-names>
</name>
<name>
<surname>Ley</surname>
<given-names>R. E</given-names>
</name>
<name>
<surname>Hamady</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fraser-Liggett</surname>
<given-names>C. M</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Gordon</surname>
<given-names>J. I</given-names>
</name>
</person-group>
<article-title>The human microbiome project</article-title>
<source>Nature</source>
<year>2007</year>
<volume>449</volume>
<fpage>804</fpage>
<lpage>810</lpage>
<pub-id pub-id-type="pmid">17943116</pub-id>
</element-citation>
</ref>
<ref id="R13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gill</surname>
<given-names>S. R</given-names>
</name>
<name>
<surname>Pop</surname>
<given-names>M</given-names>
</name>
<name>
<surname>DeBoy</surname>
<given-names>R. T</given-names>
</name>
<name>
<surname>Eckburg</surname>
<given-names>P. B</given-names>
</name>
<name>
<surname>Turnbaugh</surname>
<given-names>P. J</given-names>
</name>
<name>
<surname>Samuel</surname>
<given-names>B. S</given-names>
</name>
<name>
<surname>Gordon</surname>
<given-names>J. I</given-names>
</name>
<name>
<surname>Relman</surname>
<given-names>D. A</given-names>
</name>
<name>
<surname>Fraser-Liggett</surname>
<given-names>C. M</given-names>
</name>
<name>
<surname>Nelson</surname>
<given-names>K. E</given-names>
</name>
</person-group>
<article-title>Metagenomic analysis of the human distal gut microbiome</article-title>
<source>Science</source>
<year>2006</year>
<volume>312</volume>
<issue>5778</issue>
<fpage>1355</fpage>
<lpage>1359</lpage>
<pub-id pub-id-type="pmid">16741115</pub-id>
</element-citation>
</ref>
<ref id="R14">
<label>14</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kurokawa</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Itoh</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kuwahara</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Oshima</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Toh</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Toyoda</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Takami</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Morita</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>V. K</given-names>
</name>
<name>
<surname>Srivastava</surname>
<given-names>T. P</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>T. D</given-names>
</name>
<name>
<surname>Noguchi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Mori</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ogura</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Ehrlich</surname>
<given-names>D. S</given-names>
</name>
<name>
<surname>Itoh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Takagi</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Sakaki</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Hayashi</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Hattori</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes</article-title>
<source>DNA Res</source>
<year>2007</year>
<volume>14</volume>
<issue>4</issue>
<fpage>169</fpage>
<lpage>181</lpage>
<pub-id pub-id-type="pmid">17916580</pub-id>
</element-citation>
</ref>
<ref id="R15">
<label>15</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Frank</surname>
<given-names>DN</given-names>
</name>
<name>
<surname>Pace</surname>
<given-names>N. R</given-names>
</name>
</person-group>
<article-title>Gastrointestinal microbiology enters the metagenomics era</article-title>
<source>Curr. Opin. Gastroenterol</source>
<year>2008</year>
<volume>24</volume>
<issue>1</issue>
<fpage>4</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="pmid">18043225</pub-id>
</element-citation>
</ref>
<ref id="R16">
<label>16</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andersson</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lindberg</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Jakobsson</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Backhed</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Nyren</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Engstrand</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Comparative analysis of human gut microbiota by barcoded pyrosequencing</article-title>
<source>PLoS ONE</source>
<year>2008</year>
<volume>3</volume>
<issue>7</issue>
<fpage>e2836</fpage>
<pub-id pub-id-type="pmid">18665274</pub-id>
</element-citation>
</ref>
<ref id="R17">
<label>17</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Corby</surname>
<given-names>P. M</given-names>
</name>
<name>
<surname>Lyons-Weiler</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bretz</surname>
<given-names>W. A</given-names>
</name>
<name>
<surname>Hart</surname>
<given-names>T. C</given-names>
</name>
<name>
<surname>Aas</surname>
<given-names>J. A</given-names>
</name>
<name>
<surname>Boumenna</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Goss</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Corby</surname>
<given-names>A. L</given-names>
</name>
<name>
<surname>Junior</surname>
<given-names>H. M</given-names>
</name>
<name>
<surname>Weyant</surname>
<given-names>R. J</given-names>
</name>
<name>
<surname>Paster</surname>
<given-names>B. J</given-names>
</name>
</person-group>
<article-title>Microbial risk indicators of early childhood caries</article-title>
<source>J. Clin. Microbiol</source>
<year>2005</year>
<volume>43</volume>
<issue>11</issue>
<fpage>5753</fpage>
<lpage>5759</lpage>
<pub-id pub-id-type="pmid">16272513</pub-id>
</element-citation>
</ref>
<ref id="R18">
<label>18</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Faveri</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mayer</surname>
<given-names>M. P</given-names>
</name>
<name>
<surname>Feres</surname>
<given-names>M</given-names>
</name>
<name>
<surname>de Figueiredo</surname>
<given-names>L. C</given-names>
</name>
<name>
<surname>Dewhirst</surname>
<given-names>F. E</given-names>
</name>
<name>
<surname>Paster</surname>
<given-names>B. J</given-names>
</name>
</person-group>
<article-title>Microbiological diversity of generalized aggressive periodontitis by 16S rRNA clonal analysis</article-title>
<source>Oral Microbiol. Immunol</source>
<year>2008</year>
<volume>23</volume>
<issue>2</issue>
<fpage>112</fpage>
<lpage>118</lpage>
<pub-id pub-id-type="pmid">18279178</pub-id>
</element-citation>
</ref>
<ref id="R19">
<label>19</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grice</surname>
<given-names>E. A</given-names>
</name>
<name>
<surname>Kong</surname>
<given-names>H. H</given-names>
</name>
<name>
<surname>Renaud</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>A. C</given-names>
</name>
<name>
<surname>Program</surname>
<given-names>NC. S</given-names>
</name>
<name>
<surname>Bouffard</surname>
<given-names>G. G</given-names>
</name>
<name>
<surname>Blakesley</surname>
<given-names>R. W</given-names>
</name>
<name>
<surname>Wolfsberg</surname>
<given-names>T. G</given-names>
</name>
<name>
<surname>Turner</surname>
<given-names>M. L</given-names>
</name>
<name>
<surname>Segre</surname>
<given-names>J. A</given-names>
</name>
</person-group>
<article-title>A diversity profile of the human skin microbiota</article-title>
<source>Genome Res</source>
<year>2008</year>
<volume>18</volume>
<fpage>1043</fpage>
<lpage>1050</lpage>
<pub-id pub-id-type="pmid">18502944</pub-id>
</element-citation>
</ref>
<ref id="R20">
<label>20</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sundquist</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Bigdeli</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Jalili</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Druzin</surname>
<given-names>M. L</given-names>
</name>
<name>
<surname>Waller</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Pullen</surname>
<given-names>K. M</given-names>
</name>
<name>
<surname>El-Sayed</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Taslimi</surname>
<given-names>M. M</given-names>
</name>
<name>
<surname>Batzoglou</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ronaghi</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Bacterial flora-typing with targeted, chip-based Pyrosequencing</article-title>
<source>BMC Microbiol</source>
<year>2007</year>
<volume>7</volume>
<fpage>108</fpage>
<pub-id pub-id-type="pmid">18047683</pub-id>
</element-citation>
</ref>
<ref id="R21">
<label>21</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noordin</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kamin</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>The Effect of probiotic mouthrinse on plaque and gingival inflammation</article-title>
<source>Ann. Dent</source>
<year>2007</year>
<volume>14</volume>
<issue>1</issue>
</element-citation>
</ref>
<ref id="R22">
<label>22</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fierer</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Breitbart</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Nulton</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Salamon</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lozupone</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Robeson</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R. A</given-names>
</name>
<name>
<surname>Felts</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Rayhawk</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Jackson</surname>
<given-names>R. B</given-names>
</name>
</person-group>
<article-title>Metagenomic and small-subunit rRNA analyses reveal the genetic diversity of bacteria, archaea, fungi, and viruses in soil</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2007</year>
<volume>73</volume>
<issue>21</issue>
<fpage>7059</fpage>
<lpage>7066</lpage>
<pub-id pub-id-type="pmid">17827313</pub-id>
</element-citation>
</ref>
<ref id="R23">
<label>23</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tringe</surname>
<given-names>S. G</given-names>
</name>
<name>
<surname>von Mering</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kobayashi</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Salamov</surname>
<given-names>A. A</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>H. W</given-names>
</name>
<name>
<surname>Podar</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Short</surname>
<given-names>J. M</given-names>
</name>
<name>
<surname>Mathur</surname>
<given-names>E. J</given-names>
</name>
<name>
<surname>Detter</surname>
<given-names>J. C</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>E. M</given-names>
</name>
</person-group>
<article-title>Comparative metagenomics of microbial communities</article-title>
<source>Science</source>
<year>2005</year>
<volume>308</volume>
<issue>5721</issue>
<fpage>554</fpage>
<lpage>557</lpage>
<pub-id pub-id-type="pmid">15845853</pub-id>
</element-citation>
</ref>
<ref id="R24">
<label>24</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Nielsen</surname>
<given-names>M. N</given-names>
</name>
<name>
<surname>Winding</surname>
<given-names>A</given-names>
</name>
</person-group>
<source>Microorganisms as indicators of soil health; 388</source>
<year>2002</year>
<publisher-loc>Denmark</publisher-loc>
<publisher-name>National Environmental Research Institute</publisher-name>
</element-citation>
</ref>
<ref id="R25">
<label>25</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Elsas</surname>
<given-names>J. D</given-names>
</name>
<name>
<surname>Speksnijder</surname>
<given-names>A. J</given-names>
</name>
<name>
<surname>van Overbeek</surname>
<given-names>L. S</given-names>
</name>
</person-group>
<article-title>A procedure for the metagenomic exploration of disease-suppressive soils</article-title>
<source>J. Microbiol. Meth</source>
<year>2008</year>
<volume>75</volume>
<fpage>515</fpage>
<lpage>522</lpage>
</element-citation>
</ref>
<ref id="R26">
<label>26</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eyers</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Environmental genomics: exploring the unmined richness of microbes to degrade xenobiotics</article-title>
<source>Appl. Microbiol. Biotech</source>
<year>2004</year>
<volume>66</volume>
<fpage>123</fpage>
<lpage>130</lpage>
</element-citation>
</ref>
<ref id="R27">
<label>27</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Demaneche</surname>
<given-names>S</given-names>
</name>
<name>
<surname>David</surname>
<given-names>M. M</given-names>
</name>
<name>
<surname>Navarro</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Simonet</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Vogel</surname>
<given-names>T. M</given-names>
</name>
</person-group>
<article-title>Evaluation of functional gene enrichment in a soil metagenomic clone library</article-title>
<source>J. Microbiol. Meth</source>
<year>2009</year>
<volume>76</volume>
<fpage>105</fpage>
<lpage>107</lpage>
</element-citation>
</ref>
<ref id="R28">
<label>28</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fitch</surname>
<given-names>J. P</given-names>
</name>
<name>
<surname>Raber</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Imbro</surname>
<given-names>D. R</given-names>
</name>
</person-group>
<article-title>Technology challenges in responding to biological or chemical attacks in the civilian sector</article-title>
<source>Science</source>
<year>2003</year>
<volume>302</volume>
<issue>5649</issue>
<fpage>1350</fpage>
<lpage>1354</lpage>
<pub-id pub-id-type="pmid">14631029</pub-id>
</element-citation>
</ref>
<ref id="R29">
<label>29</label>
<element-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Enserink</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>The Anthrax Case: From Spores to a Suspect</article-title>
<source>ScienceNOW Daily News</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R30">
<label>30</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Enserink</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bhattacharjee</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Scientists seek answers, ponder future after anthrax case suicide</article-title>
<source>Science</source>
<year>2008</year>
<volume>321</volume>
<issue>5890</issue>
<fpage>754</fpage>
<lpage>755</lpage>
<pub-id pub-id-type="pmid">18687926</pub-id>
</element-citation>
</ref>
<ref id="R31">
<label>31</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blow</surname>
<given-names>M. J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Woyke</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Speller</surname>
<given-names>C. F</given-names>
</name>
<name>
<surname>Krivoshapkin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>D. Y</given-names>
</name>
<name>
<surname>Derevianko</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>E. M</given-names>
</name>
</person-group>
<article-title>Identification of ancient remains through genomic sequencing</article-title>
<source>Genome Res</source>
<year>2008</year>
<volume>18</volume>
<fpage>1347</fpage>
<lpage>1353</lpage>
<pub-id pub-id-type="pmid">18426903</pub-id>
</element-citation>
</ref>
<ref id="R32">
<label>32</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ho</surname>
<given-names>S. Y. W</given-names>
</name>
<name>
<surname>Heupink</surname>
<given-names>T. H</given-names>
</name>
<name>
<surname>Rambaut</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Shapiro</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Bayesian estimation of sequence damage in ancient DNA</article-title>
<source>Mol. Bio. Evol</source>
<year>2007</year>
<volume>24</volume>
<issue>6</issue>
<fpage>1416</fpage>
<lpage>1422</lpage>
<pub-id pub-id-type="pmid">17395598</pub-id>
</element-citation>
</ref>
<ref id="R33">
<label>33</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Margulies</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Egholm</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Altman</surname>
<given-names>W. E</given-names>
</name>
<name>
<surname>Attiya</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bader</surname>
<given-names>J. S</given-names>
</name>
<name>
<surname>Bemben</surname>
<given-names>L. A</given-names>
</name>
<name>
<surname>Berka</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Braverman</surname>
<given-names>M. S</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y. J</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Dewell</surname>
<given-names>S. B</given-names>
</name>
<name>
<surname>Du</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Fierro</surname>
<given-names>J. M</given-names>
</name>
<name>
<surname>Gomes</surname>
<given-names>X. V</given-names>
</name>
<name>
<surname>Godwin</surname>
<given-names>B. C</given-names>
</name>
<name>
<surname>He</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Helgesen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ho</surname>
<given-names>C. H</given-names>
</name>
<name>
<surname>Irzyk</surname>
<given-names>G. P</given-names>
</name>
<name>
<surname>Jando</surname>
<given-names>S. C</given-names>
</name>
<name>
<surname>Alenquer</surname>
<given-names>M. L</given-names>
</name>
<name>
<surname>Jarvie</surname>
<given-names>T. P</given-names>
</name>
<name>
<surname>Jirage</surname>
<given-names>K. B</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>J. B</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>J. R</given-names>
</name>
<name>
<surname>Lanza</surname>
<given-names>J. R</given-names>
</name>
<name>
<surname>Leamon</surname>
<given-names>J. H</given-names>
</name>
<name>
<surname>Lefkowitz</surname>
<given-names>S. M</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lohman</surname>
<given-names>K. L</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Makhijani</surname>
<given-names>V. B</given-names>
</name>
<name>
<surname>McDade</surname>
<given-names>K. E</given-names>
</name>
<name>
<surname>McKenna</surname>
<given-names>M. P</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>E. W</given-names>
</name>
<name>
<surname>Nickerson</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Nobile</surname>
<given-names>J. R</given-names>
</name>
<name>
<surname>Plant</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Puc</surname>
<given-names>B. P</given-names>
</name>
<name>
<surname>Ronan</surname>
<given-names>M. T</given-names>
</name>
<name>
<surname>Roth</surname>
<given-names>G. T</given-names>
</name>
<name>
<surname>Sarkis</surname>
<given-names>G. J</given-names>
</name>
<name>
<surname>Simons</surname>
<given-names>J. F</given-names>
</name>
<name>
<surname>Simpson</surname>
<given-names>J. W</given-names>
</name>
<name>
<surname>Srinivasan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tartaro</surname>
<given-names>K. R</given-names>
</name>
<name>
<surname>Tomasz</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Vogt</surname>
<given-names>K. A</given-names>
</name>
<name>
<surname>Volkmer</surname>
<given-names>G. A</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Weiner</surname>
<given-names>M. P</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Begley</surname>
<given-names>R. F</given-names>
</name>
<name>
<surname>Rothberg</surname>
<given-names>J. M</given-names>
</name>
</person-group>
<article-title>Genome sequencing in microfabricated high-density picolitre reactors</article-title>
<source>Nature</source>
<year>2005</year>
<volume>437</volume>
<fpage>376</fpage>
<lpage>380</lpage>
<pub-id pub-id-type="pmid">16056220</pub-id>
</element-citation>
</ref>
<ref id="R34">
<label>34</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Poinar</surname>
<given-names>H. N</given-names>
</name>
<name>
<surname>Schwarz</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Shapiro</surname>
<given-names>B</given-names>
</name>
<name>
<surname>MacPhee</surname>
<given-names>R. D. E</given-names>
</name>
<name>
<surname>Buigues</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Tikhonov</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Huson</surname>
<given-names>D. H</given-names>
</name>
<name>
<surname>Tomsho</surname>
<given-names>L. P</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rampp</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Schuster</surname>
<given-names>S. C</given-names>
</name>
</person-group>
<article-title>Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA</article-title>
<source>Science</source>
<year>2005</year>
<volume>311</volume>
<issue>5759</issue>
<fpage>392</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="pmid">16368896</pub-id>
</element-citation>
</ref>
<ref id="R35">
<label>35</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noonan</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Coop</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Kudaravalli</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Krause</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Alessi</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Platt</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Paabo</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Pritchard</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>EM</given-names>
</name>
</person-group>
<article-title>Sequencing and analysis of neanderthal genomic DNA</article-title>
<source>Science</source>
<year>2006</year>
<volume>314</volume>
<issue>5802</issue>
<fpage>1113</fpage>
<lpage>1118</lpage>
<pub-id pub-id-type="pmid">17110569</pub-id>
</element-citation>
</ref>
<ref id="R36">
<label>36</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tringe</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>WH</given-names>
</name>
<name>
<surname>Yap</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Suan</surname>
<given-names>ST</given-names>
</name>
<name>
<surname>Ing</surname>
<given-names>SK</given-names>
</name>
<name>
<surname>Haynes</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bristow</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Ruan</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>The airborne metagenome in an indoor urban environment</article-title>
<source>PLoS ONE</source>
<year>2008</year>
<volume>3</volume>
<issue>4</issue>
<fpage>e1862</fpage>
<pub-id pub-id-type="pmid">18382653</pub-id>
</element-citation>
</ref>
<ref id="R37">
<label>37</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Bruns</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Scow</surname>
<given-names>KM</given-names>
</name>
</person-group>
<source>DNA fingerprinting as a means to identify sources of soil-derived dust: problems and potential</source>
<year>1999</year>
<publisher-loc>Boca Raton FL</publisher-loc>
<publisher-name>CRC Press</publisher-name>
</element-citation>
</ref>
<ref id="R38">
<label>38</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Heath</surname>
<given-names>LE</given-names>
</name>
<name>
<surname>Saunders</surname>
<given-names>VA</given-names>
</name>
</person-group>
<article-title>Assessing the potential of bacterial dna profiling for forensic soil comparisons</article-title>
<source>J. Forensic Sci</source>
<year>2006</year>
<volume>51</volume>
<issue>5</issue>
<fpage>1062</fpage>
<lpage>1068</lpage>
<pub-id pub-id-type="pmid">17018082</pub-id>
</element-citation>
</ref>
<ref id="R39">
<label>39</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sanger</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Nicklen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Coulson</surname>
<given-names>AR</given-names>
</name>
</person-group>
<article-title>DNA sequencing with chain-terminating inhibitors</article-title>
<source>Proc. Natl. Acad. Sci. USA</source>
<year>1977</year>
<volume>74</volume>
<issue>12</issue>
<fpage>5463</fpage>
<lpage>5467</lpage>
<pub-id pub-id-type="pmid">271968</pub-id>
</element-citation>
</ref>
<ref id="R40">
<label>40</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Tyson</surname>
<given-names>GW</given-names>
</name>
</person-group>
<article-title>Microbiology: Metagenomics</article-title>
<source>Nature</source>
<year>2008</year>
<volume>455</volume>
<fpage>481</fpage>
<lpage>483</lpage>
<pub-id pub-id-type="pmid">18818648</pub-id>
</element-citation>
</ref>
<ref id="R41">
<label>41</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Gish</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Basic local alignment search tool</article-title>
<source>J. Mol. Biol</source>
<year>1990</year>
<volume>215</volume>
<fpage>403</fpage>
<lpage>410</lpage>
<pub-id pub-id-type="pmid">2231712</pub-id>
</element-citation>
</ref>
<ref id="R42">
<label>42</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wommack</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Bhavsar</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ravel</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Metagenomics: Read length matters</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2008</year>
<volume>74</volume>
<issue>5</issue>
<fpage>1453</fpage>
<lpage>1463</lpage>
<pub-id pub-id-type="pmid">18192407</pub-id>
</element-citation>
</ref>
<ref id="R43">
<label>43</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garcia-Martinez</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Acinas</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Anton</surname>
<given-names>AI</given-names>
</name>
<name>
<surname>Rodriguez-Valera</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Use of the 16S--23S ribosomal genes spacer region in studies of prokaryotic diversity</article-title>
<source>J. Microbiol. Meth</source>
<year>1999</year>
<volume>36</volume>
<issue>1-2</issue>
<fpage>55</fpage>
<lpage>64</lpage>
</element-citation>
</ref>
<ref id="R44">
<label>44</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Macrae</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>The use of 16s rdna methods in soil microbiology</article-title>
<source>Brazilian. J. Microbiol</source>
<year>2000</year>
<volume>31</volume>
<fpage>77</fpage>
<lpage>82</lpage>
</element-citation>
</ref>
<ref id="R45">
<label>45</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Harris</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Hartley</surname>
<given-names>JC</given-names>
</name>
</person-group>
<article-title>Development of broad-range 16S rDNA PCR for use in the routine diagnostic clinical microbiology service</article-title>
<source>J. Med. Microbiol</source>
<year>2003</year>
<volume>52</volume>
<fpage>685</fpage>
<lpage>691</lpage>
<pub-id pub-id-type="pmid">12867563</pub-id>
</element-citation>
</ref>
<ref id="R46">
<label>46</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Garrity</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Tiedje</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Cole</surname>
<given-names>JR</given-names>
</name>
</person-group>
<article-title>Naive bayes classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2007</year>
<volume>73</volume>
<issue>16</issue>
<fpage>5261</fpage>
<lpage>5267</lpage>
<pub-id pub-id-type="pmid">17586664</pub-id>
</element-citation>
</ref>
<ref id="R47">
<label>47</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>DeSantis</surname>
<given-names>TZ</given-names>
</name>
<name>
<surname>Andersen</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<issue>18</issue>
<fpage>e120</fpage>
<pub-id pub-id-type="pmid">18723574</pub-id>
</element-citation>
</ref>
<ref id="R48">
<label>48</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peplies</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Glockner</surname>
<given-names>FO</given-names>
</name>
<name>
<surname>Amann</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Optimization strategies for DNA microarray-based detection of bacteria with 16s rRNA-targeting oligonucleotide probes</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2003</year>
<volume>69</volume>
<issue>3</issue>
<fpage>1397</fpage>
<lpage>1407</lpage>
<pub-id pub-id-type="pmid">12620822</pub-id>
</element-citation>
</ref>
<ref id="R49">
<label>49</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Treimo</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Vegarud</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Langsrud</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Marki</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Rudi</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Total bacterial and species-specific 16S rDNA micro-array quantification of complex samples</article-title>
<source>J. Appl. Microbiol</source>
<year>2005</year>
<volume>100</volume>
<issue>5</issue>
<fpage>985</fpage>
<lpage>998</lpage>
<pub-id pub-id-type="pmid">16629999</pub-id>
</element-citation>
</ref>
<ref id="R50">
<label>50</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Loy</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Schulz</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lucker</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Schopfer-Wends</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stoecker</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Baranyi</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lehner</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wagner</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>16S rRNA gene-based oligonucleotide microarray for environmental monitoring of the betaproteobacterial order "Rhodocyclales"</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2005</year>
<volume>71</volume>
<issue>3</issue>
<fpage>1373</fpage>
<lpage>1386</lpage>
<pub-id pub-id-type="pmid">15746340</pub-id>
</element-citation>
</ref>
<ref id="R51">
<label>51</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maron</surname>
<given-names>P-A</given-names>
</name>
<name>
<surname>Ranjard</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Mougel</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lemanceau</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Metaproteomics: A new approach for studying functional microbial ecology</article-title>
<source>Microb. Ecol</source>
<year>2007</year>
<volume>53</volume>
<issue>3</issue>
<fpage>486</fpage>
<lpage>493</lpage>
<pub-id pub-id-type="pmid">17431707</pub-id>
</element-citation>
</ref>
<ref id="R52">
<label>52</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schulze</surname>
<given-names>WX</given-names>
</name>
</person-group>
<article-title>A proteomic fingerprint of dissolved organic carbon and of soil particles</article-title>
<source>Oecologia</source>
<year>2005</year>
<volume>142</volume>
<fpage>335</fpage>
<lpage>343</lpage>
<pub-id pub-id-type="pmid">15449171</pub-id>
</element-citation>
</ref>
<ref id="R53">
<label>53</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kan</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hanson</surname>
<given-names>TE</given-names>
</name>
<name>
<surname>Ginter</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Metaproteomic analysis of chesapeake bay microbial communities</article-title>
<source>Saline Syst</source>
<year>2005</year>
<volume>1</volume>
<fpage>7</fpage>
<pub-id pub-id-type="pmid">16176596</pub-id>
</element-citation>
</ref>
<ref id="R54">
<label>54</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lacerda</surname>
<given-names>CMR</given-names>
</name>
<name>
<surname>Choe</surname>
<given-names>LH</given-names>
</name>
<name>
<surname>Reardon</surname>
<given-names>KF</given-names>
</name>
</person-group>
<article-title>Metaproteomic analysis of bacterial community response to cadmium exposure</article-title>
<source>J. Proteome Res</source>
<year>2007</year>
<volume>6</volume>
<fpage>1145</fpage>
<lpage>1152</lpage>
<pub-id pub-id-type="pmid">17284062</pub-id>
</element-citation>
</ref>
<ref id="R55">
<label>55</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Benndorf</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Balcke</surname>
<given-names>GU</given-names>
</name>
<name>
<surname>Harms</surname>
<given-names>H</given-names>
</name>
<name>
<surname>von Bergen</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Functional metaproteome analysis of protein extracts from contaminated soil and groundwater</article-title>
<source>ISME J</source>
<year>2007</year>
<volume>1</volume>
<issue>3</issue>
<fpage>224</fpage>
<lpage>234</lpage>
<pub-id pub-id-type="pmid">18043633</pub-id>
</element-citation>
</ref>
<ref id="R56">
<label>56</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Foerstner</surname>
<given-names>KU</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Get the most out of your metagenome: computational analysis of environmental sequence data</article-title>
<source>Curr. Opin. Microbiol</source>
<year>2007</year>
<volume>10</volume>
<issue>5</issue>
<fpage>490</fpage>
<lpage>498</lpage>
<pub-id pub-id-type="pmid">17936679</pub-id>
</element-citation>
</ref>
<ref id="R57">
<label>57</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eisen</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>Environmental shotgun sequencing: its potential and challenges for studying the hidden world of microbes</article-title>
<source>PLoS Biol</source>
<year>2007</year>
<volume>5</volume>
<issue>3</issue>
<fpage>e82</fpage>
<pub-id pub-id-type="pmid">17355177</pub-id>
</element-citation>
</ref>
<ref id="R58">
<label>58</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Valdivia-Granda</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>The next meta-challenge for Bioinformatics</article-title>
<source>Bioinformation</source>
<year>2008</year>
<volume>2</volume>
<issue>8</issue>
<fpage>358</fpage>
<lpage>362</lpage>
<pub-id pub-id-type="pmid">18685725</pub-id>
</element-citation>
</ref>
<ref id="R59">
<label>59</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rusch</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Halpern</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Sutton</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>KB</given-names>
</name>
<name>
<surname>Williamson</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yooseph</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Eisen</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Hoffman</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Remington</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Beeson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Tran</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Baden-Tillson</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Stewart</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Thorpe</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Freeman</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Andrews-Pfannkoch</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Venter</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kravitz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Utterback</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rogers</surname>
<given-names>Y-H</given-names>
</name>
<name>
<surname>Falcon</surname>
<given-names>LI</given-names>
</name>
<name>
<surname>Souza</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Bonilla-Rosso</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Eguiarte</surname>
<given-names>LE</given-names>
</name>
<name>
<surname>Karl</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Sathyendranath</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Platt</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Bermingham</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Gallardo</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Tamayo-Castillo</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Ferrari</surname>
<given-names>MR</given-names>
</name>
<name>
<surname>Strausberg</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Nealson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Friedman</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Frazier</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Venter</surname>
<given-names>JC</given-names>
</name>
</person-group>
<article-title>The sorcerer II global ocean sampling expedition: Northwest atlantic through eastern tropical pacific</article-title>
<source>PLoS Biol</source>
<year>2007</year>
<volume>5</volume>
<issue>3</issue>
<fpage>e77</fpage>
<pub-id pub-id-type="pmid">17355176</pub-id>
</element-citation>
</ref>
<ref id="R60">
<label>60</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rosen</surname>
<given-names>GL</given-names>
</name>
</person-group>
<article-title>Examining coding structure and redundancy in DNA</article-title>
<source>IEEE Eng. Med. Biol. Mag</source>
<year>2006</year>
<volume>25</volume>
<issue>1</issue>
<fpage>62</fpage>
<lpage>68</lpage>
<pub-id pub-id-type="pmid">16485393</pub-id>
</element-citation>
</ref>
<ref id="R61">
<label>61</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gianoulis</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Patel</surname>
<given-names>PV</given-names>
</name>
<name>
<surname>Bjornson</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Korbel</surname>
<given-names>JO</given-names>
</name>
<name>
<surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Yamada</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Paccanaro</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Jensen</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Snyder</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Gerstein</surname>
<given-names>MB</given-names>
</name>
</person-group>
<article-title>Quantifying environmental adaptation of metabolic pathways in metagenomics</article-title>
<source>Proc. Natl. Acad. Sci. USA</source>
<year>2009</year>
<volume>106</volume>
<fpage>1374</fpage>
<lpage>1379</lpage>
<pub-id pub-id-type="pmid">19164758</pub-id>
</element-citation>
</ref>
<ref id="R62">
<label>62</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Rigoutsos</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>What's in the mix: phylogenetic classification of metagenome sequence samples</article-title>
<source>Curr. Opin. Microbiol</source>
<year>2007</year>
<volume>10</volume>
<issue>5</issue>
<fpage>499</fpage>
<lpage>503</lpage>
<pub-id pub-id-type="pmid">17933580</pub-id>
</element-citation>
</ref>
<ref id="R63">
<label>63</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Tsirigos</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Rigoutsos</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>Accurate phylogenetic classification of variable-length DNA fragments</article-title>
<source>Nat. Meth</source>
<year>2007</year>
<volume>4</volume>
<fpage>63</fpage>
<lpage>72</lpage>
</element-citation>
</ref>
<ref id="R64">
<label>64</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rosen</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Garbarine</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Caseiro</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Polikar</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Sokhansanj</surname>
<given-names>BA</given-names>
</name>
</person-group>
<article-title>Metagenome fragment classification using $N$-mer frequency profiles</article-title>
<source>Adv. Bioinform</source>
<year>2008</year>
<volume>2008</volume>
<fpage>12</fpage>
</element-citation>
</ref>
<ref id="R65">
<label>65</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Curtis</surname>
<given-names>TP</given-names>
</name>
<name>
<surname>Sloan</surname>
<given-names>WT</given-names>
</name>
<name>
<surname>Scannell</surname>
<given-names>JW</given-names>
</name>
</person-group>
<article-title>Estimating prokaryotic diversity and its limits</article-title>
<source>Proc. Natl. Acad. Sci. USA</source>
<year>2002</year>
<volume>99</volume>
<issue>16</issue>
<fpage>10494</fpage>
<lpage>10499</lpage>
<pub-id pub-id-type="pmid">12097644</pub-id>
</element-citation>
</ref>
<ref id="R66">
<label>66</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huson</surname>
<given-names>DE</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schuster</surname>
<given-names>SC</given-names>
</name>
</person-group>
<article-title>MEGAN analysis of metagenomic data</article-title>
<source>Genome Res</source>
<year>2007</year>
<volume>17</volume>
<issue>3</issue>
<fpage>377</fpage>
<lpage>386</lpage>
<pub-id pub-id-type="pmid">17255551</pub-id>
</element-citation>
</ref>
<ref id="R67">
<label>67</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sandberg</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Winberg</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Branden</surname>
<given-names>CI</given-names>
</name>
<name>
<surname>Kaske</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ernberg</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Coster</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Capturing whole-genome characteristics in short sequences using a naïve bayesian classifier</article-title>
<source>Genome Res</source>
<year>2001</year>
<volume>11</volume>
<issue>8</issue>
<fpage>1404</fpage>
<lpage>1409</lpage>
<pub-id pub-id-type="pmid">11483581</pub-id>
</element-citation>
</ref>
<ref id="R68">
<label>68</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koski</surname>
<given-names>LB</given-names>
</name>
<name>
<surname>Golding</surname>
<given-names>GB</given-names>
</name>
</person-group>
<article-title>The closest BLAST hit is often not the nearest neighbor</article-title>
<source>J. Mol. Evol</source>
<year>2001</year>
<volume>52</volume>
<issue>6</issue>
<fpage>540</fpage>
<lpage>542</lpage>
<pub-id pub-id-type="pmid">11443357</pub-id>
</element-citation>
</ref>
<ref id="R69">
<label>69</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Venter</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Remington</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Halpern</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Rusch</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Eisen</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Paulsen</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Nelson</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Nelson</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Fouts</surname>
<given-names>DE</given-names>
</name>
<name>
<surname>Levy</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Knap</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Lomas</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Nealson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>White</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Peterson</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hoffman</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Parsons</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Baden-Tillson</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Pfannkoch</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Rogers</surname>
<given-names>Y-H</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>HO</given-names>
</name>
</person-group>
<article-title>Environmental genome shotgun sequencing of the sargasso sea</article-title>
<source>Science</source>
<year>2004</year>
<volume>304</volume>
<issue>5667</issue>
<fpage>66</fpage>
<lpage>74</lpage>
<pub-id pub-id-type="pmid">15001713</pub-id>
</element-citation>
</ref>
<ref id="R70">
<label>70</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Havre</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Webb-Robertson</surname>
<given-names>B-J</given-names>
</name>
<name>
<surname>Shah</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Posse</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gopalan</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Brockman</surname>
<given-names>FJ</given-names>
</name>
</person-group>
<article-title>Bioinformatic insights from metagenomics through visualization</article-title>
<year>2005</year>
<conf-name>IEEE Comp. Sys. Bioinform. Conf</conf-name>
<fpage>341</fpage>
<lpage>350</lpage>
</element-citation>
</ref>
<ref id="R71">
<label>71</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Neph</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Tompa</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>MicroFootPrinter: a tool for phylogenetic footprinting in prokaryotic genomes</article-title>
<source>Nucleic Acids Res</source>
<year>2006</year>
<volume>34</volume>
<fpage>366</fpage>
<lpage>368</lpage>
</element-citation>
</ref>
<ref id="R72">
<label>72</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pignatelli</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Aparicio</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Blanquer</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Hernandez</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Moya</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Tamames</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Metagenomics reveals our incomplete knowledge of global diversity</article-title>
<source>Bioinformatics</source>
<year>2008</year>
<volume>24</volume>
<issue>18</issue>
<fpage>2124</fpage>
<lpage>2125</lpage>
<pub-id pub-id-type="pmid">18625611</pub-id>
</element-citation>
</ref>
<ref id="R73">
<label>73</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tress</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Cozzetto</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Tramontano</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Valencia</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>An analysis of the Sargasso Sea resource and the consequences for database composition</article-title>
<source>BMC Bioinformatics</source>
<year>2006</year>
<volume>7</volume>
<fpage>213</fpage>
<pub-id pub-id-type="pmid">16623953</pub-id>
</element-citation>
</ref>
<ref id="R74">
<label>74</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Manichanh</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Chapple</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Frangeul</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Gloux</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Guigo</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Dore</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>A comparison of random sequence reads versus 16S rDNA sequences for estimating the biodiversity of a metagenomic library</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<issue>16</issue>
<fpage>5180</fpage>
<lpage>5188</lpage>
<pub-id pub-id-type="pmid">18682527</pub-id>
</element-citation>
</ref>
<ref id="R75">
<label>75</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mavromatis</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ivanova</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Barry</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Shapiro</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Goltsman</surname>
<given-names>E</given-names>
</name>
<name>
<surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Rigoutsos</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Salamov</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Korzeniewski</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Land</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lapidus</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Grigoriev</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Richardson</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Kyripides</surname>
<given-names>NC</given-names>
</name>
</person-group>
<article-title>Use of simulated data sets to evaluate the fidelity of metagenomic processing methods</article-title>
<source>Nat. Meth</source>
<year>2007</year>
<volume>4</volume>
<fpage>495</fpage>
<lpage>500</lpage>
</element-citation>
</ref>
<ref id="R76">
<label>76</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Karlin</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Burge</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Dinucleotide relative abundance extremes: a genomic signature</article-title>
<source>Trends Gen</source>
<year>1995</year>
<volume>11</volume>
<fpage>283</fpage>
<lpage>290</lpage>
</element-citation>
</ref>
<ref id="R77">
<label>77</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Karlin</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Mrazek</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Campbell</surname>
<given-names>AM</given-names>
</name>
</person-group>
<article-title>Compositional biases of bacterial genomes and evolutionary implications</article-title>
<source>J. Bacteriol</source>
<year>1997</year>
<volume>179</volume>
<fpage>3899</fpage>
<lpage>3913</lpage>
<pub-id pub-id-type="pmid">9190805</pub-id>
</element-citation>
</ref>
<ref id="R78">
<label>78</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nakashima</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ota</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Nishikawa</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ooi</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Genes from nine genomes are separated into their organisms in the dinucleotide composition space</article-title>
<source>DNA Res</source>
<year>1998</year>
<volume>5</volume>
<fpage>251</fpage>
<lpage>259</lpage>
<pub-id pub-id-type="pmid">9872449</pub-id>
</element-citation>
</ref>
<ref id="R79">
<label>79</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deschavanne</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Giron</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Vilain</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Fagot</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Fertil</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Genomic signature: characterization and classification of species assessed by chaos game representation of sequences</article-title>
<source>Mol. Biol. Evol</source>
<year>1999</year>
<volume>16</volume>
<fpage>1391</fpage>
<lpage>1399</lpage>
<pub-id pub-id-type="pmid">10563018</pub-id>
</element-citation>
</ref>
<ref id="R80">
<label>80</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abe</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kanaya</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kinouchi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ichiba</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Kozuki</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ikemura</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Informatics for unveiling hidden genome signatures</article-title>
<source>Genome Res</source>
<year>2003</year>
<volume>13</volume>
<fpage>693</fpage>
<lpage>702</lpage>
<pub-id pub-id-type="pmid">12671005</pub-id>
</element-citation>
</ref>
<ref id="R81">
<label>81</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pride</surname>
<given-names>DT</given-names>
</name>
<name>
<surname>Meinersmann</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Wassenaar</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Blaser</surname>
<given-names>MJ</given-names>
</name>
</person-group>
<article-title>Evolutionary implications of microbial genome tetranucleotide frequency biases</article-title>
<source>Genome Res</source>
<year>2003</year>
<volume>13</volume>
<fpage>145</fpage>
<lpage>158</lpage>
<pub-id pub-id-type="pmid">12566393</pub-id>
</element-citation>
</ref>
<ref id="R82">
<label>82</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Teeling</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Waldmann</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lombardot</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Bauer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Glockner</surname>
<given-names>FO</given-names>
</name>
</person-group>
<article-title>TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences</article-title>
<source>BMC Bioinformatics</source>
<year>2004</year>
<volume>5</volume>
<fpage>163</fpage>
<pub-id pub-id-type="pmid">15507136</pub-id>
</element-citation>
</ref>
<ref id="R83">
<label>83</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abe</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Sugawara</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Kinouchi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kanaya</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ikemura</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples</article-title>
<source>DNA Res</source>
<year>2005</year>
<volume>12</volume>
<fpage>281</fpage>
<lpage>290</lpage>
<pub-id pub-id-type="pmid">16769690</pub-id>
</element-citation>
</ref>
<ref id="R84">
<label>84</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fertil</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Massin</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lespinats</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Devic</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Dumee</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Giron</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>GENSTYLE: exploration and analysis of DNA sequences with genomic signature</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<fpage>512</fpage>
<lpage>515</lpage>
</element-citation>
</ref>
<ref id="R85">
<label>85</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akhtar</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Epps</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ambikairajah</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Signal processing in sequence analysis: advances in eukaryotic gene prediction</article-title>
<source>IEEE Sel. Top. Sig. Proc</source>
<year>2008</year>
<volume>2</volume>
<issue>3</issue>
<fpage>310</fpage>
<lpage>321</lpage>
</element-citation>
</ref>
<ref id="R86">
<label>86</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Garbarine</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Rosen</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>An Information-theoretic method of microarray probe design for genome classification</article-title>
<year>2008</year>
<volume>2008</volume>
<conf-name>IEEE Eng. Med. Bio. Conf</conf-name>
<conf-date>2008</conf-date>
<conf-loc>Vancouver, Canada</conf-loc>
<fpage>3779</fpage>
<lpage>3782</lpage>
</element-citation>
</ref>
<ref id="R87">
<label>87</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Gadia</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Rosen</surname>
<given-names>GL</given-names>
</name>
</person-group>
<article-title>A Text-Mining Approach for Classification of Genomic Fragments</article-title>
<year>2008</year>
<conf-name>IEEE Int. Workshop Biomed. Health Inform</conf-name>
</element-citation>
</ref>
<ref id="R88">
<label>88</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Pachter</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Bioinformatics for whole-genome shotgun sequencing of microbial communities</article-title>
<source>PLoS Comput. Biol</source>
<year>2005</year>
<volume>1</volume>
<issue>2</issue>
<fpage>106</fpage>
<lpage>112</lpage>
<pub-id pub-id-type="pmid">16110337</pub-id>
</element-citation>
</ref>
<ref id="R89">
<label>89</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chan</surname>
<given-names>C-K</given-names>
</name>
<name>
<surname>Hsu</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Halgamuge</surname>
<given-names>SK</given-names>
</name>
</person-group>
<article-title>Using growing self-organising maps to improve the binning process in environmental whole-genome shotgun sequencing</article-title>
<source>J. Biomed. Biotechnol</source>
<year>2008</year>
<volume>2008</volume>
<fpage>10</fpage>
</element-citation>
</ref>
<ref id="R90">
<label>90</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chatterji</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yamazaki</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Bai</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Eisen</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>CompostBin: a DNA composition-based algorithm for binning environmental shotgun reads</article-title>
<source>Springer-Verlag Lect. Notes Comput. Sci</source>
<year>2008</year>
<volume>4955</volume>
<fpage>17</fpage>
<lpage>28</lpage>
</element-citation>
</ref>
<ref id="R91">
<label>91</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Nasser</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Breland</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Harris</surname>
<given-names>F C</given-names>
</name>
<name>
<surname>Nicolescu</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>A Fuzzy Classifier to taxonomically group DNA fragments within a metagenome</article-title>
<source>IEEE Ann. Meeting Fuzzy Info. Proc. Soc</source>
<year>2008</year>
<volume>2008</volume>
<fpage>1</fpage>
<lpage>6</lpage>
</element-citation>
</ref>
<ref id="R92">
<label>92</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Wooley</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Godzik</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Probing metagenomics by rapid cluster analysis of very large datasets</article-title>
<source>PLoS ONE</source>
<year>2008</year>
<volume>3</volume>
<issue>10</issue>
<fpage>e3375</fpage>
<pub-id pub-id-type="pmid">18846219</pub-id>
</element-citation>
</ref>
<ref id="R93">
<label>93</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Harrison</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Langdale</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>A step by step guide to phylogeny reconstruction</article-title>
<source>Plant J</source>
<year>2006</year>
<volume>45</volume>
<issue>4</issue>
<fpage>561</fpage>
<lpage>572</lpage>
<pub-id pub-id-type="pmid">16441349</pub-id>
</element-citation>
</ref>
<ref id="R94">
<label>94</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nye</surname>
<given-names>TMW</given-names>
</name>
</person-group>
<article-title>Trees of trees: an approach to comparing multiple alternative phylogenies</article-title>
<source>Syst. Biol</source>
<year>2008</year>
<volume>57</volume>
<issue>5</issue>
<fpage>785</fpage>
<lpage>794</lpage>
<pub-id pub-id-type="pmid">18853364</pub-id>
</element-citation>
</ref>
<ref id="R95">
<label>95</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gevers</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Cohan</surname>
<given-names>FM</given-names>
</name>
<name>
<surname>Lawrence</surname>
<given-names>JG</given-names>
</name>
<name>
<surname>Spratt</surname>
<given-names>BG</given-names>
</name>
<name>
<surname>Coenye</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Feil</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Stackebrandt</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Van de Peer</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Vandamme</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>FL</given-names>
</name>
<name>
<surname>Swings</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Opinion: re-evaluating prokaryotic species</article-title>
<source>Nat. Rev. Microbiol</source>
<year>2005</year>
<volume>3</volume>
<issue>9</issue>
<fpage>733</fpage>
<lpage>739</lpage>
<pub-id pub-id-type="pmid">16138101</pub-id>
</element-citation>
</ref>
<ref id="R96">
<label>96</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hagstrom</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Pommier</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Simu</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Stolte</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Svensson</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Zweifel</surname>
<given-names>UL</given-names>
</name>
</person-group>
<article-title>Use of 16s ribosomal DNA for delineation of marine bacterioplankton species</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2002</year>
<volume>68</volume>
<issue>7</issue>
<fpage>3628</fpage>
<lpage>3633</lpage>
<pub-id pub-id-type="pmid">12089052</pub-id>
</element-citation>
</ref>
<ref id="R97">
<label>97</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Hazen</surname>
<given-names>RM</given-names>
</name>
</person-group>
<source>The scientific quest for life's origin</source>
<year>2005</year>
<publisher-name>Joseph Henry Press</publisher-name>
</element-citation>
</ref>
<ref id="R98">
<label>98</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krause</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Diaz</surname>
<given-names>NN</given-names>
</name>
<name>
<surname>Goesmann</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Kelley</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Nattkemper</surname>
<given-names>TW</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Stoye</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Phylogenetic classification of short environmental DNA fragments</article-title>
<source>Nucleic. Acids Res</source>
<year>2008</year>
<volume>36</volume>
<issue>7</issue>
<fpage>2230</fpage>
<lpage>2239</lpage>
<pub-id pub-id-type="pmid">18285365</pub-id>
</element-citation>
</ref>
<ref id="R99">
<label>99</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thompson</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Plewniak</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Poch</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>A comprehensive comparison of multiple sequence alignment programs</article-title>
<source>Nucleic. Acids Res</source>
<year>1999</year>
<volume>27</volume>
<issue>13</issue>
<fpage>2682</fpage>
<lpage>2690</lpage>
<pub-id pub-id-type="pmid">10373585</pub-id>
</element-citation>
</ref>
<ref id="R100">
<label>100</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Higgins</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Sharp</surname>
<given-names>PM</given-names>
</name>
</person-group>
<article-title>Clustal: a package for performing multiple sequence alignment on a microcomputer</article-title>
<source>Gene</source>
<year>1988</year>
<volume>73</volume>
<issue>1</issue>
<fpage>237</fpage>
<lpage>244</lpage>
<pub-id pub-id-type="pmid">3243435</pub-id>
</element-citation>
</ref>
<ref id="R101">
<label>101</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tamura</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Dudley</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Nei</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Mega4: molecular evolutionary genetics analysis</article-title>
<source>Mol. Biol. Evol</source>
<year>2007</year>
<volume>24</volume>
<issue>8</issue>
<fpage>1596</fpage>
<lpage>1599</lpage>
<pub-id pub-id-type="pmid">17488738</pub-id>
</element-citation>
</ref>
<ref id="R102">
<label>102</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Swofford</surname>
<given-names>DL</given-names>
</name>
</person-group>
<article-title>PAUP: phylogenetic analysis using parsimony, Version 3.1</article-title>
<source>Illin. Nat. Hist. Surv</source>
<year>1991</year>
</element-citation>
</ref>
<ref id="R103">
<label>103</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<name>
<surname>Huelsenbeck</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Mr Bayes Manual</article-title>
<source>
<uri xlink:type="simple" xlink:href="http://mrbayes.csit.fsu.edu/manual.php">http://mrbayes.csit.fsu.edu/manual.php</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R104">
<label>104</label>
<element-citation publication-type="webpage">
<person-group person-group-type="author">
<name>
<surname>Felsenstein</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Phylip (PHYLogeny Inference Package)</article-title>
<source>
<uri xlink:type="simple" xlink:href="http://evolution.genetics.washington.edu/phylip.html">http://evolution.genetics.washington.edu/phylip.html</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R105">
<label>105</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lozupone</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Hamady</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>UniFrac - An online tool for comparing microbial community diversity in a phylogenetic context</article-title>
<source>BMC Bioinformatics</source>
<year>2006</year>
<volume>7</volume>
<fpage>371</fpage>
<pub-id pub-id-type="pmid">16893466</pub-id>
</element-citation>
</ref>
<ref id="R106">
<label>106</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jeanmougin</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Gouy</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Higgins</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Gibson</surname>
<given-names>TJ</given-names>
</name>
</person-group>
<article-title>Multiple sequence alignment with Clustal X</article-title>
<source>Trends Biochem. Sci</source>
<year>1998</year>
<volume>23</volume>
<issue>10</issue>
<fpage>403</fpage>
<lpage>405</lpage>
<pub-id pub-id-type="pmid">9810230</pub-id>
</element-citation>
</ref>
<ref id="R107">
<label>107</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Sneath</surname>
<given-names>PHA</given-names>
</name>
<name>
<surname>Sokal</surname>
<given-names>RR</given-names>
</name>
</person-group>
<source>Numerical taxonomy</source>
<year>1973</year>
<publisher-loc>San Francisco CA</publisher-loc>
<publisher-name>W.H. Freeman and Company</publisher-name>
<fpage>230</fpage>
<lpage>234</lpage>
</element-citation>
</ref>
<ref id="R108">
<label>108</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gaut</surname>
<given-names>BS</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>PO</given-names>
</name>
</person-group>
<article-title>Success of maximum likelihood phylogeny inference in the four-taxon case</article-title>
<source>Mol. Biol. Evol</source>
<year>1995</year>
<volume>12</volume>
<issue>1</issue>
<fpage>152</fpage>
<lpage>162</lpage>
<pub-id pub-id-type="pmid">7877489</pub-id>
</element-citation>
</ref>
<ref id="R109">
<label>109</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>Paml 4: phylogenetic analysis by maximum likelihood</article-title>
<source>Mol. Biol. Evol</source>
<year>2007</year>
<volume>24</volume>
<issue>8</issue>
<fpage>1586</fpage>
<lpage>1591</lpage>
<pub-id pub-id-type="pmid">17483113</pub-id>
</element-citation>
</ref>
<ref id="R110">
<label>110</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hobolth</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yoshida</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Maximum likelihood estimation of phylogenetic tree and substitution rates via generalized neighbor-joining and the em algorithm</article-title>
<source>Algebr. Biol</source>
<year>2005</year>
<volume>2005</volume>
<fpage>41</fpage>
<lpage>50</lpage>
</element-citation>
</ref>
<ref id="R111">
<label>111</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Salter</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Pearl</surname>
<given-names>DK</given-names>
</name>
</person-group>
<article-title>Estimation of evolutionary parameters with phylogenetic trees</article-title>
<source>J. Mol. Evol</source>
<year>2002</year>
<volume>55</volume>
<issue>6</issue>
<fpage>684</fpage>
<lpage>695</lpage>
<pub-id pub-id-type="pmid">12486527</pub-id>
</element-citation>
</ref>
<ref id="R112">
<label>112</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Jukes</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Cantor</surname>
<given-names>C</given-names>
</name>
</person-group>
<source>Mammalian protein metabolism, chapter evolution of protein molecules</source>
<year>1969</year>
<publisher-name>Academic Press</publisher-name>
<fpage>21</fpage>
<lpage>32</lpage>
</element-citation>
</ref>
<ref id="R113">
<label>113</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ripplinger</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Does choice in model selection affect maximum likelihood analysis?</article-title>
<source>Syst. Biol</source>
<year>2008</year>
<volume>57</volume>
<issue>1</issue>
<fpage>76</fpage>
<lpage>85</lpage>
<pub-id pub-id-type="pmid">18275003</pub-id>
</element-citation>
</ref>
<ref id="R114">
<label>114</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saitou</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Nei</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>The Neighbor-joining method: a new method for reconstructing phylogenetic trees</article-title>
<source>Mol. Biol. Evol</source>
<year>1987</year>
<volume>4</volume>
<fpage>406</fpage>
<lpage>425</lpage>
<pub-id pub-id-type="pmid">3447015</pub-id>
</element-citation>
</ref>
<ref id="R115">
<label>115</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martin</surname>
<given-names>AP</given-names>
</name>
</person-group>
<article-title>Phylogenetic approaches for describing and comparing the diversity of microbial communities</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2002</year>
<volume>68</volume>
<issue>8</issue>
<fpage>3673</fpage>
<lpage>3682</lpage>
<pub-id pub-id-type="pmid">12147459</pub-id>
</element-citation>
</ref>
<ref id="R116">
<label>116</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Sokal</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Rohlf</surname>
<given-names>F</given-names>
</name>
</person-group>
<source>Biometry: the principles and practice of statistics in biological research</source>
<year>1995</year>
<publisher-loc>New York, NY</publisher-loc>
<publisher-name>W.H. Freeman and Co</publisher-name>
</element-citation>
</ref>
<ref id="R117">
<label>117</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fitch</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Sokhansanj</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Genomic engineering: moving beyond dna sequence to function</article-title>
<source>Proc. IEEE</source>
<year>2000</year>
<volume>88</volume>
<fpage>1949</fpage>
<lpage>1971</lpage>
</element-citation>
</ref>
<ref id="R118">
<label>118</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Sheikh</surname>
<given-names>M A</given-names>
</name>
<name>
<surname>Milenkov</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Baraniuk</surname>
<given-names>R G</given-names>
</name>
</person-group>
<article-title>Designing compressive sensing dna microarrays</article-title>
<year>2007</year>
<conf-name>{IEEE} Workshop Comp. Adv. Multi-Sensor Adapt. Proc. {(CAMPSAP)}</conf-name>
<fpage>141</fpage>
<lpage>144</lpage>
</element-citation>
</ref>
<ref id="R119">
<label>119</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Gingell</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kowahl</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Automated microarray organism detection with a non-gaussian maximum likelihood model</article-title>
<year>2007</year>
<conf-name>IEEE Workshop Stat. Sig. Proc</conf-name>
</element-citation>
</ref>
<ref id="R120">
<label>120</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Yok</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Rosen</surname>
<given-names>GL</given-names>
</name>
</person-group>
<article-title>An iterative approach to probe-design for compressive sensing microarrays</article-title>
<year>2008</year>
<conf-name>IEEE Intl. Workshop Syst. Biol. Med</conf-name>
</element-citation>
</ref>
<ref id="R121">
<label>121</label>
<element-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Vikalo</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Parvresh</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Misra</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hassibi</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Recovering sparse signals using sparse measurement matrices in compressed dna microarrays</article-title>
<source>IEEE J. Select. Topics Signal Processing</source>
<year>2008</year>
<volume>2</volume>
<issue>3</issue>
<fpage>275</fpage>
<lpage>285</lpage>
</element-citation>
</ref>
<ref id="R122">
<label>122</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schliep</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Torney</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Rahmann</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Group testing with DNA chips: generating designs and decoding experiments</article-title>
<source>Comput. Soc. Bioinform. Conf</source>
<year>2003</year>
<volume>2003</volume>
<fpage>84</fpage>
</element-citation>
</ref>
<ref id="R123">
<label>123</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jones</surname>
<given-names>BV</given-names>
</name>
<name>
<surname>Begley</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hill</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gahan</surname>
<given-names>CGM</given-names>
</name>
<name>
<surname>Marchesi</surname>
<given-names>JR</given-names>
</name>
</person-group>
<article-title>Functional and comparative metagenomic analysis of bile salt hydrolase activity in the human gut microbiome</article-title>
<source>Proc. Natl. Acad. Sci</source>
<year>2008</year>
<volume>105</volume>
<issue>36</issue>
<fpage>13580</fpage>
<lpage>13585</lpage>
<pub-id pub-id-type="pmid">18757757</pub-id>
</element-citation>
</ref>
<ref id="R124">
<label>124</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Elshahed</surname>
<given-names>MS</given-names>
</name>
<name>
<surname>Youssef</surname>
<given-names>NH</given-names>
</name>
<name>
<surname>Spain</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Sheik</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Najar</surname>
<given-names>FZ</given-names>
</name>
<name>
<surname>Sukharnikov</surname>
<given-names>L O</given-names>
</name>
<name>
<surname>Roe</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Schloss</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Bailey</surname>
<given-names>VL</given-names>
</name>
<name>
<surname>Krumholz</surname>
<given-names>LR</given-names>
</name>
</person-group>
<article-title>Novelty and uniqueness patterns of rare members of the soil biosphere</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2008</year>
<volume>74</volume>
<issue>17</issue>
<fpage>5422</fpage>
<lpage>5428</lpage>
<pub-id pub-id-type="pmid">18606799</pub-id>
</element-citation>
</ref>
<ref id="R125">
<label>125</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fierer</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Jackson</surname>
<given-names>RB</given-names>
</name>
</person-group>
<article-title>The diversity and biogeography of soil bacterial communities</article-title>
<source>Proc. Natl. Acad. Sci</source>
<year>2006</year>
<volume>103</volume>
<issue>3</issue>
<fpage>626</fpage>
<lpage>631</lpage>
<pub-id pub-id-type="pmid">16407148</pub-id>
</element-citation>
</ref>
<ref id="R126">
<label>126</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Allison</surname>
<given-names>SD</given-names>
</name>
<name>
<surname>Martiny</surname>
<given-names>JBH</given-names>
</name>
</person-group>
<article-title>Resistance, resilience, and redundancy in microbial communities</article-title>
<source>Proc. Natl. Acad. Sci</source>
<year>2008</year>
<volume>105</volume>
<fpage>11512</fpage>
<lpage>11519</lpage>
<pub-id pub-id-type="pmid">18695234</pub-id>
</element-citation>
</ref>
<ref id="R127">
<label>127</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>DeLong</surname>
<given-names>EF</given-names>
</name>
<name>
<surname>Preston</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Mincer</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rich</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Hallam</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Frigaard</surname>
<given-names>N-U</given-names>
</name>
<name>
<surname>Martinez</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sullivan</surname>
<given-names>MB</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Rodriguez Brito</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Chisholm</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>Karl</surname>
<given-names>DM</given-names>
</name>
</person-group>
<article-title>Community genomics among stratified microbial assemblages in the ocean's interior</article-title>
<source>Sci. Mag</source>
<year>2006</year>
<volume>311</volume>
<issue>5760</issue>
<fpage>496</fpage>
<lpage>503</lpage>
</element-citation>
</ref>
<ref id="R128">
<label>128</label>
<element-citation publication-type="webpage">
<source>
<uri xlink:type="simple" xlink:href="www.ncbi.nlm.nih.gov/projects/gorf/">www.ncbi.nlm.nih.gov/projects/gorf/</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R129">
<label>129</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Kulp</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Haussler</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Reese</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Eeckman</surname>
<given-names>FH</given-names>
</name>
</person-group>
<article-title>A generalized hidden markov model for the recognition of human genes in DNA</article-title>
<source>ISMB</source>
<year>1996</year>
<volume>4</volume>
<publisher-loc>St. Louis, MO</publisher-loc>
<publisher-name>AAAI/MIT Press</publisher-name>
<fpage>134</fpage>
<lpage>142</lpage>
</element-citation>
</ref>
<ref id="R130">
<label>130</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Burge</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Karlin</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Prediction of complete gene structures in human genomic DNA</article-title>
<source>J. Mol. Biol</source>
<year>1997</year>
<volume>268</volume>
<issue>1</issue>
<fpage>78</fpage>
<lpage>94</lpage>
<pub-id pub-id-type="pmid">9149143</pub-id>
</element-citation>
</ref>
<ref id="R131">
<label>131</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Salzberg</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Delcher</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Kasif</surname>
<given-names>S</given-names>
</name>
<name>
<surname>White</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>Microbial gene identification using interpolated markov models</article-title>
<source>Nucleic Acids Res</source>
<year>1998</year>
<volume>26</volume>
<issue>2</issue>
<fpage>544</fpage>
<lpage>548</lpage>
<pub-id pub-id-type="pmid">9421513</pub-id>
</element-citation>
</ref>
<ref id="R132">
<label>132</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noguchi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Takagi</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Metagene: prokaryotic gene finding from environmental genome shotgun sequences</article-title>
<source>Nucleic Acids Res</source>
<year>2006</year>
<volume>34</volume>
<issue>19</issue>
<fpage>5623</fpage>
<lpage>5630</lpage>
<pub-id pub-id-type="pmid">17028096</pub-id>
</element-citation>
</ref>
<ref id="R133">
<label>133</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Benson</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Karsch-Mizrachi</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Ostell</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rapp</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Wheeler</surname>
<given-names>DL</given-names>
</name>
</person-group>
<article-title>GenBank</article-title>
<source>Nucleic Acid Res</source>
<year>2008</year>
<volume>36</volume>
<fpage>25</fpage>
<lpage>30</lpage>
</element-citation>
</ref>
<ref id="R134">
<label>134</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Harrington</surname>
<given-names>ED</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Doerks</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name>
<surname>von Mering</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Jensen</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Quantitative assessment of protein function prediction from metagenomics shotgun sequences</article-title>
<source>Proc. Natl. Acad. Sci</source>
<year>2007</year>
<volume>104</volume>
<issue>35</issue>
<fpage>13913</fpage>
<lpage>13918</lpage>
<pub-id pub-id-type="pmid">17717083</pub-id>
</element-citation>
</ref>
<ref id="R135">
<label>135</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>The kegg database</article-title>
<source>Novartis Found. Symp</source>
<year>2002</year>
<comment>247, 91-101,101-103, 119-128, 244-252</comment>
</element-citation>
</ref>
<ref id="R136">
<label>136</label>
<element-citation publication-type="webpage">
<source>
<uri xlink:type="simple" xlink:href="http://ncbi.nih.gov/COG">http://ncbi.nih.gov/COG</uri>
</source>
<year>2009</year>
</element-citation>
</ref>
<ref id="R137">
<label>137</label>
<element-citation publication-type="webpage">
<article-title>Consortium, the uniprot; nucleic acids research advance access published November 27, 2007</article-title>
<source>The Universal Protein Resource (UniProt)
<uri xlink:type="simple" xlink:href="http://www.ebi.ac.uk/uniref/">http://www.ebi.ac.uk/uniref/</uri>
</source>
<year>2007</year>
</element-citation>
</ref>
<ref id="R138">
<label>138</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Copley</surname>
<given-names>RR</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ciccarelli</surname>
<given-names>FD</given-names>
</name>
<name>
<surname>Doerks</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Schultz</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ponting</surname>
<given-names>CP</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>SMART 4.0: towards genomic data integration</article-title>
<source>Nucliec Acids Res</source>
<year>2004</year>
<volume>32</volume>
<issue>1</issue>
<fpage>142</fpage>
<lpage>144</lpage>
</element-citation>
</ref>
<ref id="R139">
<label>139</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Finn</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Tate</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Mistry</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Coggill</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Sammut</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Hotz</surname>
<given-names>H-R</given-names>
</name>
<name>
<surname>Ceric</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Forslund</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
<name>
<surname>Sonnhammer</surname>
<given-names>ELL</given-names>
</name>
<name>
<surname>Bateman</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>The pfam protein families database</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<fpage>281</fpage>
<lpage>288</lpage>
</element-citation>
</ref>
<ref id="R140">
<label>140</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yooseph</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Sutton</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering</article-title>
<source>BMC Bioinformatics</source>
<year>2008</year>
<volume>9</volume>
<fpage>182 </fpage>
<pub-id pub-id-type="pmid">18402669</pub-id>
</element-citation>
</ref>
<ref id="R141">
<label>141</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hoff</surname>
<given-names>KJ</given-names>
</name>
<name>
<surname>Tech</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lingner</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Daniel</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Morgenstern</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Meinicke</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Gene prediction in metagenomic fragments: a large scale machine learning approach</article-title>
<source>BMC Bioinformatics</source>
<year>2008</year>
<volume>9</volume>
<fpage>217</fpage>
<pub-id pub-id-type="pmid">18442389</pub-id>
</element-citation>
</ref>
<ref id="R142">
<label>142</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dinsdale</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Hall</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Angly</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Breitbart</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Brulc</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Furlan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Desnues</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Haynes</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>L</given-names>
</name>
<name>
<surname>McDaniel</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Moran</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Nelson</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Nilsson</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Olson</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Paul</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rodriguez Brito</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Ruan</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Swan</surname>
<given-names>BK</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Valentine</surname>
<given-names>DL</given-names>
</name>
<name>
<surname>Thurber</surname>
<given-names>RV</given-names>
</name>
<name>
<surname>Wegley</surname>
<given-names>L</given-names>
</name>
<name>
<surname>White</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Functional metagenomic profiling of nine biomes</article-title>
<source>Nature</source>
<year>2008</year>
<volume>452</volume>
<issue>7187</issue>
<fpage>629</fpage>
<lpage>632</lpage>
<pub-id pub-id-type="pmid">18337718</pub-id>
</element-citation>
</ref>
<ref id="R143">
<label>143</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lozupone</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Global patterns in bacterial diversity</article-title>
<source>Proc. Natl. Acad. Sci</source>
<year>2007</year>
<volume>104</volume>
<issue>27</issue>
<fpage>11436</fpage>
<lpage>11440</lpage>
<pub-id pub-id-type="pmid">17592124</pub-id>
</element-citation>
</ref>
<ref id="R144">
<label>144</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krause</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Diaz</surname>
<given-names>NN</given-names>
</name>
<name>
<surname>Bartels</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Puhler</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Stoye</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Finding novel genes in bacterial communities isolated from the environment</article-title>
<source>Bioinformatics</source>
<year>2006</year>
<volume>22</volume>
<issue>14</issue>
<fpage>281</fpage>
<lpage>289</lpage>
</element-citation>
</ref>
<ref id="R145">
<label>145</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McGrath</surname>
<given-names>KC</given-names>
</name>
<name>
<surname>Thomas-Hall</surname>
<given-names>SR</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>CT</given-names>
</name>
<name>
<surname>Leo</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Alexa</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Schenk</surname>
<given-names>PM</given-names>
</name>
</person-group>
<article-title>Isolation and analysis of mrna from environmental microbial communities</article-title>
<source>J. Microbiol. Meth</source>
<year>2008</year>
<volume>75</volume>
<fpage>172</fpage>
<lpage>176</lpage>
</element-citation>
</ref>
<ref id="R146">
<label>146</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scholten</surname>
<given-names>JCM</given-names>
</name>
<name>
<surname>Culley</surname>
<given-names>DE</given-names>
</name>
<name>
<surname>Nie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Munn</surname>
<given-names>KJ</given-names>
</name>
<name>
<surname>Chow</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Brockman</surname>
<given-names>FJ</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Development and assessment of whole-genome oligonucleotide microarrays to analyze an anaerobic microbial community and its responses to oxidative stress</article-title>
<source>Biochem. Biophys. Res. Commun</source>
<year>2007</year>
<volume>358</volume>
<fpage>571</fpage>
<lpage>577</lpage>
<pub-id pub-id-type="pmid">17498652</pub-id>
</element-citation>
</ref>
<ref id="R147">
<label>147</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Gentry</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Schadt</surname>
<given-names>CW</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Liebich</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Chong</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Gu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Jardin</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Criddle</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>GeoChip: a comprehensive microarray for investigating biogeochemical, ecological and environmental processes</article-title>
<source>ISME J</source>
<year>2007</year>
<volume>1</volume>
<fpage>67</fpage>
<lpage>77</lpage>
<pub-id pub-id-type="pmid">18043615</pub-id>
</element-citation>
</ref>
<ref id="R148">
<label>148</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rhee</surname>
<given-names>SK</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Chong</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Detection of genes involved in biodegradation and biotransformation in microbial communities by using 50-mer oligonucleotide microarrays</article-title>
<source>Appl. Environ. Microbiol</source>
<year>2004</year>
<volume>70</volume>
<fpage>4303</fpage>
<lpage>4317</lpage>
<pub-id pub-id-type="pmid">15240314</pub-id>
</element-citation>
</ref>
<ref id="R149">
<label>149</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yergeau</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Kang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>He</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Kowalchuk</surname>
<given-names>GA</given-names>
</name>
</person-group>
<article-title>Functional microarray analysis of nitrogen and carbon cycling genes across an Antarctic latitudinal transect</article-title>
<source>ISME J</source>
<year>2007</year>
<volume>1</volume>
<fpage>163</fpage>
<lpage>179</lpage>
<pub-id pub-id-type="pmid">18043626</pub-id>
</element-citation>
</ref>
<ref id="R150">
<label>150</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gilbert</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Field</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Gilna</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Joint</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>Detection of large numbers of novel sequences in the meta-transcriptomes of complex marine microbial communities</article-title>
<source>PLoS ONE</source>
<year>2008</year>
<volume>3</volume>
<fpage>e3042</fpage>
<pub-id pub-id-type="pmid">18725995</pub-id>
</element-citation>
</ref>
<ref id="R151">
<label>151</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Frias-Lopez</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tyson</surname>
<given-names>GW</given-names>
</name>
<name>
<surname>Coleman</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Schuster</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Chisholdm</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>DeLong</surname>
<given-names>EF</given-names>
</name>
</person-group>
<article-title>Microbial community gene expression in ocean surface waters</article-title>
<source>Proc. Natl. Acad. Sci. USA</source>
<year>2008</year>
<volume>105</volume>
<fpage>3805</fpage>
<lpage>3810</lpage>
<pub-id pub-id-type="pmid">18316740</pub-id>
</element-citation>
</ref>
<ref id="R152">
<label>152</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klevecz</surname>
<given-names>RR</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Bolen</surname>
<given-names>JL</given-names>
</name>
</person-group>
<article-title>Signal processing and the design of microarray time-series experiments</article-title>
<source>Meth. Mol. Biol</source>
<year>2007</year>
<volume>377</volume>
<fpage>75</fpage>
<lpage>94</lpage>
</element-citation>
</ref>
<ref id="R153">
<label>153</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alter</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>Genomic signal processing: from matrix algebra to genetic networks</article-title>
<source>Meth. Mol. Biol</source>
<year>2007</year>
<volume>377</volume>
<fpage>17</fpage>
<lpage>60</lpage>
</element-citation>
</ref>
<ref id="R154">
<label>154</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alter</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>PO</given-names>
</name>
<name>
<surname>Botstein</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms</article-title>
<source>Proc. Natl. Acad. Sci. USA</source>
<year>2003</year>
<volume>100</volume>
<fpage>3351</fpage>
<lpage>3356</lpage>
<pub-id pub-id-type="pmid">12631705</pub-id>
</element-citation>
</ref>
<ref id="R155">
<label>155</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boutros</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Okey</surname>
<given-names>AB</given-names>
</name>
</person-group>
<article-title>Unsupervised pattern recognition: an introduction to the whys and wherefores of clustering microarray data</article-title>
<source>Brief. Bioinform</source>
<year>2005</year>
<volume>6</volume>
<fpage>331</fpage>
<lpage>343</lpage>
<pub-id pub-id-type="pmid">16420732</pub-id>
</element-citation>
</ref>
<ref id="R156">
<label>156</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Valafar</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Pattern recognition techniques in microarray data analysis: a survey</article-title>
<source>Ann. NY Acad. Sci</source>
<year>2002</year>
<volume>980</volume>
<fpage>41</fpage>
<lpage>64</lpage>
<pub-id pub-id-type="pmid">12594081</pub-id>
</element-citation>
</ref>
<ref id="R157">
<label>157</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilmes</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bond</surname>
<given-names>PL</given-names>
</name>
</person-group>
<article-title>Metaproteomics: studying functional gene expression in microbial ecosystems</article-title>
<source>Trends Microbiol</source>
<year>2006</year>
<volume>14</volume>
<fpage>92</fpage>
<lpage>97</lpage>
<pub-id pub-id-type="pmid">16406790</pub-id>
</element-citation>
</ref>
<ref id="R158">
<label>158</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ram</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>VanBerkmoes</surname>
<given-names>NC</given-names>
</name>
<name>
<surname>Thelen</surname>
<given-names>MP</given-names>
</name>
<name>
<surname>Tyson</surname>
<given-names>GW</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Blake</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Shah</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hettich</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Banfield</surname>
<given-names>JF</given-names>
</name>
</person-group>
<article-title>Community proteomics of a natural microbial biofilm</article-title>
<source>Science</source>
<year>2005</year>
<volume>308</volume>
<fpage>1915</fpage>
<lpage>1920</lpage>
<pub-id pub-id-type="pmid">15879173</pub-id>
</element-citation>
</ref>
<ref id="R159">
<label>159</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilmes</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Wexler</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bond</surname>
<given-names>PL</given-names>
</name>
</person-group>
<article-title>Metaproteomics provides functional insight into activated sludge wastewater treatment</article-title>
<source>PLoS ONE</source>
<year>2008</year>
<volume>3</volume>
<fpage>e1778</fpage>
<pub-id pub-id-type="pmid">18392150</pub-id>
</element-citation>
</ref>
<ref id="R160">
<label>160</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Denef</surname>
<given-names>V J</given-names>
</name>
<name>
<surname>VerBerkmoes</surname>
<given-names>NC</given-names>
</name>
<name>
<surname>Shah</surname>
<given-names>MB</given-names>
</name>
<name>
<surname>Abraham</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lefsrud</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hettich</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Banfield</surname>
<given-names>JF</given-names>
</name>
</person-group>
<article-title>Proteomics-inferred genome typing (PIGT) demonstrates inter-population recombination as a strategy for environmental adaptation</article-title>
<source>Environ. Microbiol</source>
<year>2008</year>
<comment>(In Press)</comment>
</element-citation>
</ref>
<ref id="R161">
<label>161</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zucht</surname>
<given-names>HD</given-names>
</name>
<name>
<surname>Lamerz</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Khamenia</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Schiller</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Appel</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Tammen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Crameri</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Selle</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Datamining methodology for LC-MALDI-MS based peptide profiling</article-title>
<source>Comb. Chem. High Through. Scr</source>
<year>2005</year>
<volume>8</volume>
<fpage>717</fpage>
<lpage>723</lpage>
</element-citation>
</ref>
<ref id="R162">
<label>162</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ressom</surname>
<given-names>HW</given-names>
</name>
<name>
<surname>Varghese</surname>
<given-names>RS</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Xuan</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Clarke</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Classification algorithms for phenotype prediction in genomics and proteomics</article-title>
<source>Front Biosci</source>
<year>2008</year>
<volume>13</volume>
<fpage>691</fpage>
<lpage>708</lpage>
<pub-id pub-id-type="pmid">17981580</pub-id>
</element-citation>
</ref>
<ref id="R163">
<label>163</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Levner</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>Feature selection and nearest centroid classification for protein mass spectrometry</article-title>
<source>BMC Bioinform</source>
<year>2005</year>
<volume>6</volume>
<fpage>68</fpage>
</element-citation>
</ref>
<ref id="R164">
<label>164</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Zu</surname>
<given-names>XQ</given-names>
</name>
<name>
<surname>Leung</surname>
<given-names>HC</given-names>
</name>
<name>
<surname>Harris</surname>
<given-names>LN</given-names>
</name>
<name>
<surname>Iglehart</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Miron</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>WH</given-names>
</name>
</person-group>
<article-title>Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data</article-title>
<source>BMC Bioinform</source>
<year>2006</year>
<volume>7</volume>
<fpage>197</fpage>
</element-citation>
</ref>
<ref id="R165">
<label>165</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bensmail</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Golek</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Moody</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Semmes</surname>
<given-names>JO</given-names>
</name>
<name>
<surname>Haoudi</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>A novel approach for clustering proteomics data using Bayesian fast Fourier transform</article-title>
<source>Bioinformatics</source>
<year>2005</year>
<volume>21</volume>
<fpage>2210</fpage>
<lpage>2224</lpage>
<pub-id pub-id-type="pmid">15769836</pub-id>
</element-citation>
</ref>
<ref id="R166">
<label>166</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baria</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Jurman</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Riccadonna</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Merler</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chierici</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Furianello</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Machine learning methods for predictive proteomics</article-title>
<source>Brief. Bioinform</source>
<year>2008</year>
<volume>9</volume>
<fpage>119</fpage>
<lpage>128</lpage>
<pub-id pub-id-type="pmid">18310105</pub-id>
</element-citation>
</ref>
<ref id="R167">
<label>167</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Coen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Holmes</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Lindon</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Nicholson</surname>
<given-names>JK</given-names>
</name>
</person-group>
<article-title>NMR-based metabolic profiling and metabonomic approaches to problems in molecular toxicology</article-title>
<source>Chem. Res. Toxicol</source>
<year>2008</year>
<volume>21</volume>
<fpage>9</fpage>
<lpage>27</lpage>
<pub-id pub-id-type="pmid">18171018</pub-id>
</element-citation>
</ref>
<ref id="R168">
<label>168</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kiefer</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Portais</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Vorholt</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>Quantitative metabolome analysis using liquid chromatography-high-resolution mass spectrometry</article-title>
<source>Anal. Biochem</source>
<year>2008</year>
<volume>382</volume>
<fpage>94</fpage>
<lpage>100</lpage>
<pub-id pub-id-type="pmid">18694716</pub-id>
</element-citation>
</ref>
<ref id="R169">
<label>169</label>
<element-citation publication-type="webpage">
<article-title>Venter Institute's Sargasso Sea Set</article-title>
<source>
<uri xlink:type="simple" xlink:href="https://research.venterinstitute.org/sargasso/">https://research.venterinstitute.org/sargasso/</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R170">
<label>170</label>
<element-citation publication-type="webpage">
<article-title>Human Gut Microbiome Initiative (HGMI)</article-title>
<source>
<uri xlink:type="simple" xlink:href="http://genome.wustl.edu/hgm/HGM_frontpage.cgi">http://genome.wustl.edu/hgm/HGM_frontpage.cgi</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R171">
<label>171</label>
<element-citation publication-type="webpage">
<source>
<uri xlink:type="simple" xlink:href="http://hmp.nih.gov/">http://hmp.nih.gov/</uri>
</source>
<year>2009</year>
</element-citation>
</ref>
<ref id="R172">
<label>172</label>
<element-citation publication-type="webpage">
<source>
<uri xlink:type="simple" xlink:href="http://img.jgi.doe.gov/m">http://img.jgi.doe.gov/m</uri>
</source>
<year>2009</year>
</element-citation>
</ref>
<ref id="R173">
<label>173</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Markowitz</surname>
<given-names>VM</given-names>
</name>
<name>
<surname>Ivanova</surname>
<given-names>NN</given-names>
</name>
<name>
<surname>Szeto</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Palaniappan</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Dalevi</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>I-MA</given-names>
</name>
<name>
<surname>Grechkin</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Dubchak</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Lykidis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Mavromatis</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Kyripides</surname>
<given-names>NC</given-names>
</name>
</person-group>
<article-title>IMG/M: a data management and analysis system for metagenomes</article-title>
<source>Nucleic Acids Res</source>
<year>2007</year>
<volume>36</volume>
<fpage>534</fpage>
<lpage>538</lpage>
</element-citation>
</ref>
<ref id="R174">
<label>174</label>
<element-citation publication-type="webpage">
<article-title>SDSU Center for Universal Microbial Sequencing</article-title>
<source>
<uri xlink:type="simple" xlink:href="http://scums.sdsu.edu/">http://scums.sdsu.edu/</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R175">
<label>175</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marcy</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Ouverney</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Bik</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Losekann</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ivanova</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Szeto</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Platt</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Relman</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Quake</surname>
<given-names>SR</given-names>
</name>
</person-group>
<article-title>Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth</article-title>
<source>Proc. Natl. Acad. Sci. USA</source>
<year>2007</year>
<volume>104</volume>
<issue>29</issue>
<fpage>11889</fpage>
<lpage>94</lpage>
<pub-id pub-id-type="pmid">17620602</pub-id>
</element-citation>
</ref>
<ref id="R176">
<label>176</label>
<element-citation publication-type="webpage">
<source>
<uri xlink:type="simple" xlink:href="http://www.homd.org">http://www.homd.org</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R177">
<label>177</label>
<element-citation publication-type="webpage">
<source>
<uri xlink:type="simple" xlink:href="http://camera.calit2.net">http://camera.calit2.net</uri>
</source>
<year>2008</year>
</element-citation>
</ref>
<ref id="R178">
<label>178</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Seshadri</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Kravitz</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Smarr</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Gilna</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Frazier</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>CAMERA: A community resource for metagenomics</article-title>
<source>PLoS Biol</source>
<year>2007</year>
<volume>5</volume>
<issue>3</issue>
<fpage>e75</fpage>
<pub-id pub-id-type="pmid">17355175</pub-id>
</element-citation>
</ref>
<ref id="R179">
<label>179</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Paarmann</surname>
<given-names>D</given-names>
</name>
<name>
<surname>D'Souza</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Olson</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Glass</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Kubal</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Paczian</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rodriguez</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wilke</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wilkening</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>RA</given-names>
</name>
</person-group>
<article-title>The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes</article-title>
<source>BMC Bioinform</source>
<year>2008</year>
<volume>9</volume>
<fpage>386</fpage>
</element-citation>
</ref>
<ref id="R180">
<label>180</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garrity</surname>
<given-names>GM</given-names>
</name>
<name>
<surname>Field</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Kyripides</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Hirschman</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Sansone</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Angiuoli</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Cole</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Glockner</surname>
<given-names>FO</given-names>
</name>
<name>
<surname>Kolker</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Kowalchuk</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Moran</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Ussery</surname>
<given-names>D</given-names>
</name>
<name>
<surname>White</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>Toward a standards-compliant genomic and metagenomic publication record</article-title>
<source>OMICS: J. Integr. Biol</source>
<year>2008</year>
<volume>12</volume>
<issue>2</issue>
<fpage>157</fpage>
<lpage>160</lpage>
</element-citation>
</ref>
<ref id="R181">
<label>181</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Delong</surname>
<given-names>EF</given-names>
</name>
</person-group>
<article-title>Microbial community genomics in the ocean</article-title>
<source>Nat. Rev. Microbiol</source>
<year>2005</year>
<volume>3</volume>
<fpage>459</fpage>
<lpage>469</lpage>
<pub-id pub-id-type="pmid">15886695</pub-id>
</element-citation>
</ref>
<ref id="R182">
<label>182</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Richter</surname>
<given-names>DC</given-names>
</name>
<name>
<surname>Ott</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Schmid</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>MetaSim---A sequencing simulator for genomics and metagenomics</article-title>
<source>PLoS ONE</source>
<year>2008</year>
<comment>
<italic>doi:10.1371/journal.pone.0003373</italic>
</comment>
</element-citation>
</ref>
<ref id="R183">
<label>183</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Turnbaugh</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Ley</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Mahowald</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Magrini</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Mardis</surname>
<given-names>ER</given-names>
</name>
<name>
<surname>Gordon</surname>
<given-names>JI</given-names>
</name>
</person-group>
<article-title>An obesity-associated gut microbiome with increased capacity for energy harvest</article-title>
<source>Nature</source>
<year>2006</year>
<volume>444</volume>
<issue>21</issue>
<fpage>1027</fpage>
<lpage>1031</lpage>
<pub-id pub-id-type="pmid">17183312</pub-id>
</element-citation>
</ref>
<ref id="R184">
<label>184</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ley</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Backhed</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Turnbaugh</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lozupone</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Knight</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Gordon</surname>
<given-names>JI</given-names>
</name>
</person-group>
<article-title>Obesity alters gut microbial ecology</article-title>
<source>Proc. Natl. Acad. Sci. USA</source>
<year>2005</year>
<volume>102</volume>
<issue>31</issue>
<fpage>11070</fpage>
<lpage>11075</lpage>
<pub-id pub-id-type="pmid">16033867</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
<floats-group>
<fig id="F1" position="float">
<label>Fig. (1)</label>
<caption>
<p>Comparison of Speech Classification to the DNA Classification problem.</p>
</caption>
<graphic xlink:href="CG-10-493_F1"></graphic>
</fig>
<fig id="F2" position="float">
<label>Fig. (2)</label>
<caption>
<p>The first metagenomics dataset was shotgun,
<italic>via</italic>
the Sanger method, sequenced in 2003. Since then, pyrosequencing is now being used to gain cheaper and highly parallel reads. The timeline illustrates some metagenomics datasets that have been sequenced to date and is a subset of all the projects that are completed [
<xref ref-type="bibr" rid="R40">40</xref>
].</p>
</caption>
<graphic xlink:href="CG-10-493_F2"></graphic>
</fig>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000515 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000515 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:2808676
   |texte=   Signal Processing for Metagenomics: Extracting Information from the Soup
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:20436876" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024