Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 0005069 ( Pmc/Corpus ); précédent : 0005068; suivant : 0005070 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Functional assignment of metagenomic data: challenges and applications</title>
<author>
<name sortKey="Prakash, Tulika" sort="Prakash, Tulika" uniqKey="Prakash T" first="Tulika" last="Prakash">Tulika Prakash</name>
</author>
<author>
<name sortKey="Taylor, Todd D" sort="Taylor, Todd D" uniqKey="Taylor T" first="Todd D." last="Taylor">Todd D. Taylor</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">22772835</idno>
<idno type="pmc">3504928</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3504928</idno>
<idno type="RBID">PMC:3504928</idno>
<idno type="doi">10.1093/bib/bbs033</idno>
<date when="2012">2012</date>
<idno type="wicri:Area/Pmc/Corpus">000506</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Functional assignment of metagenomic data: challenges and applications</title>
<author>
<name sortKey="Prakash, Tulika" sort="Prakash, Tulika" uniqKey="Prakash T" first="Tulika" last="Prakash">Tulika Prakash</name>
</author>
<author>
<name sortKey="Taylor, Todd D" sort="Taylor, Todd D" uniqKey="Taylor T" first="Todd D." last="Taylor">Todd D. Taylor</name>
</author>
</analytic>
<series>
<title level="j">Briefings in Bioinformatics</title>
<idno type="ISSN">1467-5463</idno>
<idno type="eISSN">1477-4054</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Metagenomic sequencing provides a unique opportunity to explore earth’s limitless environments harboring scores of yet unknown and mostly unculturable microbes and other organisms. Functional analysis of the metagenomic data plays a central role in projects aiming to explore the most essential questions in microbiology, namely ‘In a given environment, among the microbes present, what are they doing, and how are they doing it?’ Toward this goal, several large-scale metagenomic projects have recently been conducted or are currently underway. Functional analysis of metagenomic data mainly suffers from the vast amount of data generated in these projects. The shear amount of data requires much computational time and storage space. These problems are compounded by other factors potentially affecting the functional analysis, including, sample preparation, sequencing method and average genome size of the metagenomic samples. In addition, the read-lengths generated during sequencing influence sequence assembly, gene prediction and subsequently the functional analysis. The level of confidence for functional predictions increases with increasing read-length. Usually, the most reliable functional annotations for metagenomic sequences are achieved using homology-based approaches against publicly available reference sequence databases. Here, we present an overview of the current state of functional analysis of metagenomic sequence data, bottlenecks frequently encountered and possible solutions in light of currently available resources and tools. Finally, we provide some examples of applications from recent metagenomic studies which have been successfully conducted in spite of the known difficulties.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Pace, Nr" uniqKey="Pace N">NR Pace</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
<author>
<name sortKey="Rubin, Em" uniqKey="Rubin E">EM Rubin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kunin, V" uniqKey="Kunin V">V Kunin</name>
</author>
<author>
<name sortKey="Copeland, A" uniqKey="Copeland A">A Copeland</name>
</author>
<author>
<name sortKey="Lapidus, A" uniqKey="Lapidus A">A Lapidus</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wooley, Jc" uniqKey="Wooley J">JC Wooley</name>
</author>
<author>
<name sortKey="Godzik, A" uniqKey="Godzik A">A Godzik</name>
</author>
<author>
<name sortKey="Friedberg, I" uniqKey="Friedberg I">I Friedberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Batzoglou, S" uniqKey="Batzoglou S">S Batzoglou</name>
</author>
<author>
<name sortKey="Jaffe, Db" uniqKey="Jaffe D">DB Jaffe</name>
</author>
<author>
<name sortKey="Stanley, K" uniqKey="Stanley K">K Stanley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aparicio, S" uniqKey="Aparicio S">S Aparicio</name>
</author>
<author>
<name sortKey="Chapman, J" uniqKey="Chapman J">J Chapman</name>
</author>
<author>
<name sortKey="Stupka, E" uniqKey="Stupka E">E Stupka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author>
<name sortKey="Sutton, Gg" uniqKey="Sutton G">GG Sutton</name>
</author>
<author>
<name sortKey="Delcher, Al" uniqKey="Delcher A">AL Delcher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zerbino, Dr" uniqKey="Zerbino D">DR Zerbino</name>
</author>
<author>
<name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author>
<name sortKey="Zhu, H" uniqKey="Zhu H">H Zhu</name>
</author>
<author>
<name sortKey="Ruan, J" uniqKey="Ruan J">J Ruan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
<author>
<name sortKey="Tang, H" uniqKey="Tang H">H Tang</name>
</author>
<author>
<name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y Ye</name>
</author>
<author>
<name sortKey="Tang, H" uniqKey="Tang H">H Tang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Peng, Y" uniqKey="Peng Y">Y Peng</name>
</author>
<author>
<name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author>
<name sortKey="Yiu, Sm" uniqKey="Yiu S">SM Yiu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noguchi, H" uniqKey="Noguchi H">H Noguchi</name>
</author>
<author>
<name sortKey="Park, J" uniqKey="Park J">J Park</name>
</author>
<author>
<name sortKey="Takagi, T" uniqKey="Takagi T">T Takagi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhu, W" uniqKey="Zhu W">W Zhu</name>
</author>
<author>
<name sortKey="Lomsadze, A" uniqKey="Lomsadze A">A Lomsadze</name>
</author>
<author>
<name sortKey="Borodovsky, M" uniqKey="Borodovsky M">M Borodovsky</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rho, M" uniqKey="Rho M">M Rho</name>
</author>
<author>
<name sortKey="Tang, H" uniqKey="Tang H">H Tang</name>
</author>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y Ye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delcher, Al" uniqKey="Delcher A">AL Delcher</name>
</author>
<author>
<name sortKey="Harmon, D" uniqKey="Harmon D">D Harmon</name>
</author>
<author>
<name sortKey="Kasif, S" uniqKey="Kasif S">S Kasif</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lowe, Tm" uniqKey="Lowe T">TM Lowe</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sharma, Vk" uniqKey="Sharma V">VK Sharma</name>
</author>
<author>
<name sortKey="Kumar, N" uniqKey="Kumar N">N Kumar</name>
</author>
<author>
<name sortKey="Prakash, T" uniqKey="Prakash T">T Prakash</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gerlach, W" uniqKey="Gerlach W">W Gerlach</name>
</author>
<author>
<name sortKey="Junemann, S" uniqKey="Junemann S">S Junemann</name>
</author>
<author>
<name sortKey="Tille, F" uniqKey="Tille F">F Tille</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Martin, Hg" uniqKey="Martin H">HG Martin</name>
</author>
<author>
<name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A Tsirigos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H Teeling</name>
</author>
<author>
<name sortKey="Waldmann, J" uniqKey="Waldmann J">J Waldmann</name>
</author>
<author>
<name sortKey="Lombardot, T" uniqKey="Lombardot T">T Lombardot</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
<author>
<name sortKey="Reichenberger, Er" uniqKey="Reichenberger E">ER Reichenberger</name>
</author>
<author>
<name sortKey="Rosenfeld, Am" uniqKey="Rosenfeld A">AM Rosenfeld</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Diaz, Nn" uniqKey="Diaz N">NN Diaz</name>
</author>
<author>
<name sortKey="Krause, L" uniqKey="Krause L">L Krause</name>
</author>
<author>
<name sortKey="Goesmann, A" uniqKey="Goesmann A">A Goesmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Markowitz, Vm" uniqKey="Markowitz V">VM Markowitz</name>
</author>
<author>
<name sortKey="Chen, Im" uniqKey="Chen I">IM Chen</name>
</author>
<author>
<name sortKey="Chu, K" uniqKey="Chu K">K Chu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goll, J" uniqKey="Goll J">J Goll</name>
</author>
<author>
<name sortKey="Rusch, Db" uniqKey="Rusch D">DB Rusch</name>
</author>
<author>
<name sortKey="Tanenbaum, Dm" uniqKey="Tanenbaum D">DM Tanenbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Glass, Em" uniqKey="Glass E">EM Glass</name>
</author>
<author>
<name sortKey="Wilkening, J" uniqKey="Wilkening J">J Wilkening</name>
</author>
<author>
<name sortKey="Wilke, A" uniqKey="Wilke A">A Wilke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Arumugam, M" uniqKey="Arumugam M">M Arumugam</name>
</author>
<author>
<name sortKey="Harrington, Ed" uniqKey="Harrington E">ED Harrington</name>
</author>
<author>
<name sortKey="Foerstner, Ku" uniqKey="Foerstner K">KU Foerstner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Mitra, S" uniqKey="Mitra S">S Mitra</name>
</author>
<author>
<name sortKey="Ruscheweyh, Hj" uniqKey="Ruscheweyh H">HJ Ruscheweyh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lingner, T" uniqKey="Lingner T">T Lingner</name>
</author>
<author>
<name sortKey="Asshauer, Kp" uniqKey="Asshauer K">KP Asshauer</name>
</author>
<author>
<name sortKey="Schreiber, F" uniqKey="Schreiber F">F Schreiber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, S" uniqKey="Wu S">S Wu</name>
</author>
<author>
<name sortKey="Zhu, Z" uniqKey="Zhu Z">Z Zhu</name>
</author>
<author>
<name sortKey="Fu, L" uniqKey="Fu L">L Fu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mende, Dr" uniqKey="Mende D">DR Mende</name>
</author>
<author>
<name sortKey="Waller, As" uniqKey="Waller A">AS Waller</name>
</author>
<author>
<name sortKey="Sunagawa, S" uniqKey="Sunagawa S">S Sunagawa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Foerstner, Ku" uniqKey="Foerstner K">KU Foerstner</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pignatelli, M" uniqKey="Pignatelli M">M Pignatelli</name>
</author>
<author>
<name sortKey="Moya, A" uniqKey="Moya A">A Moya</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yok, Ng" uniqKey="Yok N">NG Yok</name>
</author>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kurokawa, K" uniqKey="Kurokawa K">K Kurokawa</name>
</author>
<author>
<name sortKey="Itoh, T" uniqKey="Itoh T">T Itoh</name>
</author>
<author>
<name sortKey="Kuwahara, T" uniqKey="Kuwahara T">T Kuwahara</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ivanov, Ii" uniqKey="Ivanov I">II Ivanov</name>
</author>
<author>
<name sortKey="Atarashi, K" uniqKey="Atarashi K">K Atarashi</name>
</author>
<author>
<name sortKey="Manel, N" uniqKey="Manel N">N Manel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Prakash, T" uniqKey="Prakash T">T Prakash</name>
</author>
<author>
<name sortKey="Oshima, K" uniqKey="Oshima K">K Oshima</name>
</author>
<author>
<name sortKey="Morita, H" uniqKey="Morita H">H Morita</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ventura, M" uniqKey="Ventura M">M Ventura</name>
</author>
<author>
<name sortKey="O Onnell Motherway, M" uniqKey="O Onnell Motherway M">M O’Connell-Motherway</name>
</author>
<author>
<name sortKey="Leahy, S" uniqKey="Leahy S">S Leahy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richter, Dc" uniqKey="Richter D">DC Richter</name>
</author>
<author>
<name sortKey="Ott, F" uniqKey="Ott F">F Ott</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
<author>
<name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
<author>
<name sortKey="Sato, Y" uniqKey="Sato Y">Y Sato</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ashburner, M" uniqKey="Ashburner M">M Ashburner</name>
</author>
<author>
<name sortKey="Ball, Ca" uniqKey="Ball C">CA Ball</name>
</author>
<author>
<name sortKey="Blake, Ja" uniqKey="Blake J">JA Blake</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Overbeek, R" uniqKey="Overbeek R">R Overbeek</name>
</author>
<author>
<name sortKey="Begley, T" uniqKey="Begley T">T Begley</name>
</author>
<author>
<name sortKey="Butler, Rm" uniqKey="Butler R">RM Butler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sayers, Ew" uniqKey="Sayers E">EW Sayers</name>
</author>
<author>
<name sortKey="Barrett, T" uniqKey="Barrett T">T Barrett</name>
</author>
<author>
<name sortKey="Benson, Da" uniqKey="Benson D">DA Benson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Doerks, T" uniqKey="Doerks T">T Doerks</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tatusov, Rl" uniqKey="Tatusov R">RL Tatusov</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Powell, S" uniqKey="Powell S">S Powell</name>
</author>
<author>
<name sortKey="Szklarczyk, D" uniqKey="Szklarczyk D">D Szklarczyk</name>
</author>
<author>
<name sortKey="Trachana, K" uniqKey="Trachana K">K Trachana</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Punta, M" uniqKey="Punta M">M Punta</name>
</author>
<author>
<name sortKey="Coggill, Pc" uniqKey="Coggill P">PC Coggill</name>
</author>
<author>
<name sortKey="Eberhardt, Ry" uniqKey="Eberhardt R">RY Eberhardt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Selengut, Jd" uniqKey="Selengut J">JD Selengut</name>
</author>
<author>
<name sortKey="Haft, Dh" uniqKey="Haft D">DH Haft</name>
</author>
<author>
<name sortKey="Davidsen, T" uniqKey="Davidsen T">T Davidsen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Finn, Rd" uniqKey="Finn R">RD Finn</name>
</author>
<author>
<name sortKey="Clements, J" uniqKey="Clements J">J Clements</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brenner, Se" uniqKey="Brenner S">SE Brenner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tatusov, Rl" uniqKey="Tatusov R">RL Tatusov</name>
</author>
<author>
<name sortKey="Fedorova, Nd" uniqKey="Fedorova N">ND Fedorova</name>
</author>
<author>
<name sortKey="Jackson, Jd" uniqKey="Jackson J">JD Jackson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garcia, Mh" uniqKey="Garcia M">MH Garcia</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Kunin, V" uniqKey="Kunin V">V Kunin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Berg Miller, Me" uniqKey="Berg Miller M">ME Berg Miller</name>
</author>
<author>
<name sortKey="Yeoman, Cj" uniqKey="Yeoman C">CJ Yeoman</name>
</author>
<author>
<name sortKey="Chia, N" uniqKey="Chia N">N Chia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF DeLong</name>
</author>
<author>
<name sortKey="Preston, Cm" uniqKey="Preston C">CM Preston</name>
</author>
<author>
<name sortKey="Mincer, T" uniqKey="Mincer T">T Mincer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
<author>
<name sortKey="Von, Mc" uniqKey="Von M">MC von</name>
</author>
<author>
<name sortKey="Kobayashi, A" uniqKey="Kobayashi A">A Kobayashi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
<author>
<name sortKey="Chapman, J" uniqKey="Chapman J">J Chapman</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gill, Sr" uniqKey="Gill S">SR Gill</name>
</author>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M Pop</name>
</author>
<author>
<name sortKey="Deboy, Rt" uniqKey="Deboy R">RT Deboy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sigrist, Cj" uniqKey="Sigrist C">CJ Sigrist</name>
</author>
<author>
<name sortKey="Cerutti, L" uniqKey="Cerutti L">L Cerutti</name>
</author>
<author>
<name sortKey="De, Ce" uniqKey="De C">CE de</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Attwood, Tk" uniqKey="Attwood T">TK Attwood</name>
</author>
<author>
<name sortKey="Bradley, P" uniqKey="Bradley P">P Bradley</name>
</author>
<author>
<name sortKey="Flower, Dr" uniqKey="Flower D">DR Flower</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hunter, S" uniqKey="Hunter S">S Hunter</name>
</author>
<author>
<name sortKey="Jones, P" uniqKey="Jones P">P Jones</name>
</author>
<author>
<name sortKey="Mitchell, A" uniqKey="Mitchell A">A Mitchell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, D" uniqKey="Lee D">D Lee</name>
</author>
<author>
<name sortKey="Redfern, O" uniqKey="Redfern O">O Redfern</name>
</author>
<author>
<name sortKey="Orengo, C" uniqKey="Orengo C">C Orengo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dandekar, T" uniqKey="Dandekar T">T Dandekar</name>
</author>
<author>
<name sortKey="Snel, B" uniqKey="Snel B">B Snel</name>
</author>
<author>
<name sortKey="Huynen, M" uniqKey="Huynen M">M Huynen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Overbeek, R" uniqKey="Overbeek R">R Overbeek</name>
</author>
<author>
<name sortKey="Fonstein, M" uniqKey="Fonstein M">M Fonstein</name>
</author>
<author>
<name sortKey="D Ouza, M" uniqKey="D Ouza M">M D’Souza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Enright, Aj" uniqKey="Enright A">AJ Enright</name>
</author>
<author>
<name sortKey="Iliopoulos, I" uniqKey="Iliopoulos I">I Iliopoulos</name>
</author>
<author>
<name sortKey="Kyrpides, Nc" uniqKey="Kyrpides N">NC Kyrpides</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marcotte, Em" uniqKey="Marcotte E">EM Marcotte</name>
</author>
<author>
<name sortKey="Pellegrini, M" uniqKey="Pellegrini M">M Pellegrini</name>
</author>
<author>
<name sortKey="Ng, Hl" uniqKey="Ng H">HL Ng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pellegrini, M" uniqKey="Pellegrini M">M Pellegrini</name>
</author>
<author>
<name sortKey="Marcotte, Em" uniqKey="Marcotte E">EM Marcotte</name>
</author>
<author>
<name sortKey="Thompson, Mj" uniqKey="Thompson M">MJ Thompson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marcotte, Em" uniqKey="Marcotte E">EM Marcotte</name>
</author>
<author>
<name sortKey="Pellegrini, M" uniqKey="Pellegrini M">M Pellegrini</name>
</author>
<author>
<name sortKey="Thompson, Mj" uniqKey="Thompson M">MJ Thompson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Harrington, Ed" uniqKey="Harrington E">ED Harrington</name>
</author>
<author>
<name sortKey="Singh, Ah" uniqKey="Singh A">AH Singh</name>
</author>
<author>
<name sortKey="Doerks, T" uniqKey="Doerks T">T Doerks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sachdeva, G" uniqKey="Sachdeva G">G Sachdeva</name>
</author>
<author>
<name sortKey="Kumar, K" uniqKey="Kumar K">K Kumar</name>
</author>
<author>
<name sortKey="Jain, P" uniqKey="Jain P">P Jain</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turnbaugh, Pj" uniqKey="Turnbaugh P">PJ Turnbaugh</name>
</author>
<author>
<name sortKey="Ley, Re" uniqKey="Ley R">RE Ley</name>
</author>
<author>
<name sortKey="Mahowald, Ma" uniqKey="Mahowald M">MA Mahowald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turnbaugh, Pj" uniqKey="Turnbaugh P">PJ Turnbaugh</name>
</author>
<author>
<name sortKey="Hamady, M" uniqKey="Hamady M">M Hamady</name>
</author>
<author>
<name sortKey="Yatsunenko, T" uniqKey="Yatsunenko T">T Yatsunenko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mackelprang, R" uniqKey="Mackelprang R">R Mackelprang</name>
</author>
<author>
<name sortKey="Waldrop, Mp" uniqKey="Waldrop M">MP Waldrop</name>
</author>
<author>
<name sortKey="Deangelis, Km" uniqKey="Deangelis K">KM DeAngelis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brulc, Jm" uniqKey="Brulc J">JM Brulc</name>
</author>
<author>
<name sortKey="Antonopoulos, Da" uniqKey="Antonopoulos D">DA Antonopoulos</name>
</author>
<author>
<name sortKey="Miller, Me" uniqKey="Miller M">ME Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Willner, D" uniqKey="Willner D">D Willner</name>
</author>
<author>
<name sortKey="Furlan, M" uniqKey="Furlan M">M Furlan</name>
</author>
<author>
<name sortKey="Haynes, M" uniqKey="Haynes M">M Haynes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Xu, L" uniqKey="Xu L">L Xu</name>
</author>
<author>
<name sortKey="Chen, H" uniqKey="Chen H">H Chen</name>
</author>
<author>
<name sortKey="Hu, X" uniqKey="Hu X">X Hu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yilmaz, S" uniqKey="Yilmaz S">S Yilmaz</name>
</author>
<author>
<name sortKey="Singh, Ak" uniqKey="Singh A">AK Singh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Peterson, J" uniqKey="Peterson J">J Peterson</name>
</author>
<author>
<name sortKey="Garges, S" uniqKey="Garges S">S Garges</name>
</author>
<author>
<name sortKey="Giovanni, M" uniqKey="Giovanni M">M Giovanni</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Korbel, Jo" uniqKey="Korbel J">JO Korbel</name>
</author>
<author>
<name sortKey="Lercher, Mj" uniqKey="Lercher M">MJ Lercher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van, Ne" uniqKey="Van N">NE van</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Harrington, Ed" uniqKey="Harrington E">ED Harrington</name>
</author>
<author>
<name sortKey="Singh, Ah" uniqKey="Singh A">AH Singh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Beszteri, B" uniqKey="Beszteri B">B Beszteri</name>
</author>
<author>
<name sortKey="Temperton, B" uniqKey="Temperton B">B Temperton</name>
</author>
<author>
<name sortKey="Frickenhaus, S" uniqKey="Frickenhaus S">S Frickenhaus</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gomez Alvarez, V" uniqKey="Gomez Alvarez V">V Gomez-Alvarez</name>
</author>
<author>
<name sortKey="Teal, Tk" uniqKey="Teal T">TK Teal</name>
</author>
<author>
<name sortKey="Schmidt, Tm" uniqKey="Schmidt T">TM Schmidt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Peltola, H" uniqKey="Peltola H">H Peltola</name>
</author>
<author>
<name sortKey="Soderlund, H" uniqKey="Soderlund H">H Soderlund</name>
</author>
<author>
<name sortKey="Ukkonen, E" uniqKey="Ukkonen E">E Ukkonen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guan, X" uniqKey="Guan X">X Guan</name>
</author>
<author>
<name sortKey="Uberbacher, Ec" uniqKey="Uberbacher E">EC Uberbacher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brown, Np" uniqKey="Brown N">NP Brown</name>
</author>
<author>
<name sortKey="Sander, C" uniqKey="Sander C">C Sander</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Halperin, E" uniqKey="Halperin E">E Halperin</name>
</author>
<author>
<name sortKey="Faigler, S" uniqKey="Faigler S">S Faigler</name>
</author>
<author>
<name sortKey="Gill More, R" uniqKey="Gill More R">R Gill-More</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author>
<name sortKey="Sun, Y" uniqKey="Sun Y">Y Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Venter, Jc" uniqKey="Venter J">JC Venter</name>
</author>
<author>
<name sortKey="Remington, K" uniqKey="Remington K">K Remington</name>
</author>
<author>
<name sortKey="Heidelberg, Jf" uniqKey="Heidelberg J">JF Heidelberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Johnston, Aw" uniqKey="Johnston A">AW Johnston</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Ogilvie, L" uniqKey="Ogilvie L">L Ogilvie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Remington, Ka" uniqKey="Remington K">KA Remington</name>
</author>
<author>
<name sortKey="Heidelberg, K" uniqKey="Heidelberg K">K Heidelberg</name>
</author>
<author>
<name sortKey="Venter, Jc" uniqKey="Venter J">JC Venter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J Qin</name>
</author>
<author>
<name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brown, Ct" uniqKey="Brown C">CT Brown</name>
</author>
<author>
<name sortKey="Vis Richardson, Ag" uniqKey="Vis Richardson A">AG vis-Richardson</name>
</author>
<author>
<name sortKey="Giongo, A" uniqKey="Giongo A">A Giongo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Belda Ferre, P" uniqKey="Belda Ferre P">P Belda-Ferre</name>
</author>
<author>
<name sortKey="Alcaraz, Ld" uniqKey="Alcaraz L">LD Alcaraz</name>
</author>
<author>
<name sortKey="Cabrera Rubio, R" uniqKey="Cabrera Rubio R">R Cabrera-Rubio</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sharma, Vk" uniqKey="Sharma V">VK Sharma</name>
</author>
<author>
<name sortKey="Kumar, N" uniqKey="Kumar N">N Kumar</name>
</author>
<author>
<name sortKey="Prakash, T" uniqKey="Prakash T">T Prakash</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tasse, L" uniqKey="Tasse L">L Tasse</name>
</author>
<author>
<name sortKey="Bercovici, J" uniqKey="Bercovici J">J Bercovici</name>
</author>
<author>
<name sortKey="Pizzut Serin, S" uniqKey="Pizzut Serin S">S Pizzut-Serin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Belda Ferre, P" uniqKey="Belda Ferre P">P Belda-Ferre</name>
</author>
<author>
<name sortKey="Cabrera Rubio, R" uniqKey="Cabrera Rubio R">R Cabrera-Rubio</name>
</author>
<author>
<name sortKey="Moya, A" uniqKey="Moya A">A Moya</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turnbaugh, Pj" uniqKey="Turnbaugh P">PJ Turnbaugh</name>
</author>
<author>
<name sortKey="Quince, C" uniqKey="Quince C">C Quince</name>
</author>
<author>
<name sortKey="Faith, Jj" uniqKey="Faith J">JJ Faith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gosalbes, Mj" uniqKey="Gosalbes M">MJ Gosalbes</name>
</author>
<author>
<name sortKey="Durban, A" uniqKey="Durban A">A Durban</name>
</author>
<author>
<name sortKey="Pignatelli, M" uniqKey="Pignatelli M">M Pignatelli</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taverna, Dm" uniqKey="Taverna D">DM Taverna</name>
</author>
<author>
<name sortKey="Goldstein, Ra" uniqKey="Goldstein R">RA Goldstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Klaassens, Es" uniqKey="Klaassens E">ES Klaassens</name>
</author>
<author>
<name sortKey="De Vos, Wm" uniqKey="De Vos W">WM de Vos</name>
</author>
<author>
<name sortKey="Vaughan, Ee" uniqKey="Vaughan E">EE Vaughan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Verberkmoes, Nc" uniqKey="Verberkmoes N">NC Verberkmoes</name>
</author>
<author>
<name sortKey="Russell, Al" uniqKey="Russell A">AL Russell</name>
</author>
<author>
<name sortKey="Shah, M" uniqKey="Shah M">M Shah</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, X" uniqKey="Li X">X Li</name>
</author>
<author>
<name sortKey="Leblanc, J" uniqKey="Leblanc J">J LeBlanc</name>
</author>
<author>
<name sortKey="Truong, A" uniqKey="Truong A">A Truong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kolmeder, Ca" uniqKey="Kolmeder C">CA Kolmeder</name>
</author>
<author>
<name sortKey="De, Bm" uniqKey="De B">BM de</name>
</author>
<author>
<name sortKey="Nikkila, J" uniqKey="Nikkila J">J Nikkila</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kaddurah Daouk, R" uniqKey="Kaddurah Daouk R">R Kaddurah-Daouk</name>
</author>
<author>
<name sortKey="Kristal, Bs" uniqKey="Kristal B">BS Kristal</name>
</author>
<author>
<name sortKey="Weinshilboum, Rm" uniqKey="Weinshilboum R">RM Weinshilboum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saito, K" uniqKey="Saito K">K Saito</name>
</author>
<author>
<name sortKey="Matsuda, F" uniqKey="Matsuda F">F Matsuda</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Claus, Sp" uniqKey="Claus S">SP Claus</name>
</author>
<author>
<name sortKey="Tsang, Tm" uniqKey="Tsang T">TM Tsang</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fukuda, S" uniqKey="Fukuda S">S Fukuda</name>
</author>
<author>
<name sortKey="Nakanishi, Y" uniqKey="Nakanishi Y">Y Nakanishi</name>
</author>
<author>
<name sortKey="Chikayama, E" uniqKey="Chikayama E">E Chikayama</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Han, J" uniqKey="Han J">J Han</name>
</author>
<author>
<name sortKey="Antunes, Lc" uniqKey="Antunes L">LC Antunes</name>
</author>
<author>
<name sortKey="Finlay, Bb" uniqKey="Finlay B">BB Finlay</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Claus, Sp" uniqKey="Claus S">SP Claus</name>
</author>
<author>
<name sortKey="Ellero, Sl" uniqKey="Ellero S">SL Ellero</name>
</author>
<author>
<name sortKey="Berger, B" uniqKey="Berger B">B Berger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fukuda, S" uniqKey="Fukuda S">S Fukuda</name>
</author>
<author>
<name sortKey="Toh, H" uniqKey="Toh H">H Toh</name>
</author>
<author>
<name sortKey="Hase, K" uniqKey="Hase K">K Hase</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nakanishi, Y" uniqKey="Nakanishi Y">Y Nakanishi</name>
</author>
<author>
<name sortKey="Fukuda, S" uniqKey="Fukuda S">S Fukuda</name>
</author>
<author>
<name sortKey="Chikayama, E" uniqKey="Chikayama E">E Chikayama</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schubotz, F" uniqKey="Schubotz F">F Schubotz</name>
</author>
<author>
<name sortKey="Wakeham, Sg" uniqKey="Wakeham S">SG Wakeham</name>
</author>
<author>
<name sortKey="Lipp, Js" uniqKey="Lipp J">JS Lipp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pitcher, A" uniqKey="Pitcher A">A Pitcher</name>
</author>
<author>
<name sortKey="Hopmans, Ec" uniqKey="Hopmans E">EC Hopmans</name>
</author>
<author>
<name sortKey="Mosier, Ac" uniqKey="Mosier A">AC Mosier</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Brief Bioinform</journal-id>
<journal-id journal-id-type="iso-abbrev">Brief. Bioinformatics</journal-id>
<journal-id journal-id-type="publisher-id">bib</journal-id>
<journal-id journal-id-type="hwp">bib</journal-id>
<journal-title-group>
<journal-title>Briefings in Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="ppub">1467-5463</issn>
<issn pub-type="epub">1477-4054</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">22772835</article-id>
<article-id pub-id-type="pmc">3504928</article-id>
<article-id pub-id-type="doi">10.1093/bib/bbs033</article-id>
<article-id pub-id-type="publisher-id">bbs033</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Papers</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Functional assignment of metagenomic data: challenges and applications</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Prakash</surname>
<given-names>Tulika</given-names>
</name>
<xref ref-type="bio" rid="d34e36">*</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Taylor</surname>
<given-names>Todd D.</given-names>
</name>
<xref ref-type="bio" rid="d34e47">*</xref>
</contrib>
</contrib-group>
<author-notes>
<corresp>Corresponding author. Todd D. Taylor, Laboratory for MetaSystems Research, Quantitative Biology Center, Riken, Yokohama, Kanagawa 230-0045, Japan. Tel.:
<phone>+81-45-503-9285</phone>
; Fax:
<fax>+81-45-503-9176</fax>
; E-mail:
<email>taylor@riken.jp</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>11</month>
<year>2012</year>
</pub-date>
<pub-date pub-type="epub">
<day>6</day>
<month>7</month>
<year>2012</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>6</day>
<month>7</month>
<year>2012</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>13</volume>
<issue>6</issue>
<issue-title>Special Issue: Bioinformatics approaches and tools for metagenomic analysis</issue-title>
<fpage>711</fpage>
<lpage>727</lpage>
<history>
<date date-type="received">
<day>5</day>
<month>3</month>
<year>2012</year>
</date>
<date date-type="accepted">
<day>26</day>
<month>5</month>
<year>2012</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2012. Published by Oxford University Press.</copyright-statement>
<copyright-year>2012</copyright-year>
<license license-type="creative-commons" xlink:href="http://creativecommons.org/licenses/by-nc/3.0">
<license-p>
<pmc-comment>CREATIVE COMMONS</pmc-comment>
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/3.0">http://creativecommons.org/licenses/by-nc/3.0</ext-link>
), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>Metagenomic sequencing provides a unique opportunity to explore earth’s limitless environments harboring scores of yet unknown and mostly unculturable microbes and other organisms. Functional analysis of the metagenomic data plays a central role in projects aiming to explore the most essential questions in microbiology, namely ‘In a given environment, among the microbes present, what are they doing, and how are they doing it?’ Toward this goal, several large-scale metagenomic projects have recently been conducted or are currently underway. Functional analysis of metagenomic data mainly suffers from the vast amount of data generated in these projects. The shear amount of data requires much computational time and storage space. These problems are compounded by other factors potentially affecting the functional analysis, including, sample preparation, sequencing method and average genome size of the metagenomic samples. In addition, the read-lengths generated during sequencing influence sequence assembly, gene prediction and subsequently the functional analysis. The level of confidence for functional predictions increases with increasing read-length. Usually, the most reliable functional annotations for metagenomic sequences are achieved using homology-based approaches against publicly available reference sequence databases. Here, we present an overview of the current state of functional analysis of metagenomic sequence data, bottlenecks frequently encountered and possible solutions in light of currently available resources and tools. Finally, we provide some examples of applications from recent metagenomic studies which have been successfully conducted in spite of the known difficulties.</p>
</abstract>
<kwd-group>
<kwd>functional annotation</kwd>
<kwd>metagenomics</kwd>
<kwd>bioinformatics</kwd>
<kwd>next-generation sequencing</kwd>
<kwd>pathway-mapping</kwd>
<kwd>comparative analysis</kwd>
</kwd-group>
<counts>
<page-count count="17"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec>
<title>INTRODUCTION</title>
<p>The microbial world shows vast diversity, and microbes inhabit almost every niche on the planet. Many of them have been shown to be important members of their given ecosystems and to play crucial roles in various environmental and host-associated biological processes. However, due to their general unculturability (it is believed that only a small percentage of bacteria in nature can be cultured [
<xref ref-type="bibr" rid="bbs033-B1">1</xref>
]), up until just a few years ago it was practically impossible to sequence and analyze them in greater detail. As a result, a large fraction of microbes still remain poorly characterized and unstudied; and the means by which they exert beneficial or other effects in different environments remain largely unknown.</p>
<p>The recent culture independent technology to study microbes inhabiting different environments, termed metagenomics [
<xref ref-type="bibr" rid="bbs033-B2">2</xref>
], has opened new avenues for answering questions commonly asked in microbiology, such as ‘Which species inhabit a given environment?’ and ‘What are these microbes doing and how are they doing it?’ The basic steps involved in a typical metagenomic project to estimate the number of species and the functional repertoire of an environment include DNA or RNA sequencing using next-generation sequencers (such as Illumina and Roche 454), sequence assembly, gene prediction, functional and metabolic analysis, taxonomic binning and comparative analysis of the sequence data using specialized bioinformatics methods and tools (
<xref ref-type="fig" rid="bbs033-F1">Figure 1</xref>
,
<xref ref-type="table" rid="bbs033-T1">Tables 1</xref>
and
<xref ref-type="table" rid="bbs033-T2">2</xref>
). However, each stage of the analysis suffers heavily due to inherent problems of the metagenomic data generated, including incomplete coverage, massive volumes of raw sequence data produced by the next-generation sequencers, generally short read-lengths, species abundance and diversity and so on [
<xref ref-type="bibr" rid="bbs033-B3">3</xref>
,
<xref ref-type="bibr" rid="bbs033-B4">4</xref>
].
<fig id="bbs033-F1" position="float">
<label>Figure 1:</label>
<caption>
<p>Flow chart for the analysis of a metagenome from sequencing to functional annotation. Only the basic flow of data is shown up to the gene prediction step. For the context-based annotation approach, only the gene neighborhood method has been implemented thus far on metagenomic data sets; although in principal, other approaches which have been used for whole genome analysis can also be implemented and tested. *: A list of tools commonly used for these processes is provided in
<xref ref-type="table" rid="bbs033-T1">Table 1</xref>
.
<xref ref-type="table" rid="bbs033-T3">Table 3</xref>
provides a list of some of the additional functional analyses that can be performed on the metagenomic sequences.</p>
</caption>
<graphic xlink:href="bbs033f1"></graphic>
</fig>
<table-wrap id="bbs033-T1" position="float">
<label>Table 1:</label>
<caption>
<p>List of commonly used tools for sequence assembly, protein coding gene prediction, RNA gene prediction and phylogenetic classification steps of metagenomic data analysis</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Process</th>
<th rowspan="1" colspan="1">Tools</th>
<th rowspan="1" colspan="1">URL/ References</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="11" colspan="1">Sequence assembly</td>
<td rowspan="1" colspan="1">Phrap</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.phrap.org/">http://www.phrap.org/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Forge</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://combiol.org/forge/">http://combiol.org/forge/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Arachne</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B5">5</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">JAZZ</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B6">6</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Celera</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B7">7</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Velvet</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B8">8</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Newbler</td>
<td rowspan="1" colspan="1">454 Life Sciences</td>
</tr>
<tr>
<td rowspan="1" colspan="1">SOAPdenovo</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B9">9</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">EULER</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B10">10</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">ORFome assembly</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B11">11</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">IDBA-UD</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B12">12</xref>
]</td>
</tr>
<tr>
<td rowspan="7" colspan="1">Gene prediction</td>
<td rowspan="1" colspan="1">Metagene</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B13">13</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">GeneMark</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B14">14</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">ORF-Finder</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/projects/gorf/">http://www.ncbi.nlm.nih.gov/ projects/gorf/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">FragGeneScan</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B15">15</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">fgenesB</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.softberry.com">http://www.softberry.com</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">GLIMMER</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B16">16</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">BLAST</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B17">17</xref>
]</td>
</tr>
<tr>
<td rowspan="2" colspan="1">RNA gene prediction</td>
<td rowspan="1" colspan="1">tRNAscan-SE</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B18">18</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Similarity-based searches for rRNA in reference databases</td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="7" colspan="1">Taxonomic binning</td>
<td rowspan="1" colspan="1">MetaBin</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B19">19</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">MEGAN</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B20">20</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">WebCARMA</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B21">21</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">PhyloPythia</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B22">22</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">TETRA</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B23">23</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NBC</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B24">24</xref>
]</td>
</tr>
<tr>
<td rowspan="1" colspan="1">TACOA</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B25">25</xref>
]</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="bbs033-T2" position="float">
<label>Table 2:</label>
<caption>
<p>Current list of commonly used publicly available pipelines for the functional annotation of metagenomic data sets</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Pipeline/tools</th>
<th rowspan="1" colspan="1">IMG/M</th>
<th rowspan="1" colspan="1">METAREP</th>
<th rowspan="1" colspan="1">CAMERA</th>
<th rowspan="1" colspan="1">RAMMCAP</th>
<th rowspan="1" colspan="1">MG-RAST</th>
<th rowspan="1" colspan="1">Smash community</th>
<th rowspan="1" colspan="1">MEGAN4</th>
<th rowspan="1" colspan="1">CoMet</th>
<th rowspan="1" colspan="1">WebMGA</th>
</tr>
<tr>
<th colspan="10" rowspan="1">Functional analysis</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td colspan="10" rowspan="1">    Homology-based</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Known sequence</td>
<td rowspan="1" colspan="1">NCBI (NR), SMART, UniProt</td>
<td rowspan="1" colspan="1">NCBI (NR), UniProt</td>
<td rowspan="1" colspan="1">NCBI (NR)</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">NCBI (NR), SMART, UniProt</td>
<td rowspan="1" colspan="1">SMART, UniProt</td>
<td rowspan="1" colspan="1">NCBI (NR)</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">NCBI (NR)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Metagenomic data sets</td>
<td rowspan="1" colspan="1">IMG/M</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">IMG/M</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Orthologous groups</td>
<td rowspan="1" colspan="1">COGs</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">COGs</td>
<td rowspan="1" colspan="1">COGs</td>
<td rowspan="1" colspan="1">COGs, eggNOGs</td>
<td rowspan="1" colspan="1">COGs, eggNOGs</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">COGs</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Protein families</td>
<td rowspan="1" colspan="1">Pfam, TIGRfam</td>
<td rowspan="1" colspan="1">Pfam, TIGRfam</td>
<td rowspan="1" colspan="1">Pfam, TIGRfam</td>
<td rowspan="1" colspan="1">Pfam, TIGRfam</td>
<td rowspan="1" colspan="1">FIGfams</td>
<td rowspan="1" colspan="1">Pfam</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Pfam</td>
<td rowspan="1" colspan="1">Pfam, TIGRfam</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Ontology</td>
<td rowspan="1" colspan="1">GO</td>
<td rowspan="1" colspan="1">GO</td>
<td rowspan="1" colspan="1">GO</td>
<td rowspan="1" colspan="1">GO</td>
<td rowspan="1" colspan="1">GO</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">GO</td>
<td rowspan="1" colspan="1">GO</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Enzymes, pathways and subsystems</td>
<td rowspan="1" colspan="1">KEGG, SEED</td>
<td rowspan="1" colspan="1">PRIAM</td>
<td rowspan="1" colspan="1">KEGG, SEED</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">KEGG, SEED</td>
<td rowspan="1" colspan="1">KEGG</td>
<td rowspan="1" colspan="1">KEGG, SEED</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">KEGG</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Protein interactions</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">STRING</td>
<td rowspan="1" colspan="1">STRING</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td colspan="10" rowspan="1">    Motif- and pattern-based</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Database</td>
<td rowspan="1" colspan="1">InterPro</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td colspan="10" rowspan="1">    Context-based</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Approach</td>
<td rowspan="1" colspan="1">Gene neighborhood</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Gene Neighborhood</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td colspan="10" rowspan="1">    Other functional analysis</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        Types of predictions</td>
<td rowspan="1" colspan="1">CRISPRs, enzymes, transporter classes</td>
<td rowspan="1" colspan="1">Enzymes, transmembrane helices, lipoprotein motifs</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Protein networks</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">        URL</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://img.jgi.doe.gov/m/doc/uiMap.html">http://img.jgi.doe.gov/m/doc/uiMap.html</ext-link>
</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.jcvi.org/metarep/">http://www.jcvi.org/metarep/</ext-link>
</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://camera.calit2.net/"> http://camera.calit2.net/</ext-link>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://metagenomics.nmpdr.org/">http://metagenomics.nmpdr.org/</ext-link>
</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.bork.embl.de/software/smash/">http://www.bork.embl.de/software/smash/</ext-link>
</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://ab.inf.uni-tuebingen.de/software/megan/">http://ab.inf.uni-tuebingen.de/software/megan/</ext-link>
</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://comet.gobics.de/">http://comet .gobics.de/</ext-link>
</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://weizhong-lab.ucsd.edu/metagenomic-analysis/"> http://weizhong-lab.ucsd.edu/metagenomic-analysis/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">        References</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B26">26</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B27">27</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B28">28</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B29">29</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B30">30</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B31">31</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B32">32</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B33">33</xref>
]</td>
<td rowspan="1" colspan="1">[
<xref ref-type="bibr" rid="bbs033-B34">34</xref>
]</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p>These problems also adversely affect the downstream functional analysis process. For example, due to shorter read-length the overall functional composition is comparatively poor for shorter pyrosequencing- or Illumina-sequencing derived reads than for longer Sanger reads [
<xref ref-type="bibr" rid="bbs033-B35">35</xref>
]. Additionally, for very complex communities, partial or poor assemblies are obtained due to incomplete coverage, resulting in many short contigs and unassembled sequences. This leads to the prediction of a large number of small, fragmented genes which may not exhibit any matches in the reference sequence databases, or match with very low significance [
<xref ref-type="bibr" rid="bbs033-B36">36</xref>
]. Although sequence assembly and gene prediction tools specifically developed for metagenomic data sets offer some advantages over similar tools developed for more complete genome sequences, surprisingly, no such ‘metagenome specific’ tools have yet been developed for functional analysis. Thus, appropriate tools, from the current repertoire, and parameters must be used to achieve comprehensive and biologically meaningful functional analysis of metagenomic data sets. The steps for sequence assembly and gene prediction of metagenomic data sets are compared in several recent comprehensive reviews [
<xref ref-type="bibr" rid="bbs033-B3">3</xref>
,
<xref ref-type="bibr" rid="bbs033-B4">4</xref>
,
<xref ref-type="bibr" rid="bbs033-B37">37</xref>
,
<xref ref-type="bibr" rid="bbs033-B38">38</xref>
].</p>
<p>The scope of this review is to comprehensively discuss the prime objectives, methods and problems for functional and metabolic analysis of metagenomic sequence data, and to propose some solutions for the latter. Toward this, we first try to familiarize the reader with the aims of functional metagenomic analysis and the most commonly adopted publicly available tools and resources to achieve them. Next, we discuss how the problems arising from metagenomic sequencing affect this process, and we suggest various strategies for addressing some of these issues under the present scenario. Lastly, we demonstrate that, despite these issues, metagenomic functional analysis can still be reliably used to address globally important environmental and biological questions.</p>
</sec>
<sec>
<title>OBJECTIVES OF FUNCTIONAL METAGENOMIC ANALYSIS STUDIES</title>
<p>Interestingly, the same microbial communities sampled at different times or from different hosts can vary significantly. For example, the gut microbiomes of 13 healthy Japanese individuals were quite different, yet they still shared many microbes [
<xref ref-type="bibr" rid="bbs033-B39">39</xref>
]. Also, the community members for any given environment commonly play different roles. For example, in the human gut microbiome, segmented filamentous bacteria are known to play important roles in maintaining intestinal immunity [
<xref ref-type="bibr" rid="bbs033-B40">40</xref>
,
<xref ref-type="bibr" rid="bbs033-B41">41</xref>
], whereas bifidobacteria are known to utilize complex carbohydrates and thereby exert beneficial effects on human health [
<xref ref-type="bibr" rid="bbs033-B42">42</xref>
]. Thus, there are mainly two broad objectives of the functional analysis for metagenomic studies: the first is to determine what are the functional and metabolic repertoires of the different community members that enable them to exert different effects, and the second is to identify the variations, if any, within the functional compositions of the different communities, e.g. those found between healthy and diseased individuals that may be related to the cause of the disease. To determine the functional content of the member species of a microbiome, the coding and functional capacity for all (or at least the dominant) members should be comprehensively analyzed. Alternatively, if the goal of the study is to analyze and contrast the functional and metabolic capacities of different communities, then the functional and metabolic pathway profiles for the communities need to be generated and compared.</p>
</sec>
<sec>
<title>PUBLICLY AVAILABLE RESOURCES AND TOOLS FOR FUNCTIONAL ANNOTATION OF METAGENOMIC DATA</title>
<p>Dedicated tools for functional annotation and analysis of metagenomic data sets lag far behind the rate at which the data is being generated. Recently, some web-based, as well as local-use based, pipelines have been developed for the analysis of metagenomic data sets.
<xref ref-type="table" rid="bbs033-T2">Table 2</xref>
provides a list of a few well-known representative pipelines and compares the functional analysis capacity of each. Almost all of these pipelines provide integrated platforms for the functional prediction of metagenomic sequences using multiple tools and databases, which are also commonly used for the analysis of whole genome sequences. Most of the pipelines offer sufficient resources for the functional analysis of user data. However, to account for the inherent problems associated with the metagenomic data sets, it is highly recommended to evaluate the computational workflow and parameters for any given project. This can be achieved by using simulated sequencing reads generated by MetaSim [
<xref ref-type="bibr" rid="bbs033-B43">43</xref>
], to assess and compare different tools before actually using them on full data sets. The analysis time of any pipeline typically depends on the size of the data sets and, in the case of web-based servers, the load of requests that are already in progress submitted by other users. Web-based servers such as CAMERA [
<xref ref-type="bibr" rid="bbs033-B28">28</xref>
], MG-RAST [
<xref ref-type="bibr" rid="bbs033-B30">30</xref>
] and IMG/M [
<xref ref-type="bibr" rid="bbs033-B26">26</xref>
] host pre-computed results for most published metagenomes that enable users to perform comparative analysis with their own data sets. In most cases, the computed data can be visualized in the form of simple plots. However, KEGG [
<xref ref-type="bibr" rid="bbs033-B44">44</xref>
] pathway maps and abundance profiles can also be obtained using the IMG/M and MG-RAST servers.</p>
</sec>
<sec>
<title>STRATEGIES COMMONLY ADOPTED BY THE PIPELINES FOR THE FUNCTIONAL ANALYSIS OF METAGENOMIC DATA</title>
<p>Protein function is a very broad term, as function can be predicted at several different levels. For example, the Gene Ontology database [
<xref ref-type="bibr" rid="bbs033-B45">45</xref>
] adopts three broad domains for classifying gene products viz., the cellular location of the protein, the overall biological process it takes part in and the molecular function of the protein. On the other hand, the subsystem-based classification approach adopted by the SEED database [
<xref ref-type="bibr" rid="bbs033-B46">46</xref>
] relies mainly on the grouping of functional roles into subsystems by curation experts. The defined subsystems may be thought of as a generalization of the term ‘pathway’. Similarly, the KEGG database [
<xref ref-type="bibr" rid="bbs033-B44">44</xref>
] is a resource of pathway maps built from both genomic and chemical information of the biological systems. However, such specific functional assignment may be lacking for completely novel proteins or for those which share very weak homology with known proteins both of which are ample in metagenomic data sets. For such proteins, even minimal information that can be extracted related to their function can be useful, and may be the only available clues to their function.</p>
<p>As shown in
<xref ref-type="fig" rid="bbs033-F1">Figure 1</xref>
and
<xref ref-type="table" rid="bbs033-T2">Table 2</xref>
, the basic tools that are implemented in almost all of the available pipelines for functional analysis of metagenomic data are the same as those which are commonly used for whole genome studies and are well known. However, their performance in the metagenomic context have yet to be evaluated and reviewed. Thus, in the current review, we have divided these tools into four categories based on their inherent approach. In the following sections, we review each approach in context to its application to metagenomic data analysis, keeping in mind the associated problems of the data itself.</p>
<sec>
<title>Homology-based approach</title>
<p>As shown in
<xref ref-type="table" rid="bbs033-T2">Table 2</xref>
, the ‘simplest’ and most common approach adopted by all of the available pipelines for functional prediction is by comparison of the predicted query proteins to existing resources of reference protein sequences, including NCBI NR [
<xref ref-type="bibr" rid="bbs033-B47">47</xref>
], SMART [
<xref ref-type="bibr" rid="bbs033-B48">48</xref>
] and UniProt/UniRef [
<xref ref-type="bibr" rid="bbs033-B49">49</xref>
]. The IMG/M [
<xref ref-type="bibr" rid="bbs033-B26">26</xref>
] and MG-RAST [
<xref ref-type="bibr" rid="bbs033-B30">30</xref>
] servers also search the publicly available metagenomic data sets for homologs of the query sequences. The databases of clusters of orthologous groups (COGs) [
<xref ref-type="bibr" rid="bbs033-B50">50</xref>
], non-supervised orthologous groups (NOGs) [
<xref ref-type="bibr" rid="bbs033-B51">51</xref>
], protein families and domains including Pfam [
<xref ref-type="bibr" rid="bbs033-B52">52</xref>
] and TIGRFAM [
<xref ref-type="bibr" rid="bbs033-B53">53</xref>
], etc. are used by several pipelines to infer functional categories or to identify families and domains embedded in the query proteins. In some cases, similarities to genes found in the GO database are further explored to infer hierarchical annotations. Pathway and subsystem information for the query proteins is inferred by searching for homologs in the KEGG and SEED databases, respectively, by almost all of the pipelines.</p>
<p>For these searches, different variants of BLAST [
<xref ref-type="bibr" rid="bbs033-B17">17</xref>
] are the most preferred algorithms, including BLASTX, BLASTP, RPS-BLAST, etc. For less sensitive, but faster, searches BLAT [
<xref ref-type="bibr" rid="bbs033-B54">54</xref>
] may also be used, as in the case of MG-RAST server. Additionally, more sensitive profile- and pattern-based search methods are used by almost all of the pipelines in which sequence profiles generated from alignments of protein families in Pfam or TIGRfam databases are searched using the hidden Markov model-based algorithm, HMMER [
<xref ref-type="bibr" rid="bbs033-B55">55</xref>
]. For all these methods, best hits are identified based on statistical calculations and annotation information is directly applied to the query proteins.</p>
<p>Homology-based approaches mainly suffer from the long computation time required to search for homologs for each of the sequences within the typically massive metagenomic data sets. Additionally, BLAST-based functional predictions have been estimated to include 13–15% database propagation errors [
<xref ref-type="bibr" rid="bbs033-B56">56</xref>
]. Moreover, to detect a true match, the reference database being searched needs to contain at least one homolog of the query sequence. And, the fragmentary nature of the shotgun-generated metagenomic data leading to partial proteins negatively impacts homology-based function prediction. This is discussed in more detail below.</p>
<p>The extent to which metagenomic functional annotation has been achieved using different databases is demonstrated in
<xref ref-type="fig" rid="bbs033-F2">Figures 2</xref>
and
<xref ref-type="fig" rid="bbs033-F3">3</xref>
. The highest fraction of metagenomic sequences were annotated using the NCBI RefSeq database, which is a comprehensive collection of non-redundant well-annotated protein sequences. On the other hand, only a small fraction of sequences could be annotated using the Swiss-Prot database, which harbors manually annotated and reviewed protein sequences. The number of proteins annotated using the COGs database was slightly less than RefSeq. Among the protein family and profile databases, more predictions were made using Pfam as compared to the TIGRFAM database. This could mainly be due to the great number of protein families that are included in the Pfam database (13 672 in Pfam 26.0 release) than in the TIGRFAM database (4209 in TIGRFAM 12.0 release). The annotation using KEGG metabolic pathways is relatively low mainly due to the inherent problems of the metagenomic data sets, as discussed below. The SEED system of classification performs similar to that of KEGG, although the number of predictions is slightly lower.
<fig id="bbs033-F2" position="float">
<label>Figure 2:</label>
<caption>
<p>Distribution of metagenomic sequence matches in the SwissProt, RefSeq, KEGG and SEED databases at various
<italic>E</italic>
-value cut-offs. Smaller sequences match at lower confidence (higher
<italic>E</italic>
-values; lighter colors) or do not match at all in the databases. More sequences match with higher confidence (lower
<italic>E</italic>
-values; darker colors) as the sequence length used for the analysis increases. Pre-computed data for the metagenomes shown was derived from the MG-RAST server.</p>
</caption>
<graphic xlink:href="bbs033f2"></graphic>
</fig>
<fig id="bbs033-F3" position="float">
<label>Figure 3:</label>
<caption>
<p>Status of functional prediction of protein-coding genes from different metagenomic data sets and representatives of completely sequenced genomes. The overall functional prediction bars represent the fraction of protein-coding genes that map to at least any one of the four databases including cluster of orthologous groups (COGs), Pfam, TIGRFAM and KEGG pathways. For comparative purposes, the functional annotation status for the well-studied model microbial genome,
<italic>E. coli</italic>
K12-W3310, the smallest microbial genome,
<italic>M. genitalium</italic>
, and the human genome are also shown. The data for this graph was derived from the IMG/M database. It should be noted that for uniform comparison, the prokaryotic COGs version was also used for
<italic>Homo sapiens</italic>
. The number of matches to eukaryotic COGs (KOG database [
<xref ref-type="bibr" rid="bbs033-B57">57</xref>
]) may be higher for
<italic>H. sapiens</italic>
. The numbers next to the bars represent the total number of predicted protein-coding genes in each data set using the IMG/M annotation pipeline. For the Sludge [
<xref ref-type="bibr" rid="bbs033-B58">58</xref>
] community, data from only the Phrap assembly, a widely used program for DNA sequence assembly, was used. Except for the Cow Rumen Viral community [
<xref ref-type="bibr" rid="bbs033-B59">59</xref>
], which was sequenced using the 454 platform (average read-length > 300 bp), all other metagenomes were sequenced using the Sanger method (average read-length > 1000 bp). The following additional data sets were used: Ocean [
<xref ref-type="bibr" rid="bbs033-B60">60</xref>
], Soil [
<xref ref-type="bibr" rid="bbs033-B61">61</xref>
], Acid Mine Drainage [
<xref ref-type="bibr" rid="bbs033-B62">62</xref>
], Human Gut [
<xref ref-type="bibr" rid="bbs033-B63">63</xref>
].</p>
</caption>
<graphic xlink:href="bbs033f3"></graphic>
</fig>
</p>
</sec>
<sec>
<title>Motif- or pattern-based approach</title>
<p>The partial proteins generated from short contigs and unassembled sequences which arise due to short read-lengths or complex environments generally exhibit very poor similarities using homology-based approaches (
<xref ref-type="fig" rid="bbs033-F2">Figure 2</xref>
). Additionally, some proteins, despite sharing a common function, are more diverse at the sequence level. The overall sequence similarity of such proteins is usually lower than the thresholds used for homology-based functional prediction; however, they still share one or more common sequence or structural patterns or motifs necessary to maintain their structure and function. Currently, databases like PROSITE [
<xref ref-type="bibr" rid="bbs033-B64">64</xref>
] and PRINTS [
<xref ref-type="bibr" rid="bbs033-B65">65</xref>
] present a reliable repository of such patterns or motifs against which the query metagenomic sequences may be searched either independently or through the integrated InterPro database [
<xref ref-type="bibr" rid="bbs033-B66">66</xref>
]. Currently, only the IMG/M server incorporates the InterPro database. However, a general problem with motif-based annotation is that short sequence matches typically show low statistical significance and false-positive rates can be high [
<xref ref-type="bibr" rid="bbs033-B67">67</xref>
]. Nevertheless, given the amount of novelty inherent in metagenomic data sets, it is recommended to run motif-based analysis in parallel with other functional prediction approaches.</p>
</sec>
<sec>
<title>Context-based annotation</title>
<p>Metagenomic data sets contain a large number of novel sequences which share no homology with known sequences and thus remain unannotated by the previous two approaches. To overcome these limitations, gene context-based approaches may also be used. A few examples from single genome annotation projects include genomic neighborhood [
<xref ref-type="bibr" rid="bbs033-B68">68</xref>
,
<xref ref-type="bibr" rid="bbs033-B69">69</xref>
], gene fusion [
<xref ref-type="bibr" rid="bbs033-B70">70</xref>
,
<xref ref-type="bibr" rid="bbs033-B71">71</xref>
], phylogenetic profiling [
<xref ref-type="bibr" rid="bbs033-B72">72</xref>
] and gene co-expression analysis [
<xref ref-type="bibr" rid="bbs033-B73">73</xref>
]. Among these, only the genomic neighborhood approach has been implemented in the case of metagenomics. In 2007, Harrington
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="bbs033-B74">74</xref>
] applied a combination of homology-based searches and customized gene neighborhood methods to four metagenomic data sets derived from a variety of complex environments. Whereas BLAST-based methods alone annotated 70% of the sequences, their combined method inferred specific functions for 76% and non-specific functions for 83% of the sequences. However, due to the paucity of complete genomes in metagenomic data sets and the lack of knowledge about the true species origin of the sequences, this approach has its limitations. These problems may be ameliorated by increasing the sequencing depth and by improving the taxonomic assignment of the sequences. Additionally, better assemblies resulting in longer contigs will also improve the efficiency of context-based annotation methods. Currently, only IMG/M and SmashCommunity [
<xref ref-type="bibr" rid="bbs033-B31">31</xref>
] can be used to view predicted genes in the genomic neighborhood context.</p>
</sec>
<sec>
<title>Other types of functional prediction</title>
<p>Lastly, the putative roles of the metagenomic sequences can also be inferred by running more specific analyses using dedicated tools that target prediction of carbohydrate active enzymes, glycosyl hydrolases, protein localizations, lipoproteins, adhesins, secretory proteins, transporters, CRISPRs (Clustered Regulatory Interspaced Short Palindromic Repeats), insertion sequences, virulence factors, etc. A list of a few representative tools for such analysis is given in
<xref ref-type="table" rid="bbs033-T3">Table 3</xref>
. It should be noted that the list is not comprehensive, and that a discussion about all the tools for the above-mentioned purpose is beyond the scope of this review.
<table-wrap id="bbs033-T3" position="float">
<label>Table 3:</label>
<caption>
<p>List of commonly used available resources for functional analysis (other than homology-, motif- and context-based) that can be performed on metagenomic data sets</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Type of prediction</th>
<th rowspan="1" colspan="1">Resource name</th>
<th rowspan="1" colspan="1">URL</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">Carbohydrate-active enzymes</td>
<td rowspan="1" colspan="1">CAZy</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.cazy.org/">http://www.cazy.org/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Glycosyl hydrolases</td>
<td rowspan="1" colspan="1">GAS</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://csbl.bmb.uga.edu/~ffzhou/GASdb/">http://csbl.bmb.uga.edu/∼ffzhou/GASdb/</ext-link>
</td>
</tr>
<tr>
<td rowspan="4" colspan="1">Protein localization</td>
<td rowspan="1" colspan="1">PSORT</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://psort.hgc.jp/">http://psort.hgc.jp/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Cell-PLoc</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.csbio.sjtu.edu.cn/bioinf/Cell-PLoc/">http://www.csbio.sjtu.edu.cn/bioinf/Cell-PLoc/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">CELLO</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://cello.life.nctu.edu.tw/">http://cello.life.nctu.edu.tw/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">PA-SUB</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://webdocs.cs.ualberta.ca/~bioinfo/PA/Sub/index.html">http://webdocs.cs.ualberta.ca/∼bioinfo/PA/Sub/index.html</ext-link>
</td>
</tr>
<tr>
<td rowspan="4" colspan="1">Membrane proteins</td>
<td rowspan="1" colspan="1">DAS</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.sbc.su.se/~miklos/DAS/">http://www.sbc.su.se/∼miklos/DAS/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">HMMTOP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.enzim.hu/hmmtop/html/submit.html">http://www.enzim.hu/hmmtop/html/submit.html</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">HMM-TM</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.biol.uoa.gr/HMM-TM/index.jsp">http://bioinformatics.biol.uoa.gr/HMM-TM/index.jsp</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">TMB-Comp</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://bmbpcu36.leeds.ac.uk/~andy/betaBarrel/TMB_Hunt_2/TMB_Comp.cgi">http://bmbpcu36.leeds.ac.uk/∼andy/betaBarrel/TMB_Hunt_2/TMB_Comp.cgi</ext-link>
</td>
</tr>
<tr>
<td rowspan="5" colspan="1">Lipoproteins</td>
<td rowspan="1" colspan="1">DOLOP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.mrc-lmb.cam.ac.uk/genomes/dolop/dolop.htm">http://www.mrc-lmb.cam.ac.uk/genomes/dolop/dolop.htm</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">LIPO</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://services.cbu.uib.no/tools/lipo">http://services.cbu.uib.no/tools/lipo</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">SignalP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/SignalP/">http://www.cbs.dtu.dk/services/SignalP/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">LipoP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/LipoP/">http://www.cbs.dtu.dk/services/LipoP/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">PRED-LIPO</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.biol.uoa.gr/PRED-LIPO/input.jsp">http://bioinformatics.biol.uoa.gr/PRED-LIPO/input.jsp</ext-link>
</td>
</tr>
<tr>
<td rowspan="4" colspan="1">Secretory proteins (signal peptide Type I)</td>
<td rowspan="1" colspan="1">Tatfind</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://signalfind.org/tatfind.html">http://signalfind.org/tatfind.html</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">TatP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/TatP/">http://www.cbs.dtu.dk/services/TatP/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">SignalP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.cbs.dtu.dk/services/SignalP/">http://www.cbs.dtu.dk/services/SignalP/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">PrediSi</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.predisi.de/index.html">http://www.predisi.de/index.html</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Adhesins</td>
<td rowspan="1" colspan="1">SPAAN</td>
<td rowspan="1" colspan="1">Sachdeva
<italic>et al</italic>
. 2004 [
<xref ref-type="bibr" rid="bbs033-B75">75</xref>
]</td>
</tr>
<tr>
<td rowspan="3" colspan="1">Transporters</td>
<td rowspan="1" colspan="1">TansportTP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://bioinfo3.noble.org/transporter/">http://bioinfo3.noble.org/transporter/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">TransAAP</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.membranetransport.org/transaap/TransAAP_login.html">http://www.membranetransport.org/transaap/TransAAP_login.html</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">TCDB</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.tcdb.org/">http://www.tcdb.org/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Insertion sequences</td>
<td rowspan="1" colspan="1">ISsaga</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://issaga.biotoul.fr/ISsaga/issaga_index.php">http://issaga.biotoul.fr/ISsaga/issaga_index.php</ext-link>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1">CRISPRs</td>
<td rowspan="1" colspan="1">PILER</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.drive5.com/pilercr/">http://www.drive5.com/pilercr/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">CRISPRfinder</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://crispr.u-psud.fr/Server/">http://crispr.u-psud.fr/Server/</ext-link>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1">Repeats</td>
<td rowspan="1" colspan="1">Tandem Repeats Finder</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://tandem.bu.edu/trf/trf.html">http://tandem.bu.edu/trf/trf.html</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">EMBOSS</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://emboss.sourceforge.net/">http://emboss.sourceforge.net/</ext-link>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1">Virulence factors</td>
<td rowspan="1" colspan="1">VFDB</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://www.mgc.ac.cn/VFs/">http://www.mgc.ac.cn/VFs/</ext-link>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">MvirDB</td>
<td rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://predictioncenter.llnl.gov/">http://predictioncenter.llnl.gov/</ext-link>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
</sec>
</sec>
<sec>
<title>GENE-CENTRIC ANALYSIS OF METAGENOMIC DATA SETS</title>
<p>To explore the effect of environment on the functional and metabolic contents of different communities, comparative functional analysis may be performed on the total gene-content of the communities, i.e. gene-centric analysis. For this purpose, functional profiles can be compared and contrasted across different metagenomic data sets to look for functional characteristics responsible for community differences. Normally two levels of comparison are performed, viz., comparison of abundance of functional families and pathways, and estimation of statistical parameters to ensure that the observed differences in abundance are not merely chance occurrences. Different types of abundance profiles may be generated and compared using, for example, COGs functional categories, Pfam functional families, KEGG metabolic pathways, or SEEDs subsystems. However, before comparing the metagenomes, proper normalizations of the data sets should be performed to account for the data-associated problems, such as partial genes and effective genome sizes (discussed later). Heat-maps are commonly used to visualize the differences in communities with respect to the above-mentioned functional or metabolic profiles (for example [
<xref ref-type="bibr" rid="bbs033-B60">60</xref>
,
<xref ref-type="bibr" rid="bbs033-B61">61</xref>
,
<xref ref-type="bibr" rid="bbs033-B76 bbs033-B77 bbs033-B78">76–78</xref>
]). In addition, statistical methods, such as principal component analysis (PCA) and multidimensional scaling (MDS), may be used to reveal which factors most affect the observed data (for example [
<xref ref-type="bibr" rid="bbs033-B79">79</xref>
,
<xref ref-type="bibr" rid="bbs033-B80">80</xref>
]). The common approaches and limitations of the gene-centric analysis are discussed and reviewed by Kunin
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="bbs033-B3">3</xref>
].</p>
</sec>
<sec>
<title>PROBLEMS ASSOCIATED WITH FUNCTIONAL ANALYSIS OF METAGENOMIC DATA</title>
<p>The analysis and annotation of metagenomic data sets differ from that of whole genome studies mainly because the former is a complex mixture of sequences from multiple species. Even draft quality bacterial whole genome sequences represent most of the chromosomes, except for a few of the more complex regions that include repeats, insertion sequences, tRNAs, rRNAs, etc. When sequence coverage is sufficient, the assemblies obtained usually result in very long contigs with few gaps. The efficiency of gene prediction algorithms on such long contigs is quite high and most of the full-length coding DNA sequences (CDSs) can be predicted with high confidence. Functional prediction analysis can next be applied to obtain the functional repertoire of the genome. The functionally annotated CDSs can then be viewed in the context of metabolic pathways to predict the metabolic capabilities of the species under study.</p>
<p>A metagenome can be viewed as a collection of several whole genomes. To fully understand an environment, in principal, draft quality whole genome sequences for every member should be achieved by complete DNA sequencing. However, in spite of the availability of high throughput second-generation sequencers, this is still a very expensive and daunting task. What can be best captured from a metagenomic sample is a mixture of fragmented sequences from the community members, and mostly from dominant members of the environment. When the sequencing depth is sufficient, and by the use of sequence assemblers developed specifically for metagenomic data (
<xref ref-type="table" rid="bbs033-T1">Table 1</xref>
), draft quality assemblies for some of the member species may be achieved; e.g. a draft methanogen genome was recently assembled from a permafrost microbial community [
<xref ref-type="bibr" rid="bbs033-B78">78</xref>
]. However, this still did not suffice for completely understanding the environment, as the assemblies for many other members remained poor due to the inherent complexity of the environments and lower sequencing coverage for these genomes. Thus, for most metagenomic studies, we are left with only enormous volumes of fragmented sequences (comprised of a mixture of short contigs and singletons) from multiple species to perform analysis on. In the case of contigs, gene predictions will be more accurate, whereas the predicted genes from singletons will almost always be partial in spite of using gene prediction tools specifically developed for metagenomic data (
<xref ref-type="table" rid="bbs033-T1">Table 1</xref>
), unless very long read-lengths were obtained during sequencing. This is mainly because the typical average read-lengths generated by next-generation sequencers providing deeper coverage, including Illumina, are still smaller (up to 300 bp for paired-end reads) than the average size of the typical prokaryotic protein coding gene (∼1000 bp [
<xref ref-type="bibr" rid="bbs033-B81">81</xref>
]). The 454 pyrosequencing platform can be an alternative technology due to the longer average read-lengths it can generate (up to 700 bp for 454 GS FLX+ pyrosequencer,
<ext-link ext-link-type="uri" xlink:href="http://454.com/downloads/GSFLXApplicationFlyer_FINALv2.pdf">http://454.com/downloads/GSFLXApplicationFlyer_FINALv2.pdf</ext-link>
), but it is not the preferred choice mainly due to its lower coverage and higher cost as compared to Illumina sequencing.</p>
<p>To obtain the most complete information of the functional repertoire for any metagenome it is recommended to use the genes predicted from both the contigs and the singletons, even though many of the predicted CDSs are partial. In general, short query lengths negatively impact homology-based functional prediction as they may decrease the significance of pairwise similarities due to added noise. This is clearly evident from
<xref ref-type="fig" rid="bbs033-F2">Figure 2</xref>
, which shows that there are no matches for sequences of length ∼100 bp for the ‘Cow Rumen’ metagenome [
<xref ref-type="bibr" rid="bbs033-B79">79</xref>
] in the lower and more significant
<italic>E</italic>
-value bins (
<italic>E</italic>
-value < 1
<italic>e</italic>
 − 10). On the other hand, as sequence length increases, the
<italic>E</italic>
-value bins with lower values become more populated, as in the case of the ‘Human Gut Japanese’ [
<xref ref-type="bibr" rid="bbs033-B39">39</xref>
] data set. Additionally, for short sequence lengths, homology-based approaches have limited sensitivity. For example, only ∼25% of the ‘Cow Rumen’ sequences could be annotated using GenBank, whereas >75% of the ‘Human Gut Japanese’ sequences could be annotated using the same database with the same parameters (
<xref ref-type="fig" rid="bbs033-F2">Figure 2</xref>
). These problems may be ameliorated to some extent by increasing sequencing depth or read-length so that better assemblies and gene predictions can be obtained.</p>
<p>Another problem in metagenomic functional analysis stems from the lack of knowledge of the species of origin of the sequences. Although phylogenetic classification and binning methods specific to metagenomic sequences may be able to classify 40–93% of the reads [
<xref ref-type="bibr" rid="bbs033-B19">19</xref>
] at the genus level, depending on the novelty of the data set, at the species level this percentage is expected to decrease. This indicates that at least 7–60% of the sequences still remain unclassified due to the limitations of the available tools and the paucity of reference genomes in the public databases. Thus, in spite of gaining some functional information, due to the absence of specific species information, it is extremely difficult to put together many functionally annotated metagenomic sequences in context of their actual metabolic pathways. Additionally, because most of the metagenomic sequences will be derived from the dominant species, the complete functional and metabolic repertoire of the less abundant members cannot be obtained. Other techniques complimentary to metagenomics, such as single cell genomics [
<xref ref-type="bibr" rid="bbs033-B82">82</xref>
], may help in overcoming this problem by providing access to the genomic DNA from unculturable microbes. However, even single cell genomics has many challenges remaining [
<xref ref-type="bibr" rid="bbs033-B82">82</xref>
]. Nevertheless, if the objective of the metagenomic study is to only analyze the overall metabolic capacity of the entire community, then putting the sequences in context of their individual genomes of origin may not pose a serious problem.</p>
<p>Given that metagenomic studies are aimed at exploring complex environments harboring many yet uncultured and unknown microbes, the data sets are expected to possess a large number of novel sequences. As shown in
<xref ref-type="fig" rid="bbs033-F3">Figure 3</xref>
, the overall functional annotation achieved in the case of some example bacterial metagenomes is 50–75%, with the remaining sequences being unannotated. Even for ‘complete’ genomes, functional annotation is not complete. In the most studied model organism,
<italic>Escherichia coli</italic>
K12-W3110, and the smallest studied genome,
<italic>Mycoplasma genitalium</italic>
, both of which are considered ‘simpler’ systems, the overall functional annotation remains ∼90%. And, in a more complex system viz., the human genome, only ∼82% of the predicted proteins are currently annotated. For the even more complex human gut metagenome, this number decreases to ∼75%. Interestingly, while ocean and soil are also considered as ‘complex metagenomes’ on the scale of the human gut microbiome, only ∼50–55% of the sequences in these communities can be annotated. This difference in level of annotation could be due to a bias in the number of human-associated microbial genomes that have thus far been sequenced and are included in the reference sequence databases. To deal with the novelty of metagenomic data, reference genome sequencing efforts should be initiated for other environments as has been done under the Human Microbiome Project [
<xref ref-type="bibr" rid="bbs033-B83">83</xref>
], which plans to sequence a large number of reference genomes from different body sites for the human microbiome.</p>
<p>While the functional annotation of bacterial metagenomes is at a reasonable level and is gradually improving, the situation for viral metagenomes, or viromes, lags far behind. The extent of virome annotation for cow rumen [
<xref ref-type="bibr" rid="bbs033-B59">59</xref>
] and human lung [
<xref ref-type="bibr" rid="bbs033-B80">80</xref>
] drops to as low as 13–15% (
<xref ref-type="fig" rid="bbs033-F4">Figure 4</xref>
) in comparison to bacterial annotation (cow rumen: 32%) for similar environments. The average metagenomic read-length used for the human lung virome was only 84 bp. One might argue that this reduction in the percentage of functional annotation may be due to the short read-length, which is known to affect the extent and confidence level of the functional prediction process, as discussed earlier. But, surprisingly, the percentage of functional annotation for the cow rumen virome is also low (15%), despite using a longer read-length (>300 bp). Thus, this reduction in the extent of functional prediction for viromes could be mainly due to the limited number of completely sequenced viral species in the reference databases.
<fig id="bbs033-F4" position="float">
<label>Figure 4:</label>
<caption>
<p>Status of functional prediction for viral metagenomes. The bars for the Cow Rumen viral metagenome data set represent the percentage of genes predicted from assembled contigs, while those for the Human Lung viral metagenome data set [
<xref ref-type="bibr" rid="bbs033-B80">80</xref>
] represent the percentage of raw reads.</p>
</caption>
<graphic xlink:href="bbs033f4"></graphic>
</fig>
</p>
<p>The genome sizes of the individual microbial members of a community can vary greatly. It is known that larger genomes harbor a smaller relative fraction of universal and housekeeping genes, and thus contain a large number of novel genes [
<xref ref-type="bibr" rid="bbs033-B84">84</xref>
,
<xref ref-type="bibr" rid="bbs033-B85">85</xref>
]. Indeed, a weakly significant positive correlation was found between the effective genome size and the potential for carrying novel genes [
<xref ref-type="bibr" rid="bbs033-B86">86</xref>
]. Therefore, the average genome size in an environmental sample could also affect the comparative functional analysis of the metagenome. Recently, Beszteri
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="bbs033-B87">87</xref>
] demonstrated how, among metagenomic samples, the differences in relative gene abundance, which are often used to interpret habitat-specific adaptations, are biased by the average genome size of the communities sampled. Thus, before arriving at biological conclusions from functional analysis of metagenomic data sets, the latter should be normalized to account for their different average genome sizes.</p>
<p>Apart from the aforementioned problems, the analysis of metagenomic data sets can also be influenced by the sequencing technology used. For example, 454 pyrosequencing technology produces between 11–35% artificial replicates, both identical reads (duplicates) and reads that begin at the same position but vary in length or contain sequencing discrepancies, which lead to biased functional annotations [
<xref ref-type="bibr" rid="bbs033-B88">88</xref>
]. Replicates were also observed in an Illumina sequenced permafrost microbial community analysis [
<xref ref-type="bibr" rid="bbs033-B78">78</xref>
]. Thus, the metagenomic reads should be de-replicated before in-depth functional analysis is performed. Both 454 pyrosequencing and the more recent Ion Torrent sequencing technologies are known to introduce frameshift errors in the reads, mostly due to homopolymer runs. Almost none of the available bioinformatics tools for functional annotation of metagenomic sequences are capable of handling such errors; although several specialized tools for frameshift detection are currently available [
<xref ref-type="bibr" rid="bbs033-B89 bbs033-B90 bbs033-B91 bbs033-B92 bbs033-B93">89–93</xref>
] in the public domain and should be used for more in-depth functional analysis. In some cases, the protocols used for sample preparation, particularly the use of filters or other sample selection methods, can also lead to inappropriate biological interpretations. For example, in the first Sargasso Sea data set [
<xref ref-type="bibr" rid="bbs033-B94">94</xref>
], some nitrogen-fixing genes were found to be lacking [
<xref ref-type="bibr" rid="bbs033-B95">95</xref>
]. However, the lack of these genes was later attributed to the absence of their main contributors, cyanobacteria, which were likely removed during the filtration step [
<xref ref-type="bibr" rid="bbs033-B96">96</xref>
].</p>
</sec>
<sec>
<title>APPLICATIONS OF METAGENOMIC FUNCTIONAL ANALYSIS</title>
<p>Despite the challenges for metagenomic functional analysis, many studies exploring different environments are being conducted with varying degrees of success. The applications of metagenomic functional analysis is an extremely important and versatile subject; and, given the scope of the current review, it is impossible to comprehensively discuss it here. Therefore, to exemplify the successful implementation of metagenomic functional analysis to answer some biologically and environmentally important issues, a few recent example studies are presented in the following sections. For a discussion of other studies of major interest, we recommend the comprehensive review by Wooley
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="bbs033-B4">4</xref>
].</p>
<sec>
<title>Comparative metagenomic-based studies</title>
<p>Recently, in a large-scale metagenomic analysis of 124 European individuals, a catalogue of over 3.3 million human gut microbial genes was created [
<xref ref-type="bibr" rid="bbs033-B97">97</xref>
]. This led to the identification of bacterial functions that are necessary for a bacterium to thrive in the gut context, and to those functions involved in homeostasis of the entire ecosystem. This catalogue not only provides a good resource for annotating new human gut-related metagenomes and for comparative analysis, it also enables future studies to discover associations between the microbial genes and human phenotypes. In another study, the gut metagenomes of four healthy individuals were compared to those of individuals with autoimmune disorders, including type I diabetes [
<xref ref-type="bibr" rid="bbs033-B98">98</xref>
]. This analysis suggested that increased adhesion and flagella synthesis in diseased individuals may be involved in triggering type I diabetes associated autoimmune response. Recently, a comparison between the human gut environment and the oral cavity was made by comparing the two metagenomes, and clear distinctions in the functional capacities of the two niches were observed [
<xref ref-type="bibr" rid="bbs033-B99">99</xref>
]. In the same study, another comparison between oral metagenomes from supragingival dental plaque and cavities of healthy and diseased individuals, respectively, suggested that the dental plaque of healthy individuals (those who have never suffered from caries) may be a genetic reservoir for novel anticaries compounds and probiotics, which are live microorganisms thought to be beneficial to the host organism.</p>
<p>Metagenomics studies to date have not only aimed at exploring human health-related issues, but have also attempted to address various environmental issues. Global warming resulting from the emission of greenhouse gases is a major concern worldwide. Rising global temperatures cause permafrost, a vast reservoir of natural carbon, to thaw, resulting in microbial degradation of organic matter and emission of more greenhouse gases. Comparative metagenomics of permafrost was recently applied to both the frozen and thawed states to analyze the shifts in microbial and functional composition [
<xref ref-type="bibr" rid="bbs033-B78">78</xref>
]. Multiple genes involved in carbon and nitrogen cycling were found to shift rapidly during thaw. From this study, important insights about the microbial species and functional components involved in greenhouse gas emissions may be obtained.</p>
</sec>
<sec>
<title>Metagenomic data-mining-based studies</title>
<p>The natural diversity and affluence of metagenomic data is enormous. Over 300 independent metagenomic projects have already been completed or are underway. These facts provide a great opportunity for in-depth mining of metagenomic data and exploration of novel gene candidates useful under a variety of different scenarios. For example, the metagenomic data sets from 10 diverse sources were used to identify several novel candidates for commercially useful enzymes (CUEs) [
<xref ref-type="bibr" rid="bbs033-B100">100</xref>
]. A catalogue of 510 CUEs was prepared using literature search followed by manual curation, and then the catalogue was used to find homologues in the metagenomic data sets. High-throughput functional metagenomic screening may be used to look for the presence of CUEs and other specific enzymes of interest in the metagenomes [
<xref ref-type="bibr" rid="bbs033-B101">101</xref>
]. In another study, the recruitment of genomes from pathogens against the metagenomes of healthy individuals containing commensal strains of the same species was used to identify the genomic regions of individual bacterial isolates missing in the metagenomes [
<xref ref-type="bibr" rid="bbs033-B102">102</xref>
]. These regions are referred to as metagenomic islands and are found to harbor several virulence-related genes specific to the pathogenic strain.</p>
</sec>
</sec>
<sec sec-type="conclusions">
<title>CONCLUSIONS</title>
<p>Metagenomic sequencing provides a unique opportunity to explore yet unknown environments in great detail. Functional analysis of the metagenomic data plays a central role in such studies by providing important clues about functional and metabolic diversity, as well as variation. While metagenomic studies continue to suffer from certain caveats that make the downstream data analysis a challenging task for bioinformaticians, the gradual improvement in metagenomic technologies and development of tools and resources that account for the known problems will relieve some of the burdens. For example, the use of next-generation sequencers producing longer read-lengths (>300 bp) will usually lead to better sequence coverage. This can then be followed by the use of sequence assembly and gene prediction tools and parameters specifically developed for metagenomic sequences which will further help in improving assembly and gene prediction efficiency, respectively, and will result in a greater number of complete predicted proteins. Better functional assignments for metagenomic data sets can be obtained by using more complete proteins. However, while comparing the abundance profiles of functions between communities, the frequencies of the functions should not be masked by the assembly, and the read depths of the contigs should be accounted for. Another common problem that is usually encountered in metagenomic data functional analysis is the long computational time that is required for BLAST-based homology searches for orthologs. The use of alternative search algorithms, such as BLAT, can provide analysis results in shorter times; however, the loss of sensitivity by BLAT-based searches should be taken into account when analyzing the results. Alternatively, profile-based search methods using the HMMER algorithm may also be used whenever pre-computed sequence profiles are available. Certain issues, including large volumes of metagenomic sequence data, large storage requirements for the analyzed data, and the typically large number of unknown sequences in the metagenomic data still pose serious challenges for its analysis. Therefore, there is great need for the development of new, faster, more sensitive tools and more thorough resources dedicated to the functional analysis of metagenomic data sets. Also, it is strongly advised that when analyzing the data, one must be aware of any additional factors that can influence the functional analysis, including sample preparation, sequencing method, diversity of the environments, etc. Proper calibrations, normalizations and statistical tests for significance should always be performed in order to arrive at the most reliable conclusions.</p>
<p>DNA sequence-based metagenomic functional analysis is limited in that it only provides information about the functional content of an environment. Thus, it may be complemented by other independent approaches that help to gain further insights about the more dynamic aspects of a given community. For example, a few metatranscriptomic projects have been undertaken to address which genes are actually being expressed in different environments and to what extent [
<xref ref-type="bibr" rid="bbs033-B103">103</xref>
,
<xref ref-type="bibr" rid="bbs033-B104">104</xref>
]. Given that proteins are much more stable than mRNAs [
<xref ref-type="bibr" rid="bbs033-B105">105</xref>
], a proteome-based analysis is expected to provide a more accurate view of the functionality of a given environment. Toward this, a few metaproteomic studies have been conducted to explore which protein products are formed and how are they involved in the cross-talk within the environment under different conditions [
<xref ref-type="bibr" rid="bbs033-B106 bbs033-B107 bbs033-B108 bbs033-B109">106–109</xref>
]. The metabolome, which represents the complete set of small molecules in an organism, can influence gene expression and protein function. Therefore, metabolomics also plays a key role in understanding cellular systems and decoding the functions of genes [
<xref ref-type="bibr" rid="bbs033-B110">110</xref>
,
<xref ref-type="bibr" rid="bbs033-B111">111</xref>
]. A few metabolomic analyses have been conducted to determine which metabolites are produced as a result of the underlying metabolic pathways that are being exerted in a given community and to study host-microbe interactions [
<xref ref-type="bibr" rid="bbs033-B112 bbs033-B113 bbs033-B114 bbs033-B115 bbs033-B116 bbs033-B117">112–117</xref>
]. Another alternative to the DNA-based studies used for determining microbial community composition, metalipidomics, is being implemented mainly to identify the living microbial cells in an environment [
<xref ref-type="bibr" rid="bbs033-B118">118</xref>
]. Intact polar lipids (IPLs), which are the basic building blocks of biomembranes, are ubiquitous in nature and have several characteristics that make them useful as proxies for living microbial cells. To date, metabolomic studies have not been directly used for the functional analysis of environments. However, studies seeking to identify microbes of specific functional interest may be conducted, as has been done for ammonia-oxidizing microbes from marine and estuarine sediments [
<xref ref-type="bibr" rid="bbs033-B119">119</xref>
]. The functional component of the environment may then be extensively analyzed using different approaches to gain more insights about the cross-talk taking place in that environment. Thus, the application of metalipidomics to study host-associated microbial composition and functional analysis, while not yet explored, appears promising.</p>
<p>
<boxed-text id="bbs033-BOX1" position="float">
<caption>
<title>KEY POINTS</title>
</caption>
<p>
<list list-type="bullet">
<list-item>
<p>Read-lengths generated during metagenomic sequencing influence assembly, gene prediction and eventually functional analysis. The enormous volume of sequence data, which leads to long computational times and massive storage requirements, also impedes metagenomic functional prediction.</p>
</list-item>
<list-item>
<p>Factors that potentially influence functional analysis of metagenomic data, including sample preparation, sequencing method, average genome size, etc. should be considered prior to analysis.</p>
</list-item>
<list-item>
<p>A higher fraction of metagenomic sequences are annotated using BLAST against data-rich reference sequence databases such as NCBI NR as compared to SwissProt, COGs, KEGG, etc.</p>
</list-item>
<list-item>
<p>Integrated methods using more than one approach can improve the efficiency and reliability of functional predictions.</p>
</list-item>
<list-item>
<p>DNA-sequence-based metagenomic functional analysis should be complemented with other types of approaches, such as metatranscriptomics, metaproteomics, metabolomics and metalipidomics, to gain better insights of the dynamics of a community.</p>
</list-item>
</list>
</p>
</boxed-text>
</p>
</sec>
<sec>
<title>FUNDING</title>
<p>This work was supported by the operational expenditure fund of
<funding-source>RIKEN</funding-source>
.</p>
</sec>
</body>
<back>
<bio id="d34e36">
<p>
<bold>Tulika Prakash</bold>
is a Research Scientist in the Laboratory for MetaSystems Research at RIKEN, Japan. With specialization in functional and metabolic analysis of prokaryotic genomes, her current research involves bioinformatic analysis of genomes and metagenomes.</p>
</bio>
<bio id="d34e47">
<p>
<bold>Todd Taylor</bold>
is Team Leader of the Laboratory for MetaSystems Research at RIKEN, Japan. A former member of the International Human Genome Sequencing Consortium, his current research involves bioinformatic approaches to metagenomics.</p>
</bio>
<ref-list>
<title>References</title>
<ref id="bbs033-B1">
<label>1</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pace</surname>
<given-names>NR</given-names>
</name>
</person-group>
<article-title>A molecular view of microbial diversity and the biosphere</article-title>
<source>Science</source>
<year>1997</year>
<volume>276</volume>
<fpage>734</fpage>
<lpage>40</lpage>
<pub-id pub-id-type="pmid">9115194</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B2">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tringe</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>EM</given-names>
</name>
</person-group>
<article-title>Metagenomics: DNA sequencing of environmental samples</article-title>
<source>Nat Rev Genet</source>
<year>2005</year>
<volume>6</volume>
<fpage>805</fpage>
<lpage>14</lpage>
<pub-id pub-id-type="pmid">16304596</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B3">
<label>3</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kunin</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Copeland</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lapidus</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A bioinformatician’s guide to metagenomics</article-title>
<source>Microbiol Mol Biol Rev</source>
<year>2008</year>
<volume>72</volume>
<fpage>557</fpage>
<lpage>78</lpage>
<pub-id pub-id-type="pmid">19052320</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B4">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wooley</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Godzik</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Friedberg</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>A primer on metagenomics</article-title>
<source>PLoS Comput Biol</source>
<year>2010</year>
<volume>6</volume>
<fpage>e1000667</fpage>
<lpage>79</lpage>
<pub-id pub-id-type="pmid">20195499</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B5">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Batzoglou</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Jaffe</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Stanley</surname>
<given-names>K</given-names>
</name>
<etal></etal>
</person-group>
<article-title>ARACHNE: a whole-genome shotgun assembler</article-title>
<source>Genome Res</source>
<year>2002</year>
<volume>12</volume>
<fpage>177</fpage>
<lpage>89</lpage>
<pub-id pub-id-type="pmid">11779843</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B6">
<label>6</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aparicio</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chapman</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Stupka</surname>
<given-names>E</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes</article-title>
<source>Science</source>
<year>2002</year>
<volume>297</volume>
<fpage>1301</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="pmid">12142439</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B7">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Myers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Sutton</surname>
<given-names>GG</given-names>
</name>
<name>
<surname>Delcher</surname>
<given-names>AL</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A whole-genome assembly of Drosophila</article-title>
<source>Science</source>
<year>2000</year>
<volume>287</volume>
<fpage>2196</fpage>
<lpage>204</lpage>
<pub-id pub-id-type="pmid">10731133</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B8">
<label>8</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zerbino</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Birney</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Velvet: algorithms for de novo short read assembly using de Bruijn graphs</article-title>
<source>Genome Res</source>
<year>2008</year>
<volume>18</volume>
<fpage>821</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">18349386</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B9">
<label>9</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ruan</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>De novo assembly of human genomes with massively parallel short read sequencing</article-title>
<source>Genome Res</source>
<year>2010</year>
<volume>20</volume>
<fpage>265</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="pmid">20019144</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B10">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pevzner</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Waterman</surname>
<given-names>MS</given-names>
</name>
</person-group>
<article-title>An Eulerian path approach to DNA fragment assembly</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2001</year>
<volume>98</volume>
<fpage>9748</fpage>
<lpage>53</lpage>
<pub-id pub-id-type="pmid">11504945</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ye</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>An ORFome assembly approach to metagenomics sequences analysis</article-title>
<source>J Bioinform Comput Biol</source>
<year>2009</year>
<volume>7</volume>
<fpage>455</fpage>
<lpage>71</lpage>
<pub-id pub-id-type="pmid">19507285</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peng</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Leung</surname>
<given-names>HC</given-names>
</name>
<name>
<surname>Yiu</surname>
<given-names>SM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth</article-title>
<source>Bioinformatics</source>
<year>2012</year>
</element-citation>
</ref>
<ref id="bbs033-B13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noguchi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Takagi</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>MetaGene: prokaryotic gene finding from environmental genome shotgun sequences</article-title>
<source>Nucleic Acids Res</source>
<year>2006</year>
<volume>34</volume>
<fpage>5623</fpage>
<lpage>30</lpage>
<pub-id pub-id-type="pmid">17028096</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B14">
<label>14</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Lomsadze</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Borodovsky</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Ab initio gene identification in metagenomic sequences</article-title>
<source>Nucleic Acids Res</source>
<year>2010</year>
<volume>38</volume>
<fpage>e132</fpage>
<lpage>46</lpage>
<pub-id pub-id-type="pmid">20403810</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B15">
<label>15</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rho</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>FragGeneScan: predicting genes in short and error-prone reads</article-title>
<source>Nucleic Acids Res</source>
<year>2010</year>
<volume>38</volume>
<fpage>e191</fpage>
<lpage>202</lpage>
<pub-id pub-id-type="pmid">20805240</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B16">
<label>16</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Delcher</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Harmon</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Kasif</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Improved microbial gene identification with GLIMMER</article-title>
<source>Nucleic Acids Res</source>
<year>1999</year>
<volume>27</volume>
<fpage>4636</fpage>
<lpage>41</lpage>
<pub-id pub-id-type="pmid">10556321</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B17">
<label>17</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Gish</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Basic local alignment search tool</article-title>
<source>J Mol Biol</source>
<year>1990</year>
<volume>215</volume>
<fpage>403</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="pmid">2231712</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B18">
<label>18</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lowe</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
</person-group>
<article-title>tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence</article-title>
<source>Nucleic Acids Res</source>
<year>1997</year>
<volume>25</volume>
<fpage>955</fpage>
<lpage>64</lpage>
<pub-id pub-id-type="pmid">9023104</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B19">
<label>19</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sharma</surname>
<given-names>VK</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Prakash</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Fast and accurate taxonomic assignments of metagenomic sequences using MetaBin</article-title>
<source>PLoS One</source>
<year>2012</year>
<volume>7</volume>
<fpage>e34030</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">22496776</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B20">
<label>20</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>MEGAN analysis of metagenomic data</article-title>
<source>Genome Res</source>
<year>2007</year>
<volume>17</volume>
<fpage>377</fpage>
<lpage>86</lpage>
<pub-id pub-id-type="pmid">17255551</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B21">
<label>21</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gerlach</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Junemann</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Tille</surname>
<given-names>F</given-names>
</name>
<etal></etal>
</person-group>
<article-title>WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads</article-title>
<source>BMC Bioinformatics</source>
<year>2009</year>
<volume>10</volume>
<fpage>430</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">20021646</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B22">
<label>22</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>HG</given-names>
</name>
<name>
<surname>Tsirigos</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Accurate phylogenetic classification of variable-length DNA fragments</article-title>
<source>Nat Methods</source>
<year>2007</year>
<volume>4</volume>
<fpage>63</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="pmid">17179938</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B23">
<label>23</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Teeling</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Waldmann</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lombardot</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences</article-title>
<source>BMC Bioinformatics</source>
<year>2004</year>
<volume>5</volume>
<fpage>163</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">15507136</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B24">
<label>24</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rosen</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Reichenberger</surname>
<given-names>ER</given-names>
</name>
<name>
<surname>Rosenfeld</surname>
<given-names>AM</given-names>
</name>
</person-group>
<article-title>NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<fpage>127</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">21062764</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B25">
<label>25</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Diaz</surname>
<given-names>NN</given-names>
</name>
<name>
<surname>Krause</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Goesmann</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>TACOA: taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach</article-title>
<source>BMC Bioinformatics</source>
<year>2009</year>
<volume>10</volume>
<fpage>56</fpage>
<lpage>71</lpage>
<pub-id pub-id-type="pmid">19210774</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B26">
<label>26</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Markowitz</surname>
<given-names>VM</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>IM</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>K</given-names>
</name>
<etal></etal>
</person-group>
<article-title>IMG/M: the integrated metagenome data management and comparative analysis system</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D123</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">22086953</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B27">
<label>27</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goll</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Rusch</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Tanenbaum</surname>
<given-names>DM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>METAREP: JCVI metagenomics reports–an open source tool for high-performance comparative metagenomics</article-title>
<source>Bioinformatics</source>
<year>2010</year>
<volume>26</volume>
<fpage>2631</fpage>
<lpage>2</lpage>
<pub-id pub-id-type="pmid">20798169</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B28">
<label>28</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<fpage>D546</fpage>
<lpage>51</lpage>
<pub-id pub-id-type="pmid">21045053</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B29">
<label>29</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Analysis and comparison of very large metagenomes with fast clustering and functional annotation</article-title>
<source>BMC Bioinformatics</source>
<year>2009</year>
<volume>10</volume>
<fpage>359</fpage>
<lpage>67</lpage>
<pub-id pub-id-type="pmid">19863816</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B30">
<label>30</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Glass</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Wilkening</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wilke</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Using the metagenomics RAST server (MG-RAST) for analyzing shotgun metagenomes</article-title>
<source>Cold Spring Harb Protoc</source>
<year>2010</year>
<volume>2010</volume>
</element-citation>
</ref>
<ref id="bbs033-B31">
<label>31</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Arumugam</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Harrington</surname>
<given-names>ED</given-names>
</name>
<name>
<surname>Foerstner</surname>
<given-names>KU</given-names>
</name>
<etal></etal>
</person-group>
<article-title>SmashCommunity: a metagenomic annotation and analysis tool</article-title>
<source>Bioinformatics</source>
<year>2010</year>
<volume>26</volume>
<fpage>2977</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">20959381</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B32">
<label>32</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Mitra</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ruscheweyh</surname>
<given-names>HJ</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Integrative analysis of environmental sequences using MEGAN4</article-title>
<source>Genome Res</source>
<year>2011</year>
<volume>21</volume>
<fpage>1552</fpage>
<lpage>60</lpage>
<pub-id pub-id-type="pmid">21690186</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B33">
<label>33</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lingner</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Asshauer</surname>
<given-names>KP</given-names>
</name>
<name>
<surname>Schreiber</surname>
<given-names>F</given-names>
</name>
<etal></etal>
</person-group>
<article-title>CoMet–a web server for comparative functional profiling of metagenomes</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<fpage>W518</fpage>
<lpage>23</lpage>
<pub-id pub-id-type="pmid">21622656</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B34">
<label>34</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>L</given-names>
</name>
<etal></etal>
</person-group>
<article-title>WebMGA: a customizable web server for fast metagenomic sequence analysis</article-title>
<source>BMC Genomics</source>
<year>2011</year>
<volume>12</volume>
<fpage>444</fpage>
<lpage>52</lpage>
<pub-id pub-id-type="pmid">21899761</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B35">
<label>35</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mende</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Waller</surname>
<given-names>AS</given-names>
</name>
<name>
<surname>Sunagawa</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Assessment of metagenomic assembly using simulated next generation sequencing data</article-title>
<source>PLoS One</source>
<year>2012</year>
<volume>7</volume>
<fpage>e31386</fpage>
<lpage>96</lpage>
<pub-id pub-id-type="pmid">22384016</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B36">
<label>36</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Foerstner</surname>
<given-names>KU</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Get the most out of your metagenome: computational analysis of environmental sequence data</article-title>
<source>Curr Opin Microbiol</source>
<year>2007</year>
<volume>10</volume>
<fpage>490</fpage>
<lpage>98</lpage>
<pub-id pub-id-type="pmid">17936679</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B37">
<label>37</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pignatelli</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Moya</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Evaluating the fidelity of de novo short read metagenomic assembly using simulated data</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<fpage>e19984</fpage>
<lpage>92</lpage>
<pub-id pub-id-type="pmid">21625384</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B38">
<label>38</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yok</surname>
<given-names>NG</given-names>
</name>
<name>
<surname>Rosen</surname>
<given-names>GL</given-names>
</name>
</person-group>
<article-title>Combining gene prediction methods to improve metagenomic gene annotation</article-title>
<source>BMC Bioinformatics</source>
<year>2011</year>
<volume>12</volume>
<fpage>20</fpage>
<lpage>31</lpage>
<pub-id pub-id-type="pmid">21232129</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B39">
<label>39</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kurokawa</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Itoh</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kuwahara</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes</article-title>
<source>DNA Res</source>
<year>2007</year>
<volume>14</volume>
<fpage>169</fpage>
<lpage>81</lpage>
<pub-id pub-id-type="pmid">17916580</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B40">
<label>40</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ivanov</surname>
<given-names>II</given-names>
</name>
<name>
<surname>Atarashi</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Manel</surname>
<given-names>N</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Induction of intestinal Th17 cells by segmented filamentous bacteria</article-title>
<source>Cell</source>
<year>2009</year>
<volume>139</volume>
<fpage>485</fpage>
<lpage>98</lpage>
<pub-id pub-id-type="pmid">19836068</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B41">
<label>41</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Prakash</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Oshima</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Morita</surname>
<given-names>H</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Complete genome sequences of rat and mouse segmented filamentous bacteria, a potent inducer of th17 cell differentiation</article-title>
<source>Cell Host Microbe</source>
<year>2011</year>
<volume>10</volume>
<fpage>273</fpage>
<lpage>84</lpage>
<pub-id pub-id-type="pmid">21925114</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B42">
<label>42</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ventura</surname>
<given-names>M</given-names>
</name>
<name>
<surname>O’Connell-Motherway</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Leahy</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>From bacterial genome to functionality; case bifidobacteria</article-title>
<source>Int J Food Microbiol</source>
<year>2007</year>
<volume>120</volume>
<fpage>2</fpage>
<lpage>12</lpage>
<pub-id pub-id-type="pmid">17629975</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B43">
<label>43</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Richter</surname>
<given-names>DC</given-names>
</name>
<name>
<surname>Ott</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>AF</given-names>
</name>
<etal></etal>
</person-group>
<article-title>MetaSim: a sequencing simulator for genomics and metagenomics</article-title>
<source>PLoS One</source>
<year>2008</year>
<volume>3</volume>
<fpage>e3373</fpage>
<lpage>84</lpage>
<pub-id pub-id-type="pmid">18841204</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B44">
<label>44</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Goto</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>Y</given-names>
</name>
<etal></etal>
</person-group>
<article-title>KEGG for integration and interpretation of large-scale molecular data sets</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D109</fpage>
<lpage>14</lpage>
<pub-id pub-id-type="pmid">22080510</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B45">
<label>45</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ashburner</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ball</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Blake</surname>
<given-names>JA</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium</article-title>
<source>Nat Genet</source>
<year>2000</year>
<volume>25</volume>
<fpage>25</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">10802651</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B46">
<label>46</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Overbeek</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Begley</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Butler</surname>
<given-names>RM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<fpage>5691</fpage>
<lpage>702</lpage>
<pub-id pub-id-type="pmid">16214803</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B47">
<label>47</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sayers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Barrett</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Benson</surname>
<given-names>DA</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Database resources of the National Center for Biotechnology Information</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D13</fpage>
<lpage>25</lpage>
<pub-id pub-id-type="pmid">22140104</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B48">
<label>48</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Doerks</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>SMART 7: recent updates to the protein domain annotation resource</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D302</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="pmid">22053084</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B49">
<label>49</label>
<element-citation publication-type="journal">
<collab>UniProt Consortium</collab>
<article-title>Reorganizing the protein space at the Universal Protein Resource (UniProt)</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D71</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="pmid">22102590</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B50">
<label>50</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tatusov</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>A genomic perspective on protein families</article-title>
<source>Science</source>
<year>1997</year>
<volume>278</volume>
<fpage>631</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="pmid">9381173</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B51">
<label>51</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Powell</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Szklarczyk</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Trachana</surname>
<given-names>K</given-names>
</name>
<etal></etal>
</person-group>
<article-title>eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D284</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">22096231</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B52">
<label>52</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Punta</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Coggill</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Eberhardt</surname>
<given-names>RY</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The Pfam protein families database</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D290</fpage>
<lpage>301</lpage>
<pub-id pub-id-type="pmid">22127870</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B53">
<label>53</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Selengut</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Haft</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Davidsen</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes</article-title>
<source>Nucleic Acids Res</source>
<year>2007</year>
<volume>35</volume>
<fpage>D260</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="pmid">17151080</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B54">
<label>54</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kent</surname>
<given-names>WJ</given-names>
</name>
</person-group>
<article-title>BLAT–the BLAST-like alignment tool</article-title>
<source>Genome Res</source>
<year>2002</year>
<volume>12</volume>
<fpage>656</fpage>
<lpage>64</lpage>
<pub-id pub-id-type="pmid">11932250</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B55">
<label>55</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Finn</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Clements</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
</person-group>
<article-title>HMMER web server: interactive sequence similarity searching</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<fpage>W29</fpage>
<lpage>37</lpage>
<pub-id pub-id-type="pmid">21593126</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B56">
<label>56</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brenner</surname>
<given-names>SE</given-names>
</name>
</person-group>
<article-title>Errors in genome annotation</article-title>
<source>Trends Genet</source>
<year>1999</year>
<volume>15</volume>
<fpage>132</fpage>
<lpage>3</lpage>
<pub-id pub-id-type="pmid">10203816</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B57">
<label>57</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tatusov</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Fedorova</surname>
<given-names>ND</given-names>
</name>
<name>
<surname>Jackson</surname>
<given-names>JD</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The COG database: an updated version includes eukaryotes</article-title>
<source>BMC Bioinformatics</source>
<year>2003</year>
<volume>4</volume>
<fpage>41</fpage>
<lpage>54</lpage>
<pub-id pub-id-type="pmid">12969510</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B58">
<label>58</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Garcia</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>Ivanova</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Kunin</surname>
<given-names>V</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities</article-title>
<source>Nat Biotechnol</source>
<year>2006</year>
<volume>24</volume>
<fpage>1263</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">16998472</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B59">
<label>59</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Berg Miller</surname>
<given-names>ME</given-names>
</name>
<name>
<surname>Yeoman</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Chia</surname>
<given-names>N</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Phage-bacteria relationships and CRISPR elements revealed by a metagenomic survey of the rumen microbiome</article-title>
<source>Environ Microbiol</source>
<year>2012</year>
<volume>14</volume>
<fpage>207</fpage>
<lpage>27</lpage>
<pub-id pub-id-type="pmid">22004549</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B60">
<label>60</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>DeLong</surname>
<given-names>EF</given-names>
</name>
<name>
<surname>Preston</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Mincer</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Community genomics among stratified microbial assemblages in the ocean’s interior</article-title>
<source>Science</source>
<year>2006</year>
<volume>311</volume>
<fpage>496</fpage>
<lpage>503</lpage>
<pub-id pub-id-type="pmid">16439655</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B61">
<label>61</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tringe</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>von</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Kobayashi</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Comparative metagenomics of microbial communities</article-title>
<source>Science</source>
<year>2005</year>
<volume>308</volume>
<fpage>554</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="pmid">15845853</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B62">
<label>62</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tyson</surname>
<given-names>GW</given-names>
</name>
<name>
<surname>Chapman</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Community structure and metabolism through reconstruction of microbial genomes from the environment</article-title>
<source>Nature</source>
<year>2004</year>
<volume>428</volume>
<fpage>37</fpage>
<lpage>43</lpage>
<pub-id pub-id-type="pmid">14961025</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B63">
<label>63</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gill</surname>
<given-names>SR</given-names>
</name>
<name>
<surname>Pop</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Deboy</surname>
<given-names>RT</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Metagenomic analysis of the human distal gut microbiome</article-title>
<source>Science</source>
<year>2006</year>
<volume>312</volume>
<fpage>1355</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">16741115</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B64">
<label>64</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sigrist</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Cerutti</surname>
<given-names>L</given-names>
</name>
<name>
<surname>de</surname>
<given-names>CE</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PROSITE, a protein domain database for functional characterization and annotation</article-title>
<source>Nucleic Acids Res</source>
<year>2010</year>
<volume>38</volume>
<fpage>D161</fpage>
<lpage>6</lpage>
<pub-id pub-id-type="pmid">19858104</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B65">
<label>65</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Attwood</surname>
<given-names>TK</given-names>
</name>
<name>
<surname>Bradley</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Flower</surname>
<given-names>DR</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PRINTS and its automatic supplement, prePRINTS</article-title>
<source>Nucleic Acids Res</source>
<year>2003</year>
<volume>31</volume>
<fpage>400</fpage>
<lpage>2</lpage>
<pub-id pub-id-type="pmid">12520033</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B66">
<label>66</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hunter</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Mitchell</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>InterPro in 2011: new developments in the family and domain prediction database</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D306</fpage>
<lpage>12</lpage>
<pub-id pub-id-type="pmid">22096229</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B67">
<label>67</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Redfern</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Orengo</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Predicting protein function from sequence and structure</article-title>
<source>Nat Rev Mol Cell Biol</source>
<year>2007</year>
<volume>8</volume>
<fpage>995</fpage>
<lpage>1005</lpage>
<pub-id pub-id-type="pmid">18037900</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B68">
<label>68</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dandekar</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Snel</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Huynen</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Conservation of gene order: a fingerprint of proteins that physically interact</article-title>
<source>Trends Biochem Sci</source>
<year>1998</year>
<volume>23</volume>
<fpage>324</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">9787636</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B69">
<label>69</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Overbeek</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Fonstein</surname>
<given-names>M</given-names>
</name>
<name>
<surname>D’Souza</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The use of gene clusters to infer functional coupling</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1999</year>
<volume>96</volume>
<fpage>2896</fpage>
<lpage>901</lpage>
<pub-id pub-id-type="pmid">10077608</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B70">
<label>70</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Enright</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Iliopoulos</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Kyrpides</surname>
<given-names>NC</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Protein interaction maps for complete genomes based on gene fusion events</article-title>
<source>Nature</source>
<year>1999</year>
<volume>402</volume>
<fpage>86</fpage>
<lpage>90</lpage>
<pub-id pub-id-type="pmid">10573422</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B71">
<label>71</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marcotte</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Pellegrini</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ng</surname>
<given-names>HL</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Detecting protein function and protein-protein interactions from genome sequences</article-title>
<source>Science</source>
<year>1999</year>
<volume>285</volume>
<fpage>751</fpage>
<lpage>3</lpage>
<pub-id pub-id-type="pmid">10427000</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B72">
<label>72</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pellegrini</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Marcotte</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>MJ</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Assigning protein functions by comparative genome analysis: protein phylogenetic profiles</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1999</year>
<volume>96</volume>
<fpage>4285</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">10200254</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B73">
<label>73</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marcotte</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Pellegrini</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>MJ</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A combined algorithm for genome-wide prediction of protein function</article-title>
<source>Nature</source>
<year>1999</year>
<volume>402</volume>
<fpage>83</fpage>
<lpage>6</lpage>
<pub-id pub-id-type="pmid">10573421</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B74">
<label>74</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Harrington</surname>
<given-names>ED</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Doerks</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Quantitative assessment of protein function prediction from metagenomics shotgun sequences</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2007</year>
<volume>104</volume>
<fpage>13913</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">17717083</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B75">
<label>75</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sachdeva</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>P</given-names>
</name>
<etal></etal>
</person-group>
<article-title>SPAAN: a software program for prediction of adhesins and adhesin-like proteins using neural networks</article-title>
<source>Bioinformatics</source>
<year>2005</year>
<volume>21</volume>
<fpage>483</fpage>
<lpage>91</lpage>
<pub-id pub-id-type="pmid">15374866</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B76">
<label>76</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Turnbaugh</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Ley</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Mahowald</surname>
<given-names>MA</given-names>
</name>
<etal></etal>
</person-group>
<article-title>An obesity-associated gut microbiome with increased capacity for energy harvest</article-title>
<source>Nature</source>
<year>2006</year>
<volume>444</volume>
<fpage>1027</fpage>
<lpage>31</lpage>
<pub-id pub-id-type="pmid">17183312</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B77">
<label>77</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Turnbaugh</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Hamady</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Yatsunenko</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A core gut microbiome in obese and lean twins</article-title>
<source>Nature</source>
<year>2009</year>
<volume>457</volume>
<fpage>480</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="pmid">19043404</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B78">
<label>78</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mackelprang</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Waldrop</surname>
<given-names>MP</given-names>
</name>
<name>
<surname>DeAngelis</surname>
<given-names>KM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw</article-title>
<source>Nature</source>
<year>2011</year>
<volume>480</volume>
<fpage>368</fpage>
<lpage>71</lpage>
<pub-id pub-id-type="pmid">22056985</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B79">
<label>79</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brulc</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Antonopoulos</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>ME</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Gene-centric metagenomics of the fiber-adherent bovine rumen microbiome reveals forage specific glycoside hydrolases</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2009</year>
<volume>106</volume>
<fpage>1948</fpage>
<lpage>53</lpage>
<pub-id pub-id-type="pmid">19181843</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B80">
<label>80</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Willner</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Furlan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Haynes</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Metagenomic analysis of respiratory tract DNA viral communities in cystic fibrosis and non-cystic fibrosis individuals</article-title>
<source>PLoS One</source>
<year>2009</year>
<volume>4</volume>
<fpage>e7370</fpage>
<lpage>81</lpage>
<pub-id pub-id-type="pmid">19816605</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B81">
<label>81</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>X</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Average gene length is highly conserved in prokaryotes and eukaryotes and diverges only between the two kingdoms</article-title>
<source>Mol Biol Evol</source>
<year>2006</year>
<volume>23</volume>
<fpage>1107</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">16611645</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B82">
<label>82</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yilmaz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>AK</given-names>
</name>
</person-group>
<article-title>Single cell genome sequencing</article-title>
<source>Curr Opin Biotechnol</source>
<year>2011</year>
<volume>23</volume>
<fpage>1</fpage>
<lpage>7</lpage>
</element-citation>
</ref>
<ref id="bbs033-B83">
<label>83</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peterson</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Garges</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Giovanni</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The NIH Human Microbiome Project</article-title>
<source>Genome Res</source>
<year>2009</year>
<volume>19</volume>
<fpage>2317</fpage>
<lpage>23</lpage>
<pub-id pub-id-type="pmid">19819907</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B84">
<label>84</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Korbel</surname>
<given-names>JO</given-names>
</name>
<name>
<surname>Lercher</surname>
<given-names>MJ</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Prediction of effective genome size in metagenomic samples</article-title>
<source>Genome Biol</source>
<year>2007</year>
<volume>8</volume>
<fpage>R10</fpage>
<lpage>20</lpage>
<pub-id pub-id-type="pmid">17224063</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B85">
<label>85</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van</surname>
<given-names>NE</given-names>
</name>
</person-group>
<article-title>Scaling laws in the functional content of genomes</article-title>
<source>Trends Genet</source>
<year>2003</year>
<volume>19</volume>
<fpage>479</fpage>
<lpage>84</lpage>
<pub-id pub-id-type="pmid">12957540</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B86">
<label>86</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Harrington</surname>
<given-names>ED</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>AH</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Protein function space: viewing the limits or limited by our view?</article-title>
<source>Curr Opin Struct Biol</source>
<year>2007</year>
<volume>17</volume>
<fpage>362</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">17574832</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B87">
<label>87</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Beszteri</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Temperton</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Frickenhaus</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Average genome size: a potential source of bias in comparative metagenomics</article-title>
<source>ISME J</source>
<year>2010</year>
<volume>4</volume>
<fpage>1075</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="pmid">20336158</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B88">
<label>88</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gomez-Alvarez</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Teal</surname>
<given-names>TK</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>TM</given-names>
</name>
</person-group>
<article-title>Systematic artifacts in metagenomes from complex microbial communities</article-title>
<source>ISME J</source>
<year>2009</year>
<volume>3</volume>
<fpage>1314</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="pmid">19587772</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B89">
<label>89</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peltola</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Soderlund</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ukkonen</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Algorithms for the search of amino acid patterns in nucleic acid sequences</article-title>
<source>Nucleic Acids Res</source>
<year>1986</year>
<volume>14</volume>
<fpage>99</fpage>
<lpage>107</lpage>
<pub-id pub-id-type="pmid">3753796</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B90">
<label>90</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guan</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Uberbacher</surname>
<given-names>EC</given-names>
</name>
</person-group>
<article-title>Alignments of DNA and protein sequences containing frameshift errors</article-title>
<source>Comput Appl Biosci</source>
<year>1996</year>
<volume>12</volume>
<fpage>31</fpage>
<lpage>40</lpage>
<pub-id pub-id-type="pmid">8670617</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B91">
<label>91</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brown</surname>
<given-names>NP</given-names>
</name>
<name>
<surname>Sander</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Frame: detection of genomic sequencing errors</article-title>
<source>Bioinformatics</source>
<year>1998</year>
<volume>14</volume>
<fpage>367</fpage>
<lpage>71</lpage>
<pub-id pub-id-type="pmid">9632832</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B92">
<label>92</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Halperin</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Faigler</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gill-More</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>FramePlus: aligning DNA to protein sequences</article-title>
<source>Bioinformatics</source>
<year>1999</year>
<volume>15</volume>
<fpage>867</fpage>
<lpage>73</lpage>
<pub-id pub-id-type="pmid">10743553</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B93">
<label>93</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors</article-title>
<source>BMC Bioinformatics</source>
<year>2011</year>
<volume>12</volume>
<fpage>198</fpage>
<lpage>207</lpage>
<pub-id pub-id-type="pmid">21609463</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B94">
<label>94</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Venter</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Remington</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>JF</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Environmental genome shotgun sequencing of the Sargasso Sea</article-title>
<source>Science</source>
<year>2004</year>
<volume>304</volume>
<fpage>66</fpage>
<lpage>74</lpage>
<pub-id pub-id-type="pmid">15001713</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B95">
<label>95</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Johnston</surname>
<given-names>AW</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Ogilvie</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Metagenomic marine nitrogen fixation–feast or famine?</article-title>
<source>Trends Microbiol</source>
<year>2005</year>
<volume>13</volume>
<fpage>416</fpage>
<lpage>20</lpage>
<pub-id pub-id-type="pmid">16043354</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B96">
<label>96</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Remington</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Venter</surname>
<given-names>JC</given-names>
</name>
</person-group>
<article-title>Taking metagenomic studies in context</article-title>
<source>Trends Microbiol</source>
<year>2005</year>
<volume>13</volume>
<fpage>404</fpage>
<pub-id pub-id-type="pmid">16039858</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B97">
<label>97</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qin</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A human gut microbial gene catalogue established by metagenomic sequencing</article-title>
<source>Nature</source>
<year>2010</year>
<volume>464</volume>
<fpage>59</fpage>
<lpage>65</lpage>
<pub-id pub-id-type="pmid">20203603</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B98">
<label>98</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brown</surname>
<given-names>CT</given-names>
</name>
<name>
<surname>vis-Richardson</surname>
<given-names>AG</given-names>
</name>
<name>
<surname>Giongo</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Gut microbiome metagenomics analysis suggests a functional model for the development of autoimmunity for type 1 diabetes</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<fpage>e25792</fpage>
<lpage>800</lpage>
<pub-id pub-id-type="pmid">22043294</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B99">
<label>99</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Belda-Ferre</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Alcaraz</surname>
<given-names>LD</given-names>
</name>
<name>
<surname>Cabrera-Rubio</surname>
<given-names>R</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The oral metagenome in health and disease</article-title>
<source>ISME J</source>
<year>2012</year>
<volume>6</volume>
<fpage>46</fpage>
<lpage>56</lpage>
<pub-id pub-id-type="pmid">21716308</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B100">
<label>100</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sharma</surname>
<given-names>VK</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Prakash</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>MetaBioME: a database to explore commercially useful enzymes in metagenomic datasets</article-title>
<source>Nucleic Acids Res</source>
<year>2010</year>
<volume>38</volume>
<fpage>D468</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="pmid">19906710</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B101">
<label>101</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tasse</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bercovici</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Pizzut-Serin</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Functional metagenomics to mine the human gut microbiome for dietary fiber catabolic enzymes</article-title>
<source>Genome Res</source>
<year>2010</year>
<volume>20</volume>
<fpage>1605</fpage>
<lpage>12</lpage>
<pub-id pub-id-type="pmid">20841432</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B102">
<label>102</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Belda-Ferre</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Cabrera-Rubio</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Moya</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Mining virulence genes using metagenomics</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<fpage>e24975</fpage>
<lpage>80</lpage>
<pub-id pub-id-type="pmid">22039404</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B103">
<label>103</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Turnbaugh</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Quince</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Faith</surname>
<given-names>JJ</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2010</year>
<volume>107</volume>
<fpage>7503</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="pmid">20363958</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B104">
<label>104</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gosalbes</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Durban</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Pignatelli</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Metatranscriptomic approach to analyze the functional human gut microbiota</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<fpage>e17447</fpage>
<lpage>55</lpage>
<pub-id pub-id-type="pmid">21408168</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B105">
<label>105</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taverna</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Goldstein</surname>
<given-names>RA</given-names>
</name>
</person-group>
<article-title>Why are proteins marginally stable?</article-title>
<source>Proteins</source>
<year>2002</year>
<volume>46</volume>
<fpage>105</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">11746707</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B106">
<label>106</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klaassens</surname>
<given-names>ES</given-names>
</name>
<name>
<surname>de Vos</surname>
<given-names>WM</given-names>
</name>
<name>
<surname>Vaughan</surname>
<given-names>EE</given-names>
</name>
</person-group>
<article-title>Metaproteomics approach to study the functionality of the microbiota in the human infant gastrointestinal tract</article-title>
<source>Appl Environ Microbiol</source>
<year>2007</year>
<volume>73</volume>
<fpage>1388</fpage>
<lpage>92</lpage>
<pub-id pub-id-type="pmid">17158612</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B107">
<label>107</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Verberkmoes</surname>
<given-names>NC</given-names>
</name>
<name>
<surname>Russell</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Shah</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Shotgun metaproteomics of the human distal gut microbiota</article-title>
<source>ISME J</source>
<year>2009</year>
<volume>3</volume>
<fpage>179</fpage>
<lpage>89</lpage>
<pub-id pub-id-type="pmid">18971961</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B108">
<label>108</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>X</given-names>
</name>
<name>
<surname>LeBlanc</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Truong</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A metaproteomic approach to study human-microbial ecosystems at the mucosal luminal interface</article-title>
<source>PLoS One</source>
<year>2011</year>
<volume>6</volume>
<fpage>e26542</fpage>
<lpage>55</lpage>
<pub-id pub-id-type="pmid">22132074</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B109">
<label>109</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kolmeder</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>de</surname>
<given-names>BM</given-names>
</name>
<name>
<surname>Nikkila</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Comparative metaproteomics and diversity analysis of human intestinal microbiota testifies for its temporal stability and expression of core functions</article-title>
<source>PLoS One</source>
<year>2012</year>
<volume>7</volume>
<fpage>e29913</fpage>
<lpage>26</lpage>
<pub-id pub-id-type="pmid">22279554</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B110">
<label>110</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kaddurah-Daouk</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Kristal</surname>
<given-names>BS</given-names>
</name>
<name>
<surname>Weinshilboum</surname>
<given-names>RM</given-names>
</name>
</person-group>
<article-title>Metabolomics: a global biochemical approach to drug response and disease</article-title>
<source>Annu Rev Pharmacol Toxicol</source>
<year>2008</year>
<volume>48</volume>
<fpage>653</fpage>
<lpage>83</lpage>
<pub-id pub-id-type="pmid">18184107</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B111">
<label>111</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saito</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Matsuda</surname>
<given-names>F</given-names>
</name>
</person-group>
<article-title>Metabolomics for functional genomics, systems biology, and biotechnology</article-title>
<source>Annu Rev Plant Biol</source>
<year>2010</year>
<volume>61</volume>
<fpage>463</fpage>
<lpage>89</lpage>
<pub-id pub-id-type="pmid">19152489</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B112">
<label>112</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Claus</surname>
<given-names>SP</given-names>
</name>
<name>
<surname>Tsang</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Systemic multicompartmental effects of the gut microbiome on mouse metabolic phenotypes</article-title>
<source>Mol Syst Biol</source>
<year>2008</year>
<volume>4</volume>
<fpage>219</fpage>
<pub-id pub-id-type="pmid">18854818</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B113">
<label>113</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fukuda</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Nakanishi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Chikayama</surname>
<given-names>E</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Evaluation and characterization of bacterial metabolic dynamics with a novel profiling technique, real-time metabolotyping</article-title>
<source>PLoS One</source>
<year>2009</year>
<volume>4</volume>
<fpage>e4893</fpage>
<lpage>902</lpage>
<pub-id pub-id-type="pmid">19287504</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B114">
<label>114</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Han</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Antunes</surname>
<given-names>LC</given-names>
</name>
<name>
<surname>Finlay</surname>
<given-names>BB</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Metabolomics: towards understanding host-microbe interactions</article-title>
<source>Future Microbiol</source>
<year>2010</year>
<volume>5</volume>
<fpage>153</fpage>
<lpage>61</lpage>
<pub-id pub-id-type="pmid">20143941</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B115">
<label>115</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Claus</surname>
<given-names>SP</given-names>
</name>
<name>
<surname>Ellero</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>B</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Colonization-induced host-gut microbial metabolic interaction</article-title>
<source>MBIO</source>
<year>2011</year>
<volume>2</volume>
<fpage>e00271</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="pmid">21363910</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B116">
<label>116</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fukuda</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Toh</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Hase</surname>
<given-names>K</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Bifidobacteria can protect from enteropathogenic infection through production of acetate</article-title>
<source>Nature</source>
<year>2011</year>
<volume>469</volume>
<fpage>543</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="pmid">21270894</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B117">
<label>117</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nakanishi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Fukuda</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chikayama</surname>
<given-names>E</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Dynamic omics approach identifies nutrition-mediated microbial interactions</article-title>
<source>J Proteome Res</source>
<year>2011</year>
<volume>10</volume>
<fpage>824</fpage>
<lpage>36</lpage>
<pub-id pub-id-type="pmid">21058740</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B118">
<label>118</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schubotz</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Wakeham</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Lipp</surname>
<given-names>JS</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Detection of microbial biomass by intact polar membrane lipid analysis in the water column and surface sediments of the Black Sea</article-title>
<source>Environ Microbiol</source>
<year>2009</year>
<volume>11</volume>
<fpage>2720</fpage>
<lpage>34</lpage>
<pub-id pub-id-type="pmid">19624710</pub-id>
</element-citation>
</ref>
<ref id="bbs033-B119">
<label>119</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pitcher</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hopmans</surname>
<given-names>EC</given-names>
</name>
<name>
<surname>Mosier</surname>
<given-names>AC</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Core and intact polar glycerol dibiphytanyl glycerol tetraether lipids of ammonia-oxidizing archaea enriched from marine and estuarine sediments</article-title>
<source>Appl Environ Microbiol</source>
<year>2011</year>
<volume>77</volume>
<fpage>3468</fpage>
<lpage>77</lpage>
<pub-id pub-id-type="pmid">21441324</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0005069 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0005069 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024