Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets

Identifieur interne : 000082 ( Pmc/Corpus ); précédent : 000081; suivant : 000083

COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets

Auteurs : Tungadri Bose ; Mohammed Monzoorul Haque ; Cvsk Reddy ; Sharmila S. Mande

Source :

RBID : PMC:4641738

Abstract

Background

Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against "multiple" databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations.

Results

Validation experiments performed with real-world metagenomes and metatranscriptomes, generated using diverse sequencing technologies, indicate that the novel directed-search strategy employed in COGNIZER helps in reducing the compute requirements without significant loss in annotation accuracy. A comparison of COGNIZER's results with pre-computed benchmark values indicate the reliability of the cross-mapping database employed in COGNIZER.

Conclusion

The COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset from varied sequencing platforms in functional terms. Multiple search options in COGNIZER provide end-users the flexibility of choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER is of high utility since it enables end-users to directly infer/derive KEGG, Pfam, GO, and SEED subsystem annotations from COG categorizations. Furthermore, availability of COGNIZER as a stand-alone scalable implementation is expected to make it a valuable annotation tool in the field of metagenomic research.

Availability and Implementation

A Linux implementation of COGNIZER is freely available for download from the following links: http://metagenomics.atc.tcs.com/cognizer, https://metagenomics.atc.tcs.com/function/cognizer.


Url:
DOI: 10.1371/journal.pone.0142102
PubMed: 26561344
PubMed Central: 4641738

Links to Exploration step

PMC:4641738

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets</title>
<author>
<name sortKey="Bose, Tungadri" sort="Bose, Tungadri" uniqKey="Bose T" first="Tungadri" last="Bose">Tungadri Bose</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Haque, Mohammed Monzoorul" sort="Haque, Mohammed Monzoorul" uniqKey="Haque M" first="Mohammed Monzoorul" last="Haque">Mohammed Monzoorul Haque</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Reddy, Cvsk" sort="Reddy, Cvsk" uniqKey="Reddy C" first="Cvsk" last="Reddy">Cvsk Reddy</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mande, Sharmila S" sort="Mande, Sharmila S" uniqKey="Mande S" first="Sharmila S." last="Mande">Sharmila S. Mande</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26561344</idno>
<idno type="pmc">4641738</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4641738</idno>
<idno type="RBID">PMC:4641738</idno>
<idno type="doi">10.1371/journal.pone.0142102</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000082</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets</title>
<author>
<name sortKey="Bose, Tungadri" sort="Bose, Tungadri" uniqKey="Bose T" first="Tungadri" last="Bose">Tungadri Bose</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Haque, Mohammed Monzoorul" sort="Haque, Mohammed Monzoorul" uniqKey="Haque M" first="Mohammed Monzoorul" last="Haque">Mohammed Monzoorul Haque</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Reddy, Cvsk" sort="Reddy, Cvsk" uniqKey="Reddy C" first="Cvsk" last="Reddy">Cvsk Reddy</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mande, Sharmila S" sort="Mande, Sharmila S" uniqKey="Mande S" first="Sharmila S." last="Mande">Sharmila S. Mande</name>
<affiliation>
<nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec id="sec001">
<title>Background</title>
<p>Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against "multiple" databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations.</p>
</sec>
<sec id="sec002">
<title>Results</title>
<p>Validation experiments performed with real-world metagenomes and metatranscriptomes, generated using diverse sequencing technologies, indicate that the novel directed-search strategy employed in COGNIZER helps in reducing the compute requirements without significant loss in annotation accuracy. A comparison of COGNIZER's results with pre-computed benchmark values indicate the reliability of the cross-mapping database employed in COGNIZER.</p>
</sec>
<sec id="sec003">
<title>Conclusion</title>
<p>The COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset from varied sequencing platforms in functional terms. Multiple search options in COGNIZER provide end-users the flexibility of choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER is of high utility since it enables end-users to directly infer/derive KEGG, Pfam, GO, and SEED subsystem annotations from COG categorizations. Furthermore, availability of COGNIZER as a stand-alone scalable implementation is expected to make it a valuable annotation tool in the field of metagenomic research.</p>
</sec>
<sec id="sec004">
<title>Availability and Implementation</title>
<p>A Linux implementation of COGNIZER is freely available for download from the following links:
<ext-link ext-link-type="uri" xlink:href="http://metagenomics.atc.tcs.com/cognizer">http://metagenomics.atc.tcs.com/cognizer</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="https://metagenomics.atc.tcs.com/function/cognizer">https://metagenomics.atc.tcs.com/function/cognizer</ext-link>
.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Prakash, T" uniqKey="Prakash T">T Prakash</name>
</author>
<author>
<name sortKey="Taylor, Td" uniqKey="Taylor T">TD Taylor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y Ye</name>
</author>
<author>
<name sortKey="Choi J, H" uniqKey="Choi J H">H Choi J-</name>
</author>
<author>
<name sortKey="Tang, H" uniqKey="Tang H">H Tang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Xie, C" uniqKey="Xie C">C Xie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Mitra, S" uniqKey="Mitra S">S Mitra</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sanli, K" uniqKey="Sanli K">K Sanli</name>
</author>
<author>
<name sortKey="Karlsson, Fh" uniqKey="Karlsson F">FH Karlsson</name>
</author>
<author>
<name sortKey="Nookaew, I" uniqKey="Nookaew I">I Nookaew</name>
</author>
<author>
<name sortKey="Nielsen, J" uniqKey="Nielsen J">J Nielsen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, K" uniqKey="Yu K">K Yu</name>
</author>
<author>
<name sortKey="Zhang, T" uniqKey="Zhang T">T Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
<author>
<name sortKey="Paarmann, D" uniqKey="Paarmann D">D Paarmann</name>
</author>
<author>
<name sortKey="D Ouza, M" uniqKey="D Ouza M">M D’Souza</name>
</author>
<author>
<name sortKey="Olson, R" uniqKey="Olson R">R Olson</name>
</author>
<author>
<name sortKey="Glass, Em" uniqKey="Glass E">EM Glass</name>
</author>
<author>
<name sortKey="Kubal, M" uniqKey="Kubal M">M Kubal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goll, J" uniqKey="Goll J">J Goll</name>
</author>
<author>
<name sortKey="Rusch, Db" uniqKey="Rusch D">DB Rusch</name>
</author>
<author>
<name sortKey="Tanenbaum, Dm" uniqKey="Tanenbaum D">DM Tanenbaum</name>
</author>
<author>
<name sortKey="Thiagarajan, M" uniqKey="Thiagarajan M">M Thiagarajan</name>
</author>
<author>
<name sortKey="Li, K" uniqKey="Li K">K Li</name>
</author>
<author>
<name sortKey="Methe, Ba" uniqKey="Methe B">BA Methé</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Altintas, I" uniqKey="Altintas I">I Altintas</name>
</author>
<author>
<name sortKey="Lin, A" uniqKey="Lin A">A Lin</name>
</author>
<author>
<name sortKey="Peltier, S" uniqKey="Peltier S">S Peltier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lingner, T" uniqKey="Lingner T">T Lingner</name>
</author>
<author>
<name sortKey="Asshauer, Kp" uniqKey="Asshauer K">KP Asshauer</name>
</author>
<author>
<name sortKey="Schreiber, F" uniqKey="Schreiber F">F Schreiber</name>
</author>
<author>
<name sortKey="Meinicke, P" uniqKey="Meinicke P">P Meinicke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Markowitz, Vm" uniqKey="Markowitz V">VM Markowitz</name>
</author>
<author>
<name sortKey="Chen, I Ma" uniqKey="Chen I">I-MA Chen</name>
</author>
<author>
<name sortKey="Chu, K" uniqKey="Chu K">K Chu</name>
</author>
<author>
<name sortKey="Szeto, E" uniqKey="Szeto E">E Szeto</name>
</author>
<author>
<name sortKey="Palaniappan, K" uniqKey="Palaniappan K">K Palaniappan</name>
</author>
<author>
<name sortKey="Grechkin, Y" uniqKey="Grechkin Y">Y Grechkin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J Qin</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Cai, Z" uniqKey="Cai Z">Z Cai</name>
</author>
<author>
<name sortKey="Li, S" uniqKey="Li S">S Li</name>
</author>
<author>
<name sortKey="Zhu, J" uniqKey="Zhu J">J Zhu</name>
</author>
<author>
<name sortKey="Zhang, F" uniqKey="Zhang F">F Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlsson, Fh" uniqKey="Karlsson F">FH Karlsson</name>
</author>
<author>
<name sortKey="Tremaroli, V" uniqKey="Tremaroli V">V Tremaroli</name>
</author>
<author>
<name sortKey="Nookaew, I" uniqKey="Nookaew I">I Nookaew</name>
</author>
<author>
<name sortKey="Bergstrom, G" uniqKey="Bergstrom G">G Bergström</name>
</author>
<author>
<name sortKey="Behre, Cj" uniqKey="Behre C">CJ Behre</name>
</author>
<author>
<name sortKey="Fagerberg, B" uniqKey="Fagerberg B">B Fagerberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tatusov, Rl" uniqKey="Tatusov R">RL Tatusov</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
<author>
<name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
<author>
<name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
<author>
<name sortKey="Sato, Y" uniqKey="Sato Y">Y Sato</name>
</author>
<author>
<name sortKey="Kawashima, M" uniqKey="Kawashima M">M Kawashima</name>
</author>
<author>
<name sortKey="Furumichi, M" uniqKey="Furumichi M">M Furumichi</name>
</author>
<author>
<name sortKey="Tanabe, M" uniqKey="Tanabe M">M Tanabe</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Overbeek, R" uniqKey="Overbeek R">R Overbeek</name>
</author>
<author>
<name sortKey="Begley, T" uniqKey="Begley T">T Begley</name>
</author>
<author>
<name sortKey="Butler, Rm" uniqKey="Butler R">RM Butler</name>
</author>
<author>
<name sortKey="Choudhuri, Jv" uniqKey="Choudhuri J">JV Choudhuri</name>
</author>
<author>
<name sortKey="Chuang H, Y" uniqKey="Chuang H Y">Y Chuang H-</name>
</author>
<author>
<name sortKey="Cohoon, M" uniqKey="Cohoon M">M Cohoon</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Punta, M" uniqKey="Punta M">M Punta</name>
</author>
<author>
<name sortKey="Coggill, Pc" uniqKey="Coggill P">PC Coggill</name>
</author>
<author>
<name sortKey="Eberhardt, Ry" uniqKey="Eberhardt R">RY Eberhardt</name>
</author>
<author>
<name sortKey="Mistry, J" uniqKey="Mistry J">J Mistry</name>
</author>
<author>
<name sortKey="Tate, J" uniqKey="Tate J">J Tate</name>
</author>
<author>
<name sortKey="Boursnell, C" uniqKey="Boursnell C">C Boursnell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goujon, M" uniqKey="Goujon M">M Goujon</name>
</author>
<author>
<name sortKey="Mcwilliam, H" uniqKey="Mcwilliam H">H McWilliam</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Valentin, F" uniqKey="Valentin F">F Valentin</name>
</author>
<author>
<name sortKey="Squizzato, S" uniqKey="Squizzato S">S Squizzato</name>
</author>
<author>
<name sortKey="Paern, J" uniqKey="Paern J">J Paern</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yamada, T" uniqKey="Yamada T">T Yamada</name>
</author>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Okuda, S" uniqKey="Okuda S">S Okuda</name>
</author>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dinsdale, Ea" uniqKey="Dinsdale E">EA Dinsdale</name>
</author>
<author>
<name sortKey="Edwards, Ra" uniqKey="Edwards R">RA Edwards</name>
</author>
<author>
<name sortKey="Hall, D" uniqKey="Hall D">D Hall</name>
</author>
<author>
<name sortKey="Angly, F" uniqKey="Angly F">F Angly</name>
</author>
<author>
<name sortKey="Breitbart, M" uniqKey="Breitbart M">M Breitbart</name>
</author>
<author>
<name sortKey="Brulc, Jm" uniqKey="Brulc J">JM Brulc</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Warren, Rl" uniqKey="Warren R">RL Warren</name>
</author>
<author>
<name sortKey="Freeman, Dj" uniqKey="Freeman D">DJ Freeman</name>
</author>
<author>
<name sortKey="Pleasance, S" uniqKey="Pleasance S">S Pleasance</name>
</author>
<author>
<name sortKey="Watson, P" uniqKey="Watson P">P Watson</name>
</author>
<author>
<name sortKey="Moore, Ra" uniqKey="Moore R">RA Moore</name>
</author>
<author>
<name sortKey="Cochrane, K" uniqKey="Cochrane K">K Cochrane</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gupta, Ss" uniqKey="Gupta S">SS Gupta</name>
</author>
<author>
<name sortKey="Mohammed, Mh" uniqKey="Mohammed M">MH Mohammed</name>
</author>
<author>
<name sortKey="Ghosh, Ts" uniqKey="Ghosh T">TS Ghosh</name>
</author>
<author>
<name sortKey="Kanungo, S" uniqKey="Kanungo S">S Kanungo</name>
</author>
<author>
<name sortKey="Nair, Gb" uniqKey="Nair G">GB Nair</name>
</author>
<author>
<name sortKey="Mande, Ss" uniqKey="Mande S">SS Mande</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Belda Ferre, P" uniqKey="Belda Ferre P">P Belda-Ferre</name>
</author>
<author>
<name sortKey="Alcaraz, Ld" uniqKey="Alcaraz L">LD Alcaraz</name>
</author>
<author>
<name sortKey="Cabrera Rubio, R" uniqKey="Cabrera Rubio R">R Cabrera-Rubio</name>
</author>
<author>
<name sortKey="Romero, H" uniqKey="Romero H">H Romero</name>
</author>
<author>
<name sortKey="Sim N Soro, A" uniqKey="Sim N Soro A">A Simón-Soro</name>
</author>
<author>
<name sortKey="Pignatelli, M" uniqKey="Pignatelli M">M Pignatelli</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
<author>
<name sortKey="Chapman, J" uniqKey="Chapman J">J Chapman</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Allen, Ee" uniqKey="Allen E">EE Allen</name>
</author>
<author>
<name sortKey="Ram, Rj" uniqKey="Ram R">RJ Ram</name>
</author>
<author>
<name sortKey="Richardson, Pm" uniqKey="Richardson P">PM Richardson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Buchfink, B" uniqKey="Buchfink B">B Buchfink</name>
</author>
<author>
<name sortKey="Xie, C" uniqKey="Xie C">C Xie</name>
</author>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, CA USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">26561344</article-id>
<article-id pub-id-type="pmc">4641738</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0142102</article-id>
<article-id pub-id-type="publisher-id">PONE-D-15-19467</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets</article-title>
<alt-title alt-title-type="running-head">COGNIZER: Framework for Functional Annotation of Metagenomes</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Bose</surname>
<given-names>Tungadri</given-names>
</name>
<xref ref-type="aff" rid="aff001"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Haque</surname>
<given-names>Mohammed Monzoorul</given-names>
</name>
<xref ref-type="aff" rid="aff001"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Reddy</surname>
<given-names>CVSK</given-names>
</name>
<xref ref-type="aff" rid="aff001"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Mande</surname>
<given-names>Sharmila S.</given-names>
</name>
<xref rid="cor001" ref-type="corresp">*</xref>
<xref ref-type="aff" rid="aff001"></xref>
</contrib>
</contrib-group>
<aff id="aff001">
<addr-line>Bio-Sciences R&D Division, TCS Innovation Labs, Tata Consultancy Services Limited, 54-B, Hadapsar Industrial Estate, Pune, 411013, Maharashtra, India</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Raghava</surname>
<given-names>Gajendra P. S.</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">
<addr-line>CSIR-Institute of Microbial Technology, INDIA</addr-line>
</aff>
<author-notes>
<fn fn-type="conflict" id="coi001">
<p>
<bold>Competing Interests: </bold>
The authors of this study are employees of a commercial company (Tata Consultancy Services Limited). However, this does not alter their adherence to PLOS ONE policies on sharing data and materials. The authors also declare that no competing interests exist. The specific roles of individual authors are articulated in the "author contributions" section.</p>
</fn>
<fn fn-type="con" id="contrib001">
<p>Conceived and designed the experiments: TB MMH SSM. Performed the experiments: TB MMH. Analyzed the data: TB MMH SSM. Contributed reagents/materials/analysis tools: TB CVSKR. Wrote the paper: TB MMH SSM. Algorithm implementation: TB CVSKR.</p>
</fn>
<corresp id="cor001">* E-mail:
<email>sharmila.mande@tcs.com</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>11</day>
<month>11</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<volume>10</volume>
<issue>11</issue>
<elocation-id>e0142102</elocation-id>
<history>
<date date-type="received">
<day>5</day>
<month>5</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>10</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-year>2015</copyright-year>
<copyright-holder>Bose et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open access article distributed under the terms of the
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>
, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:type="simple" xlink:href="pone.0142102.pdf"></self-uri>
<abstract>
<sec id="sec001">
<title>Background</title>
<p>Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against "multiple" databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations.</p>
</sec>
<sec id="sec002">
<title>Results</title>
<p>Validation experiments performed with real-world metagenomes and metatranscriptomes, generated using diverse sequencing technologies, indicate that the novel directed-search strategy employed in COGNIZER helps in reducing the compute requirements without significant loss in annotation accuracy. A comparison of COGNIZER's results with pre-computed benchmark values indicate the reliability of the cross-mapping database employed in COGNIZER.</p>
</sec>
<sec id="sec003">
<title>Conclusion</title>
<p>The COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset from varied sequencing platforms in functional terms. Multiple search options in COGNIZER provide end-users the flexibility of choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER is of high utility since it enables end-users to directly infer/derive KEGG, Pfam, GO, and SEED subsystem annotations from COG categorizations. Furthermore, availability of COGNIZER as a stand-alone scalable implementation is expected to make it a valuable annotation tool in the field of metagenomic research.</p>
</sec>
<sec id="sec004">
<title>Availability and Implementation</title>
<p>A Linux implementation of COGNIZER is freely available for download from the following links:
<ext-link ext-link-type="uri" xlink:href="http://metagenomics.atc.tcs.com/cognizer">http://metagenomics.atc.tcs.com/cognizer</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="https://metagenomics.atc.tcs.com/function/cognizer">https://metagenomics.atc.tcs.com/function/cognizer</ext-link>
.</p>
</sec>
</abstract>
<funding-group>
<funding-statement>The authors [TB, MMH, CVSKR, and SSM] of this study declare that they are employees of Tata Consultancy Services Limited, a commercial company. The company has provided support for this study in the form of salaries to authors [TB, MMH, CVSKR, and SSM] but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement>
</funding-group>
<counts>
<fig-count count="5"></fig-count>
<table-count count="2"></table-count>
<page-count count="16"></page-count>
</counts>
<custom-meta-group>
<custom-meta id="data-availability">
<meta-name>Data Availability</meta-name>
<meta-value>All relevant data are within the paper and its Supporting Information files. A Linux implementation of COGNIZER is freely available for download from the following links:
<ext-link ext-link-type="uri" xlink:href="http://metagenomics.atc.tcs.com/cognizer">http://metagenomics.atc.tcs.com/cognizer</ext-link>
;
<ext-link ext-link-type="uri" xlink:href="https://metagenomics.atc.tcs.com/function/cognizer">https://metagenomics.atc.tcs.com/function/cognizer</ext-link>
.</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
<notes>
<title>Data Availability</title>
<p>All relevant data are within the paper and its Supporting Information files. A Linux implementation of COGNIZER is freely available for download from the following links:
<ext-link ext-link-type="uri" xlink:href="http://metagenomics.atc.tcs.com/cognizer">http://metagenomics.atc.tcs.com/cognizer</ext-link>
;
<ext-link ext-link-type="uri" xlink:href="https://metagenomics.atc.tcs.com/function/cognizer">https://metagenomics.atc.tcs.com/function/cognizer</ext-link>
.</p>
</notes>
</front>
<body>
<sec sec-type="intro" id="sec005">
<title>Introduction</title>
<p>Recent advances in Next Generation Sequencing (NGS) techniques have facilitated large scale sequencing of genomes of microbial communities (also referred to as metagenomes) residing in diverse ecological niches. Sequencing data generated from metagenomes typically consists of millions of short nucleotide fragments (also referred to as 'reads'). One of the important steps in metagenomic data analysis pertains to estimation of the functional potential of various protein coding genes in the given dataset. Although homology-based approaches (like BLAST) provide the most accurate results with respect to functional annotation [
<xref rid="pone.0142102.ref001" ref-type="bibr">1</xref>
], the compute requirements of these approaches are enormous. Approaches employing additional heuristic steps (as compared to BLAST), have therefore been developed in order to address the problem of huge computational time required for protein database searches. For example, tools like RAPSearch [
<xref rid="pone.0142102.ref002" ref-type="bibr">2</xref>
] and PAUDA [
<xref rid="pone.0142102.ref003" ref-type="bibr">3</xref>
] employ 'inexact’ search techniques for significantly reducing computational costs that are typically associated with performing sequence similarity based searches. Although, these tools outperform BLAST-like approaches in terms of execution speed, the overall quality of results obtained with such tools, in several instances, is not at par with those obtained using non-heuristic approaches [
<xref rid="pone.0142102.ref003" ref-type="bibr">3</xref>
].</p>
<p>MEGAN [
<xref rid="pone.0142102.ref004" ref-type="bibr">4</xref>
] and FANTOM [
<xref rid="pone.0142102.ref005" ref-type="bibr">5</xref>
] are among the few applications that are available for stand-alone functional analysis of metagenomic reads. However, these GUI-based tools are primarily designed for analyzing pre-computed BLASTx results. End-users are required to perform compute intensive BLASTx searches prior to the use of these tools. For users having access to limited computing resources, this becomes a time consuming step. For example, a previous study had estimated that a computing facility with 1000-CPU compute-cluster would require approximately 30 days to complete a BLASTx search for a 20 GB metagenome against the NCBI-nr database [
<xref rid="pone.0142102.ref006" ref-type="bibr">6</xref>
].</p>
<p>Web-servers like MG-RAST [
<xref rid="pone.0142102.ref007" ref-type="bibr">7</xref>
], METAREP [
<xref rid="pone.0142102.ref008" ref-type="bibr">8</xref>
], CAMERA [
<xref rid="pone.0142102.ref009" ref-type="bibr">9</xref>
], CoMet [
<xref rid="pone.0142102.ref010" ref-type="bibr">10</xref>
], IMG/M [
<xref rid="pone.0142102.ref011" ref-type="bibr">11</xref>
], etc. provide an alternative means for end-users intending to perform functional annotation of metagenomic datasets. Although these web-servers provide a range of utilities for functional annotation, there is a limitation on the volume of reads that can be uploaded/analyzed by a specific end-user. Moreover, due to enormous demand, jobs submitted to these servers typically are processed based on priority-listing (which, in turn, is determined/governed by a variety of factors). In spite of appearing trivial, the stated limitations become a major hindrance for end-user's intending to analyze huge datasets. For instance, the total size of (Whole Genome Sequencing) datasets in recent metagenomic studies pertaining to diabetes [
<xref rid="pone.0142102.ref012" ref-type="bibr">12</xref>
,
<xref rid="pone.0142102.ref013" ref-type="bibr">13</xref>
] is in the range of 300–400 gigabytes. Uploading (and processing) such a huge volume of data to any of the public annotation servers may often prove to be infeasible for most end-users.</p>
<p>On a different note, protocols for functional annotation of individual sequences constituting metagenomic datasets are aimed at finding (a) COGs, i.e. the clusters of orthologous genes [
<xref rid="pone.0142102.ref014" ref-type="bibr">14</xref>
], (b) KEGG pathway mappings [
<xref rid="pone.0142102.ref015" ref-type="bibr">15</xref>
,
<xref rid="pone.0142102.ref016" ref-type="bibr">16</xref>
], (c) SEED subsystems [
<xref rid="pone.0142102.ref017" ref-type="bibr">17</xref>
], Gene Ontology (GO) [
<xref rid="pone.0142102.ref018" ref-type="bibr">18</xref>
], and (d) Pfam domain families [
<xref rid="pone.0142102.ref019" ref-type="bibr">19</xref>
]. Although existing stand-alone/web-based tools (mentioned above) provide functional annotations in terms of one or more of the above mentioned functional categories (viz., COG, KEGG, SEED, GO, Pfam), with the exception of the MG-RAST web-server, none of them provide functional annotations with respect to 'all' the mentioned categories. End-users of MG-RAST can obtain functional annotation of their uploaded datasets in terms of multiple functional databases like IMG, TrEMBL, PATRIC, SwissProt, Genbank NR, M5NR, SEED, RefSeq, eggNOG, KEGG, etc. However, as mentioned previously, the typical problems associated with uploading and analyzing huge volumes of sequenced data on a shared public web-service is definitely a major point of concern. In summary, the two major limitations of existing tools which are available for functional annotation of metagenomic datasets are (1) requirement of computationally expensive homology-based searches prior to use of stand-alone tools and (2) issues pertaining of usability (with respect to upload limit, analysis turn-around time, data privacy, etc.) of web-based services.</p>
<p>In this study, we present COGNIZER, a stand-alone framework that can be employed for functional annotation of metagenomic datasets. The framework provides four annotation workflow options (schematically represented in
<xref rid="pone.0142102.g001" ref-type="fig">Fig 1</xref>
). Each option employs a distinct 'homology-search' strategy requiring varying levels of compute power. These search options enable end-users to choose a homology-search protocol based on compute resources available at their disposal. End-users having access to huge compute power can employ the first and the second options i.e. the BLASTx and RAPSearch search options respectively. When there is limited availability of computing resources, users can deploy the COGNIZER framework with options 3 or 4. These two options employ a 'customized' COG database and use a novel directed-search strategy that can help in reducing the time required for database searches. The results of the ‘homology-search’ step (obtained using one of the above four options) are subsequently processed using the information present in COGNIZER’s customized cross-mapping database. This step enables end-users to obtain functional profiles of a given metagenome (with respect to multiple functional categories viz., COG, KEGG, SEED, GO, and Pfam) by performing a single database search.</p>
<fig id="pone.0142102.g001" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0142102.g001</object-id>
<label>Fig 1</label>
<caption>
<title>Workflow options in the COGNIZER workflow.</title>
<p>A schematic representation of the four workflow options in the COGNIZER framework.</p>
</caption>
<graphic xlink:href="pone.0142102.g001"></graphic>
</fig>
</sec>
<sec sec-type="materials|methods" id="sec006">
<title>Methods</title>
<p>
<xref rid="pone.0142102.g001" ref-type="fig">Fig 1</xref>
schematically represents the four annotation workflow options in the COGNIZER framework. Each option involves two phases, namely, a ‘homology-search’ phase and a 'mapping' phase, the latter phase being common to all four workflow options. In the 'homology-search' phase, sequences in the metagenomic dataset (to be analyzed) are queried against the COG database [
<xref rid="pone.0142102.ref014" ref-type="bibr">14</xref>
]. Options 1–4 differ with respect to the employed 'homology-search' strategy as well as the format of the COG database. The subsequent ‘mapping’ phase involves inferring KEGG, SEED, GO, and Pfam annotations from the COG annotations obtained in the ‘homology-search’ phase. A customized cross-mapping database is employed for this purpose. The following sections describe (a) the structure of the COG database utilized in each workflow option (b) the protocol used for creating the cross-mapping database, and (c) the overall algorithm employed (in each workflow) for obtaining functional annotations.</p>
<sec id="sec007">
<title>(Customized) COG database</title>
<p>The COG database (available for download at
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/COG/">http://www.ncbi.nlm.nih.gov/COG/</ext-link>
) consists of approximately 200,000 protein sequences categorized into various COG groups. All protein sequences in the COG database are tagged to at least one of the 25 major functional COG categories [
<xref rid="pone.0142102.ref014" ref-type="bibr">14</xref>
]. This database, in its original form, is employed in options 1 and 2. For options 3 and 4, a 'customized' version of this database was created using the following procedure (
<xref rid="pone.0142102.g002" ref-type="fig">Fig 2</xref>
). Sequences in each functional COG category were first clustered using ClustalW2 [
<xref rid="pone.0142102.ref020" ref-type="bibr">20</xref>
] in default mode. This resulted in generating one or more clusters for each COG category. Subsequently, the longest protein sequence from each cluster was chosen and tagged to indicate the COG category to which it belonged. All (tagged) representative sequences were pooled together to form the 'customized' COG database. It may be noted that the customized database, thus generated, is approximately one-sixth in size as compared to the original COG database.</p>
<fig id="pone.0142102.g002" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0142102.g002</object-id>
<label>Fig 2</label>
<caption>
<title>Creation of the 'customized' COG database.</title>
<p>A schematic diagram illustrating the steps involved in the creation of the 'customized' COG database.</p>
</caption>
<graphic xlink:href="pone.0142102.g002"></graphic>
</fig>
</sec>
<sec id="sec008">
<title>Cross-mapping database</title>
<p>A schematic representation of the steps involved in the creation of cross-mapping database is provided in
<xref rid="pone.0142102.g003" ref-type="fig">Fig 3</xref>
. The following sequence-search/data-mining approaches were employed for building a database containing cross-relationships between COG and other functional databases. Mapping between COG and KEGG identifiers were obtained by (a) mining COG-KEGG mapping information from the iPath database [
<xref rid="pone.0142102.ref021" ref-type="bibr">21</xref>
], and (b) performing BLAST-based searches of protein sequences from the COG database against the sequences from KEGG databases (using the CAMERA web-service). Information from both these sources was collated into a unified mapping file. In cases where the mapping between the two sources did not match, the mapping obtained using the BLAST approach was given preference. COG and Pfam identifier mappings were obtained by comparing COG database sequences against Pfam database. This comparison was done using hmmscan tool from the HMMER package [
<xref rid="pone.0142102.ref022" ref-type="bibr">22</xref>
]. Mappings between GO and PFAM annotations were obtained from the GO website (
<ext-link ext-link-type="uri" xlink:href="http://www.geneontology.org/external2go/pfam2go">http://www.geneontology.org/external2go/pfam2go</ext-link>
). These mappings were further processed to infer cross-relationships between GO and COG entries. Sequence homology searches were used for obtaining the SEED-COG mappings.</p>
<fig id="pone.0142102.g003" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0142102.g003</object-id>
<label>Fig 3</label>
<caption>
<title>Procedure adopted for obtaining cross-mapping information.</title>
<p>Procedure adopted for obtaining cross-mapping information amongst sequences in the COG and the other protein functional databases (KEGG, Pfam, GO and SEED).</p>
</caption>
<graphic xlink:href="pone.0142102.g003"></graphic>
</fig>
</sec>
<sec id="sec009">
<title>Algorithm</title>
<p>Details of the four workflow options in COGNIZER method are as follows. In option 1, the BLASTx method is employed (in the homology-search phase) for querying reads constituting metagenomic datasets. The search is performed against all sequences in the COG database. The query sequence is assigned to the COG category that corresponds to the highest scoring BLASTx hit whose e-value is lower than a user-specified threshold. In the subsequent 'mapping' phase, for each query, functional annotation with respect to other databases viz., KEGG, SEED, GO, and Pfam is inferred using COGNIZER's cross-mapping database. Steps in option 2 are similar to those in option 1 except for the usage of RAPSearch algorithm instead of BLASTx (in the homology-search phase).</p>
<p>
<xref rid="pone.0142102.g004" ref-type="fig">Fig 4</xref>
illustrates the overall workflow for options 3 and 4 of COGNIZER. These options work in the following manner. In the first step, query sequences in the input metagenomic dataset are partitioned into subsets by performing similarity searches against sequences in the 'customized' COG database. This results in generating 25 query subsets, each subset consisting of sequences that have similarity to one of the 25 major COG categories. In other words, step 1 result in assigning a tentative high-level COG classification to each query sequence. In step 2, sequences in each query subset (tagged in step 1 to a COG category) are searched only against the subset of COG database sequences that belong to the same COG category. This directed-search approach (wherein subsets of query sequences are searched only against respective database partitions) therefore significantly reduces the search-space, and consequently decreases the overall compute requirements. At the end of step 2, sequences in the query dataset are annotated in terms of COG functional categories. In step 3, the pre-computed cross-mapping database is employed for extrapolating the obtained COG annotations to directly infer functional annotations corresponding to KEGG, Pfam, GO, and SEED subsystem databases. This extrapolation step does not involve compute intensive (alignment-based) searches. It may be noted that while option 3 employs BLASTx, option 4 uses the RAPSearch algorithm in steps 1 and 2.</p>
<fig id="pone.0142102.g004" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0142102.g004</object-id>
<label>Fig 4</label>
<caption>
<title>Workflows adopted in options 3 and 4 of the COGNIZER framework.</title>
<p>A flow-diagram depicting the steps adopted in workflow options 3 and 4 of the COGNIZER framework.</p>
</caption>
<graphic xlink:href="pone.0142102.g004"></graphic>
</fig>
</sec>
<sec id="sec010">
<title>Validation datasets</title>
<p>The performance of the COGNIZER framework was evaluated using 21 real-world datasets comprising (a) 3 hyper-saline saltern metagenomes [
<xref rid="pone.0142102.ref023" ref-type="bibr">23</xref>
] (b) 7 metatranscriptomic datasets [
<xref rid="pone.0142102.ref024" ref-type="bibr">24</xref>
] (c) 2 gut metagenomes from healthy and malnourished children [
<xref rid="pone.0142102.ref025" ref-type="bibr">25</xref>
] (d) 8 oral metagenomes [
<xref rid="pone.0142102.ref026" ref-type="bibr">26</xref>
], and (e) acid mine drainage (AMD) metagenome [
<xref rid="pone.0142102.ref027" ref-type="bibr">27</xref>
]. These datasets were chosen since they fairly represented the typical characteristics of sequence data obtained using three well-known sequencing technologies viz. Illumina, Roche-454 and Sanger.</p>
</sec>
<sec id="sec011">
<title>Validation strategy</title>
<p>COGNIZER employs a cross-mapping database for deriving functional annotations corresponding to KEGG, SEED, GO, and Pfam from the COG annotations. It is therefore essential to verify the accuracy of the derived annotations. Furthermore, since some options available in COGNIZER utilize a ‘customized’ COG database (for reducing execution time), validation of the COG annotations obtained using these options is also required. Therefore, COG, KEGG, SEED, and Pfam annotations (for sequences in individual datasets) were obtained by ‘directly’ performing requisite homology searches against all sequences in the respective functional database. Annotations obtained in this manner were considered as ‘benchmarks’. Given that GO mappings in the COGNIZER framework were directly obtained from the GO website, the benchmark validation procedure for GO annotations was not performed. The results obtained with the four workflow options (in COGNIZER) were then compared against the pre-computed benchmark values.</p>
<p>The performance of the COGNIZER framework was evaluated in terms of (a) execution time (b) Positive Predictive Value (PPV), and (c) Negative Predictive Value (NPV). The latter two metrics were calculated as follows:
<disp-formula id="pone.0142102.e001">
<alternatives>
<graphic xlink:href="pone.0142102.e001.jpg" id="pone.0142102.e001g" position="anchor" mimetype="image" orientation="portrait"></graphic>
<mml:math id="M1">
<mml:mrow>
<mml:mi mathvariant="italic">PPV</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">number of true positives</mml:mi>
<mml:mo>/</mml:mo>
<mml:mspace width="0.25em"></mml:mspace>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="italic">number of true positives</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">number of false positives</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</disp-formula>
<disp-formula id="pone.0142102.e002">
<alternatives>
<graphic xlink:href="pone.0142102.e002.jpg" id="pone.0142102.e002g" position="anchor" mimetype="image" orientation="portrait"></graphic>
<mml:math id="M2">
<mml:mrow>
<mml:mi mathvariant="italic">NPV</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">number of true negatives</mml:mi>
<mml:mo>/</mml:mo>
<mml:mspace width="0.25em"></mml:mspace>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="italic">number of true negatives</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">number of false negatives</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</alternatives>
</disp-formula>
</p>
</sec>
</sec>
<sec id="sec012">
<title>Results and Discussion</title>
<p>The performance of various options of COGNIZER, in terms of PPV and NPV, is summarized in
<xref rid="pone.0142102.t001" ref-type="table">Table 1</xref>
. Results obtained with option 1 of COGNIZER (i.e. BLASTx followed by mapping step) indicate high (average >0.98) PPV and NPV values as compared to the benchmark values. These results confirm the reliability of the cross-mapping database employed in the COGNIZER framework. A relatively lower accuracy (average >0.94) is observed with option 2 which adopts the RAPSearch algorithm in the search phase. This is expected since RAPSearch employs a heuristic 'reduced amino acid alphabet' based search approach for reducing the associated computational costs. In this context, it is interesting to note that the marginal gain in annotation accuracy of BLASTx (option 1) over RAPSearch (option 2) comes at a huge computational cost. As observed in
<xref rid="pone.0142102.t002" ref-type="table">Table 2</xref>
, RAPSearch takes only one-fourth of the processing time required by BLASTx.</p>
<table-wrap id="pone.0142102.t001" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0142102.t001</object-id>
<label>Table 1</label>
<caption>
<title>Evaluation of COGNIZER's annotation results in terms of positive predictive value (PPV) and negative predictive value (NPV).</title>
</caption>
<alternatives>
<graphic id="pone.0142102.t001g" xlink:href="pone.0142102.t001"></graphic>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
</colgroup>
<thead>
<tr>
<th align="center" rowspan="1" colspan="1">Sequencing Platform</th>
<th align="center" rowspan="1" colspan="1">Dataset
<xref rid="t001fn001" ref-type="table-fn">
<sup>#</sup>
</xref>
</th>
<th align="center" rowspan="1" colspan="1">Option</th>
<th colspan="2" align="center" rowspan="1">COG Annotation</th>
<th colspan="2" align="center" rowspan="1">KEGG Annotation</th>
<th colspan="2" align="center" rowspan="1">Pfam Annotation</th>
<th colspan="2" align="center" rowspan="1">SEED annotation</th>
</tr>
<tr>
<th align="left" rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1"></th>
<th align="left" rowspan="1" colspan="1"></th>
<th align="center" rowspan="1" colspan="1">PPV</th>
<th align="center" rowspan="1" colspan="1">NPV</th>
<th align="center" rowspan="1" colspan="1">PPV</th>
<th align="center" rowspan="1" colspan="1">NPV</th>
<th align="center" rowspan="1" colspan="1">PPV</th>
<th align="center" rowspan="1" colspan="1">NPV</th>
<th align="center" rowspan="1" colspan="1">PPV</th>
<th align="center" rowspan="1" colspan="1">NPV</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">Illumina (Average Read Length: 100 bp)</td>
<td align="center" rowspan="1" colspan="1">High Salt Metagenome (35446)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.75</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.75</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.74</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.75</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Medium Salt Metagenome (38929)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.80</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.80</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Low Salt Metagenome (34296)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Illumina Metatranscriptomic Datasets (Average Read Length: 209 bp)</td>
<td align="center" rowspan="1" colspan="1">SRR397002 (587272)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397004 (570339)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397074 (564583)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.75</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397076 (561040)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.80</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397079 (596020)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397146 (583386)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397148 (564583)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.80</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.80</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Roche 454 (Average Read Length: 350 bp)</td>
<td align="center" rowspan="1" colspan="1">Malnourished Child Gut Metagenome (1501481)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.80</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.67</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.60</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.64</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.55</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Healthy Child Gut Metagenome (1501481)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.75</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.66</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Roche 454 Oral Metagenomic samples (Average Read Length 400 bp)</td>
<td align="center" rowspan="1" colspan="1">4447101.3.6941 (295072)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447102.3.6942 (244881)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447103.3.6943 (464594)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447192.3.7032 (204218)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.74</td>
<td align="center" rowspan="1" colspan="1">0.77</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447903.3.7714 (306740)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447943.3.7744 (339503)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.83</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.76</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447970.3.1 (70503)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.81</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.70</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447971.3.6 (97722)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.97</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.91</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.94</td>
<td align="center" rowspan="1" colspan="1">0.85</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.92</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
<td align="center" rowspan="1" colspan="1">0.84</td>
<td align="center" rowspan="1" colspan="1">0.89</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.88</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Sanger (Average Read Length: 1000bp)</td>
<td align="center" rowspan="1" colspan="1">Acid Mine Drainage Metagenome (180713)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">1.00</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
<td align="center" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">0.98</td>
<td align="center" rowspan="1" colspan="1">0.78</td>
<td align="center" rowspan="1" colspan="1">0.93</td>
<td align="center" rowspan="1" colspan="1">0.74</td>
<td align="center" rowspan="1" colspan="1">0.96</td>
<td align="center" rowspan="1" colspan="1">0.72</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
<td align="center" rowspan="1" colspan="1">0.87</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.65</td>
<td align="center" rowspan="1" colspan="1">0.79</td>
<td align="center" rowspan="1" colspan="1">0.82</td>
<td align="center" rowspan="1" colspan="1">0.73</td>
<td align="center" rowspan="1" colspan="1">0.63</td>
<td align="center" rowspan="1" colspan="1">0.90</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">0.95</td>
<td align="center" rowspan="1" colspan="1">0.68</td>
<td align="center" rowspan="1" colspan="1">0.72</td>
<td align="center" rowspan="1" colspan="1">0.66</td>
<td align="center" rowspan="1" colspan="1">0.86</td>
<td align="center" rowspan="1" colspan="1">0.63</td>
<td align="center" rowspan="1" colspan="1">0.69</td>
<td align="center" rowspan="1" colspan="1">0.80</td>
</tr>
</tbody>
</table>
</alternatives>
<table-wrap-foot>
<fn id="t001fn001">
<p># Number within brackets indicates the total number of reads in each of the validation datasets. For oral metagenomes, MG-RAST ids are provided as dataset identifiers.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="pone.0142102.t002" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0142102.t002</object-id>
<label>Table 2</label>
<caption>
<title>Comparison of computing time required by different options in the COGNIZER framework.</title>
</caption>
<alternatives>
<graphic id="pone.0142102.t002g" xlink:href="pone.0142102.t002"></graphic>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
<col align="left" valign="middle" span="1"></col>
</colgroup>
<thead>
<tr>
<th align="center" rowspan="1" colspan="1">Sequencing Platform</th>
<th align="center" rowspan="1" colspan="1">Dataset
<xref rid="t002fn002" ref-type="table-fn">
<sup>#</sup>
</xref>
</th>
<th align="center" rowspan="1" colspan="1">Option</th>
<th align="center" rowspan="1" colspan="1">Time (secs)
<xref rid="t002fn001" ref-type="table-fn">*</xref>
</th>
<th align="center" rowspan="1" colspan="1">Percentage Reduction in time as compared to option 1</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" rowspan="1" colspan="1">Illumina (Average Read Length: 100 bp)</td>
<td align="center" rowspan="1" colspan="1">High Salt Metagenome (35446)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">507</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">120</td>
<td align="center" rowspan="1" colspan="1">76.33</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">378</td>
<td align="center" rowspan="1" colspan="1">25.44</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">111</td>
<td align="center" rowspan="1" colspan="1">78.11</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Medium Salt Metagenome (38929)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">551</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">125</td>
<td align="center" rowspan="1" colspan="1">77.31</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">392</td>
<td align="center" rowspan="1" colspan="1">28.86</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">111</td>
<td align="center" rowspan="1" colspan="1">79.85</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Low Salt Metagenome (34296)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">398</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">94</td>
<td align="center" rowspan="1" colspan="1">76.38</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">287</td>
<td align="center" rowspan="1" colspan="1">27.89</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">90</td>
<td align="center" rowspan="1" colspan="1">77.39</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Illumina Meta-transcriptomic Datasets (Average Read Length: 209 bp)</td>
<td align="center" rowspan="1" colspan="1">SRR397002 (587272)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">15272</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">2772</td>
<td align="center" rowspan="1" colspan="1">81.85</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">9274</td>
<td align="center" rowspan="1" colspan="1">39.27</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1136</td>
<td align="center" rowspan="1" colspan="1">92.56</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397004 (570339)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">15491</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">2814</td>
<td align="center" rowspan="1" colspan="1">81.83</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">8949</td>
<td align="center" rowspan="1" colspan="1">42.23</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1157</td>
<td align="center" rowspan="1" colspan="1">92.53</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397074 (564583)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">17627</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">3579</td>
<td align="center" rowspan="1" colspan="1">79.70</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">9536</td>
<td align="center" rowspan="1" colspan="1">45.90</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1595</td>
<td align="center" rowspan="1" colspan="1">90.95</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397076 (561040)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">17979</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">3535</td>
<td align="center" rowspan="1" colspan="1">80.34</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">9282</td>
<td align="center" rowspan="1" colspan="1">48.37</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1583</td>
<td align="center" rowspan="1" colspan="1">91.20</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397079 (596020)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">14411</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">3039</td>
<td align="center" rowspan="1" colspan="1">78.91</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">7294</td>
<td align="center" rowspan="1" colspan="1">49.39</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1759</td>
<td align="center" rowspan="1" colspan="1">87.79</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397146 (583386)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">14664</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">3375</td>
<td align="center" rowspan="1" colspan="1">76.98</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">8261</td>
<td align="center" rowspan="1" colspan="1">43.66</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1732</td>
<td align="center" rowspan="1" colspan="1">88.19</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">SRR397148 (564583)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">13752</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">2986</td>
<td align="center" rowspan="1" colspan="1">78.29</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">7950</td>
<td align="center" rowspan="1" colspan="1">42.19</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1697</td>
<td align="center" rowspan="1" colspan="1">87.66</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Roche 454 (Average Read Length: 350 bp)</td>
<td align="center" rowspan="1" colspan="1">Malnourished Child Gut Metagenome (1501481)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">56700</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">17040</td>
<td align="center" rowspan="1" colspan="1">69.95</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">40620</td>
<td align="center" rowspan="1" colspan="1">28.36</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">9840</td>
<td align="center" rowspan="1" colspan="1">82.65</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">Healthy Child Gut Metagenome (1501481)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">48540</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">13800</td>
<td align="center" rowspan="1" colspan="1">71.57</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">36900</td>
<td align="center" rowspan="1" colspan="1">23.98</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">8220</td>
<td align="center" rowspan="1" colspan="1">83.07</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Roche 454 Oral Metagenomic samples (Average Read Length 400 bp)</td>
<td align="center" rowspan="1" colspan="1">4447101.3.6941 (295072)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">19085</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">6048</td>
<td align="center" rowspan="1" colspan="1">68.31</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">12525</td>
<td align="center" rowspan="1" colspan="1">34.37</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">3706</td>
<td align="center" rowspan="1" colspan="1">80.58</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447102.3.6942 (244881)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">14780</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">4466</td>
<td align="center" rowspan="1" colspan="1">69.78</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">9224</td>
<td align="center" rowspan="1" colspan="1">37.59</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">2728</td>
<td align="center" rowspan="1" colspan="1">81.54</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447103.3.6943 (464594)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">30130</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">10084</td>
<td align="center" rowspan="1" colspan="1">66.53</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">20716</td>
<td align="center" rowspan="1" colspan="1">31.24</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">6048</td>
<td align="center" rowspan="1" colspan="1">79.93</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447192.3.7032 (204218)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">11772</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">3621</td>
<td align="center" rowspan="1" colspan="1">69.24</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">8027</td>
<td align="center" rowspan="1" colspan="1">31.81</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">2382</td>
<td align="center" rowspan="1" colspan="1">79.77</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447903.3.7714 (306740)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">18253</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">5713</td>
<td align="center" rowspan="1" colspan="1">68.70</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">12108</td>
<td align="center" rowspan="1" colspan="1">33.67</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">3725</td>
<td align="center" rowspan="1" colspan="1">79.59</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447943.3.7744 (339503)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">21127</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">6852</td>
<td align="center" rowspan="1" colspan="1">67.57</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">13941</td>
<td align="center" rowspan="1" colspan="1">34.01</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">4143</td>
<td align="center" rowspan="1" colspan="1">80.39</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447970.3.1 (70503)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">4087</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">1307</td>
<td align="center" rowspan="1" colspan="1">68.02</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">2763</td>
<td align="center" rowspan="1" colspan="1">32.40</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">882</td>
<td align="center" rowspan="1" colspan="1">78.42</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4447971.3.6 (97722)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">5652</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">1893</td>
<td align="center" rowspan="1" colspan="1">66.51</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">3803</td>
<td align="center" rowspan="1" colspan="1">32.71</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">1168</td>
<td align="center" rowspan="1" colspan="1">79.33</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1">Sanger (Average Read Length: 1000bp)</td>
<td align="center" rowspan="1" colspan="1">Acid Mine Drainage Metagenome (180713)</td>
<td align="center" rowspan="1" colspan="1">1</td>
<td align="center" rowspan="1" colspan="1">21680</td>
<td align="center" rowspan="1" colspan="1">-</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">2</td>
<td align="center" rowspan="1" colspan="1">7100</td>
<td align="center" rowspan="1" colspan="1">67.25</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">3</td>
<td align="center" rowspan="1" colspan="1">9020</td>
<td align="center" rowspan="1" colspan="1">58.39</td>
</tr>
<tr>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1"></td>
<td align="center" rowspan="1" colspan="1">4</td>
<td align="center" rowspan="1" colspan="1">5650</td>
<td align="center" rowspan="1" colspan="1">73.94</td>
</tr>
</tbody>
</table>
</alternatives>
<table-wrap-foot>
<fn id="t002fn001">
<p>* All validation experiments were performed on a CentOS (ver. 6.3) server having 64 Intel Xeon dual-core 2.3 Ghz processors and 128 GBs of RAM. Individual options of COGNIZER were executed using 32 CPU threads.</p>
</fn>
<fn id="t002fn002">
<p>
<sup>#</sup>
Number within brackets indicates the total number of reads in each of the validation datasets. For oral metagenomes, MG-RAST ids are provided as dataset identifiers.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>In spite of employing a directed search strategy against a customized (reduced) COG database, the PPV and NPV values obtained with options 3 and 4 of COGNIZER (in majority of cases) are observed to be in the range of 0.76–0.95 (
<xref rid="pone.0142102.t001" ref-type="table">Table 1</xref>
). Significantly, for most datasets having sequences with read-length greater than 300 bp, the mean PPV and NPV values of options 3 and 4 are observed to relatively higher than those obtained with datasets with shorter reads. The probable reason behind this observation is as follows. Sequence fragments of longer lengths are more likely to generate relatively robust alignments thereby decreasing the likelihood of predicting a false positive outcome. Furthermore, proteins typically comprise of multiple functional domains. Consequently, the probability of encompassing information corresponding to multiple protein domains is relatively higher for longer sequence fragments. The slight improvement in results obtained with datasets having higher mean sequence lengths (typically those from the 454-Roche and the Sanger sequencing technology) are a reflection of the same. Given that most of the currently available sequencing technologies have the capability to generate reads with length of at least 250 bp, the results obtained with options 3 and 4 (with datasets having sequences of length 300 and above) assume relevance in the present context. With respect to processing time, options 3 and 4 are observed to outperform options 1 and 2 respectively (
<xref rid="pone.0142102.t002" ref-type="table">Table 2</xref>
), thereby reflecting the utility of the directed search strategy in reducing the computational costs.</p>
<p>A heat map depicting correlation between the annotation results obtained using option 1 and those obtained using the other three options of COGNIZER is presented in
<xref rid="pone.0142102.g005" ref-type="fig">Fig 5</xref>
. In summary, validation results provided in Tables
<xref rid="pone.0142102.t001" ref-type="table">1</xref>
and
<xref rid="pone.0142102.t002" ref-type="table">2</xref>
along with results depicted in
<xref rid="pone.0142102.g005" ref-type="fig">Fig 5</xref>
indicate that options 2 and 4 represent an optimal trade-off between execution time and annotation accuracy. It is pertinent to note here that all workflow options of COGNIZER (including options 2 and 4) rely on the cross-mapping database for deriving annotations pertaining to different functional databases (KEGG, SEED, GO, and Pfam) using a single homology search against the COG database. Consequently, this database constitutes a key component of this stand-alone functional annotation framework. Interestingly, results presented in Tables
<xref rid="pone.0142102.t001" ref-type="table">1</xref>
and
<xref rid="pone.0142102.t002" ref-type="table">2</xref>
and
<xref rid="pone.0142102.g005" ref-type="fig">Fig 5</xref>
further indicate that the drop in annotation accuracy with options 2–4 (as compared to option 1) is more or less consistent across various datasets (irrespective of read length). This is expected given that options 2–4 involve additional heuristic features. Overall, the annotation accuracy appears to be dependent on both query sequence length as well as the heuristic option employed.</p>
<fig id="pone.0142102.g005" orientation="portrait" position="float">
<object-id pub-id-type="doi">10.1371/journal.pone.0142102.g005</object-id>
<label>Fig 5</label>
<caption>
<title>Correlation between prediction results obtained using option 1 and those obtained using the other three options in the COGNIZER framework.</title>
<p>A heat map of the correlation coefficients between the annotations obtained using option 1 and the other three options of COGNIZER framework. Pearson correlation coefficients were obtained with a p-value confidence of <0.00001. In option 1, the BLASTx method is employed (in the homology-search phase) for querying reads constituting metagenomic datasets. The search is performed against all sequences in the COG database. In the subsequent 'mapping' phase, for each query, functional annotations are inferred using COGNIZER's cross-mapping database. In option 2 the RAPSearch algorithm is used instead of BLASTx (in the homology-search phase). Option 3 and 4 are analogous to options 1 and 2 respectively, except that a reduced/customised COG database is used during the homology-search phase.</p>
</caption>
<graphic xlink:href="pone.0142102.g005"></graphic>
</fig>
<p>COGNIZER relies primarily on the COG database. The main reason for choosing the COG database is as follows. The COG database, comprising of approximately 200,000 protein sequences, is a relatively smaller database as compared to other protein databases. For instance, the Pfam database, a collection of HMM profiles (not actual protein sequences), exceeds 1.2 GB (as compared to 70 MB of the COG database). Furthermore, the COG database captures most of the known protein functional categories. A recent review has reported that, in spite of the difference in database sizes, the quality of annotation (i.e. categorization of protein sequences into functional classes) obtained using the COG and the RefSeq databases are comparable [
<xref rid="pone.0142102.ref001" ref-type="bibr">1</xref>
]. The directed-search approach employed in COGNIZER therefore helps in further reducing the computing requirements without substantial loss in annotation accuracy. It is pertinent to note here that in all validation experiments, the peak memory requirement of COGNIZER rarely exceeded 500 MBs of RAM usage.</p>
<p>Usage of options 3 and 4, employing a reduced COG database in conjunction with cross-mapping framework, is logically expected to result in some degree of loss in annotation accuracy. Not withstanding this fact, the availability of compute resources is expected to drive/dictate the choice of options by end-users. For instance, analysis of the diabetes datasets [
<xref rid="pone.0142102.ref012" ref-type="bibr">12</xref>
,
<xref rid="pone.0142102.ref013" ref-type="bibr">13</xref>
] (having a cumulative volume of 300–400 gigabytes) is expected to entail huge compute resources and time, and hence usage of option 4 appears to be the logical choice. In spite of some loss in annotation accuracy, the results generated using this option would help in obtaining macro-level profiles (corresponding to various functional aspects) of these metagenomes. Results presented in this study with varied datasets (with all four options) are expected to serve as a guideline for end-users to decide upon an acceptable trade-off between execution time and prediction accuracy based on the compute resources available at their end.</p>
<p>The architectures of existing protein databases (e.g. Pfam, COG, SEED, etc.) are not similar. While the COG database is based on the evolutionary relatedness of genes/proteins from different organisms, the Pfam database contains information pertaining to protein domains and families. The KEGG annotations, in contrast, are employed for estimating metabolic pathways that are functional among the organisms constituting a metagenome. With its cross-mapping database, COGNIZER enables obtaining multiple functional annotations using a single homology search.</p>
<p>A recently published study [
<xref rid="pone.0142102.ref003" ref-type="bibr">3</xref>
], has proposed an alternate approach (PAUDA) for annotating metagenomic datasets against protein databases. Although PAUDA outperforms all four options (available in the COGNIZER framework) in terms of operational speed, the authors report that the tool is able to achieve an assignment rate of only 33% as compared to BLASTx. The NPV of PAUDA is therefore expected to be very low. In contrast, results obtained with all four options of COGNIZER demonstrate significantly relative higher NPV values. In addition, the cross-mapping utility in the COGNIZER framework enables end-users to obtain multiple functional annotations (using a single homology search) in a time efficient manner. The COGNIZER framework therefore provides significant value addition to researchers working in the field of metagenomics and metatranscriptomics.</p>
<p>COGNIZER software has been implemented as a generic framework. In principle, any sequence alignment tool can be integrated within this framework for performing homology searches of query sequences against sequences in the COG database (or its customized variant). In the present implementation, sequence alignment tools which are compatible with both 32-bit and 64-bit system architectures were included. Given this, the present distribution of COGNIZER does not integrate DIAMOND [
<xref rid="pone.0142102.ref028" ref-type="bibr">28</xref>
]—a recently published homology search tool (with a 64-bit implementation) that can perform sequence alignments at a pace that drastically exceeds any of the tools currently implemented in the COGNIZER framework. In spite of its superior processing speed, experiments performed with
<strike>the</strike>
a subset of
<strike>same</strike>
datasets (used for evaluating the performance of COGNIZER) indicated a lower sensitivity/specificity of DIAMOND as compared to that obtained with RAPSearch and/or BLASTx (
<xref rid="pone.0142102.s001" ref-type="supplementary-material">S1 Fig</xref>
). However, as mentioned above, end-users intending to harness the rapid processing speed of DIAMOND can easily integrate this tool into the COGNIZER framework.</p>
</sec>
<sec sec-type="conclusions" id="sec013">
<title>Conclusion</title>
<p>Validation results demonstrate that the COGNIZER framework is capable of comprehensively annotating any metagenomic or metatranscriptomic dataset (from varied sequencing platforms) in functional terms. Multiple search options in COGNIZER provide the flexibility for choosing a homology search protocol based on available compute resources. The cross-mapping database in COGNIZER enables end-users to directly infer/derive Pfam, KEGG, GO, and SEED subsystem annotations from COG categorizations. This cross-mapping greatly increases the utility of COGNIZER. Furthermore, availability of COGNIZER as a stand-alone (scalable) implementation is expected to make it a valuable annotation tool in the field of metagenomic and metatranscriptomic research.</p>
</sec>
<sec sec-type="supplementary-material" id="sec014">
<title>Supporting Information</title>
<supplementary-material content-type="local-data" id="pone.0142102.s001">
<label>S1 Fig</label>
<caption>
<title>Comparison of the performance of RAPSearch and DIAMOND.</title>
<p>Comparative analysis of the specificity and sensitivity of RAPSearch and DIAMOND in comparison to BLASTX. The analysis was performed at an e-value cut-off of 0.00001.</p>
<p>(TIF)</p>
</caption>
<media xlink:href="pone.0142102.s001.tif">
<caption>
<p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>Mr. Tungadri Bose is a PhD scholar in the Indian Institute of Technology, Bombay and would like to acknowledge the Institute for its support. We thank Hemang Gandhi for his help in setting up a web server for COGNIZER.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="pone.0142102.ref001">
<label>1</label>
<mixed-citation publication-type="journal">
<name>
<surname>Prakash</surname>
<given-names>T</given-names>
</name>
,
<name>
<surname>Taylor</surname>
<given-names>TD</given-names>
</name>
.
<article-title>Functional assignment of metagenomic data: challenges and applications</article-title>
.
<source>Brief Bioinformatics</source>
.
<year>2012</year>
;
<volume>13</volume>
:
<fpage>711</fpage>
<lpage>727</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/bib/bbs033">10.1093/bib/bbs033</ext-link>
</comment>
<pub-id pub-id-type="pmid">22772835</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref002">
<label>2</label>
<mixed-citation publication-type="journal">
<name>
<surname>Ye</surname>
<given-names>Y</given-names>
</name>
,
<name>
<surname>Choi J-</surname>
<given-names>H</given-names>
</name>
,
<name>
<surname>Tang</surname>
<given-names>H</given-names>
</name>
.
<article-title>RAPSearch: a fast protein similarity search tool for short reads</article-title>
.
<source>BMC Bioinformatics</source>
.
<year>2011</year>
;
<volume>12</volume>
:
<fpage>159</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1471-2105-12-159">10.1186/1471-2105-12-159</ext-link>
</comment>
<pub-id pub-id-type="pmid">21575167</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref003">
<label>3</label>
<mixed-citation publication-type="journal">
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
,
<name>
<surname>Xie</surname>
<given-names>C</given-names>
</name>
.
<article-title>A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA</article-title>
.
<source>Bioinformatics</source>
.
<year>2014</year>
;
<volume>30</volume>
:
<fpage>38</fpage>
<lpage>39</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/bioinformatics/btt254">10.1093/bioinformatics/btt254</ext-link>
</comment>
<pub-id pub-id-type="pmid">23658416</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref004">
<label>4</label>
<mixed-citation publication-type="journal">
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
,
<name>
<surname>Mitra</surname>
<given-names>S</given-names>
</name>
.
<article-title>Introduction to the analysis of environmental sequences: metagenomics with MEGAN</article-title>
.
<source>Methods Mol Biol</source>
.
<year>2012</year>
;
<volume>856</volume>
:
<fpage>415</fpage>
<lpage>429</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1007/978-1-61779-585-5_17">10.1007/978-1-61779-585-5_17</ext-link>
</comment>
<pub-id pub-id-type="pmid">22399469</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref005">
<label>5</label>
<mixed-citation publication-type="journal">
<name>
<surname>Sanli</surname>
<given-names>K</given-names>
</name>
,
<name>
<surname>Karlsson</surname>
<given-names>FH</given-names>
</name>
,
<name>
<surname>Nookaew</surname>
<given-names>I</given-names>
</name>
,
<name>
<surname>Nielsen</surname>
<given-names>J</given-names>
</name>
.
<article-title>FANTOM: Functional and taxonomic analysis of metagenomes</article-title>
.
<source>BMC Bioinformatics</source>
.
<year>2013</year>
;
<volume>14</volume>
:
<fpage>38</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1471-2105-14-38">10.1186/1471-2105-14-38</ext-link>
</comment>
<pub-id pub-id-type="pmid">23375020</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref006">
<label>6</label>
<mixed-citation publication-type="journal">
<name>
<surname>Yu</surname>
<given-names>K</given-names>
</name>
,
<name>
<surname>Zhang</surname>
<given-names>T</given-names>
</name>
.
<article-title>Construction of customized sub-databases from NCBI-nr database for rapid annotation of huge metagenomic datasets using a combined BLAST and MEGAN approach</article-title>
.
<source>PLoS ONE</source>
.
<year>2013</year>
;
<volume>8</volume>
:
<fpage>e59831</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pone.0059831">10.1371/journal.pone.0059831</ext-link>
</comment>
<pub-id pub-id-type="pmid">23573212</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref007">
<label>7</label>
<mixed-citation publication-type="journal">
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>Paarmann</surname>
<given-names>D</given-names>
</name>
,
<name>
<surname>D’Souza</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Olson</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Glass</surname>
<given-names>EM</given-names>
</name>
,
<name>
<surname>Kubal</surname>
<given-names>M</given-names>
</name>
,
<etal>et al</etal>
<article-title>The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes</article-title>
.
<source>BMC Bioinformatics</source>
.
<year>2008</year>
;
<volume>9</volume>
:
<fpage>386</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1471-2105-9-386">10.1186/1471-2105-9-386</ext-link>
</comment>
<pub-id pub-id-type="pmid">18803844</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref008">
<label>8</label>
<mixed-citation publication-type="journal">
<name>
<surname>Goll</surname>
<given-names>J</given-names>
</name>
,
<name>
<surname>Rusch</surname>
<given-names>DB</given-names>
</name>
,
<name>
<surname>Tanenbaum</surname>
<given-names>DM</given-names>
</name>
,
<name>
<surname>Thiagarajan</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Li</surname>
<given-names>K</given-names>
</name>
,
<name>
<surname>Methé</surname>
<given-names>BA</given-names>
</name>
,
<etal>et al</etal>
<article-title>METAREP: JCVI metagenomics reports—an open source tool for high-performance comparative metagenomics</article-title>
.
<source>Bioinformatics</source>
.
<year>2010</year>
;
<volume>26</volume>
:
<fpage>2631</fpage>
<lpage>2632</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/bioinformatics/btq455">10.1093/bioinformatics/btq455</ext-link>
</comment>
<pub-id pub-id-type="pmid">20798169</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref009">
<label>9</label>
<mixed-citation publication-type="journal">
<name>
<surname>Sun</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Chen</surname>
<given-names>J</given-names>
</name>
,
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
,
<name>
<surname>Altintas</surname>
<given-names>I</given-names>
</name>
,
<name>
<surname>Lin</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Peltier</surname>
<given-names>S</given-names>
</name>
,
<etal>et al</etal>
<article-title>Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2011</year>
;
<volume>39</volume>
:
<fpage>D546</fpage>
<lpage>551</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gkq1102">10.1093/nar/gkq1102</ext-link>
</comment>
<pub-id pub-id-type="pmid">21045053</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref010">
<label>10</label>
<mixed-citation publication-type="journal">
<name>
<surname>Lingner</surname>
<given-names>T</given-names>
</name>
,
<name>
<surname>Asshauer</surname>
<given-names>KP</given-names>
</name>
,
<name>
<surname>Schreiber</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>Meinicke</surname>
<given-names>P</given-names>
</name>
.
<article-title>CoMet—a web server for comparative functional profiling of metagenomes</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2011</year>
;
<volume>39</volume>
:
<fpage>W518</fpage>
<lpage>523</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gkr388">10.1093/nar/gkr388</ext-link>
</comment>
<pub-id pub-id-type="pmid">21622656</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref011">
<label>11</label>
<mixed-citation publication-type="journal">
<name>
<surname>Markowitz</surname>
<given-names>VM</given-names>
</name>
,
<name>
<surname>Chen</surname>
<given-names>I-MA</given-names>
</name>
,
<name>
<surname>Chu</surname>
<given-names>K</given-names>
</name>
,
<name>
<surname>Szeto</surname>
<given-names>E</given-names>
</name>
,
<name>
<surname>Palaniappan</surname>
<given-names>K</given-names>
</name>
,
<name>
<surname>Grechkin</surname>
<given-names>Y</given-names>
</name>
,
<etal>et al</etal>
<article-title>IMG/M: the integrated metagenome data management and comparative analysis system</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2012</year>
;
<volume>40</volume>
:
<fpage>D123</fpage>
<lpage>D129</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gkr975">10.1093/nar/gkr975</ext-link>
</comment>
<pub-id pub-id-type="pmid">22086953</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref012">
<label>12</label>
<mixed-citation publication-type="journal">
<name>
<surname>Qin</surname>
<given-names>J</given-names>
</name>
,
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
,
<name>
<surname>Cai</surname>
<given-names>Z</given-names>
</name>
,
<name>
<surname>Li</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Zhu</surname>
<given-names>J</given-names>
</name>
,
<name>
<surname>Zhang</surname>
<given-names>F</given-names>
</name>
,
<etal>et al</etal>
<article-title>A metagenome-wide association study of gut microbiota in type 2 diabetes</article-title>
.
<source>Nature</source>
.
<year>2012</year>
;
<volume>490</volume>
:
<fpage>55</fpage>
<lpage>60</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature11450">10.1038/nature11450</ext-link>
</comment>
<pub-id pub-id-type="pmid">23023125</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref013">
<label>13</label>
<mixed-citation publication-type="journal">
<name>
<surname>Karlsson</surname>
<given-names>FH</given-names>
</name>
,
<name>
<surname>Tremaroli</surname>
<given-names>V</given-names>
</name>
,
<name>
<surname>Nookaew</surname>
<given-names>I</given-names>
</name>
,
<name>
<surname>Bergström</surname>
<given-names>G</given-names>
</name>
,
<name>
<surname>Behre</surname>
<given-names>CJ</given-names>
</name>
,
<name>
<surname>Fagerberg</surname>
<given-names>B</given-names>
</name>
,
<etal>et al</etal>
<article-title>Gut metagenome in European women with normal, impaired and diabetic glucose control</article-title>
.
<source>Nature</source>
.
<year>2013</year>
;
<volume>498</volume>
:
<fpage>99</fpage>
<lpage>103</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature12198">10.1038/nature12198</ext-link>
</comment>
<pub-id pub-id-type="pmid">23719380</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref014">
<label>14</label>
<mixed-citation publication-type="journal">
<name>
<surname>Tatusov</surname>
<given-names>RL</given-names>
</name>
,
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
,
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
.
<article-title>A genomic perspective on protein families</article-title>
.
<source>Science</source>
.
<year>1997</year>
;
<volume>278</volume>
:
<fpage>631</fpage>
<lpage>637</lpage>
.
<pub-id pub-id-type="pmid">9381173</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref015">
<label>15</label>
<mixed-citation publication-type="journal">
<name>
<surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Goto</surname>
<given-names>S</given-names>
</name>
.
<article-title>KEGG: kyoto encyclopedia of genes and genomes</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2000</year>
;
<volume>28</volume>
:
<fpage>27</fpage>
<lpage>30</lpage>
.
<pub-id pub-id-type="pmid">10592173</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref016">
<label>16</label>
<mixed-citation publication-type="journal">
<name>
<surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Goto</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Sato</surname>
<given-names>Y</given-names>
</name>
,
<name>
<surname>Kawashima</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Furumichi</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Tanabe</surname>
<given-names>M</given-names>
</name>
.
<article-title>Data, information, knowledge and principle: back to metabolism in KEGG</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2014</year>
;
<volume>42</volume>
:
<fpage>D199</fpage>
<lpage>205</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gkt1076">10.1093/nar/gkt1076</ext-link>
</comment>
<pub-id pub-id-type="pmid">24214961</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref017">
<label>17</label>
<mixed-citation publication-type="journal">
<name>
<surname>Overbeek</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Begley</surname>
<given-names>T</given-names>
</name>
,
<name>
<surname>Butler</surname>
<given-names>RM</given-names>
</name>
,
<name>
<surname>Choudhuri</surname>
<given-names>JV</given-names>
</name>
,
<name>
<surname>Chuang H-</surname>
<given-names>Y</given-names>
</name>
,
<name>
<surname>Cohoon</surname>
<given-names>M</given-names>
</name>
,
<etal>et al</etal>
<article-title>The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2005</year>
;
<volume>33</volume>
:
<fpage>5691</fpage>
<lpage>5702</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gki866">10.1093/nar/gki866</ext-link>
</comment>
<pub-id pub-id-type="pmid">16214803</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref018">
<label>18</label>
<mixed-citation publication-type="journal">
<collab>Reference Genome Group of the Gene Ontology Consortium</collab>
.
<article-title>The Gene Ontology’s Reference Genome Project: a unified framework for functional annotation across species</article-title>
.
<source>PLoS Comput Biol</source>
.
<year>2009</year>
;
<volume>5</volume>
:
<fpage>e1000431</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pcbi.1000431">10.1371/journal.pcbi.1000431</ext-link>
</comment>
<pub-id pub-id-type="pmid">19578431</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref019">
<label>19</label>
<mixed-citation publication-type="journal">
<name>
<surname>Punta</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Coggill</surname>
<given-names>PC</given-names>
</name>
,
<name>
<surname>Eberhardt</surname>
<given-names>RY</given-names>
</name>
,
<name>
<surname>Mistry</surname>
<given-names>J</given-names>
</name>
,
<name>
<surname>Tate</surname>
<given-names>J</given-names>
</name>
,
<name>
<surname>Boursnell</surname>
<given-names>C</given-names>
</name>
,
<etal>et al</etal>
<article-title>The Pfam protein families database</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2012</year>
;
<volume>40</volume>
:
<fpage>D290</fpage>
<lpage>301</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gkr1065">10.1093/nar/gkr1065</ext-link>
</comment>
<pub-id pub-id-type="pmid">22127870</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref020">
<label>20</label>
<mixed-citation publication-type="journal">
<name>
<surname>Goujon</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>McWilliam</surname>
<given-names>H</given-names>
</name>
,
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
,
<name>
<surname>Valentin</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>Squizzato</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Paern</surname>
<given-names>J</given-names>
</name>
,
<etal>et al</etal>
<article-title>A new bioinformatics analysis tools framework at EMBL-EBI</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2010</year>
;
<volume>38</volume>
:
<fpage>W695</fpage>
<lpage>699</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gkq313">10.1093/nar/gkq313</ext-link>
</comment>
<pub-id pub-id-type="pmid">20439314</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref021">
<label>21</label>
<mixed-citation publication-type="journal">
<name>
<surname>Yamada</surname>
<given-names>T</given-names>
</name>
,
<name>
<surname>Letunic</surname>
<given-names>I</given-names>
</name>
,
<name>
<surname>Okuda</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
.
<article-title>iPath2.0: interactive pathway explorer</article-title>
.
<source>Nucleic Acids Res</source>
.
<year>2011</year>
;
<volume>39</volume>
:
<fpage>W412</fpage>
<lpage>415</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1093/nar/gkr313">10.1093/nar/gkr313</ext-link>
</comment>
<pub-id pub-id-type="pmid">21546551</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref022">
<label>22</label>
<mixed-citation publication-type="journal">
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
.
<article-title>Accelerated Profile HMM Searches</article-title>
.
<source>PLoS Comput Biol</source>
.
<year>2011</year>
;
<volume>7</volume>
:
<fpage>e1002195</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1371/journal.pcbi.1002195">10.1371/journal.pcbi.1002195</ext-link>
</comment>
<pub-id pub-id-type="pmid">22039361</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref023">
<label>23</label>
<mixed-citation publication-type="journal">
<name>
<surname>Dinsdale</surname>
<given-names>EA</given-names>
</name>
,
<name>
<surname>Edwards</surname>
<given-names>RA</given-names>
</name>
,
<name>
<surname>Hall</surname>
<given-names>D</given-names>
</name>
,
<name>
<surname>Angly</surname>
<given-names>F</given-names>
</name>
,
<name>
<surname>Breitbart</surname>
<given-names>M</given-names>
</name>
,
<name>
<surname>Brulc</surname>
<given-names>JM</given-names>
</name>
,
<etal>et al</etal>
<article-title>Functional metagenomic profiling of nine biomes</article-title>
.
<source>Nature</source>
.
<year>2008</year>
;
<volume>452</volume>
:
<fpage>629</fpage>
<lpage>632</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature06810">10.1038/nature06810</ext-link>
</comment>
<pub-id pub-id-type="pmid">18337718</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref024">
<label>24</label>
<mixed-citation publication-type="journal">
<name>
<surname>Warren</surname>
<given-names>RL</given-names>
</name>
,
<name>
<surname>Freeman</surname>
<given-names>DJ</given-names>
</name>
,
<name>
<surname>Pleasance</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Watson</surname>
<given-names>P</given-names>
</name>
,
<name>
<surname>Moore</surname>
<given-names>RA</given-names>
</name>
,
<name>
<surname>Cochrane</surname>
<given-names>K</given-names>
</name>
,
<etal>et al</etal>
<article-title>Co-occurrence of anaerobic bacteria in colorectal carcinomas</article-title>
.
<source>Microbiome</source>
.
<year>2013</year>
;
<volume>1</volume>
:
<fpage>16</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/2049-2618-1-16">10.1186/2049-2618-1-16</ext-link>
</comment>
<pub-id pub-id-type="pmid">24450771</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref025">
<label>25</label>
<mixed-citation publication-type="journal">
<name>
<surname>Gupta</surname>
<given-names>SS</given-names>
</name>
,
<name>
<surname>Mohammed</surname>
<given-names>MH</given-names>
</name>
,
<name>
<surname>Ghosh</surname>
<given-names>TS</given-names>
</name>
,
<name>
<surname>Kanungo</surname>
<given-names>S</given-names>
</name>
,
<name>
<surname>Nair</surname>
<given-names>GB</given-names>
</name>
,
<name>
<surname>Mande</surname>
<given-names>SS</given-names>
</name>
.
<article-title>Metagenome of the gut of a malnourished child</article-title>
.
<source>Gut Pathog</source>
.
<year>2011</year>
;
<volume>3</volume>
:
<fpage>7</fpage>
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1186/1757-4749-3-7">10.1186/1757-4749-3-7</ext-link>
</comment>
<pub-id pub-id-type="pmid">21599906</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref026">
<label>26</label>
<mixed-citation publication-type="journal">
<name>
<surname>Belda-Ferre</surname>
<given-names>P</given-names>
</name>
,
<name>
<surname>Alcaraz</surname>
<given-names>LD</given-names>
</name>
,
<name>
<surname>Cabrera-Rubio</surname>
<given-names>R</given-names>
</name>
,
<name>
<surname>Romero</surname>
<given-names>H</given-names>
</name>
,
<name>
<surname>Simón-Soro</surname>
<given-names>A</given-names>
</name>
,
<name>
<surname>Pignatelli</surname>
<given-names>M</given-names>
</name>
,
<etal>et al</etal>
<article-title>The oral metagenome in health and disease</article-title>
.
<source>ISME J</source>
.
<year>2012</year>
;
<volume>6</volume>
:
<fpage>46</fpage>
<lpage>56</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/ismej.2011.85">10.1038/ismej.2011.85</ext-link>
</comment>
<pub-id pub-id-type="pmid">21716308</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref027">
<label>27</label>
<mixed-citation publication-type="journal">
<name>
<surname>Tyson</surname>
<given-names>GW</given-names>
</name>
,
<name>
<surname>Chapman</surname>
<given-names>J</given-names>
</name>
,
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
,
<name>
<surname>Allen</surname>
<given-names>EE</given-names>
</name>
,
<name>
<surname>Ram</surname>
<given-names>RJ</given-names>
</name>
,
<name>
<surname>Richardson</surname>
<given-names>PM</given-names>
</name>
,
<etal>et al</etal>
<article-title>Community structure and metabolism through reconstruction of microbial genomes from the environment</article-title>
.
<source>Nature</source>
.
<year>2004</year>
;
<volume>428</volume>
:
<fpage>37</fpage>
<lpage>43</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nature02340">10.1038/nature02340</ext-link>
</comment>
<pub-id pub-id-type="pmid">14961025</pub-id>
</mixed-citation>
</ref>
<ref id="pone.0142102.ref028">
<label>28</label>
<mixed-citation publication-type="journal">
<name>
<surname>Buchfink</surname>
<given-names>B</given-names>
</name>
,
<name>
<surname>Xie</surname>
<given-names>C</given-names>
</name>
,
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
.
<article-title>Fast and sensitive protein alignment using DIAMOND</article-title>
.
<source>Nat Meth</source>
.
<year>2015</year>
;
<volume>12</volume>
:
<fpage>59</fpage>
<lpage>60</lpage>
.
<comment>doi:
<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1038/nmeth.3176">10.1038/nmeth.3176</ext-link>
</comment>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000082 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000082 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4641738
   |texte=   COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:26561344" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024