Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

MetaPathways: a modular pipeline for constructing pathway/genome databases from environmental sequence information

Identifieur interne : 000274 ( Pmc/Corpus ); précédent : 000273; suivant : 000275

MetaPathways: a modular pipeline for constructing pathway/genome databases from environmental sequence information

Auteurs : Kishori M. Konwar ; Niels W. Hanson ; Antoine P. Pagé ; Steven J. Hallam

Source :

RBID : PMC:3695837

Abstract

Background

A central challenge to understanding the ecological and biogeochemical roles of microorganisms in natural and human engineered ecosystems is the reconstruction of metabolic interaction networks from environmental sequence information. The dominant paradigm in metabolic reconstruction is to assign functional annotations using BLAST. Functional annotations are then projected onto symbolic representations of metabolism in the form of KEGG pathways or SEED subsystems.

Results

Here we present MetaPathways, an open source pipeline for pathway inference that uses the PathoLogic algorithm to map functional annotations onto the MetaCyc collection of reactions and pathways, and construct environmental Pathway/Genome Databases (ePGDBs) compatible with the editing and navigation features of Pathway Tools. The pipeline accepts assembled or unassembled nucleotide sequences, performs quality assessment and control, predicts and annotates noncoding genes and open reading frames, and produces inputs to PathoLogic. In addition to constructing ePGDBs, MetaPathways uses MLTreeMap to build phylogenetic trees for selected taxonomic anchor and functional gene markers, converts General Feature Format (GFF) files into concatenated GenBank files for ePGDB construction based on third-party annotations, and generates useful file formats including Sequin files for direct GenBank submission and gene feature tables summarizing annotations, MLTreeMap trees, and ePGDB pathway coverage summaries for statistical comparisons.

Conclusions

MetaPathways provides users with a modular annotation and analysis pipeline for predicting metabolic interaction networks from environmental sequence information using an alternative to KEGG pathways and SEED subsystems mapping. It is extensible to genomic and transcriptomic datasets from a wide range of sequencing platforms, and generates useful data products for microbial community structure and function analysis. The MetaPathways software package, installation instructions, and example data can be obtained from http://hallam.microbiology.ubc.ca/MetaPathways.


Url:
DOI: 10.1186/1471-2105-14-202
PubMed: 23800136
PubMed Central: 3695837

Links to Exploration step

PMC:3695837

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">MetaPathways: a modular pipeline for constructing pathway/genome databases from environmental sequence information</title>
<author>
<name sortKey="Konwar, Kishori M" sort="Konwar, Kishori M" uniqKey="Konwar K" first="Kishori M" last="Konwar">Kishori M. Konwar</name>
<affiliation>
<nlm:aff id="I1">Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hanson, Niels W" sort="Hanson, Niels W" uniqKey="Hanson N" first="Niels W" last="Hanson">Niels W. Hanson</name>
<affiliation>
<nlm:aff id="I2">Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Page, Antoine P" sort="Page, Antoine P" uniqKey="Page A" first="Antoine P" last="Pagé">Antoine P. Pagé</name>
<affiliation>
<nlm:aff id="I1">Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hallam, Steven J" sort="Hallam, Steven J" uniqKey="Hallam S" first="Steven J" last="Hallam">Steven J. Hallam</name>
<affiliation>
<nlm:aff id="I1">Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC Canada</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">23800136</idno>
<idno type="pmc">3695837</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3695837</idno>
<idno type="RBID">PMC:3695837</idno>
<idno type="doi">10.1186/1471-2105-14-202</idno>
<date when="2013">2013</date>
<idno type="wicri:Area/Pmc/Corpus">000274</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">MetaPathways: a modular pipeline for constructing pathway/genome databases from environmental sequence information</title>
<author>
<name sortKey="Konwar, Kishori M" sort="Konwar, Kishori M" uniqKey="Konwar K" first="Kishori M" last="Konwar">Kishori M. Konwar</name>
<affiliation>
<nlm:aff id="I1">Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hanson, Niels W" sort="Hanson, Niels W" uniqKey="Hanson N" first="Niels W" last="Hanson">Niels W. Hanson</name>
<affiliation>
<nlm:aff id="I2">Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Page, Antoine P" sort="Page, Antoine P" uniqKey="Page A" first="Antoine P" last="Pagé">Antoine P. Pagé</name>
<affiliation>
<nlm:aff id="I1">Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hallam, Steven J" sort="Hallam, Steven J" uniqKey="Hallam S" first="Steven J" last="Hallam">Steven J. Hallam</name>
<affiliation>
<nlm:aff id="I1">Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC Canada</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>A central challenge to understanding the ecological and biogeochemical roles of microorganisms in natural and human engineered ecosystems is the reconstruction of metabolic interaction networks from environmental sequence information. The dominant paradigm in metabolic reconstruction is to assign functional annotations using BLAST. Functional annotations are then projected onto symbolic representations of metabolism in the form of KEGG pathways or SEED subsystems.</p>
</sec>
<sec>
<title>Results</title>
<p>Here we present MetaPathways, an open source pipeline for pathway inference that uses the PathoLogic algorithm to map functional annotations onto the MetaCyc collection of reactions and pathways, and construct environmental Pathway/Genome Databases (ePGDBs) compatible with the editing and navigation features of Pathway Tools. The pipeline accepts assembled or unassembled nucleotide sequences, performs quality assessment and control, predicts and annotates noncoding genes and open reading frames, and produces inputs to PathoLogic. In addition to constructing ePGDBs, MetaPathways uses MLTreeMap to build phylogenetic trees for selected taxonomic anchor and functional gene markers, converts General Feature Format (GFF) files into concatenated GenBank files for ePGDB construction based on third-party annotations, and generates useful file formats including Sequin files for direct GenBank submission and gene feature tables summarizing annotations, MLTreeMap trees, and ePGDB pathway coverage summaries for statistical comparisons.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>MetaPathways provides users with a modular annotation and analysis pipeline for predicting metabolic interaction networks from environmental sequence information using an alternative to KEGG pathways and SEED subsystems mapping. It is extensible to genomic and transcriptomic datasets from a wide range of sequencing platforms, and generates useful data products for microbial community structure and function analysis. The MetaPathways software package, installation instructions, and example data can be obtained from
<ext-link ext-link-type="uri" xlink:href="http://hallam.microbiology.ubc.ca/MetaPathways">http://hallam.microbiology.ubc.ca/MetaPathways</ext-link>
.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Wright, Jj" uniqKey="Wright J">JJ Wright</name>
</author>
<author>
<name sortKey="Konwar, Km" uniqKey="Konwar K">KM Konwar</name>
</author>
<author>
<name sortKey="Hallam, Sj" uniqKey="Hallam S">SJ Hallam</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF Delong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Falkowski, Pg" uniqKey="Falkowski P">PG Falkowski</name>
</author>
<author>
<name sortKey="Fenchel, T" uniqKey="Fenchel T">T Fenchel</name>
</author>
<author>
<name sortKey="Delong, Ef" uniqKey="Delong E">EF Delong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kunin, V" uniqKey="Kunin V">V Kunin</name>
</author>
<author>
<name sortKey="Copeland, A" uniqKey="Copeland A">A Copeland</name>
</author>
<author>
<name sortKey="Lapidus, A" uniqKey="Lapidus A">A Lapidus</name>
</author>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Barry, K" uniqKey="Barry K">K Barry</name>
</author>
<author>
<name sortKey="Shapiro, H" uniqKey="Shapiro H">H Shapiro</name>
</author>
<author>
<name sortKey="Goltsman, E" uniqKey="Goltsman E">E Goltsman</name>
</author>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
<author>
<name sortKey="Salamov, A" uniqKey="Salamov A">A Salamov</name>
</author>
<author>
<name sortKey="Korzeniewski, F" uniqKey="Korzeniewski F">F Korzeniewski</name>
</author>
<author>
<name sortKey="Land, M" uniqKey="Land M">M Land</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wooley, Jc" uniqKey="Wooley J">JC Wooley</name>
</author>
<author>
<name sortKey="Godzik, A" uniqKey="Godzik A">A Godzik</name>
</author>
<author>
<name sortKey="Friedberg, I" uniqKey="Friedberg I">I Friedberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
<author>
<name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Okuda, S" uniqKey="Okuda S">S Okuda</name>
</author>
<author>
<name sortKey="Yamada, T" uniqKey="Yamada T">T Yamada</name>
</author>
<author>
<name sortKey="Hamajima, M" uniqKey="Hamajima M">M Hamajima</name>
</author>
<author>
<name sortKey="Itoh, M" uniqKey="Itoh M">M Itoh</name>
</author>
<author>
<name sortKey="Katayama, T" uniqKey="Katayama T">T Katayama</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P Bork</name>
</author>
<author>
<name sortKey="Goto, S" uniqKey="Goto S">S Goto</name>
</author>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Claudel Renard, C" uniqKey="Claudel Renard C">C Claudel Renard</name>
</author>
<author>
<name sortKey="Chevalet, C" uniqKey="Chevalet C">C Chevalet</name>
</author>
<author>
<name sortKey="Faraut, T" uniqKey="Faraut T">T Faraut</name>
</author>
<author>
<name sortKey="Kahn, D" uniqKey="Kahn D">D Kahn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Overbeek, R" uniqKey="Overbeek R">R Overbeek</name>
</author>
<author>
<name sortKey="Begley, T" uniqKey="Begley T">T Begley</name>
</author>
<author>
<name sortKey="Butler, Rm" uniqKey="Butler R">RM Butler</name>
</author>
<author>
<name sortKey="Choudhuri, Jv" uniqKey="Choudhuri J">JV Choudhuri</name>
</author>
<author>
<name sortKey="Chuang, H Y" uniqKey="Chuang H">H-Y Chuang</name>
</author>
<author>
<name sortKey="Cohoon, M" uniqKey="Cohoon M">M Cohoon</name>
</author>
<author>
<name sortKey="De Crecy Lagard, V" uniqKey="De Crecy Lagard V">V de Crécy-Lagard</name>
</author>
<author>
<name sortKey="Diaz, N" uniqKey="Diaz N">N Diaz</name>
</author>
<author>
<name sortKey="Disz, T" uniqKey="Disz T">T Disz</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Markowitz, Vm" uniqKey="Markowitz V">VM Markowitz</name>
</author>
<author>
<name sortKey="Ivanova, Nn" uniqKey="Ivanova N">NN Ivanova</name>
</author>
<author>
<name sortKey="Szeto, E" uniqKey="Szeto E">E Szeto</name>
</author>
<author>
<name sortKey="Palaniappan, K" uniqKey="Palaniappan K">K Palaniappan</name>
</author>
<author>
<name sortKey="Chu, K" uniqKey="Chu K">K Chu</name>
</author>
<author>
<name sortKey="Dalevi, D" uniqKey="Dalevi D">D Dalevi</name>
</author>
<author>
<name sortKey="Chen, I Ma" uniqKey="Chen I">I-MA Chen</name>
</author>
<author>
<name sortKey="Grechkin, Y" uniqKey="Grechkin Y">Y Grechkin</name>
</author>
<author>
<name sortKey="Dubchak, I" uniqKey="Dubchak I">I Dubchak</name>
</author>
<author>
<name sortKey="Anderson, I" uniqKey="Anderson I">I Anderson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Markowitz, Vm" uniqKey="Markowitz V">VM Markowitz</name>
</author>
<author>
<name sortKey="Chen, I Ma" uniqKey="Chen I">I-MA Chen</name>
</author>
<author>
<name sortKey="Chu, K" uniqKey="Chu K">K Chu</name>
</author>
<author>
<name sortKey="Szeto, E" uniqKey="Szeto E">E Szeto</name>
</author>
<author>
<name sortKey="Palaniappan, K" uniqKey="Palaniappan K">K Palaniappan</name>
</author>
<author>
<name sortKey="Grechkin, Y" uniqKey="Grechkin Y">Y Grechkin</name>
</author>
<author>
<name sortKey="Ratner, A" uniqKey="Ratner A">A Ratner</name>
</author>
<author>
<name sortKey="Jacob, B" uniqKey="Jacob B">B Jacob</name>
</author>
<author>
<name sortKey="Pati, A" uniqKey="Pati A">A Pati</name>
</author>
<author>
<name sortKey="Huntemann, M" uniqKey="Huntemann M">M Huntemann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Seshadri, R" uniqKey="Seshadri R">R Seshadri</name>
</author>
<author>
<name sortKey="Kravitz, Sa" uniqKey="Kravitz S">SA Kravitz</name>
</author>
<author>
<name sortKey="Smarr, L" uniqKey="Smarr L">L Smarr</name>
</author>
<author>
<name sortKey="Gilna, P" uniqKey="Gilna P">P Gilna</name>
</author>
<author>
<name sortKey="Frazier, M" uniqKey="Frazier M">M Frazier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
<author>
<name sortKey="Paarmann, D" uniqKey="Paarmann D">D Paarmann</name>
</author>
<author>
<name sortKey="D Souza, M" uniqKey="D Souza M">M D'Souza</name>
</author>
<author>
<name sortKey="Olson, R" uniqKey="Olson R">R Olson</name>
</author>
<author>
<name sortKey="Glass, Em" uniqKey="Glass E">EM Glass</name>
</author>
<author>
<name sortKey="Kubal, M" uniqKey="Kubal M">M Kubal</name>
</author>
<author>
<name sortKey="Paczian, T" uniqKey="Paczian T">T Paczian</name>
</author>
<author>
<name sortKey="Rodriguez, A" uniqKey="Rodriguez A">A Rodriguez</name>
</author>
<author>
<name sortKey="Stevens, R" uniqKey="Stevens R">R Stevens</name>
</author>
<author>
<name sortKey="Wilke, A" uniqKey="Wilke A">A Wilke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aziz, Rk" uniqKey="Aziz R">RK Aziz</name>
</author>
<author>
<name sortKey="Bartels, D" uniqKey="Bartels D">D Bartels</name>
</author>
<author>
<name sortKey="Best, Aa" uniqKey="Best A">AA Best</name>
</author>
<author>
<name sortKey="Dejongh, M" uniqKey="Dejongh M">M DeJongh</name>
</author>
<author>
<name sortKey="Disz, T" uniqKey="Disz T">T Disz</name>
</author>
<author>
<name sortKey="Edwards, Ra" uniqKey="Edwards R">RA Edwards</name>
</author>
<author>
<name sortKey="Formsma, K" uniqKey="Formsma K">K Formsma</name>
</author>
<author>
<name sortKey="Gerdes, S" uniqKey="Gerdes S">S Gerdes</name>
</author>
<author>
<name sortKey="Glass, Em" uniqKey="Glass E">EM Glass</name>
</author>
<author>
<name sortKey="Kubal, M" uniqKey="Kubal M">M Kubal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
<author>
<name sortKey="Overbeek, R" uniqKey="Overbeek R">R Overbeek</name>
</author>
<author>
<name sortKey="Rodriguez, A" uniqKey="Rodriguez A">A Rodriguez</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
<author>
<name sortKey="Paley, S" uniqKey="Paley S">S Paley</name>
</author>
<author>
<name sortKey="Romero, P" uniqKey="Romero P">P Romero</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
<author>
<name sortKey="Paley, Sm" uniqKey="Paley S">SM Paley</name>
</author>
<author>
<name sortKey="Krummenacker, M" uniqKey="Krummenacker M">M Krummenacker</name>
</author>
<author>
<name sortKey="Latendresse, M" uniqKey="Latendresse M">M Latendresse</name>
</author>
<author>
<name sortKey="Dale, Jm" uniqKey="Dale J">JM Dale</name>
</author>
<author>
<name sortKey="Lee, Tj" uniqKey="Lee T">TJ Lee</name>
</author>
<author>
<name sortKey="Kaipa, P" uniqKey="Kaipa P">P Kaipa</name>
</author>
<author>
<name sortKey="Gilham, F" uniqKey="Gilham F">F Gilham</name>
</author>
<author>
<name sortKey="Spaulding, A" uniqKey="Spaulding A">A Spaulding</name>
</author>
<author>
<name sortKey="Popescu, L" uniqKey="Popescu L">L Popescu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
<author>
<name sortKey="Latendresse, M" uniqKey="Latendresse M">M Latendresse</name>
</author>
<author>
<name sortKey="Caspi, R" uniqKey="Caspi R">R Caspi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Latendresse, M" uniqKey="Latendresse M">M Latendresse</name>
</author>
<author>
<name sortKey="Krummenacker, M" uniqKey="Krummenacker M">M Krummenacker</name>
</author>
<author>
<name sortKey="Trupp, M" uniqKey="Trupp M">M Trupp</name>
</author>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hucka, M" uniqKey="Hucka M">M Hucka</name>
</author>
<author>
<name sortKey="Finney, A" uniqKey="Finney A">A Finney</name>
</author>
<author>
<name sortKey="Sauro, Hm" uniqKey="Sauro H">HM Sauro</name>
</author>
<author>
<name sortKey="Bolouri, H" uniqKey="Bolouri H">H Bolouri</name>
</author>
<author>
<name sortKey="Doyle, Jc" uniqKey="Doyle J">JC Doyle</name>
</author>
<author>
<name sortKey="Kitano, H" uniqKey="Kitano H">H Kitano</name>
</author>
<author>
<name sortKey="Arkin, Ap" uniqKey="Arkin A">AP Arkin</name>
</author>
<author>
<name sortKey="Bornstein, Bj" uniqKey="Bornstein B">BJ Bornstein</name>
</author>
<author>
<name sortKey="Bray, D" uniqKey="Bray D">D Bray</name>
</author>
<author>
<name sortKey="Cornish Bowden, A" uniqKey="Cornish Bowden A">A Cornish-Bowden</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
<author>
<name sortKey="Riley, M" uniqKey="Riley M">M Riley</name>
</author>
<author>
<name sortKey="Saier, M" uniqKey="Saier M">M Saier</name>
</author>
<author>
<name sortKey="Paulsen, It" uniqKey="Paulsen I">IT Paulsen</name>
</author>
<author>
<name sortKey="Paley, Sm" uniqKey="Paley S">SM Paley</name>
</author>
<author>
<name sortKey="Pellegrini Toole, A" uniqKey="Pellegrini Toole A">A Pellegrini-Toole</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Caspi, R" uniqKey="Caspi R">R Caspi</name>
</author>
<author>
<name sortKey="Altman, T" uniqKey="Altman T">T Altman</name>
</author>
<author>
<name sortKey="Dreher, K" uniqKey="Dreher K">K Dreher</name>
</author>
<author>
<name sortKey="Fulcher, Ca" uniqKey="Fulcher C">CA Fulcher</name>
</author>
<author>
<name sortKey="Subhraveti, P" uniqKey="Subhraveti P">P Subhraveti</name>
</author>
<author>
<name sortKey="Keseler, Im" uniqKey="Keseler I">IM Keseler</name>
</author>
<author>
<name sortKey="Kothari, A" uniqKey="Kothari A">A Kothari</name>
</author>
<author>
<name sortKey="Krummenacker, M" uniqKey="Krummenacker M">M Krummenacker</name>
</author>
<author>
<name sortKey="Latendresse, M" uniqKey="Latendresse M">M Latendresse</name>
</author>
<author>
<name sortKey="Mueller, La" uniqKey="Mueller L">LA Mueller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Latendresse, M" uniqKey="Latendresse M">M Latendresse</name>
</author>
<author>
<name sortKey="Paley, S" uniqKey="Paley S">S Paley</name>
</author>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stark, M" uniqKey="Stark M">M Stark</name>
</author>
<author>
<name sortKey="Berger, Sa" uniqKey="Berger S">SA Berger</name>
</author>
<author>
<name sortKey="Stamatakis, A" uniqKey="Stamatakis A">A Stamatakis</name>
</author>
<author>
<name sortKey="Mering Von, C" uniqKey="Mering Von C">C Mering von</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hyatt, D" uniqKey="Hyatt D">D Hyatt</name>
</author>
<author>
<name sortKey="Chen, G L" uniqKey="Chen G">G-L Chen</name>
</author>
<author>
<name sortKey="Locascio, Pf" uniqKey="Locascio P">PF LoCascio</name>
</author>
<author>
<name sortKey="Land, Ml" uniqKey="Land M">ML Land</name>
</author>
<author>
<name sortKey="Larimer, Fw" uniqKey="Larimer F">FW Larimer</name>
</author>
<author>
<name sortKey="Hauser, Lj" uniqKey="Hauser L">LJ Hauser</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tatusov, Rl" uniqKey="Tatusov R">RL Tatusov</name>
</author>
<author>
<name sortKey="Natale, Da" uniqKey="Natale D">DA Natale</name>
</author>
<author>
<name sortKey="Garkavtsev, Iv" uniqKey="Garkavtsev I">IV Garkavtsev</name>
</author>
<author>
<name sortKey="Tatusova, Ta" uniqKey="Tatusova T">TA Tatusova</name>
</author>
<author>
<name sortKey="Shankavaram, Ut" uniqKey="Shankavaram U">UT Shankavaram</name>
</author>
<author>
<name sortKey="Rao, Bs" uniqKey="Rao B">BS Rao</name>
</author>
<author>
<name sortKey="Kiryutin, B" uniqKey="Kiryutin B">B Kiryutin</name>
</author>
<author>
<name sortKey="Galperin, My" uniqKey="Galperin M">MY Galperin</name>
</author>
<author>
<name sortKey="Fedorova, Nd" uniqKey="Fedorova N">ND Fedorova</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pruitt, Kd" uniqKey="Pruitt K">KD Pruitt</name>
</author>
<author>
<name sortKey="Maglott, Dr" uniqKey="Maglott D">DR Maglott</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kielbasa, Sm" uniqKey="Kielbasa S">SM Kiełbasa</name>
</author>
<author>
<name sortKey="Wan, R" uniqKey="Wan R">R Wan</name>
</author>
<author>
<name sortKey="Sato, K" uniqKey="Sato K">K Sato</name>
</author>
<author>
<name sortKey="Horton, P" uniqKey="Horton P">P Horton</name>
</author>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rasko, Da" uniqKey="Rasko D">DA Rasko</name>
</author>
<author>
<name sortKey="Myers, Gsa" uniqKey="Myers G">GSA Myers</name>
</author>
<author>
<name sortKey="Ravel, J" uniqKey="Ravel J">J Ravel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rost, B" uniqKey="Rost B">B Rost</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gentzsch, W" uniqKey="Gentzsch W">W Gentzsch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pruesse, E" uniqKey="Pruesse E">E Pruesse</name>
</author>
<author>
<name sortKey="Quast, C" uniqKey="Quast C">C Quast</name>
</author>
<author>
<name sortKey="Knittel, K" uniqKey="Knittel K">K Knittel</name>
</author>
<author>
<name sortKey="Fuchs, Bm" uniqKey="Fuchs B">BM Fuchs</name>
</author>
<author>
<name sortKey="Ludwig, W" uniqKey="Ludwig W">W Ludwig</name>
</author>
<author>
<name sortKey="Peplies, J" uniqKey="Peplies J">J Peplies</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO Glöckner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Desantis, Tz" uniqKey="Desantis T">TZ DeSantis</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Larsen, N" uniqKey="Larsen N">N Larsen</name>
</author>
<author>
<name sortKey="Rojas, M" uniqKey="Rojas M">M Rojas</name>
</author>
<author>
<name sortKey="Brodie, El" uniqKey="Brodie E">EL Brodie</name>
</author>
<author>
<name sortKey="Keller, K" uniqKey="Keller K">K Keller</name>
</author>
<author>
<name sortKey="Huber, T" uniqKey="Huber T">T Huber</name>
</author>
<author>
<name sortKey="Dalevi, D" uniqKey="Dalevi D">D Dalevi</name>
</author>
<author>
<name sortKey="Hu, P" uniqKey="Hu P">P Hu</name>
</author>
<author>
<name sortKey="Andersen, Gl" uniqKey="Andersen G">GL Andersen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lowe, Tm" uniqKey="Lowe T">TM Lowe</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Latendresse, M" uniqKey="Latendresse M">M Latendresse</name>
</author>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paley, Sm" uniqKey="Paley S">SM Paley</name>
</author>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dale, Jm" uniqKey="Dale J">JM Dale</name>
</author>
<author>
<name sortKey="Popescu, L" uniqKey="Popescu L">L Popescu</name>
</author>
<author>
<name sortKey="Karp, Pd" uniqKey="Karp P">PD Karp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richter, Dc" uniqKey="Richter D">DC Richter</name>
</author>
<author>
<name sortKey="Ott, F" uniqKey="Ott F">F Ott</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Schmid, R" uniqKey="Schmid R">R Schmid</name>
</author>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barton, Ad" uniqKey="Barton A">AD Barton</name>
</author>
<author>
<name sortKey="Dutkiewicz, S" uniqKey="Dutkiewicz S">S Dutkiewicz</name>
</author>
<author>
<name sortKey="Flierl, G" uniqKey="Flierl G">G Flierl</name>
</author>
<author>
<name sortKey="Bragg, J" uniqKey="Bragg J">J Bragg</name>
</author>
<author>
<name sortKey="Follows, Mj" uniqKey="Follows M">MJ Follows</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Follows, Mj" uniqKey="Follows M">MJ Follows</name>
</author>
<author>
<name sortKey="Dutkiewicz, S" uniqKey="Dutkiewicz S">S Dutkiewicz</name>
</author>
<author>
<name sortKey="Grant, S" uniqKey="Grant S">S Grant</name>
</author>
<author>
<name sortKey="Chisholm, Sw" uniqKey="Chisholm S">SW Chisholm</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Larsen, Pe" uniqKey="Larsen P">PE Larsen</name>
</author>
<author>
<name sortKey="Field, D" uniqKey="Field D">D Field</name>
</author>
<author>
<name sortKey="Gilbert, Ja" uniqKey="Gilbert J">JA Gilbert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Larsen, Pe" uniqKey="Larsen P">PE Larsen</name>
</author>
<author>
<name sortKey="Collart, Fr" uniqKey="Collart F">FR Collart</name>
</author>
<author>
<name sortKey="Field, D" uniqKey="Field D">D Field</name>
</author>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
<author>
<name sortKey="Keegan, Kp" uniqKey="Keegan K">KP Keegan</name>
</author>
<author>
<name sortKey="Henry, Cs" uniqKey="Henry C">CS Henry</name>
</author>
<author>
<name sortKey="Mcgrath, J" uniqKey="Mcgrath J">J McGrath</name>
</author>
<author>
<name sortKey="Quinn, J" uniqKey="Quinn J">J Quinn</name>
</author>
<author>
<name sortKey="Gilbert, Ja" uniqKey="Gilbert J">JA Gilbert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Abubucker, S" uniqKey="Abubucker S">S Abubucker</name>
</author>
<author>
<name sortKey="Segata, N" uniqKey="Segata N">N Segata</name>
</author>
<author>
<name sortKey="Goll, J" uniqKey="Goll J">J Goll</name>
</author>
<author>
<name sortKey="Schubert, Am" uniqKey="Schubert A">AM Schubert</name>
</author>
<author>
<name sortKey="Izard, J" uniqKey="Izard J">J Izard</name>
</author>
<author>
<name sortKey="Cantarel, Bl" uniqKey="Cantarel B">BL Cantarel</name>
</author>
<author>
<name sortKey="Rodriguez Mueller, B" uniqKey="Rodriguez Mueller B">B Rodriguez-Mueller</name>
</author>
<author>
<name sortKey="Zucker, J" uniqKey="Zucker J">J Zucker</name>
</author>
<author>
<name sortKey="Thiagarajan, M" uniqKey="Thiagarajan M">M Thiagarajan</name>
</author>
<author>
<name sortKey="Henrissat, B" uniqKey="Henrissat B">B Henrissat</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y Ye</name>
</author>
<author>
<name sortKey="Doak, Tg" uniqKey="Doak T">TG Doak</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goll, J" uniqKey="Goll J">J Goll</name>
</author>
<author>
<name sortKey="Thiagarajan, M" uniqKey="Thiagarajan M">M Thiagarajan</name>
</author>
<author>
<name sortKey="Abubucker, S" uniqKey="Abubucker S">S Abubucker</name>
</author>
<author>
<name sortKey="Huttenhower, C" uniqKey="Huttenhower C">C Huttenhower</name>
</author>
<author>
<name sortKey="Yooseph, S" uniqKey="Yooseph S">S Yooseph</name>
</author>
<author>
<name sortKey="Methe, Ba" uniqKey="Methe B">BA Methé</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Henry, Cs" uniqKey="Henry C">CS Henry</name>
</author>
<author>
<name sortKey="Dejongh, M" uniqKey="Dejongh M">M DeJongh</name>
</author>
<author>
<name sortKey="Best, Aa" uniqKey="Best A">AA Best</name>
</author>
<author>
<name sortKey="Frybarger, Pm" uniqKey="Frybarger P">PM Frybarger</name>
</author>
<author>
<name sortKey="Linsay, B" uniqKey="Linsay B">B Linsay</name>
</author>
<author>
<name sortKey="Stevens, Rl" uniqKey="Stevens R">RL Stevens</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Henry, Cs" uniqKey="Henry C">CS Henry</name>
</author>
<author>
<name sortKey="Overbeek, R" uniqKey="Overbeek R">R Overbeek</name>
</author>
<author>
<name sortKey="Xia, F" uniqKey="Xia F">F Xia</name>
</author>
<author>
<name sortKey="Best, Aa" uniqKey="Best A">AA Best</name>
</author>
<author>
<name sortKey="Glass, E" uniqKey="Glass E">E Glass</name>
</author>
<author>
<name sortKey="Gilbert, J" uniqKey="Gilbert J">J Gilbert</name>
</author>
<author>
<name sortKey="Larsen, P" uniqKey="Larsen P">P Larsen</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R Edwards</name>
</author>
<author>
<name sortKey="Disz, T" uniqKey="Disz T">T Disz</name>
</author>
<author>
<name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kalyanaraman, A" uniqKey="Kalyanaraman A">A Kalyanaraman</name>
</author>
<author>
<name sortKey="Aluru, S" uniqKey="Aluru S">S Aluru</name>
</author>
<author>
<name sortKey="Kothari, S" uniqKey="Kothari S">S Kothari</name>
</author>
<author>
<name sortKey="Brendel, V" uniqKey="Brendel V">V Brendel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yooseph, S" uniqKey="Yooseph S">S Yooseph</name>
</author>
<author>
<name sortKey="Sutton, G" uniqKey="Sutton G">G Sutton</name>
</author>
<author>
<name sortKey="Rusch, Db" uniqKey="Rusch D">DB Rusch</name>
</author>
<author>
<name sortKey="Halpern, Al" uniqKey="Halpern A">AL Halpern</name>
</author>
<author>
<name sortKey="Williamson, Sj" uniqKey="Williamson S">SJ Williamson</name>
</author>
<author>
<name sortKey="Remington, K" uniqKey="Remington K">K Remington</name>
</author>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
<author>
<name sortKey="Heidelberg, Kb" uniqKey="Heidelberg K">KB Heidelberg</name>
</author>
<author>
<name sortKey="Manning, G" uniqKey="Manning G">G Manning</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kalyanaraman, A" uniqKey="Kalyanaraman A">A Kalyanaraman</name>
</author>
<author>
<name sortKey="Cannon, Wr" uniqKey="Cannon W">WR Cannon</name>
</author>
<author>
<name sortKey="Latt, B" uniqKey="Latt B">B Latt</name>
</author>
<author>
<name sortKey="Baxter, Dj" uniqKey="Baxter D">DJ Baxter</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="product-review" xml:lang="en">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">23800136</article-id>
<article-id pub-id-type="pmc">3695837</article-id>
<article-id pub-id-type="publisher-id">1471-2105-14-202</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-14-202</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Software</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>MetaPathways: a modular pipeline for constructing pathway/genome databases from environmental sequence information</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" id="A1">
<name>
<surname>Konwar</surname>
<given-names>Kishori M</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>kishori@mail.ubc.ca</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Hanson</surname>
<given-names>Niels W</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>nielsh@mail.ubc.ca</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Pagé</surname>
<given-names>Antoine P</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>appage@mail.ubc.ca</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A4">
<name>
<surname>Hallam</surname>
<given-names>Steven J</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>shallam@mail.ubc.ca</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada</aff>
<aff id="I2">
<label>2</label>
Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC Canada</aff>
<pub-date pub-type="collection">
<year>2013</year>
</pub-date>
<pub-date pub-type="epub">
<day>21</day>
<month>6</month>
<year>2013</year>
</pub-date>
<volume>14</volume>
<fpage>202</fpage>
<lpage>202</lpage>
<history>
<date date-type="received">
<day>24</day>
<month>1</month>
<year>2013</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>6</month>
<year>2013</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2013 Konwar et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2013</copyright-year>
<copyright-holder>Konwar et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/14/202"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>A central challenge to understanding the ecological and biogeochemical roles of microorganisms in natural and human engineered ecosystems is the reconstruction of metabolic interaction networks from environmental sequence information. The dominant paradigm in metabolic reconstruction is to assign functional annotations using BLAST. Functional annotations are then projected onto symbolic representations of metabolism in the form of KEGG pathways or SEED subsystems.</p>
</sec>
<sec>
<title>Results</title>
<p>Here we present MetaPathways, an open source pipeline for pathway inference that uses the PathoLogic algorithm to map functional annotations onto the MetaCyc collection of reactions and pathways, and construct environmental Pathway/Genome Databases (ePGDBs) compatible with the editing and navigation features of Pathway Tools. The pipeline accepts assembled or unassembled nucleotide sequences, performs quality assessment and control, predicts and annotates noncoding genes and open reading frames, and produces inputs to PathoLogic. In addition to constructing ePGDBs, MetaPathways uses MLTreeMap to build phylogenetic trees for selected taxonomic anchor and functional gene markers, converts General Feature Format (GFF) files into concatenated GenBank files for ePGDB construction based on third-party annotations, and generates useful file formats including Sequin files for direct GenBank submission and gene feature tables summarizing annotations, MLTreeMap trees, and ePGDB pathway coverage summaries for statistical comparisons.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>MetaPathways provides users with a modular annotation and analysis pipeline for predicting metabolic interaction networks from environmental sequence information using an alternative to KEGG pathways and SEED subsystems mapping. It is extensible to genomic and transcriptomic datasets from a wide range of sequencing platforms, and generates useful data products for microbial community structure and function analysis. The MetaPathways software package, installation instructions, and example data can be obtained from
<ext-link ext-link-type="uri" xlink:href="http://hallam.microbiology.ubc.ca/MetaPathways">http://hallam.microbiology.ubc.ca/MetaPathways</ext-link>
.</p>
</sec>
</abstract>
<kwd-group>
<kwd>Environmental pathway/Genome Database (ePGDB)</kwd>
<kwd>Metagenome</kwd>
<kwd>Pathway tools</kwd>
<kwd>PathoLogic</kwd>
<kwd>MetaCyc</kwd>
<kwd>Microbial community</kwd>
<kwd>Metabolism</kwd>
<kwd>Metabolic interaction networks</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>Metabolic interactions between microorganisms direct matter and energy transformations integral to ecosystem function [
<xref ref-type="bibr" rid="B1">1</xref>
-
<xref ref-type="bibr" rid="B3">3</xref>
]. Plurality sequencing methods enable exploration of potential (metagenomic) and expressed (metatranscriptomic) metabolic interactions with the aid of computational methods that assemble or cluster contiguous reads, search for patterns or motifs representing genes, and reconstruct pathways from environmental sequence information [
<xref ref-type="bibr" rid="B4">4</xref>
-
<xref ref-type="bibr" rid="B6">6</xref>
]. The prevailing paradigm in pathway reconstruction is to assign functional annotation based on sequence homology using BLAST [
<xref ref-type="bibr" rid="B7">7</xref>
]. Functional annotations are then projected onto symbolic representations of metabolism such as KEGG pathways [
<xref ref-type="bibr" rid="B8">8</xref>
-
<xref ref-type="bibr" rid="B10">10</xref>
] or SEED subsystems [
<xref ref-type="bibr" rid="B11">11</xref>
] revealing network structure.</p>
<p>With the expansion of next generation sequencing technologies, increasingly complex datasets are being generated for thousands of environmental samples resulting in analytic bottlenecks with the potential to stymie pathway reconstruction efforts. As a result, on-line services for metabolic reconstruction have been developed to externalize data processing burdens and provide warehousing and visualization tools for environmental sequence information. Popular on-line services for metabolic reconstruction include Integrated Microbial Genomes and Metagenomes (IMG/M), Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis (CAMERA), and Metagenome Rapid Annotation using Subsystem Technology (MG-RAST). Both IMG/M [
<xref ref-type="bibr" rid="B12">12</xref>
,
<xref ref-type="bibr" rid="B13">13</xref>
] and CAMERA [
<xref ref-type="bibr" rid="B14">14</xref>
] warehouse public datasets and provide management, exploration, and visualization tools for environmental sequence information. MG-RAST [
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B16">16</xref>
] warehouses public datasets and provides gene prediction and annotation services based on SEED subsystems mapping using FIGfams [
<xref ref-type="bibr" rid="B17">17</xref>
] and BLAST. While on-line services increase access to computational resources, idiosyncratic data processing and management practices common to each service insulate users from command-line optimization and create formatting and data transfer restrictions.</p>
<p>Pathway Tools [
<xref ref-type="bibr" rid="B18">18</xref>
,
<xref ref-type="bibr" rid="B19">19</xref>
] is a production-quality software system that enables construction, management and navigation of symbolic representations of metabolism in the form of Pathway/Genome databases (PGDBs). A PGDB encodes contemporary knowledge about the network properties of a cellular organism. Pathway Tools supports four modular operations including metabolic pathway prediction using PathoLogic [
<xref ref-type="bibr" rid="B18">18</xref>
,
<xref ref-type="bibr" rid="B20">20</xref>
], metabolic flux modeling using MetaFlux [
<xref ref-type="bibr" rid="B21">21</xref>
], PGDB editing and navigation tools including manual or automated search functions, and comparative analysis and systems level visualizations. Further, genes, reactions, and pathways can be exported via the Systems Biology Markup Language (SMBL) framework, allowing interoperability and downstream analysis with compatible systems biology tools [
<xref ref-type="bibr" rid="B22">22</xref>
]. The Pathologic module allows users to construct new PGDBs from an annotated genome using MetaCyc [
<xref ref-type="bibr" rid="B23">23</xref>
,
<xref ref-type="bibr" rid="B24">24</xref>
], a highly curated, non-redundant and experimentally validated database of metabolic pathways representing all domains of life. Unlike KEGG pathways or SEED subsystems, MetaCyc emphasizes smaller, evolutionary conserved units of metabolism or pathway variants that are regulated and transferred together. MetaCyc is also extensively commented with pathway descriptions, literature citations, and enzyme properties including subunit composition, substrate specificity, cofactors, activators, and inhibitors each connected to specific pathway variants. A web-server version of the Pathway Tools editing and navigation tools supports on-line browsing, manual curating and web publishing of PGDBs. Currently PGDBs for 2037 cellular organisms have been constructed and incorporated into the BioCyc collection [
<xref ref-type="bibr" rid="B25">25</xref>
].</p>
<p>Here we extend the PGDB concept for cellular organisms to microbial community structure and function through the introduction of MetaPathways, a modular pipeline for pathway inference that uses the PathoLogic algorithm to build environmental PGDBs (ePGDBs) compatible with the editing and navigation features of Pathway Tools. The pipeline accepts assembled contig or unassembled nucleotide sequences, performs quality control and coverage estimates, predicts and annotates noncoding genes and open reading frames, and produces concatenated GenBank files used as inputs to PathoLogic. In addition to constructing ePGDBs, MetaPathways uses MLTreeMap [
<xref ref-type="bibr" rid="B26">26</xref>
] to build phylogenetic trees for selected taxonomic anchor and functional gene markers, converts General Feature Format (.gff) files into concatenated GenBank (.gbk) files for ePGDB construction using third-party annotations, and generates useful file formats including Sequin files for direct GenBank submission and gene feature tables summarizing annotations, MLTreeMap, and ePGDBs for statistical comparisons.</p>
</sec>
<sec>
<title>Implementation</title>
<p>MetaPathways is a modular pipeline written in Python that calls software components written in C/C++, Perl, and Python. Required input files for MetaPathways include metagenomic or metatranscriptomic sequence data in one of several file formats (.fasta, .gff, or .gbk). The pipeline consists of five operational stages including (1) Quality control (QC) and open reading frame (ORF) prediction (2) ORF annotation, (3) Modular analysis (4) ePGDB construction, and (5) Pathway Export (Figure 
<xref ref-type="fig" rid="F1">1</xref>
). A parameter file (.parameters.txt) delimits software settings for successive operational stages and can be easily edited to enable or disable specific operations or modify default settings associated with specific software components (Figure 
<xref ref-type="fig" rid="F2">2</xref>
).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>The MetaPathways pipeline consists of five operational stages including (1) Quality control (QC) and open reading frame (ORF) prediction (2) ORF annotation, (3) Modular analysis (4) ePGDB construction, and (5) Pathway Export.</bold>
Inputs and executables are depicted on the left with corresponding output directories and exported files on the right.</p>
</caption>
<graphic xlink:href="1471-2105-14-202-1"></graphic>
</fig>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>The parameter file (.parameters.txt) delimits software settings for successive operational stages and can be easily edited to enable or disable specific operations or modify default settings associated with specific software components.</bold>
Execution flags including yes, skip, and grid control successive pipeline operations.</p>
</caption>
<graphic xlink:href="1471-2105-14-202-2"></graphic>
</fig>
<sec>
<title>Quality control & ORF prediction</title>
<p>Sequence information is processed to remove sequences below a user-defined length threshold, incompletely specified bases are converted to the character ‘N’ and input sequence identifiers are sequentially renamed (i.e., sample_#). A mapping file (.mapping.txt) is exported relating original input sequence names with sequential names. ORFs are predicted using the Prokaryotic Dynamic Programming Genefinding Algorithm (Prodigal), which can detect incomplete or fragmentary ORFs [
<xref ref-type="bibr" rid="B27">27</xref>
]. Coordinate information, nucleotide, and conceptually translated amino acid sequences for predicted ORFS are exported as .gff, .fna, and .faa files, respectively. By default, ORFs below a default length of 180 nucleotides or 60 amino acids are removed (.qced.faa) and nucleotide (.nuc.stats) and amino acid sequence (.amino.stats) distribution summaries before and after post-processing are exported.</p>
</sec>
<sec>
<title>ORF annotation</title>
<p>Conceptually translated ORFs are queried against user-defined reference protein databases including KEGG [
<xref ref-type="bibr" rid="B8">8</xref>
], COG [
<xref ref-type="bibr" rid="B28">28</xref>
], RefSeq [
<xref ref-type="bibr" rid="B29">29</xref>
], and MetaCyc, where MetaCyc refers to the pathway hole-filler database included with Pathway Tools [
<xref ref-type="bibr" rid="B19">19</xref>
], using the protein BLAST or optimized LAST algorithm [
<xref ref-type="bibr" rid="B30">30</xref>
] in tabular format (.blastout/.lastout). Concomitant with reference protein database queries, self-BLAST bit-scores are calculated (.refscores) enabling a measure of similarity using the BLAST-score ratio (BSR) [
<xref ref-type="bibr" rid="B31">31</xref>
]. BLAST summary tables parsing resulting e-values, percent identities, bit-scores, lengths, and BSRs are exported for each reference database (.blastout.parsed.txt) highlighting the e-value, percent identity, bit-score, length, and BSR values. By default, annotations with BSRs below 0.4 corresponding to the so-called “Twilight Zone” of gene annotation [
<xref ref-type="bibr" rid="B32">32</xref>
] are excluded from summary tables.</p>
<p>BLAST represents a computational burden that can limit pipeline performance on big datasets when implemented on local machines. Therefore, we have adopted a representational state transfer (REST) design supporting implementation on external Sun Grid Engine servers or supercomputers [
<xref ref-type="bibr" rid="B33">33</xref>
]. A user-defined connection filter (username, password and external server address for configuration) and externalization script enables setup (uploading, formatting, and installing BLAST databases and executables), parallel splitting of BLAST jobs, queue submission and management, and the collection and consolidation of results back to the local machine. This creates a RESTful system that is robust to unforeseen interruption and is readily transferable to the cloud. MetaPathways can also incorporate third party annotations sourced from .gbk or .gff files directly using embedded file-interconversion scripts.</p>
<p>Tabular BLAST results returned from local or external resources are used to assign product descriptions to predicted ORFs based on an internal heuristic to standardize product descriptions. For each ORF, the top e-value from each reference database is selected and given an “information score” based on the number of distinct enzymatic words and a preference to Enzyme Commission (EC) numbers (+10 score). Functional annotations with the highest information score are appended to the ORF description and exported as a tabular file (.annotated.gff). Predicted ORFs with no BLAST hits are annotated as “hypothetical protein.” In addition, BLAST summaries of functional annotations at different hierarchical levels (Cite KEGG/COG) are exported for KEGG and COG databases (.{DB}.stats.txt). Following functional annotation of predicted ORFs, nucleotide sequences are queried against reference nucleotide databases including SILVA [
<xref ref-type="bibr" rid="B34">34</xref>
] and GreenGenes [
<xref ref-type="bibr" rid="B35">35</xref>
] to identify ribosomal RNA genes. BLAST summary tables containing e-values, percent identities, bit-scores, lengths and taxonomic identity are exported for each reference database (.rRNA.stats.txt). This information is combined with the file .annotated.gff to generate input files for ePGDB construction, standard .gbk file and .sequin file for NCBI submission.</p>
</sec>
<sec>
<title>Analyses</title>
<p>MetaPathways currently implements three modular analyses using existing or derived files as input (.fasta, .gbk, or .gff input formats and derived tabular results). The first analytic module implements tRNA-scan (version 1.4) to identify relevant tRNAs from QC nucleotide sequences [
<xref ref-type="bibr" rid="B36">36</xref>
]. Resulting tRNA identifications are appended to the .gbk and .sequin files as additional annotations. The second analytic module implements the popular and widely accepted LCA algorithm for taxonomic binning [
<xref ref-type="bibr" rid="B37">37</xref>
]. The lowest common ancestor in the NCBI taxonomic hierarchy is selected based on the previously calculated BLAST-hits from the RefSeq database. This effectively sums the number of BLAST hits at the lowest shared position of the hierarchy. The RefSeq taxonomic names often contain multiple synonyms or alternative spellings. Therefore, names that conform to the official NCBI taxonomy are selected in preference over unknown synonyms. The third analytic module implements MLTreeMap (version 2.061) to identify and construct trees for selected phylogenetic and functional marker genes from QC nucleotide sequences [
<xref ref-type="bibr" rid="B26">26</xref>
]. Results from LCA and MLTreeMap analysis are exported as a tabular file (fxn_and_taxa_table.txt). Additional analysis modules implemented from the command line can be directly inserted into the pipeline. By convention, results from each analysis are placed in a self-titled directory under the parent results directory (i.e. /results/mltreemap).</p>
</sec>
<sec>
<title>ePGDB construction</title>
<p>The annotated ORF file (.annotated.gff) is parsed and separated into four files including (1) an annotation file containing gene product information, (2) a nucleotide sequence file in .fasta format, (3) a genetic-elements file, and (4) a PGDB parameters file (/ptools/). For the purposes of ePGDB construction, nucleotide input files are concatenated to form a single “chromosomal” element defining a composite genome. Concatenation is necessary to improve Pathway Tools performance on input files containing thousands of genetic-elements in batch mode. PathoLogic uses these input files to predict metabolic pathways based on defined biochemical rules (pathway completion, diagnostic/key enzymes, biosynthesis and degradation constraints) resulting in ePGDB construction and export to the local Pathway Tools internal library ($Pathway_Tools/user/).</p>
<p>Environmental PGDBs and their contents are accessible, internally or externally, through a built-in web server, allowing the knowledge of genes, proteins, metabolic and regulatory networks embedded within them to be queried, compared, curated and shared in a distributed fashion via the Internet. In addition to powerful search and retrieval functions, Pathway Tools provides a metabolic encyclopedia, based on primary literature citations encompassing more than 1900 evolutionary conserved sub-pathways within the MetaCyc schema [
<xref ref-type="bibr" rid="B21">21</xref>
,
<xref ref-type="bibr" rid="B24">24</xref>
,
<xref ref-type="bibr" rid="B38">38</xref>
]. The “Cellular Overview” feature displays ePGDB contents in the form of interactive glyphs that link sub-pathways together in a global picture of metabolism [
<xref ref-type="bibr" rid="B39">39</xref>
]. Hovering over a glyph activates a tooltip that identifies the pathway and clicking on a glyph reveals pathway interactions at the level of enzymes, reactions and identified ORFs (Figure 
<xref ref-type="fig" rid="F3">3</xref>
). Direct comparisons between ePGDBs can be made using coloured overlays on the cellular overview revealing similarities and differences in metabolic pathway composition (Figure 
<xref ref-type="fig" rid="F4">4</xref>
).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>Environmental PGDBs (ePGDBS) and their contents are accessible through a built-in web server, allowing the knowledge of genes, proteins, metabolic and regulatory networks embedded within them to be queried, compared, curated and shared in a distributed fashion via the Internet.</bold>
The “Cellular Overview” feature displays ePGDB contents in the form of interactive glyphs that link sub-pathways together in a global picture of metabolism scalable down to the level of pathways, reactions and individual open reading frames.</p>
</caption>
<graphic xlink:href="1471-2105-14-202-3"></graphic>
</fig>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>Cellular overview in the Pathway Tools software highlighting pathways found in the Naphtha-degrading culture sample.</bold>
Using the assembled Illumina pathways as a backbone (blue), common predicted pathways from the 454 (red) and Sanger (green) sequencings are placed on top. Allowing exploration of pathways predicted using different sequence technologies and depth.</p>
</caption>
<graphic xlink:href="1471-2105-14-202-4"></graphic>
</fig>
</sec>
<sec>
<title>Pathway export</title>
<p>Information is extracted from ePGDBs including ORF identities, enzyme abundance, and pathway coverage and exported in tabular format (.pathways.txt, and pathway_rxns.txt). A receipt and time-stamp for each successful pipeline execution is created containing the specific parameter settings used in ePGDB creation (.run_parameters.txt).</p>
</sec>
<sec>
<title>Performance</title>
<p>MetaPathways performance was evaluated using unassembled (Sanger fosmid end, 454 pyrosequencing) and assembled (Illumina HiSeq) genomic sequence information sourced from a naphtha-degrading, methanogenic enrichment culture (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
). Input datasets captured a range of nucleotide sequence numbers, lengths and sample coverage. Base pathway prediction and runtime increased as a function of nucleotide sequence number. While runtime complexity varies in relation to input file size and external resource allocation, empirical runtimes approached an upward limit of 2,300 sequences per minute, when externalizing BLAST on the Western Canadian Research Grid [
<xref ref-type="bibr" rid="B40">40</xref>
] (Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
). Remaining analyses and data transformations were performed locally on a Mac Pro desktop computer running Mac OSX 10.6.8 with a 2×2.4 Ghz Quad-Core Intel Xeon processors and 24GB of 1066Mhz DDR3 RAM.</p>
<sec>
<title>Evaluation of pathway prediction with simulated metagenomes</title>
<p>Previous studies have evaluated PathoLogic’s performance on fully-sequenced genomes establishing its pathway prediction power in relation to machine learning methods [
<xref ref-type="bibr" rid="B41">41</xref>
]. To determine PathoLogic’s performance on combined and incomplete genomes sourced from environmental sequence information we generated simulated metagenomes from 10 BioCyc tier-2 PGDBs (Additional file
<xref ref-type="supplementary-material" rid="S2">2</xref>
) using MetaSim [
<xref ref-type="bibr" rid="B42">42</xref>
] (Sanger sequencing, average length 700 bp, standard deviation 100 bp) with differing sequence coverage and taxon distribution profiles (Sim1 and Sim2). Tier-2 PGDBs were selected to minimize potential name mapping errors between MetaPathways’ annotations and extant MetaCyc annotations [
<xref ref-type="bibr" rid="B41">41</xref>
]. In Sim1 each genome was present at equal coverage and in Sim2 the
<italic>Caulobacter crescentus</italic>
NA1000 genome was overrepresented by 20-fold (Figure 
<xref ref-type="fig" rid="F5">5</xref>
a). Simulations manifesting progressively larger fractions of total unique sequence length (unique-Gm) revealed that pathway recovery increases with sequence coverage (Figure 
<xref ref-type="fig" rid="F5">5</xref>
b). Specificity, a measure of the confidence in accurate pathway prediction was high (>85%) regardless of taxonomic distribution or sequence coverage (Figure 
<xref ref-type="fig" rid="F5">5</xref>
c) consistent with reduced Type I errors (false positives). However, sensitivity, a measure of the confidence in predicting specific pathways present in the sample, was reduced at low coverage consistent with increased Type II errors (false negatives) (Figure 
<xref ref-type="fig" rid="F5">5</xref>
c). A 6% reduction in pathway recovery between Sim1 and Sim2 was observed, suggesting that pathway prediction follows a collector’s curve in which core metabolic functions shared between community members initially accumulate. As coverage increases, the encounter frequency for accessory genes increases resulting in improved pathway prediction approaching a limit based on extant MetaCyc pathways. Summary statistics including F-measure and Matthews Correlation Coefficient that balance between Type I and Type II errors, reinforce the observation that PathoLogic’s performance improves with increasing sequence coverage (Table 
<xref ref-type="table" rid="T1">1</xref>
and Additional file
<xref ref-type="supplementary-material" rid="S3">3</xref>
).</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption>
<p>
<bold>Analysis on in silico simulated sequencing experiments across different levels of coverage and taxon distribution.</bold>
Sim1 (blue) contains ten tier-2 PGDB genomes in approximately equal proportion. Sim2 (red) has one taxon overrepresented by 20-fold. Tier-2 taxa were selected on the basis of approximately equal genome size and gene content (
<bold>a</bold>
). Predicted pathway recovery as a percentage of the total pathways predicted from the full genome (
<bold>b</bold>
). Specificity (triangles) and sensitivity (squares) classification performance of predicted pathways using the pathways predicted on the full genomes as the gold standard (
<bold>c</bold>
). Interpolating lines were drawn via a natural spline.</p>
</caption>
<graphic xlink:href="1471-2105-14-202-5"></graphic>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption>
<p>Pathway classification performance statistics for simulated metagenomes Sim1 and Sim2 at progressively larger sequence coverage</p>
</caption>
<table frame="hsides" rules="groups" border="1">
<colgroup>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
<col align="center"></col>
</colgroup>
<thead valign="top">
<tr>
<th align="center">
<bold>Sample</bold>
</th>
<th align="center">
<bold>Gm</bold>
</th>
<th align="center">
<bold>Precision</bold>
</th>
<th align="center">
<bold>Sensitivity</bold>
</th>
<th align="center">
<bold>Specificity</bold>
</th>
<th align="center">
<bold>Accuracy</bold>
</th>
<th align="center">
<bold>F-measure</bold>
</th>
<th align="center">
<bold>Matthews</bold>
</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center" valign="bottom">Sim1
<hr></hr>
</td>
<td align="center" valign="bottom">(1/32)
<hr></hr>
</td>
<td align="center" valign="bottom">0.96
<hr></hr>
</td>
<td align="center" valign="bottom">0.31
<hr></hr>
</td>
<td align="center" valign="bottom">0.99
<hr></hr>
</td>
<td align="center" valign="bottom">0.73
<hr></hr>
</td>
<td align="center" valign="bottom">0.47
<hr></hr>
</td>
<td align="center" valign="bottom">0.79
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim1
<hr></hr>
</td>
<td align="center" valign="bottom">(1/16)
<hr></hr>
</td>
<td align="center" valign="bottom">0.70
<hr></hr>
</td>
<td align="center" valign="bottom">0.38
<hr></hr>
</td>
<td align="center" valign="bottom">0.98
<hr></hr>
</td>
<td align="center" valign="bottom">0.75
<hr></hr>
</td>
<td align="center" valign="bottom">0.53
<hr></hr>
</td>
<td align="center" valign="bottom">0.73
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim1
<hr></hr>
</td>
<td align="center" valign="bottom">(1/8)
<hr></hr>
</td>
<td align="center" valign="bottom">0.76
<hr></hr>
</td>
<td align="center" valign="bottom">0.57
<hr></hr>
</td>
<td align="center" valign="bottom">0.98
<hr></hr>
</td>
<td align="center" valign="bottom">0.82
<hr></hr>
</td>
<td align="center" valign="bottom">0.71
<hr></hr>
</td>
<td align="center" valign="bottom">0.81
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim1
<hr></hr>
</td>
<td align="center" valign="bottom">(1/4)
<hr></hr>
</td>
<td align="center" valign="bottom">0.85
<hr></hr>
</td>
<td align="center" valign="bottom">0.69
<hr></hr>
</td>
<td align="center" valign="bottom">0.97
<hr></hr>
</td>
<td align="center" valign="bottom">0.86
<hr></hr>
</td>
<td align="center" valign="bottom">0.80
<hr></hr>
</td>
<td align="center" valign="bottom">0.83
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim1
<hr></hr>
</td>
<td align="center" valign="bottom">(1/2)
<hr></hr>
</td>
<td align="center" valign="bottom">0.81
<hr></hr>
</td>
<td align="center" valign="bottom">0.80
<hr></hr>
</td>
<td align="center" valign="bottom">0.98
<hr></hr>
</td>
<td align="center" valign="bottom">0.91
<hr></hr>
</td>
<td align="center" valign="bottom">0.87
<hr></hr>
</td>
<td align="center" valign="bottom">0.88
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim1
<hr></hr>
</td>
<td align="center" valign="bottom">(1/1)
<hr></hr>
</td>
<td align="center" valign="bottom">0.84
<hr></hr>
</td>
<td align="center" valign="bottom">0.89
<hr></hr>
</td>
<td align="center" valign="bottom">0.97
<hr></hr>
</td>
<td align="center" valign="bottom">0.94
<hr></hr>
</td>
<td align="center" valign="bottom">0.92
<hr></hr>
</td>
<td align="center" valign="bottom">0.91
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim2
<hr></hr>
</td>
<td align="center" valign="bottom">(1/32)
<hr></hr>
</td>
<td align="center" valign="bottom">0.93
<hr></hr>
</td>
<td align="center" valign="bottom">0.25
<hr></hr>
</td>
<td align="center" valign="bottom">0.99
<hr></hr>
</td>
<td align="center" valign="bottom">0.70
<hr></hr>
</td>
<td align="center" valign="bottom">0.40
<hr></hr>
</td>
<td align="center" valign="bottom">0.74
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim2
<hr></hr>
</td>
<td align="center" valign="bottom">(1/16)
<hr></hr>
</td>
<td align="center" valign="bottom">0.95
<hr></hr>
</td>
<td align="center" valign="bottom">0.36
<hr></hr>
</td>
<td align="center" valign="bottom">0.99
<hr></hr>
</td>
<td align="center" valign="bottom">0.75
<hr></hr>
</td>
<td align="center" valign="bottom">0.53
<hr></hr>
</td>
<td align="center" valign="bottom">0.78
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim2
<hr></hr>
</td>
<td align="center" valign="bottom">(1/8)
<hr></hr>
</td>
<td align="center" valign="bottom">0.93
<hr></hr>
</td>
<td align="center" valign="bottom">0.49
<hr></hr>
</td>
<td align="center" valign="bottom">0.98
<hr></hr>
</td>
<td align="center" valign="bottom">0.79
<hr></hr>
</td>
<td align="center" valign="bottom">0.64
<hr></hr>
</td>
<td align="center" valign="bottom">0.78
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim2
<hr></hr>
</td>
<td align="center" valign="bottom">(1/4)
<hr></hr>
</td>
<td align="center" valign="bottom">0.95
<hr></hr>
</td>
<td align="center" valign="bottom">0.62
<hr></hr>
</td>
<td align="center" valign="bottom">0.98
<hr></hr>
</td>
<td align="center" valign="bottom">0.84
<hr></hr>
</td>
<td align="center" valign="bottom">0.75
<hr></hr>
</td>
<td align="center" valign="bottom">0.83
<hr></hr>
</td>
</tr>
<tr>
<td align="center" valign="bottom">Sim2
<hr></hr>
</td>
<td align="center" valign="bottom">(1/2)
<hr></hr>
</td>
<td align="center" valign="bottom">0.97
<hr></hr>
</td>
<td align="center" valign="bottom">0.70
<hr></hr>
</td>
<td align="center" valign="bottom">0.98
<hr></hr>
</td>
<td align="center" valign="bottom">0.88
<hr></hr>
</td>
<td align="center" valign="bottom">0.81
<hr></hr>
</td>
<td align="center" valign="bottom">0.87
<hr></hr>
</td>
</tr>
<tr>
<td align="center">Sim2</td>
<td align="center">(1/1)</td>
<td align="center">0.95</td>
<td align="center">0.81</td>
<td align="center">0.97</td>
<td align="center">0.91</td>
<td align="center">0.87</td>
<td align="center">0.87</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec>
<title>Related work</title>
<p>While efforts to model microbial community structure in relation to environmental parameters have successfully predicted real-world distribution and diversity patterns in the surface ocean [
<xref ref-type="bibr" rid="B43">43</xref>
-
<xref ref-type="bibr" rid="B45">45</xref>
], the extension of modeling approaches to microbial metabolic interaction networks remains nascent. Function-based models such as Predicted Relative Metabolic Turnover (PRMT) predict metabolic flux in the environment based on the abundance of unique functional annotations using MG-RAST [
<xref ref-type="bibr" rid="B46">46</xref>
]. More recently, Abubucker and colleagues developed the Human Microbiome Project Unified Metabolic Analysis Network (HUMAnN) for metabolic reconstruction [
<xref ref-type="bibr" rid="B47">47</xref>
]. HUMAnN integrates MinPath to reconcile the multiple mapping problem associated with BLAST-based annotations for metabolic inference based on KEGG pathways and SEED subsystems [
<xref ref-type="bibr" rid="B48">48</xref>
] with additional taxonomic limitation and gap filling algorithms to reduce false positives and correct for rare genes in abundant pathways. HUMAnN results have been compared using Metagenomics Reports (METAREP) data storage and retrieval pipeline that supports scalable and dynamic analysis of complex environmental datasets [
<xref ref-type="bibr" rid="B49">49</xref>
]. While Pathway Tools uses its set of biochemical rules for pathway prediction, an alternative to Pathway Tools for the construction of genome-scale metabolic networks has also been integrated into SEED servers. This approach projects reactions onto the comparatively coarser KEGG metabolic map without further filtering or weighting results, and applies a mixed linear integer optimization for filling reaction gaps [
<xref ref-type="bibr" rid="B50">50</xref>
,
<xref ref-type="bibr" rid="B51">51</xref>
]. However, this method has not yet been applied to metabolic interaction networks in the environment.</p>
</sec>
<sec>
<title>Pipeline limitations</title>
<p>Compared to current methods that project functional annotations from environmental sequence information onto KEGG pathways or SEED subsystems, MetaPathways enables an alternative algorithmic approach to metabolic reconstruction using evolutionarily conserved pathway prediction based on coverage and biochemical pathway rules. Moreover, the pipeline performs taxonomic binning and functional gene annotation, integrates external resource partitioning on compute clusters using the Sun Grid engine, and supports useful data transformation and formatting options. While we have demonstrated pipeline scalability with next generation sequencing datasets, further improvements to computationally intensive stages including BLAST/LAST-based annotation and ePGDB construction are needed to keep pace with projected advances in sequencing throughput. Future pipeline implementations will enable users to harness multi-core desktop computers to build local grid engines or to externalize BLAST and ePGDB construction on commercial computing resources such as the Amazon Elastic Compute Cloud (EC2). As an alternative to comprehensive all-against-all homology searches, future pipeline implementations will also incorporate scalable and distributed clustering algorithms enabling functional annotation based on hierarchical cluster assignments [
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B52">52</xref>
-
<xref ref-type="bibr" rid="B54">54</xref>
].</p>
<p>Aside from runtime improvements, additional data transformation and visual analysis modules expanding on existing taxonomic binning and marker gene identification components are needed. These include coverage statistics for assembled sequence information, data matrices and interactive visualizations indicating numerical abundance and taxonomic distribution of enzymatic steps, self-organizing maps and automated methods to append single cell or population genome assemblies to the NCBI hierarchy for more accurate taxonomic binning. Additional reference databases for 5S, 7S and 23S RNA genes and updates to the current MetaCyc database that include more biogeochemically relevant pathways are needed to improve BLAST and cluster-based annotation efforts. Finally, more experience and operational insight is needed in constructing, comparing and interpreting ePGDBs to identify potential sources of error and inform ongoing Pathway Tools development efforts.</p>
</sec>
</sec>
<sec sec-type="conclusions">
<title>Conclusions</title>
<p>MetaPathways provides users with a modular annotation and analysis pipeline for predicting metabolic interaction networks from environmental sequence information. It is extensible to genomic and transcriptomic datasets from multiple sequencing platforms, and generates useful data products for microbial community structure and functional analysis including phylogenetic trees, taxonomic bins and tabular annotation files. The pipeline provides local and external computing solutions for implementing BLAST/LAST homology searches, resolves data handling issues associated with .gbk and .gff file conversion and NCBI submission, and generates ePGBDs using Pathway Tools for pathway inference and interactive visualization. The MetaPathways software, installation instructions, tutorials and example data can be obtained from
<ext-link ext-link-type="uri" xlink:href="http://github.com/hallamlab/MetaPathways/">http://github.com/hallamlab/MetaPathways/</ext-link>
or
<ext-link ext-link-type="uri" xlink:href="http://hallam.microbiology.ubc.ca/MetaPathways">http://hallam.microbiology.ubc.ca/MetaPathways</ext-link>
.</p>
</sec>
<sec>
<title>Availability and requirements</title>
<p>
<bold>Project Name:</bold>
MetaPathways 1.0.</p>
<p>
<bold>Project Home Page:</bold>
<ext-link ext-link-type="uri" xlink:href="http://hallam.microbiology.ubc.ca/MetaPathways">http://hallam.microbiology.ubc.ca/MetaPathways</ext-link>
.</p>
<p>
<bold>Operating system(s):</bold>
Linux/Unix, Mac OSX 10.6.x or later, Windows XP.</p>
<p>
<bold>Programming Languages:</bold>
C/C++, Python 2.7+, Perl.</p>
<p>
<bold>Other Requirements:</bold>
GCC compiler, NCBI BLAST 2.2.25+, Pathway Tools 16.0.</p>
<p>
<bold>License:</bold>
GNU GPL, Academic licenses needed for BLAST and Pathway Tools.</p>
</sec>
<sec>
<title>Abbreviations</title>
<p>ePGDB: Environmental Pathway/Genome Database; ORF: Open reading frame; EC: Enzyme commission; REST: Representational state transfer.</p>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors are unaware of any competing interests.</p>
</sec>
<sec>
<title>Authors’ contributions</title>
<p>KMK was the primary pipeline developer and co-wrote the paper. NWH assisted with pipeline development and co-wrote the paper. APP helped conceptualize and implement initial pipeline implementations. SJH conceived pipeline architecture, provided essential feedback on data products, integration and formatting, and co-wrote the paper. All authors read and approved the final manuscript.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="S1">
<caption>
<title>Additional file 1</title>
<p>MetaPathways validation summary based on a comparison of three sequencing methods on a common sample.</p>
</caption>
<media xlink:href="1471-2105-14-202-S1.xlsx">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S2">
<caption>
<title>Additional file 2</title>
<p>Source genome statistics for simulated metagenomes sim1 and sim2.</p>
</caption>
<media xlink:href="1471-2105-14-202-S2.xlsx">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S3">
<caption>
<title>Additional file 3</title>
<p>Confusion tables for classification analysis of simulated metagenomes sim1 and sim2 at progressively larger sequence coverage.</p>
</caption>
<media xlink:href="1471-2105-14-202-S3.xlsx">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<sec>
<title>Acknowledgements</title>
<p>This work was carried out under the auspices of Genome Canada, Genome British Columbia, the Natural Science and Engineering Research Council (NSERC) of Canada, the Canadian Foundation for Innovation (CFI) and the Canadian Institute for Advanced Research (CIFAR). The Western Canadian Research Grid (WestGrid) provided access to high-performance computing resources. KMK was supported by the Tula Foundation funded Centre for Microbial Diversity and Evolution (CMDE) at UBC. We would like to thank Charles Howes and Simon Eng for technical support, Peter Karp and the SRI International staff for invaluable comments related to design ethos and implementation of MetaPathways, and Dr. Julia Foght at the University of Alberta for providing genomic sequence information from a Naptha-degrading enrichment culture used in pipeline performance evaluation.</p>
</sec>
<ref-list>
<ref id="B1">
<mixed-citation publication-type="journal">
<name>
<surname>Wright</surname>
<given-names>JJ</given-names>
</name>
<name>
<surname>Konwar</surname>
<given-names>KM</given-names>
</name>
<name>
<surname>Hallam</surname>
<given-names>SJ</given-names>
</name>
<article-title>Microbial ecology of expanding oxygen minimum zones</article-title>
<source>Nat Rev Microbiol</source>
<year>2012</year>
<volume>10</volume>
<fpage>381</fpage>
<lpage>394</lpage>
<pub-id pub-id-type="pmid">22580367</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<name>
<surname>Delong</surname>
<given-names>EF</given-names>
</name>
<article-title>Towards microbial systems science: integrating microbial perspective, from genomes to biomes</article-title>
<source>Environ Microbiol</source>
<year>2002</year>
<volume>4</volume>
<fpage>9</fpage>
<lpage>10</lpage>
<pub-id pub-id-type="doi">10.1046/j.1462-2920.2002.t01-12-00257.x</pub-id>
<pub-id pub-id-type="pmid">11966814</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal">
<name>
<surname>Falkowski</surname>
<given-names>PG</given-names>
</name>
<name>
<surname>Fenchel</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Delong</surname>
<given-names>EF</given-names>
</name>
<article-title>The microbial engines that drive Earth's biogeochemical cycles</article-title>
<source>Science</source>
<year>2008</year>
<volume>320</volume>
<fpage>1034</fpage>
<lpage>1039</lpage>
<pub-id pub-id-type="doi">10.1126/science.1153213</pub-id>
<pub-id pub-id-type="pmid">18497287</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<name>
<surname>Kunin</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Copeland</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lapidus</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Mavromatis</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<article-title>A bioinformatician's guide to metagenomics</article-title>
<source>Microbiol Mol Biol Rev</source>
<year>2008</year>
<volume>72</volume>
<fpage>557</fpage>
<lpage>578</lpage>
<comment>Table of Contents</comment>
<pub-id pub-id-type="doi">10.1128/MMBR.00009-08</pub-id>
<pub-id pub-id-type="pmid">19052320</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal">
<name>
<surname>Mavromatis</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ivanova</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Barry</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Shapiro</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Goltsman</surname>
<given-names>E</given-names>
</name>
<name>
<surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Rigoutsos</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Salamov</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Korzeniewski</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Land</surname>
<given-names>M</given-names>
</name>
<etal></etal>
<article-title>Use of simulated data sets to evaluate the fidelity of metagenomic processing methods</article-title>
<source>Nat Meth</source>
<year>2007</year>
<volume>4</volume>
<fpage>495</fpage>
<lpage>500</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth1043</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<name>
<surname>Wooley</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Godzik</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Friedberg</surname>
<given-names>I</given-names>
</name>
<article-title>A primer on metagenomics</article-title>
<source>PLoS Comput Biol</source>
<year>2010</year>
<volume>6</volume>
<fpage>e1000667</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1000667</pub-id>
<pub-id pub-id-type="pmid">20195499</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="journal">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Gish</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
<article-title>Basic local alignment search tool</article-title>
<source>J Mol Biol</source>
<year>1990</year>
<volume>215</volume>
<fpage>403</fpage>
<lpage>410</lpage>
<pub-id pub-id-type="pmid">2231712</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal">
<name>
<surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Goto</surname>
<given-names>S</given-names>
</name>
<article-title>KEGG: kyoto encyclopedia of genes and genomes</article-title>
<source>Nucleic Acids Res</source>
<year>2000</year>
<volume>28</volume>
<fpage>27</fpage>
<lpage>30</lpage>
<pub-id pub-id-type="doi">10.1093/nar/28.1.27</pub-id>
<pub-id pub-id-type="pmid">10592173</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal">
<name>
<surname>Okuda</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yamada</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Hamajima</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Itoh</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Katayama</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Goto</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kanehisa</surname>
<given-names>M</given-names>
</name>
<article-title>KEGG Atlas mapping for global analysis of metabolic pathways</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<fpage>W423</fpage>
<lpage>W426</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkn282</pub-id>
<pub-id pub-id-type="pmid">18477636</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<name>
<surname>Claudel Renard</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Chevalet</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Faraut</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kahn</surname>
<given-names>D</given-names>
</name>
<article-title>Enzyme‒specific profiles for genome annotation: PRIAM</article-title>
<source>Nucleic Acids Res</source>
<year>2003</year>
<volume>31</volume>
<fpage>6633</fpage>
<lpage>6639</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkg847</pub-id>
<pub-id pub-id-type="pmid">14602924</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<name>
<surname>Overbeek</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Begley</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Butler</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Choudhuri</surname>
<given-names>JV</given-names>
</name>
<name>
<surname>Chuang</surname>
<given-names>H-Y</given-names>
</name>
<name>
<surname>Cohoon</surname>
<given-names>M</given-names>
</name>
<name>
<surname>de Crécy-Lagard</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Diaz</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Disz</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R</given-names>
</name>
<etal></etal>
<article-title>The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<fpage>5691</fpage>
<lpage>5702</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gki866</pub-id>
<pub-id pub-id-type="pmid">16214803</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="journal">
<name>
<surname>Markowitz</surname>
<given-names>VM</given-names>
</name>
<name>
<surname>Ivanova</surname>
<given-names>NN</given-names>
</name>
<name>
<surname>Szeto</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Palaniappan</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Dalevi</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>I-MA</given-names>
</name>
<name>
<surname>Grechkin</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Dubchak</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>I</given-names>
</name>
<etal></etal>
<article-title>IMG/M: a data management and analysis system for metagenomes</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<fpage>D534</fpage>
<lpage>D538</lpage>
<pub-id pub-id-type="pmid">17932063</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<name>
<surname>Markowitz</surname>
<given-names>VM</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>I-MA</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Szeto</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Palaniappan</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Grechkin</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Ratner</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Jacob</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Pati</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Huntemann</surname>
<given-names>M</given-names>
</name>
<etal></etal>
<article-title>IMG/M: the integrated metagenome data management and comparative analysis system</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D123</fpage>
<lpage>D129</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkr975</pub-id>
<pub-id pub-id-type="pmid">22086953</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<name>
<surname>Seshadri</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Kravitz</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Smarr</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Gilna</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Frazier</surname>
<given-names>M</given-names>
</name>
<article-title>CAMERA: a community resource for metagenomics</article-title>
<source>PLoS Biol</source>
<year>2007</year>
<volume>5</volume>
<fpage>e75</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.0050075</pub-id>
<pub-id pub-id-type="pmid">17355175</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Paarmann</surname>
<given-names>D</given-names>
</name>
<name>
<surname>D'Souza</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Olson</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Glass</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Kubal</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Paczian</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rodriguez</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wilke</surname>
<given-names>A</given-names>
</name>
<etal></etal>
<article-title>The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes</article-title>
<source>BMC Bioinformatics</source>
<year>2008</year>
<volume>9</volume>
<fpage>386</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-9-386</pub-id>
<pub-id pub-id-type="pmid">18803844</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="journal">
<name>
<surname>Aziz</surname>
<given-names>RK</given-names>
</name>
<name>
<surname>Bartels</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Best</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>DeJongh</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Disz</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Formsma</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Gerdes</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Glass</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Kubal</surname>
<given-names>M</given-names>
</name>
<etal></etal>
<article-title>The RAST Server: Rapid Annotations using Subsystems Technology</article-title>
<source>BMC Genomics</source>
<year>2008</year>
<volume>9</volume>
<fpage>75</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-9-75</pub-id>
<pub-id pub-id-type="pmid">18261238</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Overbeek</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Rodriguez</surname>
<given-names>A</given-names>
</name>
<article-title>FIGfams: yet another set of protein families</article-title>
<source>Nucleic Acids Res</source>
<year>2009</year>
<volume>37</volume>
<fpage>6643</fpage>
<lpage>6654</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkp698</pub-id>
<pub-id pub-id-type="pmid">19762480</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal">
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Paley</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Romero</surname>
<given-names>P</given-names>
</name>
<article-title>The pathway tools software</article-title>
<source>Bioinformatics</source>
<year>2002</year>
<volume>18</volume>
<fpage>S225</fpage>
<lpage>S232</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/18.suppl_1.S225</pub-id>
<pub-id pub-id-type="pmid">12169551</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Paley</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Krummenacker</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Latendresse</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Dale</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Kaipa</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Gilham</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Spaulding</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Popescu</surname>
<given-names>L</given-names>
</name>
<etal></etal>
<article-title>Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology</article-title>
<source>Brief Bioinformatics</source>
<year>2010</year>
<volume>11</volume>
<fpage>40</fpage>
<lpage>79</lpage>
<pub-id pub-id-type="doi">10.1093/bib/bbp043</pub-id>
<pub-id pub-id-type="pmid">19955237</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Latendresse</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Caspi</surname>
<given-names>R</given-names>
</name>
<article-title>The pathway tools pathway prediction algorithm</article-title>
<source>Stand Genomic Sci</source>
<year>2011</year>
<volume>5</volume>
<fpage>424</fpage>
<lpage>429</lpage>
<pub-id pub-id-type="doi">10.4056/sigs.1794338</pub-id>
<pub-id pub-id-type="pmid">22675592</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<name>
<surname>Latendresse</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Krummenacker</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Trupp</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<article-title>Construction and completion of flux balance models from pathway databases</article-title>
<source>Bioinformatics</source>
<year>2012</year>
<volume>28</volume>
<fpage>388</fpage>
<lpage>396</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btr681</pub-id>
<pub-id pub-id-type="pmid">22262672</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="journal">
<name>
<surname>Hucka</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Finney</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sauro</surname>
<given-names>HM</given-names>
</name>
<name>
<surname>Bolouri</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Doyle</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Kitano</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Arkin</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Bornstein</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Bray</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Cornish-Bowden</surname>
<given-names>A</given-names>
</name>
<etal></etal>
<article-title>The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models</article-title>
<source>Bioinformatics</source>
<year>2003</year>
<volume>19</volume>
<fpage>524</fpage>
<lpage>531</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btg015</pub-id>
<pub-id pub-id-type="pmid">12611808</pub-id>
</mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="journal">
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<name>
<surname>Riley</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Saier</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Paulsen</surname>
<given-names>IT</given-names>
</name>
<name>
<surname>Paley</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Pellegrini-Toole</surname>
<given-names>A</given-names>
</name>
<article-title>The EcoCyc and MetaCyc databases</article-title>
<source>Nucleic Acids Res</source>
<year>2000</year>
<volume>28</volume>
<fpage>56</fpage>
<lpage>59</lpage>
<pub-id pub-id-type="doi">10.1093/nar/28.1.56</pub-id>
<pub-id pub-id-type="pmid">10592180</pub-id>
</mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="journal">
<name>
<surname>Caspi</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Altman</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Dreher</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Fulcher</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Subhraveti</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Keseler</surname>
<given-names>IM</given-names>
</name>
<name>
<surname>Kothari</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Krummenacker</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Latendresse</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mueller</surname>
<given-names>LA</given-names>
</name>
<etal></etal>
<article-title>The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases</article-title>
<source>Nucleic Acids Res</source>
<year>2012</year>
<volume>40</volume>
<fpage>D742</fpage>
<lpage>D753</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkr1014</pub-id>
<pub-id pub-id-type="pmid">22102576</pub-id>
</mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal">
<name>
<surname>Latendresse</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Paley</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<article-title>Browsing metabolic and regulatory networks with BioCyc</article-title>
<source>Methods Mol Biol</source>
<year>2012</year>
<volume>804</volume>
<fpage>197</fpage>
<lpage>216</lpage>
<pub-id pub-id-type="doi">10.1007/978-1-61779-361-5_11</pub-id>
<pub-id pub-id-type="pmid">22144155</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<mixed-citation publication-type="journal">
<name>
<surname>Stark</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Stamatakis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Mering von</surname>
<given-names>C</given-names>
</name>
<article-title>MLTreeMap–accurate Maximum Likelihood placement of environmental DNA sequences into taxonomic and functional reference phylogenies</article-title>
<source>BMC Genomics</source>
<year>2010</year>
<volume>11</volume>
<fpage>461</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-11-461</pub-id>
<pub-id pub-id-type="pmid">20687950</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<mixed-citation publication-type="journal">
<name>
<surname>Hyatt</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>G-L</given-names>
</name>
<name>
<surname>LoCascio</surname>
<given-names>PF</given-names>
</name>
<name>
<surname>Land</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Larimer</surname>
<given-names>FW</given-names>
</name>
<name>
<surname>Hauser</surname>
<given-names>LJ</given-names>
</name>
<article-title>Prodigal: prokaryotic gene recognition and translation initiation site identification</article-title>
<source>BMC Bioinformatics</source>
<year>2010</year>
<volume>11</volume>
<fpage>119</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-11-119</pub-id>
<pub-id pub-id-type="pmid">20211023</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<mixed-citation publication-type="journal">
<name>
<surname>Tatusov</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Natale</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Garkavtsev</surname>
<given-names>IV</given-names>
</name>
<name>
<surname>Tatusova</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Shankavaram</surname>
<given-names>UT</given-names>
</name>
<name>
<surname>Rao</surname>
<given-names>BS</given-names>
</name>
<name>
<surname>Kiryutin</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Galperin</surname>
<given-names>MY</given-names>
</name>
<name>
<surname>Fedorova</surname>
<given-names>ND</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
<article-title>The COG database: new developments in phylogenetic classification of proteins from complete genomes</article-title>
<source>Nucleic Acids Res</source>
<year>2001</year>
<volume>29</volume>
<fpage>22</fpage>
<lpage>28</lpage>
<pub-id pub-id-type="doi">10.1093/nar/29.1.22</pub-id>
<pub-id pub-id-type="pmid">11125040</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<mixed-citation publication-type="journal">
<name>
<surname>Pruitt</surname>
<given-names>KD</given-names>
</name>
<name>
<surname>Maglott</surname>
<given-names>DR</given-names>
</name>
<article-title>RefSeq and LocusLink: NCBI gene-centered resources</article-title>
<source>Nucleic Acids Res</source>
<year>2001</year>
<volume>29</volume>
<fpage>137</fpage>
<lpage>140</lpage>
<pub-id pub-id-type="doi">10.1093/nar/29.1.137</pub-id>
<pub-id pub-id-type="pmid">11125071</pub-id>
</mixed-citation>
</ref>
<ref id="B30">
<mixed-citation publication-type="journal">
<name>
<surname>Kiełbasa</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Horton</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Frith</surname>
<given-names>MC</given-names>
</name>
<article-title>Adaptive seeds tame genomic sequence comparison</article-title>
<source>Genome Res</source>
<year>2011</year>
<volume>21</volume>
<fpage>487</fpage>
<lpage>493</lpage>
<pub-id pub-id-type="doi">10.1101/gr.113985.110</pub-id>
<pub-id pub-id-type="pmid">21209072</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<mixed-citation publication-type="journal">
<name>
<surname>Rasko</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>GSA</given-names>
</name>
<name>
<surname>Ravel</surname>
<given-names>J</given-names>
</name>
<article-title>Visualization of comparative genomic analyses by BLAST score ratio</article-title>
<source>BMC Bioinformatics</source>
<year>2005</year>
<volume>6</volume>
<fpage>2</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-6-2</pub-id>
<pub-id pub-id-type="pmid">15634352</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<mixed-citation publication-type="journal">
<name>
<surname>Rost</surname>
<given-names>B</given-names>
</name>
<article-title>Twilight zone of protein sequence alignments</article-title>
<source>Protein Eng</source>
<year>1999</year>
<volume>12</volume>
<fpage>85</fpage>
<lpage>94</lpage>
<pub-id pub-id-type="doi">10.1093/protein/12.2.85</pub-id>
<pub-id pub-id-type="pmid">10195279</pub-id>
</mixed-citation>
</ref>
<ref id="B33">
<mixed-citation publication-type="other">
<name>
<surname>Gentzsch</surname>
<given-names>W</given-names>
</name>
<article-title>Sun Grid Engine: towards creating a compute power grid</article-title>
<source>CCGRID-01. IEEE Comput. Soc</source>
<year>2001</year>
<fpage>35</fpage>
<lpage>36</lpage>
</mixed-citation>
</ref>
<ref id="B34">
<mixed-citation publication-type="journal">
<name>
<surname>Pruesse</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Quast</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Knittel</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Fuchs</surname>
<given-names>BM</given-names>
</name>
<name>
<surname>Ludwig</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Peplies</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Glöckner</surname>
<given-names>FO</given-names>
</name>
<article-title>SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB</article-title>
<source>Nucleic Acids Res</source>
<year>2007</year>
<volume>35</volume>
<fpage>7188</fpage>
<lpage>7196</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkm864</pub-id>
<pub-id pub-id-type="pmid">17947321</pub-id>
</mixed-citation>
</ref>
<ref id="B35">
<mixed-citation publication-type="journal">
<name>
<surname>DeSantis</surname>
<given-names>TZ</given-names>
</name>
<name>
<surname>Hugenholtz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Larsen</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Rojas</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Brodie</surname>
<given-names>EL</given-names>
</name>
<name>
<surname>Keller</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Huber</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Dalevi</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Andersen</surname>
<given-names>GL</given-names>
</name>
<article-title>Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB</article-title>
<source>Appl Environ Microbiol</source>
<year>2006</year>
<volume>72</volume>
<fpage>5069</fpage>
<lpage>5072</lpage>
<pub-id pub-id-type="doi">10.1128/AEM.03006-05</pub-id>
<pub-id pub-id-type="pmid">16820507</pub-id>
</mixed-citation>
</ref>
<ref id="B36">
<mixed-citation publication-type="journal">
<name>
<surname>Lowe</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR</given-names>
</name>
<article-title>tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence</article-title>
<source>Nucleic Acids Res</source>
<year>1997</year>
<volume>25</volume>
<fpage>0955</fpage>
<lpage>0964</lpage>
</mixed-citation>
</ref>
<ref id="B37">
<mixed-citation publication-type="journal">
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schuster</surname>
<given-names>SC</given-names>
</name>
<article-title>MEGAN analysis of metagenomic data</article-title>
<source>Genome Res</source>
<year>2007</year>
<volume>17</volume>
<fpage>377</fpage>
<lpage>386</lpage>
<pub-id pub-id-type="doi">10.1101/gr.5969107</pub-id>
<pub-id pub-id-type="pmid">17255551</pub-id>
</mixed-citation>
</ref>
<ref id="B38">
<mixed-citation publication-type="journal">
<name>
<surname>Latendresse</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<article-title>An advanced web query interface for biological databases</article-title>
<source>Database (Oxford)</source>
<year>2010</year>
<volume>2010</volume>
<fpage>baq006</fpage>
<pub-id pub-id-type="doi">10.1093/database/baq006</pub-id>
<pub-id pub-id-type="pmid">20624715</pub-id>
</mixed-citation>
</ref>
<ref id="B39">
<mixed-citation publication-type="journal">
<name>
<surname>Paley</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<article-title>The Pathway Tools cellular overview diagram and Omics Viewer</article-title>
<source>Nucleic Acids Res</source>
<year>2006</year>
<volume>34</volume>
<fpage>3771</fpage>
<lpage>3778</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkl334</pub-id>
<pub-id pub-id-type="pmid">16893960</pub-id>
</mixed-citation>
</ref>
<ref id="B40">
<mixed-citation publication-type="other">
<source>Western Canadian Research Grid (WestGrid)</source>
<comment>
<ext-link ext-link-type="uri" xlink:href="http://www.westgrid.ca/">http://www.westgrid.ca/</ext-link>
</comment>
</mixed-citation>
</ref>
<ref id="B41">
<mixed-citation publication-type="journal">
<name>
<surname>Dale</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Popescu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Karp</surname>
<given-names>PD</given-names>
</name>
<article-title>Machine learning methods for metabolic pathway prediction</article-title>
<source>BMC Bioinformatics</source>
<year>2010</year>
<volume>11</volume>
<fpage>15</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-11-15</pub-id>
<pub-id pub-id-type="pmid">20064214</pub-id>
</mixed-citation>
</ref>
<ref id="B42">
<mixed-citation publication-type="journal">
<name>
<surname>Richter</surname>
<given-names>DC</given-names>
</name>
<name>
<surname>Ott</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Auch</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Schmid</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
<article-title>MetaSim—A Sequencing Simulator for Genomics and Metagenomics</article-title>
<source>PLoS One</source>
<year>2008</year>
<volume>3</volume>
<fpage>e3373</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0003373</pub-id>
<pub-id pub-id-type="pmid">18841204</pub-id>
</mixed-citation>
</ref>
<ref id="B43">
<mixed-citation publication-type="journal">
<name>
<surname>Barton</surname>
<given-names>AD</given-names>
</name>
<name>
<surname>Dutkiewicz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Flierl</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Bragg</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Follows</surname>
<given-names>MJ</given-names>
</name>
<article-title>Patterns of diversity in marine phytoplankton</article-title>
<source>Science</source>
<year>2010</year>
<volume>327</volume>
<fpage>1509</fpage>
<lpage>1511</lpage>
<pub-id pub-id-type="doi">10.1126/science.1184961</pub-id>
<pub-id pub-id-type="pmid">20185684</pub-id>
</mixed-citation>
</ref>
<ref id="B44">
<mixed-citation publication-type="journal">
<name>
<surname>Follows</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Dutkiewicz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Grant</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chisholm</surname>
<given-names>SW</given-names>
</name>
<article-title>Emergent Biogeography of Microbial Communities in a Model Ocean</article-title>
<source>Science</source>
<year>2007</year>
<volume>315</volume>
<fpage>1843</fpage>
<lpage>1846</lpage>
<pub-id pub-id-type="doi">10.1126/science.1138544</pub-id>
<pub-id pub-id-type="pmid">17395828</pub-id>
</mixed-citation>
</ref>
<ref id="B45">
<mixed-citation publication-type="journal">
<name>
<surname>Larsen</surname>
<given-names>PE</given-names>
</name>
<name>
<surname>Field</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Gilbert</surname>
<given-names>JA</given-names>
</name>
<article-title>Predicting bacterial community assemblages using an artificial neural network approach</article-title>
<source>Nat Meth</source>
<year>2012</year>
<volume>9</volume>
<fpage>621</fpage>
<lpage>625</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.1975</pub-id>
</mixed-citation>
</ref>
<ref id="B46">
<mixed-citation publication-type="journal">
<name>
<surname>Larsen</surname>
<given-names>PE</given-names>
</name>
<name>
<surname>Collart</surname>
<given-names>FR</given-names>
</name>
<name>
<surname>Field</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Keegan</surname>
<given-names>KP</given-names>
</name>
<name>
<surname>Henry</surname>
<given-names>CS</given-names>
</name>
<name>
<surname>McGrath</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Quinn</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Gilbert</surname>
<given-names>JA</given-names>
</name>
<article-title>Predicted Relative Metabolomic Turnover (PRMT): determining metabolic turnover from a coastal marine metagenomic dataset</article-title>
<source>Microbial Informatics and Experimentation</source>
<year>2011</year>
<volume>1</volume>
<fpage>4</fpage>
<pub-id pub-id-type="doi">10.1186/2042-5783-1-4</pub-id>
<pub-id pub-id-type="pmid">22587810</pub-id>
</mixed-citation>
</ref>
<ref id="B47">
<mixed-citation publication-type="journal">
<name>
<surname>Abubucker</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Segata</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Goll</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schubert</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Izard</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Cantarel</surname>
<given-names>BL</given-names>
</name>
<name>
<surname>Rodriguez-Mueller</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Zucker</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Thiagarajan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Henrissat</surname>
<given-names>B</given-names>
</name>
<etal></etal>
<article-title>Metabolic Reconstruction for Metagenomic Data and Its Application to the Human Microbiome</article-title>
<source>PLoS Comput Biol</source>
<year>2012</year>
<volume>8</volume>
<fpage>e1002358</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1002358</pub-id>
<pub-id pub-id-type="pmid">22719234</pub-id>
</mixed-citation>
</ref>
<ref id="B48">
<mixed-citation publication-type="journal">
<name>
<surname>Ye</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Doak</surname>
<given-names>TG</given-names>
</name>
<article-title>A parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes</article-title>
<source>PLoS Comput Biol</source>
<year>2009</year>
<volume>5</volume>
<fpage>e1000465</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1000465</pub-id>
<pub-id pub-id-type="pmid">19680427</pub-id>
</mixed-citation>
</ref>
<ref id="B49">
<mixed-citation publication-type="journal">
<name>
<surname>Goll</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Thiagarajan</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Abubucker</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Huttenhower</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Yooseph</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Methé</surname>
<given-names>BA</given-names>
</name>
<article-title>A case study for large-scale human microbiome analysis using JCVI's metagenomics reports (METAREP)</article-title>
<source>PLoS One</source>
<year>2012</year>
<volume>7</volume>
<fpage>e29044</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0029044</pub-id>
<pub-id pub-id-type="pmid">22719821</pub-id>
</mixed-citation>
</ref>
<ref id="B50">
<mixed-citation publication-type="journal">
<name>
<surname>Henry</surname>
<given-names>CS</given-names>
</name>
<name>
<surname>DeJongh</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Best</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Frybarger</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Linsay</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>RL</given-names>
</name>
<article-title>High-throughput generation, optimization and analysis of genome-scale metabolic models</article-title>
<source>Nat Biotechnol</source>
<year>2010</year>
<volume>28</volume>
<fpage>977</fpage>
<lpage>982</lpage>
<pub-id pub-id-type="doi">10.1038/nbt.1672</pub-id>
<pub-id pub-id-type="pmid">20802497</pub-id>
</mixed-citation>
</ref>
<ref id="B51">
<mixed-citation publication-type="journal">
<name>
<surname>Henry</surname>
<given-names>CS</given-names>
</name>
<name>
<surname>Overbeek</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Best</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Glass</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Gilbert</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Larsen</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Disz</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>F</given-names>
</name>
<etal></etal>
<article-title>Connecting genotype to phenotype in the era of high-throughput sequencing</article-title>
<source>Biochim Biophys Acta</source>
<year>1810</year>
<volume>2011</volume>
<fpage>967</fpage>
<lpage>977</lpage>
</mixed-citation>
</ref>
<ref id="B52">
<mixed-citation publication-type="journal">
<name>
<surname>Kalyanaraman</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Aluru</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kothari</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Brendel</surname>
<given-names>V</given-names>
</name>
<article-title>Efficient clustering of large EST data sets on parallel computers</article-title>
<source>Nucleic Acids Res</source>
<year>2003</year>
<volume>31</volume>
<fpage>2963</fpage>
<lpage>2974</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkg379</pub-id>
<pub-id pub-id-type="pmid">12771222</pub-id>
</mixed-citation>
</ref>
<ref id="B53">
<mixed-citation publication-type="journal">
<name>
<surname>Yooseph</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Sutton</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Rusch</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Halpern</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Williamson</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Remington</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Eisen</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Heidelberg</surname>
<given-names>KB</given-names>
</name>
<name>
<surname>Manning</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<article-title>The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families</article-title>
<source>PLoS Biol</source>
<year>2007</year>
<volume>5</volume>
<fpage>e16</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.0050016</pub-id>
<pub-id pub-id-type="pmid">17355171</pub-id>
</mixed-citation>
</ref>
<ref id="B54">
<mixed-citation publication-type="journal">
<name>
<surname>Kalyanaraman</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Cannon</surname>
<given-names>WR</given-names>
</name>
<name>
<surname>Latt</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Baxter</surname>
<given-names>DJ</given-names>
</name>
<article-title>MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<fpage>3072</fpage>
<lpage>3073</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btr523</pub-id>
<pub-id pub-id-type="pmid">21926122</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000274 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000274 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:3695837
   |texte=   MetaPathways: a modular pipeline for constructing pathway/genome databases from environmental sequence information
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:23800136" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024