CyberinfraV1, Pmc, Corpus, bibRecord, 000501

NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads

Identifieur interne : 000501 ( Pmc/Corpus ); précédent : 000500; suivant : 000502

NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads

Auteurs : Gail L. Rosen ; Erin R. Reichenberger ; Aaron M. Rosenfeld

Source :

Bioinformatics [ 1367-4803 ] ; 2010.

RBID : PMC:3008645

Abstract

Motivation: Datasets from high-throughput sequencing technologies have yielded a vast amount of data about organisms in environmental samples. Yet, it is still a challenge to assess the exact organism content in these samples because the task of taxonomic classification is too computationally complex to annotate all reads in a dataset. An easy-to-use webserver is needed to process these reads. While many methods exist, only a few are publicly available on webservers, and out of those, most do not annotate all reads.

Results: We introduce a webserver that implements the naïve Bayes classifier (NBC) to classify all metagenomic reads to their best taxonomic match. Results indicate that NBC can assign next-generation sequencing reads to their taxonomic classification and can find significant populations of genera that other classifiers may miss.

Availability: Publicly available at: http://nbc.ece.drexel.edu.

Contact: gailr@ece.drexel.edu

Url:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3008645

DOI: 10.1093/bioinformatics/btq619
PubMed: 21062764
PubMed Central: 3008645

Links to Exploration step

PMC:3008645

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads</title>
<author><name sortKey="Rosen, Gail L" sort="Rosen, Gail L" uniqKey="Rosen G" first="Gail L." last="Rosen">Gail L. Rosen</name>
<affiliation><nlm:aff id="AFF1">Department of Electrical and Computer Engineering,</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Reichenberger, Erin R" sort="Reichenberger, Erin R" uniqKey="Reichenberger E" first="Erin R." last="Reichenberger">Erin R. Reichenberger</name>
<affiliation><nlm:aff wicri:cut=" and" id="AFF1">School of Biomedical Engineering, Science, and Health Systems</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Rosenfeld, Aaron M" sort="Rosenfeld, Aaron M" uniqKey="Rosenfeld A" first="Aaron M." last="Rosenfeld">Aaron M. Rosenfeld</name>
<affiliation><nlm:aff id="AFF1">Department of Computer Science, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">21062764</idno>
<idno type="pmc">3008645</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3008645</idno>
<idno type="RBID">PMC:3008645</idno>
<idno type="doi">10.1093/bioinformatics/btq619</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">000501</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads</title>
<author><name sortKey="Rosen, Gail L" sort="Rosen, Gail L" uniqKey="Rosen G" first="Gail L." last="Rosen">Gail L. Rosen</name>
<affiliation><nlm:aff id="AFF1">Department of Electrical and Computer Engineering,</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Reichenberger, Erin R" sort="Reichenberger, Erin R" uniqKey="Reichenberger E" first="Erin R." last="Reichenberger">Erin R. Reichenberger</name>
<affiliation><nlm:aff wicri:cut=" and" id="AFF1">School of Biomedical Engineering, Science, and Health Systems</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Rosenfeld, Aaron M" sort="Rosenfeld, Aaron M" uniqKey="Rosenfeld A" first="Aaron M." last="Rosenfeld">Aaron M. Rosenfeld</name>
<affiliation><nlm:aff id="AFF1">Department of Computer Science, Drexel University, Philadelphia, PA, USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p><bold>Motivation:</bold>
 Datasets from high-throughput sequencing technologies have yielded a vast amount of data about organisms in environmental samples. Yet, it is still a challenge to assess the exact organism content in these samples because the task of taxonomic classification is too computationally complex to annotate all reads in a dataset. An easy-to-use webserver is needed to process these reads. While many methods exist, only a few are publicly available on webservers, and out of those, most do not annotate all reads.</p>
<p><bold>Results:</bold>
 We introduce a webserver that implements the naïve Bayes classifier (NBC) to classify all metagenomic reads to their best taxonomic match. Results indicate that NBC can assign next-generation sequencing reads to their taxonomic classification and can find significant populations of genera that other classifiers may miss.</p>
<p><bold>Availability:</bold>
 Publicly available at: <ext-link ext-link-type="uri" xlink:href="http://nbc.ece.drexel.edu">http://nbc.ece.drexel.edu</ext-link>
.</p>
<p><bold>Contact:</bold>
 <email>gailr@ece.drexel.edu</email>
</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gerlach, W" uniqKey="Gerlach W">W Gerlach</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hery, M" uniqKey="Hery M">M Hery</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Huson, De" uniqKey="Huson D">DE Huson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Meyer, F" uniqKey="Meyer F">F Meyer</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Nolan, M" uniqKey="Nolan M">M Nolan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Overbeek, R" uniqKey="Overbeek R">R Overbeek</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pond, Sk" uniqKey="Pond S">SK Pond</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Schluter, A" uniqKey="Schluter A">A Schlüter</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Seshadri, R" uniqKey="Seshadri R">R Seshadri</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Vinneras, B" uniqKey="Vinneras B">B Vinneras</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">Bioinformatics</journal-id>
<journal-id journal-id-type="publisher-id">bioinformatics</journal-id>
<journal-id journal-id-type="hwp">bioinfo</journal-id>
<journal-title-group><journal-title>Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="ppub">1367-4803</issn>
<issn pub-type="epub">1367-4811</issn>
<publisher><publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">21062764</article-id>
<article-id pub-id-type="pmc">3008645</article-id>
<article-id pub-id-type="doi">10.1093/bioinformatics/btq619</article-id>
<article-id pub-id-type="publisher-id">btq619</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Applications Note</subject>
<subj-group><subject>Genome Analysis</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group><article-title>NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads</article-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Rosen</surname>
<given-names>Gail L.</given-names>
</name>
<xref ref-type="aff" rid="AFF1"><sup>1</sup>
</xref>
<xref ref-type="corresp" rid="COR1">*</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Reichenberger</surname>
<given-names>Erin R.</given-names>
</name>
<xref ref-type="aff" rid="AFF1"><sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Rosenfeld</surname>
<given-names>Aaron M.</given-names>
</name>
<xref ref-type="aff" rid="AFF1"><sup>3</sup>
</xref>
</contrib>
</contrib-group>
<aff id="AFF1"><sup>1</sup>
Department of Electrical and Computer Engineering,<sup>2</sup>
School of Biomedical Engineering, Science, and Health Systems and<sup>3</sup>
Department of Computer Science, Drexel University, Philadelphia, PA, USA</aff>
<author-notes><corresp id="COR1">* To whom correspondence should be addressed.</corresp>
<fn><p>Associate Editor: John Quackenbush</p>
</fn>
</author-notes>
<pub-date pub-type="ppub"><day>1</day>
<month>1</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="epub"><day>8</day>
<month>11</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="pmc-release"><day>8</day>
<month>11</month>
<year>2010</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the 
 							. </pmc-comment>
      <volume>27</volume>
<issue>1</issue>
<fpage>127</fpage>
<lpage>129</lpage>
<history><date date-type="received"><day>2</day>
<month>8</month>
<year>2010</year>
</date>
<date date-type="rev-recd"><day>12</day>
<month>10</month>
<year>2010</year>
</date>
<date date-type="accepted"><day>29</day>
<month>10</month>
<year>2010</year>
</date>
</history>
<permissions><copyright-statement>© The Author(s) 2010. Published by Oxford University Press.</copyright-statement>
<copyright-year>2010</copyright-year>
<license license-type="creative-commons" xlink:href="http://creativecommons.org/licenses/by-nc/2.0/uk/"><license-p><pmc-comment>CREATIVE COMMONS</pmc-comment>
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/2.5">http://creativecommons.org/licenses/by-nc/2.5</ext-link>
), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract><p><bold>Motivation:</bold>
 Datasets from high-throughput sequencing technologies have yielded a vast amount of data about organisms in environmental samples. Yet, it is still a challenge to assess the exact organism content in these samples because the task of taxonomic classification is too computationally complex to annotate all reads in a dataset. An easy-to-use webserver is needed to process these reads. While many methods exist, only a few are publicly available on webservers, and out of those, most do not annotate all reads.</p>
<p><bold>Results:</bold>
 We introduce a webserver that implements the naïve Bayes classifier (NBC) to classify all metagenomic reads to their best taxonomic match. Results indicate that NBC can assign next-generation sequencing reads to their taxonomic classification and can find significant populations of genera that other classifiers may miss.</p>
<p><bold>Availability:</bold>
 Publicly available at: <ext-link ext-link-type="uri" xlink:href="http://nbc.ece.drexel.edu">http://nbc.ece.drexel.edu</ext-link>
.</p>
<p><bold>Contact:</bold>
 <email>gailr@ece.drexel.edu</email>
</p>
</abstract>
</article-meta>
</front>
<body><sec sec-type="intro" id="SEC1"><title>1 INTRODUCTION</title>
<p>After acquiring a sample and using next-generation technology to perform shotgun sequencing, the next step in metagenomic analysis it to assess the taxonomic content of the sample. This methodology, also known as phylogenetic analysis, gives a simple look at ‘Who is in this sample?’ The first tool ever used (which is still widely used) for taxonomic assessment is Basic Local Alignment Search Tool (BLAST; <xref ref-type="bibr" rid="B1">Altschul <italic>et al.</italic>
, 1990</xref>
). In recent years, several specialized webservers have been made available to the public to ease the process of taxonomically classifying reads, namely Phylopythia (<xref ref-type="bibr" rid="B5">McHardy <italic>et al.</italic>
, 2007</xref>
), CAMERA (<xref ref-type="bibr" rid="B13">Seshadri <italic>et al.</italic>
, 2007</xref>
), WebCARMA (<xref ref-type="bibr" rid="B2">Gerlach <italic>et al.</italic>
, 2009</xref>
), MG-RAST (<xref ref-type="bibr" rid="B6">Meyer <italic>et al.</italic>
, 2008</xref>
) and Galaxy (<xref ref-type="bibr" rid="B9">Pond <italic>et al.</italic>
, 2009</xref>
). Unlike BLAST, Phylopythia and WebCARMA return more specific taxonomic information and assign reads to higher level taxonomic levels using a consensus of BLAST top-hit taxonomies [aka ‘last common ancestor’ algorithms (<xref ref-type="bibr" rid="B4">Huson <italic>et al.</italic>
, 2007</xref>
)]. In this article, we focus our comparison to remote stand-alone webservers and not to methods that only have locally installable software. Ultimately, all the metagenomic analysis webservers aim to ease analysis of complex environmental samples for users that do not have resources to maintain their own databases and systems.</p>
<p>Phylopythia was the first taxonomic classification webserver to be implemented. Phylopythia is based on a support vector machine (SVM) classification method and produces very good accuracy for long (≥ 1 Kbp) reads (<xref ref-type="bibr" rid="B5">McHardy <italic>et al.</italic>
, 2007</xref>
). WebCARMA is a homology-based approach that matches environmental gene tags to protein families and reports good results for long and ultrashort 35-bp reads using (i) BLASTX to find candidate environmental gene tags (EGTs) and (ii) using Pfam (protein family) hidden Markov models (HMMs) to match the EGTs against protein families during an EGT candidate selection process. MG-RAST (Metagenome Rapid Annotation using Subsystem Technology) (<xref ref-type="bibr" rid="B6">Meyer <italic>et al.</italic>
, 2008</xref>
), CAMERA (Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis) (<xref ref-type="bibr" rid="B13">Seshadri <italic>et al.</italic>
, 2007</xref>
) and the Galaxy Project (<xref ref-type="bibr" rid="B9">Pond <italic>et al.</italic>
, 2009</xref>
) are high-throughput metagenomic pipelines that aim to be an all-in-one one-stop analysis for metagenomic samples. For taxonomic classification of shotgun sequencing, MG-RAST offers a homology-based approach, SEED (<xref ref-type="bibr" rid="B8">Overbeek <italic>et al.</italic>
, 2005</xref>
). CAMERA and Galaxy provide high-throughput implementations and custom databases for BLASTN. BLASTN yields best hit sequence matches and is known to have reasonable accuracy (<xref ref-type="bibr" rid="B10">Rosen <italic>et al.</italic>
, 2009</xref>
).</p>
<p>Previously, Rosen <italic>et al.</italic>
 have explored a machine learning method, naïve Bayes classifier (NBC), as a possible way to classify fragments that can annotate more sequences than BLAST (<xref ref-type="bibr" rid="B11">Rosen <italic>et al.</italic>
, 2008</xref>
). We now implement the algorithm on a webserver for public use and benchmark it against other web sites.</p>
</sec>
<sec id="SEC2"><title>2 METHODS AND MATERIALS</title>
<p>We selected a previously benchmarked dataset (<xref ref-type="bibr" rid="B2">Gerlach <italic>et al.</italic>
, 2009</xref>
): the Biogas reactor dataset (<xref ref-type="bibr" rid="B12">Schlüter <italic>et al.</italic>
, 2008</xref>
), composed of 353 213 reads of average 230 bp length. We selected a real dataset as opposed to a synthetic one because we did not want to tailor the dataset to any specific database, since the database will vary on each web site. This comparison fairly assesses each webserver's performance on a ‘real’ dataset containing known and novel organisms.</p>
<p>We conducted our tests against NBC and five other webservers in July and August of 2010. WebCARMA and MG-RAST require no parameters. Phylopythia requires the type of model to match against. MG-RAST requires an <italic>E</italic>
-value cutoff under the SEED viewer (which we selected the highest). We selected default BLAST parameters for the NT database for Galaxy. For NBC, we used an <italic>N</italic>
mer size of 15 and the default 1032 organism genome-list. For CAMERA, we only retained the best top-hit organism for each read and used the ‘All Prokaryotes’ BLASTN database (and used the default parameters for the rest).</p>
<p>We implement the NBC approach in <xref ref-type="bibr" rid="B11">Rosen <italic>et al.</italic>
 (2008</xref>
) that assigns each read a log-likelihood score. We introduce two functions of NBC: (i) the novice functionality and (ii) the expert functionality. We expect that most users will fit into the ‘novice’ category, which will enable them to upload their FASTA file of reads and obtain a file of summarized results matching each read to its most likely organism, given the training database. The parameters that (expert and novice) users can choose from are as follows:</p>
<p><italic>Upload File</italic>
: the FASTA formatted file of metagenomic reads. The webserver also accepts .zip, .gz and .tgz of several FASTA files.</p>
<p><italic>Genome list</italic>
: the algorithm speed depends linearly on the number of genomes that one scores against. So, if an expert user has prior knowledge about the expected microbes in the environment, he/she can select only those microbes that should be scored against. This will both speed up the computation time and reduce false positives of the algorithm.</p>
<p><italic>Nmer length</italic>
: the user can select different <italic>N</italic>
mer feature sizes, but it is recommended that the novice user use <italic>N</italic>
 = 15 since it works well for both long and short reads (<xref ref-type="bibr" rid="B11">Rosen <italic>et al.</italic>
, 2008</xref>
).</p>
<p><italic>Email</italic>
: The user's email address is required so that they can be notified as to where to retrieve the results when the job is completed.</p>
<p><italic>Output</italic>
: For a beginner, we suggest to (i) upload a FASTA file with the metagenomic reads and (ii) enter an email address. The output is a link to a directory that contains your original upload file (renamed as userAnalysisFile.txt), the genomes that were scored against (masterGenomeList.txt) and a summary of the matches for each read (summarized_results.txt). The expert user may be particularly interested in the *.csv.gz files where he/she can analyze the ‘score distribution’ of each read more in depth.</p>
</sec>
<sec sec-type="discussion" id="SEC3"><title>3 DISCUSSION</title>
<p>In <xref ref-type="fig" rid="F1">Figure 1</xref>
, we show the percentage of reads (out of the whole dataset) that ranked in the top eight genera for each algorithm. We see that all methods are in unanimous agreement for Clostridium and Bacillus, while most methods (except Galaxy) agree for prominence of Methanoculleus. CAMERA supports NBC's findings of Pseudomonas and Burkholderia, known to be found in sewage treatment plants (<xref ref-type="bibr" rid="B14">Vinneras <italic>et al.</italic>
, 2006</xref>
). [The biogas reactor contained ∼2% chicken manure so it can have the traits of sludge waste (<xref ref-type="bibr" rid="B12">Schlüter <italic>et al.</italic>
, 2008</xref>
)]. In <xref ref-type="bibr" rid="B3">Hery <italic>et al.</italic>
 (2010</xref>
), Pseudomonas and Sorangium have been found in sludge wastes. Streptosporangium and Streptomyces are commonly found in vegetable gardens (<xref ref-type="bibr" rid="B7">Nolan <italic>et al.</italic>
, 2010</xref>), which is quite reasonable since this is an agricultural bioreactor. Therefore, NBC potentially has found significant populations of genera that other classifiers have missed. Thermosinus is not in NBC's completed microbial training database and therefore, it did not find any matches.
<fig id="F1" position="float"><label>Fig. 1.</label>
<caption><p>Percentage of reads that are assigned to a particular genera out of all 454 reads from the Biogas reactor community. CAMERA and NBC tend to agree for over 70% of the genera shown while MG-RAST agrees with CAMERA and NBC near 50%. WebCARMA bins fewers reads, and Galaxy has high variability. For the first 5602 reads (1.5 Mb web site limit), Phylopythia only classifies eight reads to the phylum level and is not included in the graph due to its inability to make assignments at the genus level.</p>
</caption>
<graphic xlink:href="btq619f1"></graphic>
</fig>
</p>
<p>NBC took 21 h to run and classified all 100% of the reads compared with 12 h/23% for WebCARMA, 5 h/99% for CAMERA, 2–3 h/140% for Galaxy <xref ref-type="fn" rid="FN1"><sup>1</sup>
</xref>
, and a few weeks <xref ref-type="fn" rid="FN2"><sup>2</sup>
</xref>
/56.2% for MG-RAST. NBC runs on a 4-core Intel machine and speed would linearly increase with distributed computing in the future.</p>
</sec>
<sec sec-type="conclusions" id="SEC4"><title>4 CONCLUSION</title>
<p>The naïve Bayes classification tool is implemented on a web site for public use. We demonstrate that the tool can handle a complete pyrosequencing dataset, and it gives the full taxonomy for each read, so that users can easily analyze the taxonomic composition of their datasets. NBC classifies every read unlike other tools and is easy to use, runs an entire dataset in a reasonable amount of time and yields competitive results.</p>
</sec>
</body>
<back><fn-group><fn id="FN1"><p><sup>1</sup>
In Galaxy, the number of BLAST hits is greater than the original # of reads.</p>
</fn>
<fn id="FN2"><p><sup>2</sup>
There was a lengthy wait queue for MG-RAST. It is difficult to assess true run times due to each site's different hardware and usage.</p>
</fn>
</fn-group>
<ack><title>ACKNOWLEDGEMENT</title>
<p>We thank Christopher Cramer for the scoring code and binary packages.</p>
<p><italic>Funding</italic>
: Supported in part by the National Science Foundation CAREER award #0845827 and Department of Energy award DE-SC0004335.</p>
<p><italic>Conflict of Interest</italic>
: none declared.</p>
</ack>
<ref-list><title>REFERENCES</title>
<ref id="B1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Basic local alignment search tool</article-title>
<source>J. Mol. Biol.</source>
<year>1990</year>
<volume>215</volume>
<fpage>403</fpage>
<lpage>410</lpage>
<pub-id pub-id-type="pmid">2231712</pub-id>
</element-citation>
</ref>
<ref id="B2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gerlach</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Webcarma: a web application for the functional and taxonomic classification of unassembled metagenomic reads</article-title>
<source>BMC Bioinformatics</source>
<year>2009</year>
<volume>10</volume>
</element-citation>
</ref>
<ref id="B3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hery</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Monitoring of bacterial communities during low temperature thermal treatment of activated sludge combining dna phylochip and respirometry techniques</article-title>
<source>Water Res.</source>
<year>2010</year>
<comment>[Epub ahead of print, doi: 10.1016/j.watres.2010.07.003.]</comment>
</element-citation>
</ref>
<ref id="B4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Huson</surname>
<given-names>DE</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Megan analysis of metagenomic data</article-title>
<source>Genome Res.</source>
<year>2007</year>
<volume>17</volume>
<fpage>377</fpage>
<lpage>386</lpage>
<pub-id pub-id-type="pmid">17255551</pub-id>
</element-citation>
</ref>
<ref id="B5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McHardy</surname>
<given-names>AC</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Accurate phylogenetic classification of variable-length dna fragments</article-title>
<source>Nat. Methods</source>
<year>2007</year>
<volume>4</volume>
<fpage>63</fpage>
<lpage>72</lpage>
<pub-id pub-id-type="pmid">17179938</pub-id>
</element-citation>
</ref>
<ref id="B6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Meyer</surname>
<given-names>F</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The metagenomics rast server - a public resource for the automatic phylogenetic and functional analysis of metagenomes</article-title>
<source>BMC Bioinformatics</source>
<year>2008</year>
<volume>9</volume>
<fpage>386</fpage>
<pub-id pub-id-type="pmid">18803844</pub-id>
</element-citation>
</ref>
<ref id="B7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nolan</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Complete genome sequence of streptosorangium roseum type strain (ni 9100t)</article-title>
<source>Stand. Genomic Sci.</source>
<year>2010</year>
<volume>2</volume>
<fpage>1</fpage>
</element-citation>
</ref>
<ref id="B8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Overbeek</surname>
<given-names>R</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes</article-title>
<source>Nucelic Acids Res.</source>
<year>2005</year>
<volume>33</volume>
<fpage>5691</fpage>
<lpage>5702</lpage>
</element-citation>
</ref>
<ref id="B9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pond</surname>
<given-names>SK</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Windshield splatter analysis with the galaxy metagenomic pipeline</article-title>
<source>Genome Res.</source>
<year>2009</year>
<volume>19</volume>
<fpage>2144</fpage>
<lpage>2153</lpage>
<pub-id pub-id-type="pmid">19819906</pub-id>
</element-citation>
</ref>
<ref id="B10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rosen</surname>
<given-names>GL</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Signal processing for metagenomics: extracting information from the soup</article-title>
<source>Curr. Genomics</source>
<year>2009</year>
<volume>10</volume>
<fpage>493</fpage>
<lpage>510</lpage>
<pub-id pub-id-type="pmid">20436876</pub-id>
</element-citation>
</ref>
<ref id="B11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rosen</surname>
<given-names>GL</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Metagenome fragment classification using <italic>n</italic>
-mer frequency profiles</article-title>
<source>Adv. Bioinformatics</source>
<year>2008</year>
<volume>2008</volume>
<comment>Article ID 205969</comment>
</element-citation>
</ref>
<ref id="B12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schlüter</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The metagenome of a biogas-producing microbial community of a production-scale biogas plant fermenter analysed by the 454-pyrosequencing technology</article-title>
<source>J. Biotechnol.</source>
<year>2008</year>
<volume>136</volume>
<fpage>77</fpage>
<lpage>90</lpage>
<pub-id pub-id-type="pmid">18597880</pub-id>
</element-citation>
</ref>
<ref id="B13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Seshadri</surname>
<given-names>R</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Camera: A community resource for metagenomics</article-title>
<source>PLoS Biol.</source>
<year>2007</year>
<volume>5</volume>
</element-citation>
</ref>
<ref id="B14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vinneras</surname>
<given-names>B</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Identification of the microbiological community in biogas systems and evaluation of microbial risks from gas usage</article-title>
<source>Sci. Total Environ.</source>
<year>2006</year>
<volume>367</volume>
<fpage>606</fpage>
<lpage>615</lpage>
<pub-id pub-id-type="pmid">16556456</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000501 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000501 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:3008645
   |texte=   NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:21062764" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024

	Serveur d'exploration Cyberinfrastructure
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration Cyberinfrastructure

NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads

NBC: the Naïve Bayes Classification tool webserver for taxonomic classification of metagenomic reads

Source :

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki