Système d'information stratégique et agriculture (serveur d'exploration)

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000162 ( Pmc/Corpus ); précédent : 0001619; suivant : 0001630 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">MetaBioME: a database to explore commercially useful enzymes in metagenomic datasets</title>
<author>
<name sortKey="Sharma, Vineet K" sort="Sharma, Vineet K" uniqKey="Sharma V" first="Vineet K." last="Sharma">Vineet K. Sharma</name>
</author>
<author>
<name sortKey="Kumar, Naveen" sort="Kumar, Naveen" uniqKey="Kumar N" first="Naveen" last="Kumar">Naveen Kumar</name>
</author>
<author>
<name sortKey="Prakash, Tulika" sort="Prakash, Tulika" uniqKey="Prakash T" first="Tulika" last="Prakash">Tulika Prakash</name>
</author>
<author>
<name sortKey="Taylor, Todd D" sort="Taylor, Todd D" uniqKey="Taylor T" first="Todd D." last="Taylor">Todd D. Taylor</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">19906710</idno>
<idno type="pmc">2808964</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2808964</idno>
<idno type="RBID">PMC:2808964</idno>
<idno type="doi">10.1093/nar/gkp1001</idno>
<date when="2009">2009</date>
<idno type="wicri:Area/Pmc/Corpus">000162</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000162</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">MetaBioME: a database to explore commercially useful enzymes in metagenomic datasets</title>
<author>
<name sortKey="Sharma, Vineet K" sort="Sharma, Vineet K" uniqKey="Sharma V" first="Vineet K." last="Sharma">Vineet K. Sharma</name>
</author>
<author>
<name sortKey="Kumar, Naveen" sort="Kumar, Naveen" uniqKey="Kumar N" first="Naveen" last="Kumar">Naveen Kumar</name>
</author>
<author>
<name sortKey="Prakash, Tulika" sort="Prakash, Tulika" uniqKey="Prakash T" first="Tulika" last="Prakash">Tulika Prakash</name>
</author>
<author>
<name sortKey="Taylor, Todd D" sort="Taylor, Todd D" uniqKey="Taylor T" first="Todd D." last="Taylor">Todd D. Taylor</name>
</author>
</analytic>
<series>
<title level="j">Nucleic Acids Research</title>
<idno type="ISSN">0305-1048</idno>
<idno type="eISSN">1362-4962</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Microbial enzymes have many known applications as biocatalysts in biotechnology, agriculture, medical and other industries. However, only a few enzymes are currently employed for such commercial applications. In this scenario, the current onslaught of metagenomic data provides a new unexplored treasure trove of genomic wealth that can not only enhance the enzyme repertoire by the discovery of novel commercially useful enzymes (CUEs) but can also reveal better functional variants for existing CUEs. We prepared a catalogue of CUEs using text mining of PubMed abstracts and other publicly available information, and manually curated the data to identify 510 CUEs. Further, in order to identify novel homologues of these CUEs, we identified potential ORFs in publicly available metagenomic datasets from 10 diverse sources. Using this strategy, we have developed a resource called MetaBioME (
<ext-link ext-link-type="uri" xlink:href="http://metasystems.riken.jp/metabiome/">http://metasystems.riken.jp/metabiome/</ext-link>
) that comprises (i) a database of CUEs and (ii) a comprehensive platform to facilitate homology-based computational identification of novel homologous CUEs from metagenomic and bacterial genomic datasets. Using MetaBioME, we have identified several novel homologues to known CUEs that can potentially serve as leads for further experimental verification.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Arnold, Fh" uniqKey="Arnold F">FH Arnold</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ferrer, M" uniqKey="Ferrer M">M Ferrer</name>
</author>
<author>
<name sortKey="Martinez Abarca, F" uniqKey="Martinez Abarca F">F Martinez-Abarca</name>
</author>
<author>
<name sortKey="Golyshin, Pn" uniqKey="Golyshin P">PN Golyshin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lorenz, P" uniqKey="Lorenz P">P Lorenz</name>
</author>
<author>
<name sortKey="Eck, J" uniqKey="Eck J">J Eck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
<author>
<name sortKey="Rubin, Em" uniqKey="Rubin E">EM Rubin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bairoch, A" uniqKey="Bairoch A">A Bairoch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gasteiger, E" uniqKey="Gasteiger E">E Gasteiger</name>
</author>
<author>
<name sortKey="Gattiker, A" uniqKey="Gattiker A">A Gattiker</name>
</author>
<author>
<name sortKey="Hoogland, C" uniqKey="Hoogland C">C Hoogland</name>
</author>
<author>
<name sortKey="Ivanyi, I" uniqKey="Ivanyi I">I Ivanyi</name>
</author>
<author>
<name sortKey="Appel, Rd" uniqKey="Appel R">RD Appel</name>
</author>
<author>
<name sortKey="Bairoch, A" uniqKey="Bairoch A">A Bairoch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kiefer, F" uniqKey="Kiefer F">F Kiefer</name>
</author>
<author>
<name sortKey="Arnold, K" uniqKey="Arnold K">K Arnold</name>
</author>
<author>
<name sortKey="Kunzli, M" uniqKey="Kunzli M">M Kunzli</name>
</author>
<author>
<name sortKey="Bordoli, L" uniqKey="Bordoli L">L Bordoli</name>
</author>
<author>
<name sortKey="Schwede, T" uniqKey="Schwede T">T Schwede</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sayers, Ew" uniqKey="Sayers E">EW Sayers</name>
</author>
<author>
<name sortKey="Barrett, T" uniqKey="Barrett T">T Barrett</name>
</author>
<author>
<name sortKey="Benson, Da" uniqKey="Benson D">DA Benson</name>
</author>
<author>
<name sortKey="Bryant, Sh" uniqKey="Bryant S">SH Bryant</name>
</author>
<author>
<name sortKey="Canese, K" uniqKey="Canese K">K Canese</name>
</author>
<author>
<name sortKey="Chetvernin, V" uniqKey="Chetvernin V">V Chetvernin</name>
</author>
<author>
<name sortKey="Church, Dm" uniqKey="Church D">DM Church</name>
</author>
<author>
<name sortKey="Dicuccio, M" uniqKey="Dicuccio M">M DiCuccio</name>
</author>
<author>
<name sortKey="Edgar, R" uniqKey="Edgar R">R Edgar</name>
</author>
<author>
<name sortKey="Federhen, S" uniqKey="Federhen S">S Federhen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chang, A" uniqKey="Chang A">A Chang</name>
</author>
<author>
<name sortKey="Scheer, M" uniqKey="Scheer M">M Scheer</name>
</author>
<author>
<name sortKey="Grote, A" uniqKey="Grote A">A Grote</name>
</author>
<author>
<name sortKey="Schomburg, I" uniqKey="Schomburg I">I Schomburg</name>
</author>
<author>
<name sortKey="Schomburg, D" uniqKey="Schomburg D">D Schomburg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marchler Bauer, A" uniqKey="Marchler Bauer A">A Marchler-Bauer</name>
</author>
<author>
<name sortKey="Anderson, Jb" uniqKey="Anderson J">JB Anderson</name>
</author>
<author>
<name sortKey="Chitsaz, F" uniqKey="Chitsaz F">F Chitsaz</name>
</author>
<author>
<name sortKey="Derbyshire, Mk" uniqKey="Derbyshire M">MK Derbyshire</name>
</author>
<author>
<name sortKey="Weese Scott, C" uniqKey="Weese Scott C">C Weese-Scott</name>
</author>
<author>
<name sortKey="Fong, Jh" uniqKey="Fong J">JH Fong</name>
</author>
<author>
<name sortKey="Geer, Ly" uniqKey="Geer L">LY Geer</name>
</author>
<author>
<name sortKey="Geer, Rc" uniqKey="Geer R">RC Geer</name>
</author>
<author>
<name sortKey="Gonzales, Nr" uniqKey="Gonzales N">NR Gonzales</name>
</author>
<author>
<name sortKey="Gwadz, M" uniqKey="Gwadz M">M Gwadz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dutta, S" uniqKey="Dutta S">S Dutta</name>
</author>
<author>
<name sortKey="Burkhardt, K" uniqKey="Burkhardt K">K Burkhardt</name>
</author>
<author>
<name sortKey="Young, J" uniqKey="Young J">J Young</name>
</author>
<author>
<name sortKey="Swaminathan, Gj" uniqKey="Swaminathan G">GJ Swaminathan</name>
</author>
<author>
<name sortKey="Matsuura, T" uniqKey="Matsuura T">T Matsuura</name>
</author>
<author>
<name sortKey="Henrick, K" uniqKey="Henrick K">K Henrick</name>
</author>
<author>
<name sortKey="Nakamura, H" uniqKey="Nakamura H">H Nakamura</name>
</author>
<author>
<name sortKey="Berman, Hm" uniqKey="Berman H">HM Berman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bernstein, Hj" uniqKey="Bernstein H">HJ Bernstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hongoh, Y" uniqKey="Hongoh Y">Y Hongoh</name>
</author>
<author>
<name sortKey="Sharma, Vk" uniqKey="Sharma V">VK Sharma</name>
</author>
<author>
<name sortKey="Prakash, T" uniqKey="Prakash T">T Prakash</name>
</author>
<author>
<name sortKey="Noda, S" uniqKey="Noda S">S Noda</name>
</author>
<author>
<name sortKey="Taylor, Td" uniqKey="Taylor T">TD Taylor</name>
</author>
<author>
<name sortKey="Kudo, T" uniqKey="Kudo T">T Kudo</name>
</author>
<author>
<name sortKey="Sakaki, Y" uniqKey="Sakaki Y">Y Sakaki</name>
</author>
<author>
<name sortKey="Toyoda, A" uniqKey="Toyoda A">A Toyoda</name>
</author>
<author>
<name sortKey="Hattori, M" uniqKey="Hattori M">M Hattori</name>
</author>
<author>
<name sortKey="Ohkuma, M" uniqKey="Ohkuma M">M Ohkuma</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hongoh, Y" uniqKey="Hongoh Y">Y Hongoh</name>
</author>
<author>
<name sortKey="Sharma, Vk" uniqKey="Sharma V">VK Sharma</name>
</author>
<author>
<name sortKey="Prakash, T" uniqKey="Prakash T">T Prakash</name>
</author>
<author>
<name sortKey="Noda, S" uniqKey="Noda S">S Noda</name>
</author>
<author>
<name sortKey="Toh, H" uniqKey="Toh H">H Toh</name>
</author>
<author>
<name sortKey="Taylor, Td" uniqKey="Taylor T">TD Taylor</name>
</author>
<author>
<name sortKey="Kudo, T" uniqKey="Kudo T">T Kudo</name>
</author>
<author>
<name sortKey="Sakaki, Y" uniqKey="Sakaki Y">Y Sakaki</name>
</author>
<author>
<name sortKey="Toyoda, A" uniqKey="Toyoda A">A Toyoda</name>
</author>
<author>
<name sortKey="Hattori, M" uniqKey="Hattori M">M Hattori</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delcher, Al" uniqKey="Delcher A">AL Delcher</name>
</author>
<author>
<name sortKey="Bratke, Ka" uniqKey="Bratke K">KA Bratke</name>
</author>
<author>
<name sortKey="Powers, Ec" uniqKey="Powers E">EC Powers</name>
</author>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noguchi, H" uniqKey="Noguchi H">H Noguchi</name>
</author>
<author>
<name sortKey="Park, J" uniqKey="Park J">J Park</name>
</author>
<author>
<name sortKey="Takagi, T" uniqKey="Takagi T">T Takagi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Godzik, A" uniqKey="Godzik A">A Godzik</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Katoh, K" uniqKey="Katoh K">K Katoh</name>
</author>
<author>
<name sortKey="Kuma, K" uniqKey="Kuma K">K Kuma</name>
</author>
<author>
<name sortKey="Miyata, T" uniqKey="Miyata T">T Miyata</name>
</author>
<author>
<name sortKey="Toh, H" uniqKey="Toh H">H Toh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nobeli, I" uniqKey="Nobeli I">I Nobeli</name>
</author>
<author>
<name sortKey="Favia, Ad" uniqKey="Favia A">AD Favia</name>
</author>
<author>
<name sortKey="Thornton, Jm" uniqKey="Thornton J">JM Thornton</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="iso-abbrev">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="publisher-id">nar</journal-id>
<journal-id journal-id-type="hwp">nar</journal-id>
<journal-title-group>
<journal-title>Nucleic Acids Research</journal-title>
</journal-title-group>
<issn pub-type="ppub">0305-1048</issn>
<issn pub-type="epub">1362-4962</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">19906710</article-id>
<article-id pub-id-type="pmc">2808964</article-id>
<article-id pub-id-type="doi">10.1093/nar/gkp1001</article-id>
<article-id pub-id-type="publisher-id">gkp1001</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Articles</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>MetaBioME: a database to explore commercially useful enzymes in metagenomic datasets</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Sharma</surname>
<given-names>Vineet K.</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kumar</surname>
<given-names>Naveen</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Prakash</surname>
<given-names>Tulika</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Taylor</surname>
<given-names>Todd D.</given-names>
</name>
<xref ref-type="corresp" rid="COR1">*</xref>
</contrib>
</contrib-group>
<aff>MetaSystems Research Team, Computational Systems Biology Research Group, Advanced Computational Sciences Department, Advanced Science Institute, RIKEN, Yokohama, Kanagawa 230-0045, Japan</aff>
<author-notes>
<corresp id="COR1">*To whom correspondence should be addressed. Tel: +81-45-503-9285; Fax: +81-45-503-9176; Email:
<email>taylor@riken.jp</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>1</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>11</day>
<month>11</month>
<year>2009</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>10</day>
<month>11</month>
<year>2009</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>38</volume>
<issue>Database issue</issue>
<issue-title>Database issue</issue-title>
<fpage>D468</fpage>
<lpage>D472</lpage>
<history>
<date date-type="received">
<day>15</day>
<month>8</month>
<year>2009</year>
</date>
<date date-type="rev-recd">
<day>8</day>
<month>10</month>
<year>2009</year>
</date>
<date date-type="accepted">
<day>17</day>
<month>10</month>
<year>2009</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2009. Published by Oxford University Press.</copyright-statement>
<copyright-year>2009</copyright-year>
<license license-type="creative-commons" xlink:href="http://creativecommons.org/licenses/by-nc/2.5/uk/">
<license-p>
<pmc-comment>CREATIVE COMMONS</pmc-comment>
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/2.5/uk/">http://creativecommons.org/licenses/by-nc/2.5/uk/</ext-link>
) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>Microbial enzymes have many known applications as biocatalysts in biotechnology, agriculture, medical and other industries. However, only a few enzymes are currently employed for such commercial applications. In this scenario, the current onslaught of metagenomic data provides a new unexplored treasure trove of genomic wealth that can not only enhance the enzyme repertoire by the discovery of novel commercially useful enzymes (CUEs) but can also reveal better functional variants for existing CUEs. We prepared a catalogue of CUEs using text mining of PubMed abstracts and other publicly available information, and manually curated the data to identify 510 CUEs. Further, in order to identify novel homologues of these CUEs, we identified potential ORFs in publicly available metagenomic datasets from 10 diverse sources. Using this strategy, we have developed a resource called MetaBioME (
<ext-link ext-link-type="uri" xlink:href="http://metasystems.riken.jp/metabiome/">http://metasystems.riken.jp/metabiome/</ext-link>
) that comprises (i) a database of CUEs and (ii) a comprehensive platform to facilitate homology-based computational identification of novel homologous CUEs from metagenomic and bacterial genomic datasets. Using MetaBioME, we have identified several novel homologues to known CUEs that can potentially serve as leads for further experimental verification.</p>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>INTRODUCTION</title>
<p>Characteristics such as high efficiency and stereo-selectivity render naturally occurring enzymes suitable for commercial applications (
<xref ref-type="bibr" rid="B1">1</xref>
). These ‘commercially useful enzymes’ (or CUEs), predominantly used as ‘biocatalysts’, offer ecologically friendly or ‘green’ solutions for the implementation of biochemical processes at a reduced cost and produce a large variety of chemical substances (
<xref ref-type="bibr" rid="B2">2</xref>
). Despite these merits, only a limited number of enzymes have been commercially exploited. This limitation is primarily due to the lack of availability of microbial enzymes that can perform the desired chemical reactions.</p>
<p>In the current era of sequencing, when numerous genomes are being sequenced, the discovery of novel ORFs far exceeds the rate of their functional characterization, resulting in most of these ORFs being labelled as hypothetical proteins with unknown functions. The wetlab work to be invested into enzyme characterization is a much more tedious process and is one of the reasons for the growing gap between the number of potential enzymes deduced from these genomic sequences and those actually implemented in industry.</p>
<p>Another reason for the above limitation is that the CUEs that are currently used may not be ‘ideal’ enzymes for a given bioprocess, and sometimes the industrial processes have to be designed to purposely fit these mediocre enzymes (
<xref ref-type="bibr" rid="B3">3</xref>
). Therefore, improving the existing enzymes to make them suitable for commercial exploitation or finding better functional variants is a key challenge.</p>
<p>One promising approach is to augment our knowledgebase by exploring the inherent diversity of nature that harbours numerous species and their constituent enzymes that perform numerous transformations of molecules in diverse biological systems with great precision and specificity. In this scenario, metagenomics has emerged as a powerful culture-independent approach for exploring the complexity and diversity of microbial genomes in their natural environments (
<xref ref-type="bibr" rid="B4">4</xref>
). Potentially, it can not only enhance the enzyme repertoire by the discovery of novel CUEs but can also reveal better functional variants for the existing CUEs. The current onslaught of metagenomic data provides a unique opportunity to discover novel functional variants for existing CUEs using sequence homology-based approaches.</p>
<p>Therefore, in the present work, to first catalogue the known CUEs, we used publicly available information to curate a unique and comprehensive database of CUEs mostly comprising biocatalysts currently used in diverse commercial applications or having potential applications. Further, in order to find the homologues of these CUEs, we explored 10 metagenomic data sources and 971 completed bacterial genomes and identified several novel homologues for most of the known CUEs. Using this strategy, we developed the comprehensive Metagenomic BioMining Engine (MetaBioME), which can be used as an intuitive search engine to access manually curated data on the CUEs, stored in a relational database, along with several options to identify their homologues from multiple metagenomic datasets and completed bacterial genomes.</p>
</sec>
<sec>
<title>DATABASE CONSTRUCTION AND CONTENTS</title>
<sec>
<title>Enzyme database</title>
<p>For this analysis, we have exclusively used the Enzyme Commission number (EC number) system to refer to enzymes and define their functions (
<xref ref-type="bibr" rid="B5">5</xref>
). Information on the complete set of 4877 enzymes annotated with EC numbers was retrieved from the ENZYME nomenclature database, as available at ExPASy (March 3, 2009) (
<xref ref-type="bibr" rid="B6">6</xref>
). The corresponding Swiss-Prot sequences were retrieved from the Swiss-Prot database (release 56.9, March 3, 2009) (
<xref ref-type="bibr" rid="B7">7</xref>
).</p>
</sec>
<sec>
<title>Database of CUEs</title>
<p>We curated a database of 510 enzymes with known or potential commercial applications (CUEs) using the information available at NCBI PubMed (
<xref ref-type="bibr" rid="B8">8</xref>
) and BRENDA (
<xref ref-type="bibr" rid="B9">9</xref>
). All ‘English’ abstracts containing the keyword ‘enzyme’ were retrieved from PubMed in XML format and imported into a MySQL database (version 5.1) (
<xref ref-type="fig" rid="F1">Figure 1</xref>
). The initial set of candidate CUEs were identified using the ‘Natural Language full-text search’ and ‘Boolean full-text search’ features of MySQL. Additional information on known CUEs was retrieved from BRENDA. Taken together, these candidate CUEs were manually curated to identify the final set of 510 CUEs (CUEsDB). Based on their known application, these CUEs were classified into nine broad application categories, namely: Agriculture, Biosensor, Biotechnology, Energy, Environment, Food and Nutrition, Medical, Other Industries and Miscellaneous.
<fig id="F1" position="float">
<label>Figure 1.</label>
<caption>
<p>Steps in the construction of the MetaBioME database.</p>
</caption>
<graphic xlink:href="gkp1001f1"></graphic>
</fig>
</p>
</sec>
<sec>
<title>Other resources</title>
<p>The non-redundant (NR) sequence database, sequences of 971 completed bacterial genomes (
<ext-link ext-link-type="uri" xlink:href="ftp.ncbi.nih.gov/genomes/Bacteria">ftp.ncbi.nih.gov/genomes/Bacteria</ext-link>
as of September 21, 2009), and Conserved Domain Database (
<ext-link ext-link-type="ftp" xlink:href="ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd">ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd</ext-link>
) were retrieved from NCBI (
<xref ref-type="bibr" rid="B8">8</xref>
,
<xref ref-type="bibr" rid="B10">10</xref>
). The Protein Data Bank (PDB) database was retrieved from the Worldwide Protein Data Bank (wwPDB) (
<ext-link ext-link-type="uri" xlink:href="http://www.wwpdb.org/">http://www.wwpdb.org/</ext-link>
) (
<xref ref-type="bibr" rid="B11">11</xref>
). Protein structures were created using Rasmol (version 2.6) (
<xref ref-type="bibr" rid="B12">12</xref>
).</p>
</sec>
<sec>
<title>Mining the metagenomic databases</title>
<p>In the current version of the database, we have included the publicly available metagenomic sequence data from 10 sources (environments) comprising 44 datasets (details are available at
<ext-link ext-link-type="uri" xlink:href="http://metasystems.riken.jp/metabiome/metagenome.php">http://metasystems.riken.jp/metabiome/metagenome.php</ext-link>
) generated using Sanger sequencing technology except in the case of mouse gut where both Sanger and 454 sequencing technologies were used. The assembled metagenomic data was retrieved from NCBI Entrez Genome Project (
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genomes/lenvs.cgi">http://www.ncbi.nlm.nih.gov/genomes/lenvs.cgi</ext-link>
). Complete and partial ORFs (≥150 nucleotides) were predicted in the metagenomic contigs using the SuperGene algorithm (part of our in-house iMetaSys pipeline (
<xref ref-type="bibr" rid="B13">13</xref>
,
<xref ref-type="bibr" rid="B14">14</xref>
)) that integrates both the Glimmer (
<xref ref-type="bibr" rid="B15">15</xref>
) and MetaGene (
<xref ref-type="bibr" rid="B16">16</xref>
) gene prediction software. The Cd-hit program (version 3.1.2) (
<xref ref-type="bibr" rid="B17">17</xref>
) was used to cluster the metagenomic ORFs. Swiss-Prot protein sequences were available for only 409 CUEs and these sequences were aligned with the predicted metagenomic ORFs for each metagenomic dataset using BLASTP with a threshold of
<italic>E</italic>
< 10
<sup>−6</sup>
. The output was generated in XML format, parsed and imported into a MySQL database (Metabase).</p>
</sec>
<sec>
<title>Web Interface and Metabase development</title>
<p>Open Source LAMP (Red Hat Enterprise Linux 4) Technology, Apache (version 2.2.8), MySQL (version 5.0.45), PHP (version 5.2.4) and Perl (version 5.8.5)) were used for development of the GUI and back-end database called ‘Metabase’. The web-server was developed using the Apache HTTP Server (version 2.2.8). Client-side scripting was done using XHTML, JavaScript and AJAX, and server-side scripting was done using PHP and XML. The external applications BLAT (v34) (
<xref ref-type="bibr" rid="B18">18</xref>
), BLAST (version 2.2.17) and MAFFT (version 6.240) (
<xref ref-type="bibr" rid="B19">19</xref>
) were integrated for analysis.</p>
</sec>
</sec>
<sec>
<title>RESULTS, QUERIES AND WEB INTERFACE</title>
<sec>
<title>Distribution of CUEs in application categories</title>
<p>The distribution of 510 CUEs in nine application categories provides a useful schema for selecting enzymes involved in an application area of interest. Since an enzyme may be employed in more than one application, some overlaps exist in the distribution of CUEs in these nine categories. Among these categories, the highest number of CUEs are present in the ‘Biotechnology’ category (234, 46%) and the lowest number are in the ‘Energy’ category (13, 3%) (
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Figures S1–S3</ext-link>
).</p>
</sec>
<sec>
<title>Identification of potential homologues to known CUEs</title>
<p>Using a homology-based approach, we identified 199 (49%) novel homologues for known CUEs in the metagenomic datasets using a stringent threshold of identity ≥50% and coverage ≥90% (
<xref ref-type="table" rid="T1">Table 1</xref>
). Upon relaxing the above cut-off (identity ≥30% and coverage ≥90%), we identified an expanded list of novel homologues for a total of 305 (75%) out of 409 CUEs in the metagenomic datasets (
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Table S1</ext-link>
). Within this expanded list, homologues for 20 CUEs were commonly found in all nine metagenomic datasets (the coral viral metagenome dataset was excluded from this analysis), while homologues for 64 CUEs only appeared once each among the nine metagenomic datasets (
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Table S2</ext-link>
).
<table-wrap id="T1" position="float">
<label>Table 1.</label>
<caption>
<p>Distribution of CUEs showing significant homology with novel ORFs in metagenomic datasets</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Application</th>
<th rowspan="1" colspan="1">
<sup>a</sup>
Total potential homologs (%)</th>
<th colspan="9" align="center" rowspan="1">
<hr></hr>
Homologous ORFs predicted in Metagenomic Datasets (Coverage: ≥90%, Identity ≥50%)
<sup>b</sup>
</th>
</tr>
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1">Human gut</th>
<th rowspan="1" colspan="1">Mouse gut</th>
<th rowspan="1" colspan="1">Termite gut</th>
<th rowspan="1" colspan="1">Marine</th>
<th rowspan="1" colspan="1">Mine drainage</th>
<th rowspan="1" colspan="1">Sludge</th>
<th rowspan="1" colspan="1">Soil</th>
<th rowspan="1" colspan="1">Microbial mat</th>
<th rowspan="1" colspan="1">Whale fall</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">Agriculture</td>
<td rowspan="1" colspan="1">14 (34)</td>
<td rowspan="1" colspan="1">11</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">12</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">6</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Biosensor</td>
<td rowspan="1" colspan="1">34 (51)</td>
<td rowspan="1" colspan="1">27</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">30</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">19</td>
<td rowspan="1" colspan="1">10</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">11</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Biotechnology</td>
<td rowspan="1" colspan="1">104 (54)</td>
<td rowspan="1" colspan="1">79</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">18</td>
<td rowspan="1" colspan="1">92</td>
<td rowspan="1" colspan="1">15</td>
<td rowspan="1" colspan="1">53</td>
<td rowspan="1" colspan="1">22</td>
<td rowspan="1" colspan="1">14</td>
<td rowspan="1" colspan="1">33</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Energy</td>
<td rowspan="1" colspan="1">6 (55)</td>
<td rowspan="1" colspan="1">6</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">0</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Environment</td>
<td rowspan="1" colspan="1">31 (49)</td>
<td rowspan="1" colspan="1">17</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">29</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">11</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">8</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Food and Nutrition</td>
<td rowspan="1" colspan="1">46 (47)</td>
<td rowspan="1" colspan="1">39</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">6</td>
<td rowspan="1" colspan="1">39</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">13</td>
<td rowspan="1" colspan="1">6</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">12</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Medical</td>
<td rowspan="1" colspan="1">38 (42)</td>
<td rowspan="1" colspan="1">33</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">32</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">21</td>
<td rowspan="1" colspan="1">13</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">16</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Miscellaneous</td>
<td rowspan="1" colspan="1">8 (47)</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">8</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Other industries</td>
<td rowspan="1" colspan="1">7 (35)</td>
<td rowspan="1" colspan="1">6</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">2</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TF1">
<p>
<sup>a</sup>
Total number of homologous ORFs and their percentages (in brackets) that showed ≥50% sequence identity and ≥90% alignment coverage with CUEs, taken together for all metagenomic datasets.</p>
</fn>
<fn id="TF2">
<p>
<sup>b</sup>
The number of homologous CUEs out of the total number of CUEs in that category.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec>
<title>Description of web resource: MetaBioME</title>
<p>MetaBioME has two main components (i) a curated database of CUEs and (ii) comprehensive bio-mining options to search for novel homologues to the known CUEs in metagenomic and bacterial genomic datasets. For comprehensive querying, we have designed four query pages that are briefly described below.</p>
</sec>
<sec>
<title>MetaSearch: search for homologous CUEs in metagenomic datasets and completed bacterial genomes</title>
<p>The ‘MetaSearch’ query page is designed to identify novel homologues to the existing set of CUEs from multiple metagenomic datasets and completed bacterial genomes. It consists of a set of CUEs pre-classified in nine application categories that help the user to select any CUEs of interest based on the area of application (
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Figure S4</ext-link>
). Queries can be made by selecting one or more of the application categories or by using the ‘Advanced Search’ option to select any particular enzyme class (EC) or enzyme name (
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Figure S5</ext-link>
). This selected set of enzymes can be searched for in the available metagenomic data or completed bacterial genomes. Queries can also be made using multiple keywords and Boolean operators by selecting different attributes, such as enzyme name or keywords, biochemical pathway and substrates or products. A sample query to search for CUEs belonging to the ‘Environment’ application category in the ‘Soil’ metagenomic source is shown in
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Figures S4, S6 and S7</ext-link>
.</p>
<p>On query submission, MetaBioME examines the sequence similarity of all known Swiss-Prot sequences of CUEs belonging to the selected application categories with all the predicted metagenomic ORFs of the selected metagenomic dataset(s) or with all proteins of the selected bacterial genomes. The subsequent ‘MetaResults’ page displays the qualified hits as a table sorted on the basis of percent coverage (
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Figure S6</ext-link>
). Comprehensive information can be retrieved by clicking on the Swiss-Prot ID link on the MetaResults page, opening up the ‘MetaBioME profile’ page (
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Figure S7</ext-link>
). The profile page summarizes various information about the selected CUE. This is followed by a table of all the predicted ORFs in the metagenomic contig or bacterial genome with the description of the ORF that showed the highest similarity to the selected Swiss-Prot sequence of the CUE, also displayed in the contig view window. This is followed by an alignment view of the homologous ORF with the CUE sequence, a summary of the closest match of the homologous ORF to a known finished bacterial genome, and information on the closest available PDB structure. The alignment of the CUE sequence with all novel metagenomic ORFs (all datasets) clustered using cd-hit is displayed in the next window. A list of other Swiss-Prot IDs belonging to the same EC number that showed lower similarity is shown in the next table.</p>
<p>‘MetaBioME Rating’ rates the homologous ORF on a scale of 1–5 stars (weakest to best match). In the case of a good match (≥2 stars), users can perform an ‘Advanced Analysis’ such as (i) examine the alignment of the CUE sequence with the homologous ORF, (ii) examine the sequence similarity among all Swiss-Prot sequences of the CUE and the homologous ORF, (iii) examine the presence of conserved domains in the homologous ORF using the NCBI Conserved Domain Database (CDD) or (iv) look for more homologues of the CUE in other metagenomic datasets or bacterial genomes.</p>
</sec>
<sec>
<title>CUEsXplorer—explore curated CUEs</title>
<p>This query page provides options for browsing the CUEs database with respect to application category or EC classification. Users can retrieve details about enzyme function and a curation summary by selecting any enzyme. A complete list of all CUEs can also be retrieved from this query page.</p>
</sec>
<sec>
<title>MetaXplorer—search for enzymes in metagenomic datasets</title>
<p>This query page provides users with an option to search for all known enzymes as available in the six EC classes, irrespective of their role as a CUE, in the metagenomic datasets or completed bacterial genomes. Detailed information about the enzymes, their biochemical pathways and all Swiss-Prot IDs belonging to the selected number can be retrieved. Any representative Swiss-Prot sequence can be further searched in one or more of the metagenomic datasets or completed bacterial genomes.</p>
</sec>
<sec>
<title>MetaAlign—search for nucleotide/protein sequences in metagenomic datasets</title>
<p>MetaAlign is an application powered by the BLAT and BLAST sequence alignment tools. Options are provided to carry out homology-based searches by uploading (i) single or multiple (multi-fasta format) nucleotide or protein sequences to search against the metagenomic sequences or bacterial genomes and (ii) the user’s own genomic or metagenomic sequences to search against the CUEs database.</p>
</sec>
</sec>
<sec>
<title>DISCUSSION AND FUTURE DIRECTIONS</title>
<p>The richness and natural diversity of metagenomic data is so enormous that the likelihood of retrieving functional genes of interest is almost assured, and this assertion will increase with the availability of additional metagenomic datasets and complete genomic sequences. Therefore, an automated homology-based computational approach like MetaBioME has great potential to reveal novel functional homologues for known CUEs. To our knowledge, this is the first comprehensive effort to curate a publicly available database of CUEs and the first such resource for exploring them in multiple metagenomic datasets or bacterial genomes.</p>
<p>It is a challenging task to look for an ‘ideal biocatalyst’, since the requirements and conditions of the bioprocesses are not constant and the commercial significance of an enzyme can only be established by experimental studies. Therefore, MetaBioME does not involve an exclusive approach in looking for ideal biocatalysts or CUEs with novel function, but instead employs an inclusive approach to try and identify all possible homologues of known CUEs using stringent criteria. These homologous ORFs come from the naturally existing diverse protein repertoire of yet unidentified microbial genomes that have evolved and survived in diverse environments in some cases for billions of years. Thus, each resultant homologous ORF is likely to be functional and it is likely to be somewhat unique with distinct characteristics such as thermodynamic and pH stability, turnover frequency, specific activity, etc. depending upon its environmental source (
<xref ref-type="bibr" rid="B20">20</xref>
). These novel homologous ORFs expand the currently known family of CUEs and their functional repertoire and provide wide range of possible enzymes to choose from and employ as per the requirements of any given bioprocess. Such approaches are useful for pharmaceutical and supporting fine-chemical companies (
<xref ref-type="bibr" rid="B3">3</xref>
), and especially for biotechnological companies that explore multiple diverse biocatalysts in order to build and expand their in-house toolboxes for biotransformations.</p>
<p>Certainly, the enzymatic properties and commercial potential of the novel homologous CUEs identified through MetaBioME need to be established through the inclusion of more intense bioinformatic analyses before more costly experimental characterization is performed, but at least initially they can serve as potential leads for such analyses. In future versions of MetaBioME, we aim to increase our knowledgebase of CUEs and to include more metagenomic datasets and completed bacterial genomes, with additional options for in silico analysis and data mining. MetaBioME can be queried using a publicly available web interface available at
<ext-link ext-link-type="uri" xlink:href="http://metasystems.riken.jp/metabiome">http://metasystems.riken.jp/metabiome</ext-link>
.</p>
</sec>
<sec>
<title>SUPPLEMENTARY DATA</title>
<p>
<ext-link ext-link-type="uri" xlink:href="http://nar.oxfordjournals.org/cgi/content/full/gkp1001/DC1">Supplementary Data</ext-link>
are available at NAR Online.</p>
</sec>
<sec>
<title>FUNDING</title>
<p>Funding for open access charge: Operational expenditure fund of RIKEN.</p>
<p>
<italic>Conflict of interest statement</italic>
. None declared.</p>
</sec>
</body>
<back>
<ack>
<title>ACKNOWLEDGEMENTS</title>
<p>We thank Takujiro Katayama (Hitachi Government and Public Corporation System Engineering, Ltd) and Chiharu Kawagoe (Hitachi, Ltd) for providing technical support. We also thank Naoko Kobayashi and Yui Bando for their administrative assistance.</p>
</ack>
<ref-list>
<title>REFERENCES</title>
<ref id="B1">
<label>1</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Arnold</surname>
<given-names>FH</given-names>
</name>
</person-group>
<article-title>Combinatorial and computational challenges for biocatalyst design</article-title>
<source>Nature.</source>
<year>2001</year>
<volume>409</volume>
<fpage>253</fpage>
<lpage>257</lpage>
<pub-id pub-id-type="pmid">11196654</pub-id>
</element-citation>
</ref>
<ref id="B2">
<label>2</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ferrer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Martinez-Abarca</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Golyshin</surname>
<given-names>PN</given-names>
</name>
</person-group>
<article-title>Mining genomes and ‘metagenomes’ for novel catalysts</article-title>
<source>Curr. Opin. Biotechnol.</source>
<year>2005</year>
<volume>16</volume>
<fpage>588</fpage>
<lpage>593</lpage>
<pub-id pub-id-type="pmid">16171989</pub-id>
</element-citation>
</ref>
<ref id="B3">
<label>3</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lorenz</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Eck</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Metagenomics and industrial applications</article-title>
<source>Nat. Rev. Microbiol.</source>
<year>2005</year>
<volume>3</volume>
<fpage>510</fpage>
<lpage>516</lpage>
<pub-id pub-id-type="pmid">15931168</pub-id>
</element-citation>
</ref>
<ref id="B4">
<label>4</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tringe</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Rubin</surname>
<given-names>EM</given-names>
</name>
</person-group>
<article-title>Metagenomics: DNA sequencing of environmental samples</article-title>
<source>Nat. Rev. Genet.</source>
<year>2005</year>
<volume>6</volume>
<fpage>805</fpage>
<lpage>814</lpage>
<pub-id pub-id-type="pmid">16304596</pub-id>
</element-citation>
</ref>
<ref id="B5">
<label>5</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bairoch</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>The ENZYME database in 2000</article-title>
<source>Nucleic Acids Res.</source>
<year>2000</year>
<volume>28</volume>
<fpage>304</fpage>
<lpage>305</lpage>
<pub-id pub-id-type="pmid">10592255</pub-id>
</element-citation>
</ref>
<ref id="B6">
<label>6</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gasteiger</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Gattiker</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hoogland</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ivanyi</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Appel</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Bairoch</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>ExPASy: the proteomics server for in-depth protein knowledge and analysis</article-title>
<source>Nucleic Acids Res.</source>
<year>2003</year>
<volume>31</volume>
<fpage>3784</fpage>
<lpage>3788</lpage>
<pub-id pub-id-type="pmid">12824418</pub-id>
</element-citation>
</ref>
<ref id="B7">
<label>7</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kiefer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Arnold</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kunzli</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bordoli</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Schwede</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>The SWISS-MODEL Repository and associated resources</article-title>
<source>Nucleic Acids Res.</source>
<year>2009</year>
<volume>37</volume>
<fpage>D387</fpage>
<lpage>D392</lpage>
<pub-id pub-id-type="pmid">18931379</pub-id>
</element-citation>
</ref>
<ref id="B8">
<label>8</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sayers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Barrett</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Benson</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Bryant</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Canese</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Chetvernin</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Church</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>DiCuccio</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Edgar</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Federhen</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Database resources of the National Center for Biotechnology Information</article-title>
<source>Nucleic Acids Res.</source>
<year>2009</year>
<volume>37</volume>
<fpage>D5</fpage>
<lpage>D15</lpage>
<pub-id pub-id-type="pmid">18940862</pub-id>
</element-citation>
</ref>
<ref id="B9">
<label>9</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Scheer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Grote</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Schomburg</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Schomburg</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009</article-title>
<source>Nucleic Acids Res.</source>
<year>2009</year>
<volume>37</volume>
<fpage>D588</fpage>
<lpage>D592</lpage>
<pub-id pub-id-type="pmid">18984617</pub-id>
</element-citation>
</ref>
<ref id="B10">
<label>10</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marchler-Bauer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Chitsaz</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Derbyshire</surname>
<given-names>MK</given-names>
</name>
<name>
<surname>Weese-Scott</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Fong</surname>
<given-names>JH</given-names>
</name>
<name>
<surname>Geer</surname>
<given-names>LY</given-names>
</name>
<name>
<surname>Geer</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Gonzales</surname>
<given-names>NR</given-names>
</name>
<name>
<surname>Gwadz</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>CDD: specific functional annotation with the Conserved Domain Database</article-title>
<source>Nucleic Acids Res.</source>
<year>2009</year>
<volume>37</volume>
<fpage>D205</fpage>
<lpage>D210</lpage>
<pub-id pub-id-type="pmid">18984618</pub-id>
</element-citation>
</ref>
<ref id="B11">
<label>11</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dutta</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Burkhardt</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Swaminathan</surname>
<given-names>GJ</given-names>
</name>
<name>
<surname>Matsuura</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Henrick</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Nakamura</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Berman</surname>
<given-names>HM</given-names>
</name>
</person-group>
<article-title>Data deposition and annotation at the worldwide protein data bank</article-title>
<source>Mol. Biotechnol.</source>
<year>2009</year>
<volume>42</volume>
<fpage>1</fpage>
<lpage>13</lpage>
<pub-id pub-id-type="pmid">19082769</pub-id>
</element-citation>
</ref>
<ref id="B12">
<label>12</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bernstein</surname>
<given-names>HJ</given-names>
</name>
</person-group>
<article-title>Recent changes to RasMol, recombining the variants</article-title>
<source>Trends Biochem. Sci.</source>
<year>2000</year>
<volume>25</volume>
<fpage>453</fpage>
<lpage>455</lpage>
<pub-id pub-id-type="pmid">10973060</pub-id>
</element-citation>
</ref>
<ref id="B13">
<label>13</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hongoh</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>VK</given-names>
</name>
<name>
<surname>Prakash</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Noda</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>TD</given-names>
</name>
<name>
<surname>Kudo</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Sakaki</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Toyoda</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hattori</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ohkuma</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Complete genome of the uncultured Termite Group 1 bacteria in a single host protist cell</article-title>
<source>Proc. Natl Acad. Sci. USA</source>
<year>2008</year>
<volume>105</volume>
<fpage>5555</fpage>
<lpage>5560</lpage>
<pub-id pub-id-type="pmid">18391199</pub-id>
</element-citation>
</ref>
<ref id="B14">
<label>14</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hongoh</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>VK</given-names>
</name>
<name>
<surname>Prakash</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Noda</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Toh</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>TD</given-names>
</name>
<name>
<surname>Kudo</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Sakaki</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Toyoda</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hattori</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Genome of an endosymbiont coupling N2 fixation to cellulolysis within protist cells in termite gut</article-title>
<source>Science</source>
<year>2008</year>
<volume>322</volume>
<fpage>1108</fpage>
<lpage>1109</lpage>
<pub-id pub-id-type="pmid">19008447</pub-id>
</element-citation>
</ref>
<ref id="B15">
<label>15</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Delcher</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Bratke</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Powers</surname>
<given-names>EC</given-names>
</name>
<name>
<surname>Salzberg</surname>
<given-names>SL</given-names>
</name>
</person-group>
<article-title>Identifying bacterial genes and endosymbiont DNA with Glimmer</article-title>
<source>Bioinformatics</source>
<year>2007</year>
<volume>23</volume>
<fpage>673</fpage>
<lpage>679</lpage>
<pub-id pub-id-type="pmid">17237039</pub-id>
</element-citation>
</ref>
<ref id="B16">
<label>16</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noguchi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Takagi</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>MetaGene: prokaryotic gene finding from environmental genome shotgun sequences</article-title>
<source>Nucleic Acids Res.</source>
<year>2006</year>
<volume>34</volume>
<fpage>5623</fpage>
<lpage>5630</lpage>
<pub-id pub-id-type="pmid">17028096</pub-id>
</element-citation>
</ref>
<ref id="B17">
<label>17</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Godzik</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences</article-title>
<source>Bioinformatics</source>
<year>2006</year>
<volume>22</volume>
<fpage>1658</fpage>
<lpage>1659</lpage>
<pub-id pub-id-type="pmid">16731699</pub-id>
</element-citation>
</ref>
<ref id="B18">
<label>18</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kent</surname>
<given-names>WJ</given-names>
</name>
</person-group>
<article-title>BLAT–the BLAST-like alignment tool</article-title>
<source>Genome Res.</source>
<year>2002</year>
<volume>12</volume>
<fpage>656</fpage>
<lpage>664</lpage>
<pub-id pub-id-type="pmid">11932250</pub-id>
</element-citation>
</ref>
<ref id="B19">
<label>19</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katoh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kuma</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Miyata</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Toh</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Improvement in the accuracy of multiple sequence alignment program MAFFT</article-title>
<source>Genome Inform.</source>
<year>2005</year>
<volume>16</volume>
<fpage>22</fpage>
<lpage>33</lpage>
<pub-id pub-id-type="pmid">16362903</pub-id>
</element-citation>
</ref>
<ref id="B20">
<label>20</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nobeli</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Favia</surname>
<given-names>AD</given-names>
</name>
<name>
<surname>Thornton</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Protein promiscuity and its implications for biotechnology</article-title>
<source>Nat. Biotechnol.</source>
<year>2009</year>
<volume>27</volume>
<fpage>157</fpage>
<lpage>167</lpage>
<pub-id pub-id-type="pmid">19204698</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Agronomie/explor/SisAgriV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000162  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000162  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Agronomie
   |area=    SisAgriV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.28.
Data generation: Wed Mar 29 00:06:34 2017. Site generation: Tue Mar 12 12:44:16 2024