MersV1, PubMed, Corpus, bibRecord, 000346

eCAMI: simultaneous classification and motif identification for enzyme annotation.

Identifieur interne : 000346 ( PubMed/Corpus ); précédent : 000345; suivant : 000347

eCAMI: simultaneous classification and motif identification for enzyme annotation.

Auteurs : Jing Xu ; Han Zhang ; Jinfang Zheng ; Philippe Dovoedo ; Yanbin Yin

Source :

Bioinformatics (Oxford, England) [ 1367-4811 ] ; 2020.

RBID : pubmed:31794006

Abstract

Carbohydrate-active enzymes (CAZymes) are extremely important to bioenergy, human gut microbiome, and plant pathogen researches and industries. Here we developed a new amino acid k-mer-based CAZyme classification, motif identification and genome annotation tool using a bipartite network algorithm. Using this tool, we classified 390 CAZyme families into thousands of subfamilies each with distinguishing k-mer peptides. These k-mers represented the characteristic motifs (in the form of a collection of conserved short peptides) of each subfamily, and thus were further used to annotate new genomes for CAZymes. This idea was also generalized to extract characteristic k-mer peptides for all the Swiss-Prot enzymes classified by the EC (enzyme commission) numbers and applied to enzyme EC prediction.

DOI: 10.1093/bioinformatics/btz908
PubMed: 31794006

Links to Exploration step

pubmed:31794006

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">eCAMI: simultaneous classification and motif identification for enzyme annotation.</title>
<author><name sortKey="Xu, Jing" sort="Xu, Jing" uniqKey="Xu J" first="Jing" last="Xu">Jing Xu</name>
<affiliation><nlm:affiliation>College of Artificial Intelligence, Nankai University, Tianjin 300071, China.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Zhang, Han" sort="Zhang, Han" uniqKey="Zhang H" first="Han" last="Zhang">Han Zhang</name>
<affiliation><nlm:affiliation>College of Artificial Intelligence, Nankai University, Tianjin 300071, China.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Zheng, Jinfang" sort="Zheng, Jinfang" uniqKey="Zheng J" first="Jinfang" last="Zheng">Jinfang Zheng</name>
<affiliation><nlm:affiliation>Department of Food Science and Technology, Nebraska Food for Health Center, University of Nebraska, Lincoln, NE 68588, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Dovoedo, Philippe" sort="Dovoedo, Philippe" uniqKey="Dovoedo P" first="Philippe" last="Dovoedo">Philippe Dovoedo</name>
<affiliation><nlm:affiliation>Department of Mathematical Sciences, Northern Illinois University, DeKalb, IL 60115, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Yin, Yanbin" sort="Yin, Yanbin" uniqKey="Yin Y" first="Yanbin" last="Yin">Yanbin Yin</name>
<affiliation><nlm:affiliation>Department of Food Science and Technology, Nebraska Food for Health Center, University of Nebraska, Lincoln, NE 68588, USA.</nlm:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2020">2020</date>
<idno type="RBID">pubmed:31794006</idno>
<idno type="pmid">31794006</idno>
<idno type="doi">10.1093/bioinformatics/btz908</idno>
<idno type="wicri:Area/PubMed/Corpus">000346</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000346</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">eCAMI: simultaneous classification and motif identification for enzyme annotation.</title>
<author><name sortKey="Xu, Jing" sort="Xu, Jing" uniqKey="Xu J" first="Jing" last="Xu">Jing Xu</name>
<affiliation><nlm:affiliation>College of Artificial Intelligence, Nankai University, Tianjin 300071, China.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Zhang, Han" sort="Zhang, Han" uniqKey="Zhang H" first="Han" last="Zhang">Han Zhang</name>
<affiliation><nlm:affiliation>College of Artificial Intelligence, Nankai University, Tianjin 300071, China.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Zheng, Jinfang" sort="Zheng, Jinfang" uniqKey="Zheng J" first="Jinfang" last="Zheng">Jinfang Zheng</name>
<affiliation><nlm:affiliation>Department of Food Science and Technology, Nebraska Food for Health Center, University of Nebraska, Lincoln, NE 68588, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Dovoedo, Philippe" sort="Dovoedo, Philippe" uniqKey="Dovoedo P" first="Philippe" last="Dovoedo">Philippe Dovoedo</name>
<affiliation><nlm:affiliation>Department of Mathematical Sciences, Northern Illinois University, DeKalb, IL 60115, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Yin, Yanbin" sort="Yin, Yanbin" uniqKey="Yin Y" first="Yanbin" last="Yin">Yanbin Yin</name>
<affiliation><nlm:affiliation>Department of Food Science and Technology, Nebraska Food for Health Center, University of Nebraska, Lincoln, NE 68588, USA.</nlm:affiliation>
</affiliation>
</author>
</analytic>
<series><title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2020" type="published">2020</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Carbohydrate-active enzymes (CAZymes) are extremely important to bioenergy, human gut microbiome, and plant pathogen researches and industries. Here we developed a new amino acid k-mer-based CAZyme classification, motif identification and genome annotation tool using a bipartite network algorithm. Using this tool, we classified 390 CAZyme families into thousands of subfamilies each with distinguishing k-mer peptides. These k-mers represented the characteristic motifs (in the form of a collection of conserved short peptides) of each subfamily, and thus were further used to annotate new genomes for CAZymes. This idea was also generalized to extract characteristic k-mer peptides for all the Swiss-Prot enzymes classified by the EC (enzyme commission) numbers and applied to enzyme EC prediction.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="In-Data-Review" Owner="NLM"><PMID Version="1">31794006</PMID>
<DateRevised><Year>2020</Year>
<Month>04</Month>
<Day>09</Day>
</DateRevised>
<Article PubModel="Print"><Journal><ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>36</Volume>
<Issue>7</Issue>
<PubDate><Year>2020</Year>
<Month>Apr</Month>
<Day>01</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>eCAMI: simultaneous classification and motif identification for enzyme annotation.</ArticleTitle>
<Pagination><MedlinePgn>2068-2075</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1093/bioinformatics/btz908</ELocationID>
<Abstract><AbstractText Label="MOTIVATION" NlmCategory="BACKGROUND">Carbohydrate-active enzymes (CAZymes) are extremely important to bioenergy, human gut microbiome, and plant pathogen researches and industries. Here we developed a new amino acid k-mer-based CAZyme classification, motif identification and genome annotation tool using a bipartite network algorithm. Using this tool, we classified 390 CAZyme families into thousands of subfamilies each with distinguishing k-mer peptides. These k-mers represented the characteristic motifs (in the form of a collection of conserved short peptides) of each subfamily, and thus were further used to annotate new genomes for CAZymes. This idea was also generalized to extract characteristic k-mer peptides for all the Swiss-Prot enzymes classified by the EC (enzyme commission) numbers and applied to enzyme EC prediction.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">This new tool was implemented as a Python package named eCAMI. Benchmark analysis of eCAMI against the state-of-the-art tools on CAZyme and enzyme EC datasets found that: (i) eCAMI has the best performance in terms of accuracy and memory use for CAZyme and enzyme EC classification and annotation; (ii) the k-mer-based tools (including PPR-Hotpep, CUPP and eCAMI) perform better than homology-based tools and deep-learning tools in enzyme EC prediction. Lastly, we confirmed that the k-mer-based tools have the unique ability to identify the characteristic k-mer peptides in the predicted enzymes.</AbstractText>
<AbstractText Label="AVAILABILITY AND IMPLEMENTATION" NlmCategory="METHODS">https://github.com/yinlabniu/eCAMI and https://github.com/zhanglabNKU/eCAMI.</AbstractText>
<AbstractText Label="SUPPLEMENTARY INFORMATION" NlmCategory="BACKGROUND">Supplementary data are available at Bioinformatics online.</AbstractText>
<CopyrightInformation>© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Xu</LastName>
<ForeName>Jing</ForeName>
<Initials>J</Initials>
<AffiliationInfo><Affiliation>College of Artificial Intelligence, Nankai University, Tianjin 300071, China.</Affiliation>
</AffiliationInfo>
<AffiliationInfo><Affiliation>College of Computer Science, Nankai University, Tianjin 300071, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Zhang</LastName>
<ForeName>Han</ForeName>
<Initials>H</Initials>
<AffiliationInfo><Affiliation>College of Artificial Intelligence, Nankai University, Tianjin 300071, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Zheng</LastName>
<ForeName>Jinfang</ForeName>
<Initials>J</Initials>
<AffiliationInfo><Affiliation>Department of Food Science and Technology, Nebraska Food for Health Center, University of Nebraska, Lincoln, NE 68588, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Dovoedo</LastName>
<ForeName>Philippe</ForeName>
<Initials>P</Initials>
<AffiliationInfo><Affiliation>Department of Mathematical Sciences, Northern Illinois University, DeKalb, IL 60115, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Yin</LastName>
<ForeName>Yanbin</ForeName>
<Initials>Y</Initials>
<AffiliationInfo><Affiliation>Department of Food Science and Technology, Nebraska Food for Health Center, University of Nebraska, Lincoln, NE 68588, USA.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo><Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2019</Year>
<Month>09</Month>
<Day>18</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="revised"><Year>2019</Year>
<Month>11</Month>
<Day>20</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2019</Year>
<Month>11</Month>
<Day>30</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2019</Year>
<Month>12</Month>
<Day>4</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2019</Year>
<Month>12</Month>
<Day>4</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2019</Year>
<Month>12</Month>
<Day>4</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">31794006</ArticleId>
<ArticleId IdType="pii">5651014</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btz908</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000346 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 000346 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:31794006
   |texte=   eCAMI: simultaneous classification and motif identification for enzyme annotation.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:31794006" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021

	Serveur d'exploration MERS
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration MERS

eCAMI: simultaneous classification and motif identification for enzyme annotation.

eCAMI: simultaneous classification and motif identification for enzyme annotation.

Source :

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki