Serveur d'exploration SRAS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.

Identifieur interne : 001190 ( Ncbi/Merge ); précédent : 001189; suivant : 001191

Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.

Auteurs : C Z Cai [Singapour] ; L Y Han ; X. Chen ; Z W Cao ; Y Z Chen

Source :

RBID : pubmed:16212442

Descripteurs français

English descriptors

Abstract

The complete genome of severe acute respiratory syndrome coronavirus (SARS-CoV) reveals the existence of putative proteins unique to SARS-CoV. Identification of their function facilitates a mechanistic understanding of SARS infection and drug development for its treatment. The sequence of the majority of these putative proteins has no significant similarity to those of known proteins, which complicates the task of using sequence analysis tools to probe their function. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to SARS-CoV proteins. Testing results indicate that SVM is able to predict the functional class of 73% of the known SARS-CoV proteins with available sequences and 67% of 18 other novel viral proteins. A combination of the sequence comparison method BLAST and SVMProt can further improve the prediction accuracy of SMVProt such that the functional class of two additional SARS-CoV proteins is correctly predicted. Our study suggests that the SARS-CoV genome possibly contains a putative voltage-gated ion channel, structural proteins, a carbon-oxygen lyase, oxidoreductases acting on the CH-OH group of donors, and an ATP-binding cassette transporter. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi .

DOI: 10.1021/pr050110a
PubMed: 16212442

Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:16212442

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.</title>
<author>
<name sortKey="Cai, C Z" sort="Cai, C Z" uniqKey="Cai C" first="C Z" last="Cai">C Z Cai</name>
<affiliation wicri:level="4">
<nlm:affiliation>Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543.</nlm:affiliation>
<orgName type="university">Université nationale de Singapour</orgName>
<country>Singapour</country>
</affiliation>
</author>
<author>
<name sortKey="Han, L Y" sort="Han, L Y" uniqKey="Han L" first="L Y" last="Han">L Y Han</name>
</author>
<author>
<name sortKey="Chen, X" sort="Chen, X" uniqKey="Chen X" first="X" last="Chen">X. Chen</name>
</author>
<author>
<name sortKey="Cao, Z W" sort="Cao, Z W" uniqKey="Cao Z" first="Z W" last="Cao">Z W Cao</name>
</author>
<author>
<name sortKey="Chen, Y Z" sort="Chen, Y Z" uniqKey="Chen Y" first="Y Z" last="Chen">Y Z Chen</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="????">
<PubDate>
<MedlineDate>2005 Sep-Oct</MedlineDate>
</PubDate>
</date>
<idno type="RBID">pubmed:16212442</idno>
<idno type="pmid">16212442</idno>
<idno type="doi">10.1021/pr050110a</idno>
<idno type="wicri:Area/PubMed/Corpus">002514</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">002514</idno>
<idno type="wicri:Area/PubMed/Curation">002514</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">002514</idno>
<idno type="wicri:Area/PubMed/Checkpoint">003521</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">003521</idno>
<idno type="wicri:Area/Ncbi/Merge">001190</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.</title>
<author>
<name sortKey="Cai, C Z" sort="Cai, C Z" uniqKey="Cai C" first="C Z" last="Cai">C Z Cai</name>
<affiliation wicri:level="4">
<nlm:affiliation>Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543.</nlm:affiliation>
<orgName type="university">Université nationale de Singapour</orgName>
<country>Singapour</country>
</affiliation>
</author>
<author>
<name sortKey="Han, L Y" sort="Han, L Y" uniqKey="Han L" first="L Y" last="Han">L Y Han</name>
</author>
<author>
<name sortKey="Chen, X" sort="Chen, X" uniqKey="Chen X" first="X" last="Chen">X. Chen</name>
</author>
<author>
<name sortKey="Cao, Z W" sort="Cao, Z W" uniqKey="Cao Z" first="Z W" last="Cao">Z W Cao</name>
</author>
<author>
<name sortKey="Chen, Y Z" sort="Chen, Y Z" uniqKey="Chen Y" first="Y Z" last="Chen">Y Z Chen</name>
</author>
</analytic>
<series>
<title level="j">Journal of proteome research</title>
<idno type="ISSN">1535-3893</idno>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Adenosine Triphosphate (chemistry)</term>
<term>Algorithms</term>
<term>Artificial Intelligence</term>
<term>Computational Biology (methods)</term>
<term>Databases, Protein</term>
<term>Genome, Viral</term>
<term>Models, Statistical</term>
<term>Open Reading Frames</term>
<term>Proteome</term>
<term>Proteomics (methods)</term>
<term>SARS Virus (chemistry)</term>
<term>Sequence Alignment</term>
<term>Sequence Analysis, Protein</term>
<term>Software</term>
<term>Viral Proteins (chemistry)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Adénosine triphosphate ()</term>
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Analyse de séquence de protéine</term>
<term>Bases de données de protéines</term>
<term>Biologie informatique ()</term>
<term>Cadres ouverts de lecture</term>
<term>Génome viral</term>
<term>Intelligence artificielle</term>
<term>Logiciel</term>
<term>Modèles statistiques</term>
<term>Protéines virales ()</term>
<term>Protéome</term>
<term>Protéomique ()</term>
<term>Virus du SRAS ()</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="chemistry" xml:lang="en">
<term>Adenosine Triphosphate</term>
<term>Viral Proteins</term>
</keywords>
<keywords scheme="MESH" qualifier="chemistry" xml:lang="en">
<term>SARS Virus</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Computational Biology</term>
<term>Proteomics</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Artificial Intelligence</term>
<term>Databases, Protein</term>
<term>Genome, Viral</term>
<term>Models, Statistical</term>
<term>Open Reading Frames</term>
<term>Proteome</term>
<term>Sequence Alignment</term>
<term>Sequence Analysis, Protein</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Adénosine triphosphate</term>
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Analyse de séquence de protéine</term>
<term>Bases de données de protéines</term>
<term>Biologie informatique</term>
<term>Cadres ouverts de lecture</term>
<term>Génome viral</term>
<term>Intelligence artificielle</term>
<term>Logiciel</term>
<term>Modèles statistiques</term>
<term>Protéines virales</term>
<term>Protéome</term>
<term>Protéomique</term>
<term>Virus du SRAS</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The complete genome of severe acute respiratory syndrome coronavirus (SARS-CoV) reveals the existence of putative proteins unique to SARS-CoV. Identification of their function facilitates a mechanistic understanding of SARS infection and drug development for its treatment. The sequence of the majority of these putative proteins has no significant similarity to those of known proteins, which complicates the task of using sequence analysis tools to probe their function. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to SARS-CoV proteins. Testing results indicate that SVM is able to predict the functional class of 73% of the known SARS-CoV proteins with available sequences and 67% of 18 other novel viral proteins. A combination of the sequence comparison method BLAST and SVMProt can further improve the prediction accuracy of SMVProt such that the functional class of two additional SARS-CoV proteins is correctly predicted. Our study suggests that the SARS-CoV genome possibly contains a putative voltage-gated ion channel, structural proteins, a carbon-oxygen lyase, oxidoreductases acting on the CH-OH group of donors, and an ATP-binding cassette transporter. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi .</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">16212442</PMID>
<DateCompleted>
<Year>2006</Year>
<Month>01</Month>
<Day>03</Day>
</DateCompleted>
<DateRevised>
<Year>2013</Year>
<Month>11</Month>
<Day>21</Day>
</DateRevised>
<Article PubModel="Print">
<Journal>
<ISSN IssnType="Print">1535-3893</ISSN>
<JournalIssue CitedMedium="Print">
<Volume>4</Volume>
<Issue>5</Issue>
<PubDate>
<MedlineDate>2005 Sep-Oct</MedlineDate>
</PubDate>
</JournalIssue>
<Title>Journal of proteome research</Title>
<ISOAbbreviation>J. Proteome Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.</ArticleTitle>
<Pagination>
<MedlinePgn>1855-62</MedlinePgn>
</Pagination>
<Abstract>
<AbstractText>The complete genome of severe acute respiratory syndrome coronavirus (SARS-CoV) reveals the existence of putative proteins unique to SARS-CoV. Identification of their function facilitates a mechanistic understanding of SARS infection and drug development for its treatment. The sequence of the majority of these putative proteins has no significant similarity to those of known proteins, which complicates the task of using sequence analysis tools to probe their function. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to SARS-CoV proteins. Testing results indicate that SVM is able to predict the functional class of 73% of the known SARS-CoV proteins with available sequences and 67% of 18 other novel viral proteins. A combination of the sequence comparison method BLAST and SVMProt can further improve the prediction accuracy of SMVProt such that the functional class of two additional SARS-CoV proteins is correctly predicted. Our study suggests that the SARS-CoV genome possibly contains a putative voltage-gated ion channel, structural proteins, a carbon-oxygen lyase, oxidoreductases acting on the CH-OH group of donors, and an ATP-binding cassette transporter. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi .</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Cai</LastName>
<ForeName>C Z</ForeName>
<Initials>CZ</Initials>
<AffiliationInfo>
<Affiliation>Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Han</LastName>
<ForeName>L Y</ForeName>
<Initials>LY</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Chen</LastName>
<ForeName>X</ForeName>
<Initials>X</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Cao</LastName>
<ForeName>Z W</ForeName>
<Initials>ZW</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Chen</LastName>
<ForeName>Y Z</ForeName>
<Initials>YZ</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>J Proteome Res</MedlineTA>
<NlmUniqueID>101128775</NlmUniqueID>
<ISSNLinking>1535-3893</ISSNLinking>
</MedlineJournalInfo>
<ChemicalList>
<Chemical>
<RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D020543">Proteome</NameOfSubstance>
</Chemical>
<Chemical>
<RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D014764">Viral Proteins</NameOfSubstance>
</Chemical>
<Chemical>
<RegistryNumber>8L70Q75FXE</RegistryNumber>
<NameOfSubstance UI="D000255">Adenosine Triphosphate</NameOfSubstance>
</Chemical>
</ChemicalList>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000255" MajorTopicYN="N">Adenosine Triphosphate</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="N">chemistry</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D001185" MajorTopicYN="N">Artificial Intelligence</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D019295" MajorTopicYN="N">Computational Biology</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D030562" MajorTopicYN="N">Databases, Protein</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016679" MajorTopicYN="N">Genome, Viral</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D015233" MajorTopicYN="N">Models, Statistical</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016366" MajorTopicYN="N">Open Reading Frames</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D020543" MajorTopicYN="N">Proteome</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D040901" MajorTopicYN="N">Proteomics</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="N">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D045473" MajorTopicYN="N">SARS Virus</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="Y">chemistry</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016415" MajorTopicYN="N">Sequence Alignment</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D020539" MajorTopicYN="N">Sequence Analysis, Protein</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012984" MajorTopicYN="N">Software</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D014764" MajorTopicYN="N">Viral Proteins</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="Y">chemistry</QualifierName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="pubmed">
<Year>2005</Year>
<Month>10</Month>
<Day>11</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2006</Year>
<Month>1</Month>
<Day>4</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2005</Year>
<Month>10</Month>
<Day>11</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">16212442</ArticleId>
<ArticleId IdType="doi">10.1021/pr050110a</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>Singapour</li>
</country>
<orgName>
<li>Université nationale de Singapour</li>
</orgName>
</list>
<tree>
<noCountry>
<name sortKey="Cao, Z W" sort="Cao, Z W" uniqKey="Cao Z" first="Z W" last="Cao">Z W Cao</name>
<name sortKey="Chen, X" sort="Chen, X" uniqKey="Chen X" first="X" last="Chen">X. Chen</name>
<name sortKey="Chen, Y Z" sort="Chen, Y Z" uniqKey="Chen Y" first="Y Z" last="Chen">Y Z Chen</name>
<name sortKey="Han, L Y" sort="Han, L Y" uniqKey="Han L" first="L Y" last="Han">L Y Han</name>
</noCountry>
<country name="Singapour">
<noRegion>
<name sortKey="Cai, C Z" sort="Cai, C Z" uniqKey="Cai C" first="C Z" last="Cai">C Z Cai</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/SrasV1/Data/Ncbi/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001190 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd -nk 001190 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    SrasV1
   |flux=    Ncbi
   |étape=   Merge
   |type=    RBID
   |clé=     pubmed:16212442
   |texte=   Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/RBID.i   -Sk "pubmed:16212442" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd   \
       | NlmPubMed2Wicri -a SrasV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Apr 28 14:49:16 2020. Site generation: Sat Mar 27 22:06:49 2021