Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.
Identifieur interne : 002514 ( PubMed/Curation ); précédent : 002513; suivant : 002515Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.
Auteurs : C Z Cai [Singapour] ; L Y Han ; X. Chen ; Z W Cao ; Y Z ChenSource :
- Journal of proteome research [ 1535-3893 ]
Descripteurs français
- KwdFr :
- Adénosine triphosphate (), Algorithmes, Alignement de séquences, Analyse de séquence de protéine, Bases de données de protéines, Biologie informatique (), Cadres ouverts de lecture, Génome viral, Intelligence artificielle, Logiciel, Modèles statistiques, Protéines virales (), Protéome, Protéomique (), Virus du SRAS ().
- MESH :
- Adénosine triphosphate, Algorithmes, Alignement de séquences, Analyse de séquence de protéine, Bases de données de protéines, Biologie informatique, Cadres ouverts de lecture, Génome viral, Intelligence artificielle, Logiciel, Modèles statistiques, Protéines virales, Protéome, Protéomique, Virus du SRAS.
English descriptors
- KwdEn :
- Adenosine Triphosphate (chemistry), Algorithms, Artificial Intelligence, Computational Biology (methods), Databases, Protein, Genome, Viral, Models, Statistical, Open Reading Frames, Proteome, Proteomics (methods), SARS Virus (chemistry), Sequence Alignment, Sequence Analysis, Protein, Software, Viral Proteins (chemistry).
- MESH :
- chemical , chemistry : Adenosine Triphosphate, Viral Proteins.
- chemistry : SARS Virus.
- methods : Computational Biology, Proteomics.
- Algorithms, Artificial Intelligence, Databases, Protein, Genome, Viral, Models, Statistical, Open Reading Frames, Proteome, Sequence Alignment, Sequence Analysis, Protein, Software.
Abstract
The complete genome of severe acute respiratory syndrome coronavirus (SARS-CoV) reveals the existence of putative proteins unique to SARS-CoV. Identification of their function facilitates a mechanistic understanding of SARS infection and drug development for its treatment. The sequence of the majority of these putative proteins has no significant similarity to those of known proteins, which complicates the task of using sequence analysis tools to probe their function. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to SARS-CoV proteins. Testing results indicate that SVM is able to predict the functional class of 73% of the known SARS-CoV proteins with available sequences and 67% of 18 other novel viral proteins. A combination of the sequence comparison method BLAST and SVMProt can further improve the prediction accuracy of SMVProt such that the functional class of two additional SARS-CoV proteins is correctly predicted. Our study suggests that the SARS-CoV genome possibly contains a putative voltage-gated ion channel, structural proteins, a carbon-oxygen lyase, oxidoreductases acting on the CH-OH group of donors, and an ATP-binding cassette transporter. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi .
DOI: 10.1021/pr050110a
PubMed: 16212442
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: Pour aller vers cette notice dans l'étape Curation :002514
Links to Exploration step
pubmed:16212442Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.</title>
<author><name sortKey="Cai, C Z" sort="Cai, C Z" uniqKey="Cai C" first="C Z" last="Cai">C Z Cai</name>
<affiliation wicri:level="4"><nlm:affiliation>Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543.</nlm:affiliation>
<orgName type="university">Université nationale de Singapour</orgName>
<country>Singapour</country>
</affiliation>
</author>
<author><name sortKey="Han, L Y" sort="Han, L Y" uniqKey="Han L" first="L Y" last="Han">L Y Han</name>
</author>
<author><name sortKey="Chen, X" sort="Chen, X" uniqKey="Chen X" first="X" last="Chen">X. Chen</name>
</author>
<author><name sortKey="Cao, Z W" sort="Cao, Z W" uniqKey="Cao Z" first="Z W" last="Cao">Z W Cao</name>
</author>
<author><name sortKey="Chen, Y Z" sort="Chen, Y Z" uniqKey="Chen Y" first="Y Z" last="Chen">Y Z Chen</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="????"><PubDate><MedlineDate>2005 Sep-Oct</MedlineDate>
</PubDate>
</date>
<idno type="RBID">pubmed:16212442</idno>
<idno type="pmid">16212442</idno>
<idno type="doi">10.1021/pr050110a</idno>
<idno type="wicri:Area/PubMed/Corpus">002514</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">002514</idno>
<idno type="wicri:Area/PubMed/Curation">002514</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">002514</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.</title>
<author><name sortKey="Cai, C Z" sort="Cai, C Z" uniqKey="Cai C" first="C Z" last="Cai">C Z Cai</name>
<affiliation wicri:level="4"><nlm:affiliation>Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543.</nlm:affiliation>
<orgName type="university">Université nationale de Singapour</orgName>
<country>Singapour</country>
</affiliation>
</author>
<author><name sortKey="Han, L Y" sort="Han, L Y" uniqKey="Han L" first="L Y" last="Han">L Y Han</name>
</author>
<author><name sortKey="Chen, X" sort="Chen, X" uniqKey="Chen X" first="X" last="Chen">X. Chen</name>
</author>
<author><name sortKey="Cao, Z W" sort="Cao, Z W" uniqKey="Cao Z" first="Z W" last="Cao">Z W Cao</name>
</author>
<author><name sortKey="Chen, Y Z" sort="Chen, Y Z" uniqKey="Chen Y" first="Y Z" last="Chen">Y Z Chen</name>
</author>
</analytic>
<series><title level="j">Journal of proteome research</title>
<idno type="ISSN">1535-3893</idno>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Adenosine Triphosphate (chemistry)</term>
<term>Algorithms</term>
<term>Artificial Intelligence</term>
<term>Computational Biology (methods)</term>
<term>Databases, Protein</term>
<term>Genome, Viral</term>
<term>Models, Statistical</term>
<term>Open Reading Frames</term>
<term>Proteome</term>
<term>Proteomics (methods)</term>
<term>SARS Virus (chemistry)</term>
<term>Sequence Alignment</term>
<term>Sequence Analysis, Protein</term>
<term>Software</term>
<term>Viral Proteins (chemistry)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Adénosine triphosphate ()</term>
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Analyse de séquence de protéine</term>
<term>Bases de données de protéines</term>
<term>Biologie informatique ()</term>
<term>Cadres ouverts de lecture</term>
<term>Génome viral</term>
<term>Intelligence artificielle</term>
<term>Logiciel</term>
<term>Modèles statistiques</term>
<term>Protéines virales ()</term>
<term>Protéome</term>
<term>Protéomique ()</term>
<term>Virus du SRAS ()</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="chemistry" xml:lang="en"><term>Adenosine Triphosphate</term>
<term>Viral Proteins</term>
</keywords>
<keywords scheme="MESH" qualifier="chemistry" xml:lang="en"><term>SARS Virus</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Computational Biology</term>
<term>Proteomics</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Artificial Intelligence</term>
<term>Databases, Protein</term>
<term>Genome, Viral</term>
<term>Models, Statistical</term>
<term>Open Reading Frames</term>
<term>Proteome</term>
<term>Sequence Alignment</term>
<term>Sequence Analysis, Protein</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Adénosine triphosphate</term>
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Analyse de séquence de protéine</term>
<term>Bases de données de protéines</term>
<term>Biologie informatique</term>
<term>Cadres ouverts de lecture</term>
<term>Génome viral</term>
<term>Intelligence artificielle</term>
<term>Logiciel</term>
<term>Modèles statistiques</term>
<term>Protéines virales</term>
<term>Protéome</term>
<term>Protéomique</term>
<term>Virus du SRAS</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">The complete genome of severe acute respiratory syndrome coronavirus (SARS-CoV) reveals the existence of putative proteins unique to SARS-CoV. Identification of their function facilitates a mechanistic understanding of SARS infection and drug development for its treatment. The sequence of the majority of these putative proteins has no significant similarity to those of known proteins, which complicates the task of using sequence analysis tools to probe their function. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to SARS-CoV proteins. Testing results indicate that SVM is able to predict the functional class of 73% of the known SARS-CoV proteins with available sequences and 67% of 18 other novel viral proteins. A combination of the sequence comparison method BLAST and SVMProt can further improve the prediction accuracy of SMVProt such that the functional class of two additional SARS-CoV proteins is correctly predicted. Our study suggests that the SARS-CoV genome possibly contains a putative voltage-gated ion channel, structural proteins, a carbon-oxygen lyase, oxidoreductases acting on the CH-OH group of donors, and an ATP-binding cassette transporter. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi .</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="MEDLINE" Owner="NLM"><PMID Version="1">16212442</PMID>
<DateCompleted><Year>2006</Year>
<Month>01</Month>
<Day>03</Day>
</DateCompleted>
<DateRevised><Year>2013</Year>
<Month>11</Month>
<Day>21</Day>
</DateRevised>
<Article PubModel="Print"><Journal><ISSN IssnType="Print">1535-3893</ISSN>
<JournalIssue CitedMedium="Print"><Volume>4</Volume>
<Issue>5</Issue>
<PubDate><MedlineDate>2005 Sep-Oct</MedlineDate>
</PubDate>
</JournalIssue>
<Title>Journal of proteome research</Title>
<ISOAbbreviation>J. Proteome Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.</ArticleTitle>
<Pagination><MedlinePgn>1855-62</MedlinePgn>
</Pagination>
<Abstract><AbstractText>The complete genome of severe acute respiratory syndrome coronavirus (SARS-CoV) reveals the existence of putative proteins unique to SARS-CoV. Identification of their function facilitates a mechanistic understanding of SARS infection and drug development for its treatment. The sequence of the majority of these putative proteins has no significant similarity to those of known proteins, which complicates the task of using sequence analysis tools to probe their function. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to SARS-CoV proteins. Testing results indicate that SVM is able to predict the functional class of 73% of the known SARS-CoV proteins with available sequences and 67% of 18 other novel viral proteins. A combination of the sequence comparison method BLAST and SVMProt can further improve the prediction accuracy of SMVProt such that the functional class of two additional SARS-CoV proteins is correctly predicted. Our study suggests that the SARS-CoV genome possibly contains a putative voltage-gated ion channel, structural proteins, a carbon-oxygen lyase, oxidoreductases acting on the CH-OH group of donors, and an ATP-binding cassette transporter. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi .</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Cai</LastName>
<ForeName>C Z</ForeName>
<Initials>CZ</Initials>
<AffiliationInfo><Affiliation>Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Han</LastName>
<ForeName>L Y</ForeName>
<Initials>LY</Initials>
</Author>
<Author ValidYN="Y"><LastName>Chen</LastName>
<ForeName>X</ForeName>
<Initials>X</Initials>
</Author>
<Author ValidYN="Y"><LastName>Cao</LastName>
<ForeName>Z W</ForeName>
<Initials>ZW</Initials>
</Author>
<Author ValidYN="Y"><LastName>Chen</LastName>
<ForeName>Y Z</ForeName>
<Initials>YZ</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo><Country>United States</Country>
<MedlineTA>J Proteome Res</MedlineTA>
<NlmUniqueID>101128775</NlmUniqueID>
<ISSNLinking>1535-3893</ISSNLinking>
</MedlineJournalInfo>
<ChemicalList><Chemical><RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D020543">Proteome</NameOfSubstance>
</Chemical>
<Chemical><RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D014764">Viral Proteins</NameOfSubstance>
</Chemical>
<Chemical><RegistryNumber>8L70Q75FXE</RegistryNumber>
<NameOfSubstance UI="D000255">Adenosine Triphosphate</NameOfSubstance>
</Chemical>
</ChemicalList>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList><MeshHeading><DescriptorName UI="D000255" MajorTopicYN="N">Adenosine Triphosphate</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="N">chemistry</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D001185" MajorTopicYN="N">Artificial Intelligence</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D019295" MajorTopicYN="N">Computational Biology</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D030562" MajorTopicYN="N">Databases, Protein</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D016679" MajorTopicYN="N">Genome, Viral</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D015233" MajorTopicYN="N">Models, Statistical</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D016366" MajorTopicYN="N">Open Reading Frames</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D020543" MajorTopicYN="N">Proteome</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D040901" MajorTopicYN="N">Proteomics</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="N">methods</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D045473" MajorTopicYN="N">SARS Virus</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="Y">chemistry</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D016415" MajorTopicYN="N">Sequence Alignment</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D020539" MajorTopicYN="N">Sequence Analysis, Protein</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D012984" MajorTopicYN="N">Software</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D014764" MajorTopicYN="N">Viral Proteins</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="Y">chemistry</QualifierName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="pubmed"><Year>2005</Year>
<Month>10</Month>
<Day>11</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2006</Year>
<Month>1</Month>
<Day>4</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2005</Year>
<Month>10</Month>
<Day>11</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">16212442</ArticleId>
<ArticleId IdType="doi">10.1021/pr050110a</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/SrasV1/Data/PubMed/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002514 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd -nk 002514 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= SrasV1 |flux= PubMed |étape= Curation |type= RBID |clé= pubmed:16212442 |texte= Prediction of functional class of the SARS coronavirus proteins by a statistical learning method. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Curation/RBID.i -Sk "pubmed:16212442" \ | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd \ | NlmPubMed2Wicri -a SrasV1
This area was generated with Dilib version V0.6.33. |