MersV1, Pmc, Curation, bibRecord, 000A87

Amino acid classification based spectrum kernel fusion for protein subnuclear localization

Identifieur interne : 000A87 ( Pmc/Curation ); précédent : 000A86; suivant : 000A88

Amino acid classification based spectrum kernel fusion for protein subnuclear localization

Auteurs : Suyu Mei [République populaire de Chine] ; Wang Fei [République populaire de Chine]

Source :

BMC Bioinformatics [ 1471-2105 ] ; 2010.

RBID : PMC:3009488

Abstract

Background

Prediction of protein localization in subnuclear organelles is more challenging than general protein subcelluar localization. There are only three computational models for protein subnuclear localization thus far, to the best of our knowledge. Two models were based on protein primary sequence only. The first model assumed homogeneous amino acid substitution pattern across all protein sequence residue sites and used BLOSUM62 to encode k-mer of protein sequence. Ensemble of SVM based on different k-mers drew the final conclusion, achieving 50% overall accuracy. The simplified assumption did not exploit protein sequence profile and ignored the fact of heterogeneous amino acid substitution patterns across sites. The second model derived the PsePSSM feature representation from protein sequence by simply averaging the profile PSSM and combined the PseAA feature representation to construct a kNN ensemble classifier Nuc-PLoc, achieving 67.4% overall accuracy. The two models based on protein primary sequence only both achieved relatively poor predictive performance. The third model required that GO annotations be available, thus restricting the model's applicability.

Methods

In this paper, we only use the amino acid information of protein sequence without any other information to design a widely-applicable model for protein subnuclear localization. We use K-spectrum kernel to exploit the contextual information around an amino acid and the conserved motif information. Besides expanding window size, we adopt various amino acid classification approaches to capture diverse aspects of amino acid physiochemical properties. Each amino acid classification generates a series of spectrum kernels based on different window size. Thus, (I) window expansion can capture more contextual information and cover size-varying motifs; (II) various amino acid classifications can exploit multi-aspect biological information from the protein sequence. Finally, we combine all the spectrum kernels by simple addition into one single kernel called SpectrumKernel+ for protein subnuclear localization.

Results

We conduct the performance evaluation experiments on two benchmark datasets: Lei and Nuc-PLoc. Experimental results show that SpectrumKernel+ achieves substantial performance improvement against the previous model Nuc-PLoc, with overall accuracy 83.47% against 67.4%; and 71.23% against 50% of Lei SVM Ensemble, against 66.50% of Lei GO SVM Ensemble.

Conclusion

The method SpectrumKernel+ can exploit rich amino acid information of protein sequence by embedding into implicit size-varying motifs the multi-aspect amino acid physiochemical properties captured by amino acid classification approaches. The kernels derived from diverse amino acid classification approaches and different sizes of k-mer are summed together for data integration. Experiments show that the method SpectrumKernel+ significantly outperforms the existing models for protein subnuclear localization.

Url:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009488

DOI: 10.1186/1471-2105-11-S1-S17
PubMed: 20122188
PubMed Central: 3009488

Links toward previous steps (curation, corpus...)

to stream Pmc, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000A87

Links to Exploration step

PMC:3009488

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Amino acid classification based spectrum kernel fusion for protein subnuclear localization</title>
<author><name sortKey="Mei, Suyu" sort="Mei, Suyu" uniqKey="Mei S" first="Suyu" last="Mei">Suyu Mei</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, PR China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Fei, Wang" sort="Fei, Wang" uniqKey="Fei W" first="Wang" last="Fei">Wang Fei</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, PR China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">20122188</idno>
<idno type="pmc">3009488</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009488</idno>
<idno type="RBID">PMC:3009488</idno>
<idno type="doi">10.1186/1471-2105-11-S1-S17</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">000A87</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000A87</idno>
<idno type="wicri:Area/Pmc/Curation">000A87</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000A87</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Amino acid classification based spectrum kernel fusion for protein subnuclear localization</title>
<author><name sortKey="Mei, Suyu" sort="Mei, Suyu" uniqKey="Mei S" first="Suyu" last="Mei">Suyu Mei</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, PR China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Fei, Wang" sort="Fei, Wang" uniqKey="Fei W" first="Wang" last="Fei">Wang Fei</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, PR China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series><title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint><date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>Prediction of protein localization in subnuclear organelles is more challenging than general protein subcelluar localization. There are only three computational models for protein subnuclear localization thus far, to the best of our knowledge. Two models were based on protein primary sequence only. The first model assumed homogeneous amino acid substitution pattern across all protein sequence residue sites and used BLOSUM62 to encode <italic>k</italic>
-mer of protein sequence. Ensemble of SVM based on different <italic>k</italic>
-mers drew the final conclusion, achieving 50% overall accuracy. The simplified assumption did not exploit protein sequence profile and ignored the fact of heterogeneous amino acid substitution patterns across sites. The second model derived the <italic>PsePSSM </italic>
feature representation from protein sequence by simply averaging the profile PSSM and combined the <italic>PseAA </italic>
feature representation to construct a kNN ensemble classifier <italic>Nuc-PLoc</italic>
, achieving 67.4% overall accuracy. The two models based on protein primary sequence only both achieved relatively poor predictive performance. The third model required that GO annotations be available, thus restricting the model's applicability.</p>
</sec>
<sec><title>Methods</title>
<p>In this paper, we only use the amino acid information of protein sequence without any other information to design a widely-applicable model for protein subnuclear localization. We use <italic>K</italic>
-spectrum kernel to exploit the contextual information around an amino acid and the conserved motif information. Besides expanding window size, we adopt various amino acid classification approaches to capture diverse aspects of amino acid physiochemical properties. Each amino acid classification generates a series of spectrum kernels based on different window size. Thus, (I) window expansion can capture more contextual information and cover size-varying motifs; (II) various amino acid classifications can exploit multi-aspect biological information from the protein sequence. Finally, we combine all the spectrum kernels by simple addition into one single kernel called <italic>SpectrumKernel+ </italic>
for protein subnuclear localization.</p>
</sec>
<sec><title>Results</title>
<p>We conduct the performance evaluation experiments on two benchmark datasets: <italic>Lei </italic>
and <italic>Nuc-PLoc</italic>
. Experimental results show that <italic>SpectrumKernel+ </italic>
achieves substantial performance improvement against the previous model <italic>Nuc-PLoc</italic>
, with overall accuracy <italic>83.47% </italic>
against <italic>67.4%</italic>
; and <italic>71.23% </italic>
against <italic>50% </italic>
of <italic>Lei SVM Ensemble</italic>
, against 66.50% of <italic>Lei GO SVM Ensemble</italic>
.</p>
</sec>
<sec><title>Conclusion</title>
<p>The method <italic>SpectrumKernel</italic>
+ can exploit rich amino acid information of protein sequence by embedding into implicit size-varying motifs the multi-aspect amino acid physiochemical properties captured by amino acid classification approaches. The kernels derived from diverse amino acid classification approaches and different sizes of <italic>k</italic>
-mer are summed together for data integration. Experiments show that the method <italic>SpectrumKernel</italic>
+ significantly outperforms the existing models for protein subnuclear localization.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Lei, Z" uniqKey="Lei Z">Z Lei</name>
</author>
<author><name sortKey="Dai, Y" uniqKey="Dai Y">Y Dai</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shen, H" uniqKey="Shen H">H Shen</name>
</author>
<author><name sortKey="Chou, K" uniqKey="Chou K">K Chou</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lei, Z" uniqKey="Lei Z">Z Lei</name>
</author>
<author><name sortKey="Dai, Y" uniqKey="Dai Y">Y Dai</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Cedano, J" uniqKey="Cedano J">J Cedano</name>
</author>
<author><name sortKey="Aloy, P" uniqKey="Aloy P">P Aloy</name>
</author>
<author><name sortKey="P Erez Pons, J" uniqKey="P Erez Pons J">J P'erez-Pons</name>
</author>
<author><name sortKey="Querol, E" uniqKey="Querol E">E Querol</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hoglund, A" uniqKey="Hoglund A">A Hoglund</name>
</author>
<author><name sortKey="Donnes, P" uniqKey="Donnes P">P Donnes</name>
</author>
<author><name sortKey="Blum, T" uniqKey="Blum T">T Blum</name>
</author>
<author><name sortKey="Adolph, H" uniqKey="Adolph H">H Adolph</name>
</author>
<author><name sortKey="Kohlbacher, O" uniqKey="Kohlbacher O">O Kohlbacher</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bhasin, M" uniqKey="Bhasin M">M Bhasin</name>
</author>
<author><name sortKey="Raghava, G" uniqKey="Raghava G">G Raghava</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chou, K" uniqKey="Chou K">K Chou</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lee, K" uniqKey="Lee K">K Lee</name>
</author>
<author><name sortKey="Chuang, H" uniqKey="Chuang H">H Chuang</name>
</author>
<author><name sortKey="Beyer, A" uniqKey="Beyer A">A Beyer</name>
</author>
<author><name sortKey="Sung, M" uniqKey="Sung M">M Sung</name>
</author>
<author><name sortKey="Huh, W" uniqKey="Huh W">W Huh</name>
</author>
<author><name sortKey="Lee, B" uniqKey="Lee B">B Lee</name>
</author>
<author><name sortKey="Ideker, T" uniqKey="Ideker T">T Ideker</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Alexander, Z" uniqKey="Alexander Z">Z Alexander</name>
</author>
<author><name sortKey="Cheng, S" uniqKey="Cheng S">S Cheng</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dijk, A" uniqKey="Dijk A">A Dijk</name>
</author>
<author><name sortKey="Bosch, D" uniqKey="Bosch D">D Bosch</name>
</author>
<author><name sortKey="Braak, C" uniqKey="Braak C">C Braak</name>
</author>
<author><name sortKey="Krol, A" uniqKey="Krol A">A Krol</name>
</author>
<author><name sortKey="Ham, R" uniqKey="Ham R">R Ham</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shen, J" uniqKey="Shen J">J Shen</name>
</author>
<author><name sortKey="Zhang, J" uniqKey="Zhang J">J Zhang</name>
</author>
<author><name sortKey="Luo, X" uniqKey="Luo X">X Luo</name>
</author>
<author><name sortKey="Zhu, W" uniqKey="Zhu W">W Zhu</name>
</author>
<author><name sortKey="Yu, K" uniqKey="Yu K">K Yu</name>
</author>
<author><name sortKey="Chen, K" uniqKey="Chen K">K Chen</name>
</author>
<author><name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author><name sortKey="Jiang, H" uniqKey="Jiang H">H Jiang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Schneider, G" uniqKey="Schneider G">G Schneider</name>
</author>
<author><name sortKey="Fechner, U" uniqKey="Fechner U">U Fechner</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Edward, M" uniqKey="Edward M">M Edward</name>
</author>
<author><name sortKey="Ioannis, X" uniqKey="Ioannis X">X Ioannis</name>
</author>
<author><name sortKey="Alexander, M" uniqKey="Alexander M">M Alexander</name>
</author>
<author><name sortKey="David, E" uniqKey="David E">E David</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Guo, J" uniqKey="Guo J">J Guo</name>
</author>
<author><name sortKey="Lin, Y" uniqKey="Lin Y">Y Lin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mak, M" uniqKey="Mak M">M Mak</name>
</author>
<author><name sortKey="Guo, J" uniqKey="Guo J">J Guo</name>
</author>
<author><name sortKey="Kung, S" uniqKey="Kung S">S Kung</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rangwala, H" uniqKey="Rangwala H">H Rangwala</name>
</author>
<author><name sortKey="Karypis, G" uniqKey="Karypis G">G Karypis</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kuang, R" uniqKey="Kuang R">R Kuang</name>
</author>
<author><name sortKey="Ie, E" uniqKey="Ie E">E Ie</name>
</author>
<author><name sortKey="Wang, K" uniqKey="Wang K">K Wang</name>
</author>
<author><name sortKey="Siddiqi, M" uniqKey="Siddiqi M">M Siddiqi</name>
</author>
<author><name sortKey="Freund, Y" uniqKey="Freund Y">Y Freund</name>
</author>
<author><name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author><name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author><name sortKey="Cohen, A" uniqKey="Cohen A">A Cohen</name>
</author>
<author><name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author><name sortKey="Noble, W" uniqKey="Noble W">W Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kuang, R" uniqKey="Kuang R">R Kuang</name>
</author>
<author><name sortKey="Jianying, Gu" uniqKey="Jianying G">Gu Jianying</name>
</author>
<author><name sortKey="Hong, Cai" uniqKey="Hong C">Cai Hong</name>
</author>
<author><name sortKey="Yufeng, Wang" uniqKey="Yufeng W">Wang Yufeng</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Richard, M" uniqKey="Richard M">M Richard</name>
</author>
<author><name sortKey="Jorg, S" uniqKey="Jorg S">S Jörg</name>
</author>
<author><name sortKey="Peer, B" uniqKey="Peer B">B Peer</name>
</author>
<author><name sortKey="Chris, P" uniqKey="Chris P">P Chris</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Jia, P" uniqKey="Jia P">P Jia</name>
</author>
<author><name sortKey="Qian, Z" uniqKey="Qian Z">Z Qian</name>
</author>
<author><name sortKey="Zeng, Z" uniqKey="Zeng Z">Z Zeng</name>
</author>
<author><name sortKey="Cai, Y" uniqKey="Cai Y">Y Cai</name>
</author>
<author><name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mei, S" uniqKey="Mei S">S Mei</name>
</author>
<author><name sortKey="Wang, F" uniqKey="Wang F">F Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chou, K" uniqKey="Chou K">K Chou</name>
</author>
<author><name sortKey="Shen, H" uniqKey="Shen H">H Shen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tung, T" uniqKey="Tung T">T Tung</name>
</author>
<author><name sortKey="Lee, D" uniqKey="Lee D">D Lee</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author><name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author><name sortKey="Noble, W" uniqKey="Noble W">W Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
<author><name sortKey="Cristianini, N" uniqKey="Cristianini N">N Cristianini</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Vapnik, V" uniqKey="Vapnik V">V Vapnik</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Alejandro, S" uniqKey="Alejandro S">S Alejandro</name>
</author>
<author><name sortKey="Ernesto, P" uniqKey="Ernesto P">P Ernesto</name>
</author>
<author><name sortKey="Segovia, L" uniqKey="Segovia L">L Segovia</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lanckriet, G" uniqKey="Lanckriet G">G Lanckriet</name>
</author>
<author><name sortKey="Debie, T" uniqKey="Debie T">T DeBie</name>
</author>
<author><name sortKey="Cristianini, N" uniqKey="Cristianini N">N Cristianini</name>
</author>
<author><name sortKey="Jordan, M" uniqKey="Jordan M">M Jordan</name>
</author>
<author><name sortKey="Noble, W" uniqKey="Noble W">W Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Boeckmann, B" uniqKey="Boeckmann B">B Boeckmann</name>
</author>
<author><name sortKey="Bairoch, A" uniqKey="Bairoch A">A Bairoch</name>
</author>
<author><name sortKey="Apweiler, R" uniqKey="Apweiler R">R Apweiler</name>
</author>
<author><name sortKey="Blatter, M" uniqKey="Blatter M">M Blatter</name>
</author>
<author><name sortKey="Estreicher, A" uniqKey="Estreicher A">A Estreicher</name>
</author>
<author><name sortKey="Gasteiger, E" uniqKey="Gasteiger E">E Gasteiger</name>
</author>
<author><name sortKey="Martin, M" uniqKey="Martin M">M Martin</name>
</author>
<author><name sortKey="Michoud, K" uniqKey="Michoud K">K Michoud</name>
</author>
<author><name sortKey="Donovan, C" uniqKey="Donovan C">C Donovan</name>
</author>
<author><name sortKey="Phan, I" uniqKey="Phan I">I Phan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dellaire, G" uniqKey="Dellaire G">G Dellaire</name>
</author>
<author><name sortKey="Farrall, R" uniqKey="Farrall R">R Farrall</name>
</author>
<author><name sortKey="Bickmore, W" uniqKey="Bickmore W">W Bickmore</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-title-group><journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher><publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">20122188</article-id>
<article-id pub-id-type="pmc">3009488</article-id>
<article-id pub-id-type="publisher-id">1471-2105-11-S1-S17</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-11-S1-S17</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research</subject>
</subj-group>
</article-categories>
<title-group><article-title>Amino acid classification based spectrum kernel fusion for protein subnuclear localization</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" id="A1"><name><surname>Mei</surname>
<given-names>Suyu</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>meisuyureg@sohu.com</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A2"><name><surname>Fei</surname>
<given-names>Wang</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>wangfei@fudan.edu.cn</email>
</contrib>
</contrib-group>
<aff id="I1"><label>1</label>
Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, PR China</aff>
<pub-date pub-type="collection"><year>2010</year>
</pub-date>
<pub-date pub-type="epub"><day>18</day>
<month>1</month>
<year>2010</year>
</pub-date>
<volume>11</volume>
<issue>Suppl 1</issue>
<supplement><named-content content-type="supplement-title">Selected articles from the Eighth Asia-Pacific Bioinformatics Conference (APBC 2010)</named-content>
<named-content content-type="supplement-editor">Laxmi Parida and Gene Myers</named-content>
</supplement>
<fpage>S17</fpage>
<lpage>S17</lpage>
<permissions><copyright-statement>Copyright ©2010 Mei and Fei; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2010</copyright-year>
<copyright-holder>Mei and Fei; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0"><license-p>This is an open access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/11/S1/S17"></self-uri>
<abstract><sec><title>Background</title>
<p>Prediction of protein localization in subnuclear organelles is more challenging than general protein subcelluar localization. There are only three computational models for protein subnuclear localization thus far, to the best of our knowledge. Two models were based on protein primary sequence only. The first model assumed homogeneous amino acid substitution pattern across all protein sequence residue sites and used BLOSUM62 to encode <italic>k</italic>
-mer of protein sequence. Ensemble of SVM based on different <italic>k</italic>
-mers drew the final conclusion, achieving 50% overall accuracy. The simplified assumption did not exploit protein sequence profile and ignored the fact of heterogeneous amino acid substitution patterns across sites. The second model derived the <italic>PsePSSM </italic>
feature representation from protein sequence by simply averaging the profile PSSM and combined the <italic>PseAA </italic>
feature representation to construct a kNN ensemble classifier <italic>Nuc-PLoc</italic>
, achieving 67.4% overall accuracy. The two models based on protein primary sequence only both achieved relatively poor predictive performance. The third model required that GO annotations be available, thus restricting the model's applicability.</p>
</sec>
<sec><title>Methods</title>
<p>In this paper, we only use the amino acid information of protein sequence without any other information to design a widely-applicable model for protein subnuclear localization. We use <italic>K</italic>
-spectrum kernel to exploit the contextual information around an amino acid and the conserved motif information. Besides expanding window size, we adopt various amino acid classification approaches to capture diverse aspects of amino acid physiochemical properties. Each amino acid classification generates a series of spectrum kernels based on different window size. Thus, (I) window expansion can capture more contextual information and cover size-varying motifs; (II) various amino acid classifications can exploit multi-aspect biological information from the protein sequence. Finally, we combine all the spectrum kernels by simple addition into one single kernel called <italic>SpectrumKernel+ </italic>
for protein subnuclear localization.</p>
</sec>
<sec><title>Results</title>
<p>We conduct the performance evaluation experiments on two benchmark datasets: <italic>Lei </italic>
and <italic>Nuc-PLoc</italic>
. Experimental results show that <italic>SpectrumKernel+ </italic>
achieves substantial performance improvement against the previous model <italic>Nuc-PLoc</italic>
, with overall accuracy <italic>83.47% </italic>
against <italic>67.4%</italic>
; and <italic>71.23% </italic>
against <italic>50% </italic>
of <italic>Lei SVM Ensemble</italic>
, against 66.50% of <italic>Lei GO SVM Ensemble</italic>
.</p>
</sec>
<sec><title>Conclusion</title>
<p>The method <italic>SpectrumKernel</italic>
+ can exploit rich amino acid information of protein sequence by embedding into implicit size-varying motifs the multi-aspect amino acid physiochemical properties captured by amino acid classification approaches. The kernels derived from diverse amino acid classification approaches and different sizes of <italic>k</italic>
-mer are summed together for data integration. Experiments show that the method <italic>SpectrumKernel</italic>
+ significantly outperforms the existing models for protein subnuclear localization.</p>
</sec>
</abstract>
<conference><conf-date>18-21 January 2010</conf-date>
<conf-name>The Eighth Asia Pacific Bioinformatics Conference (APBC 2010)</conf-name>
<conf-loc>Bangalore, India</conf-loc>
</conference>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A87 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000A87 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:3009488
   |texte=   Amino acid classification based spectrum kernel fusion for protein subnuclear localization
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:20122188" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021

	Serveur d'exploration MERS
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration MERS

Amino acid classification based spectrum kernel fusion for protein subnuclear localization

Amino acid classification based spectrum kernel fusion for protein subnuclear localization

Source :

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki