Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Physicochemical property distributions for accurate and rapid pairwise protein homology detection

Identifieur interne : 000A88 ( Pmc/Curation ); précédent : 000A87; suivant : 000A89

Physicochemical property distributions for accurate and rapid pairwise protein homology detection

Auteurs : Bobbie-Jo M. Webb-Robertson [États-Unis] ; Kyle G. Ratuiste [États-Unis] ; Christopher S. Oehmen [États-Unis]

Source :

RBID : PMC:2851606

Abstract

Background

The challenge of remote homology detection is that many evolutionarily related sequences have very little similarity at the amino acid level. Kernel-based discriminative methods, such as support vector machines (SVMs), that use vector representations of sequences derived from sequence properties have been shown to have superior accuracy when compared to traditional approaches for the task of remote homology detection.

Results

We introduce a new method for feature vector representation based on the physicochemical properties of the primary protein sequence. A distribution of physicochemical property scores are assembled from 4-mers of the sequence and normalized based on the null distribution of the property over all possible 4-mers. With this approach there is little computational cost associated with the transformation of the protein into feature space, and overall performance in terms of remote homology detection is comparable with current state-of-the-art methods. We demonstrate that the features can be used for the task of pairwise remote homology detection with improved accuracy versus sequence-based methods such as BLAST and other feature-based methods of similar computational cost.

Conclusions

A protein feature method based on physicochemical properties is a viable approach for extracting features in a computationally inexpensive manner while retaining the sensitivity of SVM protein homology detection. Furthermore, identifying features that can be used for generic pairwise homology detection in lieu of family-based homology detection is important for applications such as large database searches and comparative genomics.


Url:
DOI: 10.1186/1471-2105-11-145
PubMed: 20302613
PubMed Central: 2851606

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:2851606

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Physicochemical property distributions for accurate and rapid pairwise protein homology detection</title>
<author>
<name sortKey="Webb Robertson, Bobbie Jo M" sort="Webb Robertson, Bobbie Jo M" uniqKey="Webb Robertson B" first="Bobbie-Jo M" last="Webb-Robertson">Bobbie-Jo M. Webb-Robertson</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ratuiste, Kyle G" sort="Ratuiste, Kyle G" uniqKey="Ratuiste K" first="Kyle G" last="Ratuiste">Kyle G. Ratuiste</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Chemistry, Gonzaga University, Spokane, WA 99258, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Chemistry, Gonzaga University, Spokane, WA 99258</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Oehmen, Christopher S" sort="Oehmen, Christopher S" uniqKey="Oehmen C" first="Christopher S" last="Oehmen">Christopher S. Oehmen</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">20302613</idno>
<idno type="pmc">2851606</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851606</idno>
<idno type="RBID">PMC:2851606</idno>
<idno type="doi">10.1186/1471-2105-11-145</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">000A88</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000A88</idno>
<idno type="wicri:Area/Pmc/Curation">000A88</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000A88</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Physicochemical property distributions for accurate and rapid pairwise protein homology detection</title>
<author>
<name sortKey="Webb Robertson, Bobbie Jo M" sort="Webb Robertson, Bobbie Jo M" uniqKey="Webb Robertson B" first="Bobbie-Jo M" last="Webb-Robertson">Bobbie-Jo M. Webb-Robertson</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ratuiste, Kyle G" sort="Ratuiste, Kyle G" uniqKey="Ratuiste K" first="Kyle G" last="Ratuiste">Kyle G. Ratuiste</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Chemistry, Gonzaga University, Spokane, WA 99258, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Chemistry, Gonzaga University, Spokane, WA 99258</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Oehmen, Christopher S" sort="Oehmen, Christopher S" uniqKey="Oehmen C" first="Christopher S" last="Oehmen">Christopher S. Oehmen</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>The challenge of remote homology detection is that many evolutionarily related sequences have very little similarity at the amino acid level. Kernel-based discriminative methods, such as support vector machines (SVMs), that use vector representations of sequences derived from sequence properties have been shown to have superior accuracy when compared to traditional approaches for the task of remote homology detection.</p>
</sec>
<sec>
<title>Results</title>
<p>We introduce a new method for feature vector representation based on the physicochemical properties of the primary protein sequence. A distribution of physicochemical property scores are assembled from 4-mers of the sequence and normalized based on the null distribution of the property over all possible 4-mers. With this approach there is little computational cost associated with the transformation of the protein into feature space, and overall performance in terms of remote homology detection is comparable with current state-of-the-art methods. We demonstrate that the features can be used for the task of pairwise remote homology detection with improved accuracy versus sequence-based methods such as BLAST and other feature-based methods of similar computational cost.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>A protein feature method based on physicochemical properties is a viable approach for extracting features in a computationally inexpensive manner while retaining the sensitivity of SVM protein homology detection. Furthermore, identifying features that can be used for generic pairwise homology detection in lieu of family-based homology detection is important for applications such as large database searches and comparative genomics.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Dong, Qw" uniqKey="Dong Q">QW Dong</name>
</author>
<author>
<name sortKey="Wang, Xl" uniqKey="Wang X">XL Wang</name>
</author>
<author>
<name sortKey="Lin, L" uniqKey="Lin L">L Lin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author>
<name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leslie, Cs" uniqKey="Leslie C">CS Leslie</name>
</author>
<author>
<name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author>
<name sortKey="Cohen, A" uniqKey="Cohen A">A Cohen</name>
</author>
<author>
<name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liao, L" uniqKey="Liao L">L Liao</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lingner, T" uniqKey="Lingner T">T Lingner</name>
</author>
<author>
<name sortKey="Meinicke, P" uniqKey="Meinicke P">P Meinicke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author>
<name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author>
<name sortKey="Lin, L" uniqKey="Lin L">L Lin</name>
</author>
<author>
<name sortKey="Dong, Q" uniqKey="Dong Q">Q Dong</name>
</author>
<author>
<name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Melvin, I" uniqKey="Melvin I">I Melvin</name>
</author>
<author>
<name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author>
<name sortKey="Leslie, Cs" uniqKey="Leslie C">CS Leslie</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Webb Robertson, Bj" uniqKey="Webb Robertson B">BJ Webb-Robertson</name>
</author>
<author>
<name sortKey="Oehmen, C" uniqKey="Oehmen C">C Oehmen</name>
</author>
<author>
<name sortKey="Matzke, M" uniqKey="Matzke M">M Matzke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, Y" uniqKey="Yang Y">Y Yang</name>
</author>
<author>
<name sortKey="Tantoso, E" uniqKey="Tantoso E">E Tantoso</name>
</author>
<author>
<name sortKey="Li, Kb" uniqKey="Li K">KB Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yuan, Y" uniqKey="Yuan Y">Y Yuan</name>
</author>
<author>
<name sortKey="Lin, L" uniqKey="Lin L">L Lin</name>
</author>
<author>
<name sortKey="Dong, Q" uniqKey="Dong Q">Q Dong</name>
</author>
<author>
<name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Damoulas, T" uniqKey="Damoulas T">T Damoulas</name>
</author>
<author>
<name sortKey="Girolami, Ma" uniqKey="Girolami M">MA Girolami</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jung, I" uniqKey="Jung I">I Jung</name>
</author>
<author>
<name sortKey="Kim, D" uniqKey="Kim D">D Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kumar, A" uniqKey="Kumar A">A Kumar</name>
</author>
<author>
<name sortKey="Cowen, L" uniqKey="Cowen L">L Cowen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rangwala, H" uniqKey="Rangwala H">H Rangwala</name>
</author>
<author>
<name sortKey="Karypis, G" uniqKey="Karypis G">G Karypis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saigo, H" uniqKey="Saigo H">H Saigo</name>
</author>
<author>
<name sortKey="Vert, Jp" uniqKey="Vert J">JP Vert</name>
</author>
<author>
<name sortKey="Ueda, N" uniqKey="Ueda N">N Ueda</name>
</author>
<author>
<name sortKey="Akutsu, T" uniqKey="Akutsu T">T Akutsu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ben Hur, A" uniqKey="Ben Hur A">A Ben-Hur</name>
</author>
<author>
<name sortKey="Brutlag, D" uniqKey="Brutlag D">D Brutlag</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hou, Y" uniqKey="Hou Y">Y Hou</name>
</author>
<author>
<name sortKey="Hsu, W" uniqKey="Hsu W">W Hsu</name>
</author>
<author>
<name sortKey="Lee, Ml" uniqKey="Lee M">ML Lee</name>
</author>
<author>
<name sortKey="Bystroff, C" uniqKey="Bystroff C">C Bystroff</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hou, Y" uniqKey="Hou Y">Y Hou</name>
</author>
<author>
<name sortKey="Hsu, W" uniqKey="Hsu W">W Hsu</name>
</author>
<author>
<name sortKey="Lee, Ml" uniqKey="Lee M">ML Lee</name>
</author>
<author>
<name sortKey="Bystroff, C" uniqKey="Bystroff C">C Bystroff</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kuang, R" uniqKey="Kuang R">R Kuang</name>
</author>
<author>
<name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
<author>
<name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Melvin, I" uniqKey="Melvin I">I Melvin</name>
</author>
<author>
<name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author>
<name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
<author>
<name sortKey="Kuang, R" uniqKey="Kuang R">R Kuang</name>
</author>
<author>
<name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author>
<name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shah, Ar" uniqKey="Shah A">AR Shah</name>
</author>
<author>
<name sortKey="Oehmen, Cs" uniqKey="Oehmen C">CS Oehmen</name>
</author>
<author>
<name sortKey="Webb Robertson, Bj" uniqKey="Webb Robertson B">BJ Webb-Robertson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Weston, J" uniqKey="Weston J">J Weston</name>
</author>
<author>
<name sortKey="Kuang, R" uniqKey="Kuang R">R Kuang</name>
</author>
<author>
<name sortKey="Leslie, C" uniqKey="Leslie C">C Leslie</name>
</author>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Madden, Tl" uniqKey="Madden T">TL Madden</name>
</author>
<author>
<name sortKey="Schaffer, Aa" uniqKey="Schaffer A">AA Schaffer</name>
</author>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J Zhang</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kawashima, S" uniqKey="Kawashima S">S Kawashima</name>
</author>
<author>
<name sortKey="Pokarowski, P" uniqKey="Pokarowski P">P Pokarowski</name>
</author>
<author>
<name sortKey="Pokarowska, M" uniqKey="Pokarowska M">M Pokarowska</name>
</author>
<author>
<name sortKey="Kolinski, A" uniqKey="Kolinski A">A Kolinski</name>
</author>
<author>
<name sortKey="Katayama, T" uniqKey="Katayama T">T Katayama</name>
</author>
<author>
<name sortKey="Kanehisa, M" uniqKey="Kanehisa M">M Kanehisa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, Tf" uniqKey="Smith T">TF Smith</name>
</author>
<author>
<name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noble, Ws" uniqKey="Noble W">WS Noble</name>
</author>
<author>
<name sortKey="Pavlidis, P" uniqKey="Pavlidis P">P Pavlidis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Anderson, Nh" uniqKey="Anderson N">NH Anderson</name>
</author>
<author>
<name sortKey="Cao, B" uniqKey="Cao B">B Cao</name>
</author>
<author>
<name sortKey="Chen, C" uniqKey="Chen C">C Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Murzin, Ag" uniqKey="Murzin A">AG Murzin</name>
</author>
<author>
<name sortKey="Brenner, Se" uniqKey="Brenner S">SE Brenner</name>
</author>
<author>
<name sortKey="Hubbard, T" uniqKey="Hubbard T">T Hubbard</name>
</author>
<author>
<name sortKey="Chothia, C" uniqKey="Chothia C">C Chothia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Webb Robertson, Bj" uniqKey="Webb Robertson B">BJ Webb-Robertson</name>
</author>
<author>
<name sortKey="Oehmen, Cs" uniqKey="Oehmen C">CS Oehmen</name>
</author>
<author>
<name sortKey="Shah, Ar" uniqKey="Shah A">AR Shah</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Webb Robertson, Bj" uniqKey="Webb Robertson B">BJ Webb-Robertson</name>
</author>
<author>
<name sortKey="Mccue, La" uniqKey="Mccue L">LA McCue</name>
</author>
<author>
<name sortKey="Lawrence, Ce" uniqKey="Lawrence C">CE Lawrence</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hochreiter, S" uniqKey="Hochreiter S">S Hochreiter</name>
</author>
<author>
<name sortKey="Heusel, M" uniqKey="Heusel M">M Heusel</name>
</author>
<author>
<name sortKey="Obermayer, K" uniqKey="Obermayer K">K Obermayer</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">20302613</article-id>
<article-id pub-id-type="pmc">2851606</article-id>
<article-id pub-id-type="publisher-id">1471-2105-11-145</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-11-145</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Methodology article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Physicochemical property distributions for accurate and rapid pairwise protein homology detection</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes" id="A1">
<name>
<surname>Webb-Robertson</surname>
<given-names>Bobbie-Jo M</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>bj@pnl.gov</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Ratuiste</surname>
<given-names>Kyle G</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>kgratuiste@gmail.com</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Oehmen</surname>
<given-names>Christopher S</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>christopher.oehmen@pnl.gov</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA 99352, USA</aff>
<aff id="I2">
<label>2</label>
Department of Chemistry, Gonzaga University, Spokane, WA 99258, USA</aff>
<pub-date pub-type="collection">
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>19</day>
<month>3</month>
<year>2010</year>
</pub-date>
<volume>11</volume>
<fpage>145</fpage>
<lpage>145</lpage>
<history>
<date date-type="received">
<day>24</day>
<month>10</month>
<year>2009</year>
</date>
<date date-type="accepted">
<day>19</day>
<month>3</month>
<year>2010</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright ©2010 Webb-Robertson et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2010</copyright-year>
<copyright-holder>Webb-Robertson et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/11/145"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>The challenge of remote homology detection is that many evolutionarily related sequences have very little similarity at the amino acid level. Kernel-based discriminative methods, such as support vector machines (SVMs), that use vector representations of sequences derived from sequence properties have been shown to have superior accuracy when compared to traditional approaches for the task of remote homology detection.</p>
</sec>
<sec>
<title>Results</title>
<p>We introduce a new method for feature vector representation based on the physicochemical properties of the primary protein sequence. A distribution of physicochemical property scores are assembled from 4-mers of the sequence and normalized based on the null distribution of the property over all possible 4-mers. With this approach there is little computational cost associated with the transformation of the protein into feature space, and overall performance in terms of remote homology detection is comparable with current state-of-the-art methods. We demonstrate that the features can be used for the task of pairwise remote homology detection with improved accuracy versus sequence-based methods such as BLAST and other feature-based methods of similar computational cost.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>A protein feature method based on physicochemical properties is a viable approach for extracting features in a computationally inexpensive manner while retaining the sensitivity of SVM protein homology detection. Furthermore, identifying features that can be used for generic pairwise homology detection in lieu of family-based homology detection is important for applications such as large database searches and comparative genomics.</p>
</sec>
</abstract>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A88 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000A88 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:2851606
   |texte=   Physicochemical property distributions for accurate and rapid pairwise protein homology detection
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:20302613" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021