Codon number shapes peptide redundancy in the universal proteome composition.
Identifieur interne : 000700 ( Ncbi/Merge ); précédent : 000699; suivant : 000701Codon number shapes peptide redundancy in the universal proteome composition.
Auteurs : Anthony Kusalik [Canada] ; Brett Trost ; Mik Bickis ; Candida Fasano ; Giovanni Capone ; Darja KanducSource :
- Peptides [ 1873-5169 ] ; 2009.
Descripteurs français
- KwdFr :
- MESH :
English descriptors
- KwdEn :
- MESH :
- chemical , analysis : Proteome.
- chemical , genetics : Peptides, Proteome.
- chemical : Codon.
- Amino Acid Sequence, Computational Biology, Databases, Protein, Humans, Molecular Sequence Data, Sequence Analysis, Protein.
Abstract
The proteomes catalogued in the UniRef100 database were collected into a single proteome set and examined for actual versus theoretical pentapeptide occurrences. We found a highly diversified degree of pentapeptide redundancy. Numerically, 953 pentamers are expressed only once in the protein world, whereas 103 pentamers occur more than 50,000 times. Moreover, it seems that 417 potentially possible pentapeptides are not present in the protein world. On the whole, tracing the redundancy profile of the protein world as a function of pentapeptide occurrences reveals a quasi-Gaussian curve, with tails representing scarcely and repeatedly occurring 5-mers. Analysis of physico-chemical-biological parameters shows that codon number is the main factor influencing and favoring specific pentapeptide frequencies in the universal proteome composition. That is, when compared to the set of never-expressed 5-mers, the pentapeptides frequently represented in the universal proteome are endowed with a higher number of multi-codonic amino acids. In contrast, the bulkiness degree and the hydrophobicity level play a smaller role. Unexpectedly, the heat of formation of pentapeptide appears to have the least influence.
DOI: 10.1016/j.peptides.2009.06.035
PubMed: 19591891
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: 002004
- to stream PubMed, to step Curation: 002004
- to stream PubMed, to step Checkpoint: 001F28
Links to Exploration step
pubmed:19591891Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Codon number shapes peptide redundancy in the universal proteome composition.</title>
<author><name sortKey="Kusalik, Anthony" sort="Kusalik, Anthony" uniqKey="Kusalik A" first="Anthony" last="Kusalik">Anthony Kusalik</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Computer Science, University of Saskatchewan, Saskatoon, Canada.</nlm:affiliation>
<country xml:lang="fr">Canada</country>
<wicri:regionArea>Department of Computer Science, University of Saskatchewan, Saskatoon</wicri:regionArea>
<wicri:noRegion>Saskatoon</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Trost, Brett" sort="Trost, Brett" uniqKey="Trost B" first="Brett" last="Trost">Brett Trost</name>
</author>
<author><name sortKey="Bickis, Mik" sort="Bickis, Mik" uniqKey="Bickis M" first="Mik" last="Bickis">Mik Bickis</name>
</author>
<author><name sortKey="Fasano, Candida" sort="Fasano, Candida" uniqKey="Fasano C" first="Candida" last="Fasano">Candida Fasano</name>
</author>
<author><name sortKey="Capone, Giovanni" sort="Capone, Giovanni" uniqKey="Capone G" first="Giovanni" last="Capone">Giovanni Capone</name>
</author>
<author><name sortKey="Kanduc, Darja" sort="Kanduc, Darja" uniqKey="Kanduc D" first="Darja" last="Kanduc">Darja Kanduc</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2009">2009</date>
<idno type="RBID">pubmed:19591891</idno>
<idno type="pmid">19591891</idno>
<idno type="doi">10.1016/j.peptides.2009.06.035</idno>
<idno type="wicri:Area/PubMed/Corpus">002004</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">002004</idno>
<idno type="wicri:Area/PubMed/Curation">002004</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">002004</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001F28</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001F28</idno>
<idno type="wicri:Area/Ncbi/Merge">000700</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Codon number shapes peptide redundancy in the universal proteome composition.</title>
<author><name sortKey="Kusalik, Anthony" sort="Kusalik, Anthony" uniqKey="Kusalik A" first="Anthony" last="Kusalik">Anthony Kusalik</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Computer Science, University of Saskatchewan, Saskatoon, Canada.</nlm:affiliation>
<country xml:lang="fr">Canada</country>
<wicri:regionArea>Department of Computer Science, University of Saskatchewan, Saskatoon</wicri:regionArea>
<wicri:noRegion>Saskatoon</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Trost, Brett" sort="Trost, Brett" uniqKey="Trost B" first="Brett" last="Trost">Brett Trost</name>
</author>
<author><name sortKey="Bickis, Mik" sort="Bickis, Mik" uniqKey="Bickis M" first="Mik" last="Bickis">Mik Bickis</name>
</author>
<author><name sortKey="Fasano, Candida" sort="Fasano, Candida" uniqKey="Fasano C" first="Candida" last="Fasano">Candida Fasano</name>
</author>
<author><name sortKey="Capone, Giovanni" sort="Capone, Giovanni" uniqKey="Capone G" first="Giovanni" last="Capone">Giovanni Capone</name>
</author>
<author><name sortKey="Kanduc, Darja" sort="Kanduc, Darja" uniqKey="Kanduc D" first="Darja" last="Kanduc">Darja Kanduc</name>
</author>
</analytic>
<series><title level="j">Peptides</title>
<idno type="eISSN">1873-5169</idno>
<imprint><date when="2009" type="published">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Amino Acid Sequence</term>
<term>Codon</term>
<term>Computational Biology</term>
<term>Databases, Protein</term>
<term>Humans</term>
<term>Molecular Sequence Data</term>
<term>Peptides (genetics)</term>
<term>Proteome (analysis)</term>
<term>Proteome (genetics)</term>
<term>Sequence Analysis, Protein</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Analyse de séquence de protéine</term>
<term>Bases de données de protéines</term>
<term>Biologie informatique</term>
<term>Codon</term>
<term>Données de séquences moléculaires</term>
<term>Humains</term>
<term>Peptides (génétique)</term>
<term>Protéome (analyse)</term>
<term>Protéome (génétique)</term>
<term>Séquence d'acides aminés</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="analysis" xml:lang="en"><term>Proteome</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="genetics" xml:lang="en"><term>Peptides</term>
<term>Proteome</term>
</keywords>
<keywords scheme="MESH" type="chemical" xml:lang="en"><term>Codon</term>
</keywords>
<keywords scheme="MESH" qualifier="analyse" xml:lang="fr"><term>Protéome</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr"><term>Peptides</term>
<term>Protéome</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Amino Acid Sequence</term>
<term>Computational Biology</term>
<term>Databases, Protein</term>
<term>Humans</term>
<term>Molecular Sequence Data</term>
<term>Sequence Analysis, Protein</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Analyse de séquence de protéine</term>
<term>Bases de données de protéines</term>
<term>Biologie informatique</term>
<term>Codon</term>
<term>Données de séquences moléculaires</term>
<term>Humains</term>
<term>Séquence d'acides aminés</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">The proteomes catalogued in the UniRef100 database were collected into a single proteome set and examined for actual versus theoretical pentapeptide occurrences. We found a highly diversified degree of pentapeptide redundancy. Numerically, 953 pentamers are expressed only once in the protein world, whereas 103 pentamers occur more than 50,000 times. Moreover, it seems that 417 potentially possible pentapeptides are not present in the protein world. On the whole, tracing the redundancy profile of the protein world as a function of pentapeptide occurrences reveals a quasi-Gaussian curve, with tails representing scarcely and repeatedly occurring 5-mers. Analysis of physico-chemical-biological parameters shows that codon number is the main factor influencing and favoring specific pentapeptide frequencies in the universal proteome composition. That is, when compared to the set of never-expressed 5-mers, the pentapeptides frequently represented in the universal proteome are endowed with a higher number of multi-codonic amino acids. In contrast, the bulkiness degree and the hydrophobicity level play a smaller role. Unexpectedly, the heat of formation of pentapeptide appears to have the least influence.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="MEDLINE" Owner="NLM"><PMID Version="1">19591891</PMID>
<DateCompleted><Year>2010</Year>
<Month>02</Month>
<Day>01</Day>
</DateCompleted>
<DateRevised><Year>2009</Year>
<Month>09</Month>
<Day>30</Day>
</DateRevised>
<Article PubModel="Print-Electronic"><Journal><ISSN IssnType="Electronic">1873-5169</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>30</Volume>
<Issue>10</Issue>
<PubDate><Year>2009</Year>
<Month>Oct</Month>
</PubDate>
</JournalIssue>
<Title>Peptides</Title>
<ISOAbbreviation>Peptides</ISOAbbreviation>
</Journal>
<ArticleTitle>Codon number shapes peptide redundancy in the universal proteome composition.</ArticleTitle>
<Pagination><MedlinePgn>1940-4</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1016/j.peptides.2009.06.035</ELocationID>
<Abstract><AbstractText>The proteomes catalogued in the UniRef100 database were collected into a single proteome set and examined for actual versus theoretical pentapeptide occurrences. We found a highly diversified degree of pentapeptide redundancy. Numerically, 953 pentamers are expressed only once in the protein world, whereas 103 pentamers occur more than 50,000 times. Moreover, it seems that 417 potentially possible pentapeptides are not present in the protein world. On the whole, tracing the redundancy profile of the protein world as a function of pentapeptide occurrences reveals a quasi-Gaussian curve, with tails representing scarcely and repeatedly occurring 5-mers. Analysis of physico-chemical-biological parameters shows that codon number is the main factor influencing and favoring specific pentapeptide frequencies in the universal proteome composition. That is, when compared to the set of never-expressed 5-mers, the pentapeptides frequently represented in the universal proteome are endowed with a higher number of multi-codonic amino acids. In contrast, the bulkiness degree and the hydrophobicity level play a smaller role. Unexpectedly, the heat of formation of pentapeptide appears to have the least influence.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Kusalik</LastName>
<ForeName>Anthony</ForeName>
<Initials>A</Initials>
<AffiliationInfo><Affiliation>Department of Computer Science, University of Saskatchewan, Saskatoon, Canada.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Trost</LastName>
<ForeName>Brett</ForeName>
<Initials>B</Initials>
</Author>
<Author ValidYN="Y"><LastName>Bickis</LastName>
<ForeName>Mik</ForeName>
<Initials>M</Initials>
</Author>
<Author ValidYN="Y"><LastName>Fasano</LastName>
<ForeName>Candida</ForeName>
<Initials>C</Initials>
</Author>
<Author ValidYN="Y"><LastName>Capone</LastName>
<ForeName>Giovanni</ForeName>
<Initials>G</Initials>
</Author>
<Author ValidYN="Y"><LastName>Kanduc</LastName>
<ForeName>Darja</ForeName>
<Initials>D</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic"><Year>2009</Year>
<Month>07</Month>
<Day>08</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo><Country>United States</Country>
<MedlineTA>Peptides</MedlineTA>
<NlmUniqueID>8008690</NlmUniqueID>
<ISSNLinking>0196-9781</ISSNLinking>
</MedlineJournalInfo>
<ChemicalList><Chemical><RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D003062">Codon</NameOfSubstance>
</Chemical>
<Chemical><RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D010455">Peptides</NameOfSubstance>
</Chemical>
<Chemical><RegistryNumber>0</RegistryNumber>
<NameOfSubstance UI="D020543">Proteome</NameOfSubstance>
</Chemical>
</ChemicalList>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList><MeshHeading><DescriptorName UI="D000595" MajorTopicYN="N">Amino Acid Sequence</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D003062" MajorTopicYN="Y">Codon</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D019295" MajorTopicYN="N">Computational Biology</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D030562" MajorTopicYN="N">Databases, Protein</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D008969" MajorTopicYN="N">Molecular Sequence Data</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D010455" MajorTopicYN="N">Peptides</DescriptorName>
<QualifierName UI="Q000235" MajorTopicYN="Y">genetics</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D020543" MajorTopicYN="Y">Proteome</DescriptorName>
<QualifierName UI="Q000032" MajorTopicYN="N">analysis</QualifierName>
<QualifierName UI="Q000235" MajorTopicYN="N">genetics</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D020539" MajorTopicYN="N">Sequence Analysis, Protein</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2009</Year>
<Month>04</Month>
<Day>21</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="revised"><Year>2009</Year>
<Month>06</Month>
<Day>29</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2009</Year>
<Month>06</Month>
<Day>30</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2009</Year>
<Month>7</Month>
<Day>14</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2009</Year>
<Month>7</Month>
<Day>14</Day>
<Hour>9</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2010</Year>
<Month>2</Month>
<Day>2</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">19591891</ArticleId>
<ArticleId IdType="pii">S0196-9781(09)00272-1</ArticleId>
<ArticleId IdType="doi">10.1016/j.peptides.2009.06.035</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations><list><country><li>Canada</li>
</country>
</list>
<tree><noCountry><name sortKey="Bickis, Mik" sort="Bickis, Mik" uniqKey="Bickis M" first="Mik" last="Bickis">Mik Bickis</name>
<name sortKey="Capone, Giovanni" sort="Capone, Giovanni" uniqKey="Capone G" first="Giovanni" last="Capone">Giovanni Capone</name>
<name sortKey="Fasano, Candida" sort="Fasano, Candida" uniqKey="Fasano C" first="Candida" last="Fasano">Candida Fasano</name>
<name sortKey="Kanduc, Darja" sort="Kanduc, Darja" uniqKey="Kanduc D" first="Darja" last="Kanduc">Darja Kanduc</name>
<name sortKey="Trost, Brett" sort="Trost, Brett" uniqKey="Trost B" first="Brett" last="Trost">Brett Trost</name>
</noCountry>
<country name="Canada"><noRegion><name sortKey="Kusalik, Anthony" sort="Kusalik, Anthony" uniqKey="Kusalik A" first="Anthony" last="Kusalik">Anthony Kusalik</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Ncbi/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000700 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd -nk 000700 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Ncbi |étape= Merge |type= RBID |clé= pubmed:19591891 |texte= Codon number shapes peptide redundancy in the universal proteome composition. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/RBID.i -Sk "pubmed:19591891" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |