Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.

Identifieur interne : 000364 ( PubMed/Corpus ); précédent : 000363; suivant : 000365

Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.

Auteurs : Zhan-Heng Chen ; Zhu-Hong You ; Wen-Bo Zhang ; Yan-Bin Wang ; Li Cheng ; Daniyal Alghazzawi

Source :

RBID : pubmed:31726752

Abstract

Self-interacting proteins (SIPs) is of paramount importance in current molecular biology. There have been developed a number of traditional biological experiment methods for predicting SIPs in the past few years. However, these methods are costly, time-consuming and inefficient, and often limit their usage for predicting SIPs. Therefore, the development of computational method emerges at the times require. In this paper, we for the first time proposed a novel deep learning model which combined natural language processing (NLP) method for potential SIPs prediction from the protein sequence information. More specifically, the protein sequence is de novo assembled by k-mers. Then, we obtained the global vectors representation for each protein sequences by using natural language processing (NLP) technique. Finally, based on the knowledge of known self-interacting and non-interacting proteins, a multi-grained cascade forest model is trained to predict SIPs. Comprehensive experiments were performed on yeast and human datasets, which obtained an accuracy rate of 91.45% and 93.12%, respectively. From our evaluations, the experimental results show that the use of amino acid semantics information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work would have potential applications for various biological classification problems.

DOI: 10.3390/genes10110924
PubMed: 31726752

Links to Exploration step

pubmed:31726752

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.</title>
<author>
<name sortKey="Chen, Zhan Heng" sort="Chen, Zhan Heng" uniqKey="Chen Z" first="Zhan-Heng" last="Chen">Zhan-Heng Chen</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="You, Zhu Hong" sort="You, Zhu Hong" uniqKey="You Z" first="Zhu-Hong" last="You">Zhu-Hong You</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Wen Bo" sort="Zhang, Wen Bo" uniqKey="Zhang W" first="Wen-Bo" last="Zhang">Wen-Bo Zhang</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Wang, Yan Bin" sort="Wang, Yan Bin" uniqKey="Wang Y" first="Yan-Bin" last="Wang">Yan-Bin Wang</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Cheng, Li" sort="Cheng, Li" uniqKey="Cheng L" first="Li" last="Cheng">Li Cheng</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Alghazzawi, Daniyal" sort="Alghazzawi, Daniyal" uniqKey="Alghazzawi D" first="Daniyal" last="Alghazzawi">Daniyal Alghazzawi</name>
<affiliation>
<nlm:affiliation>Department of Information Systems, King Abdulaziz University, Jeddah 21589, Saudi Arabia.</nlm:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2019">2019</date>
<idno type="RBID">pubmed:31726752</idno>
<idno type="pmid">31726752</idno>
<idno type="doi">10.3390/genes10110924</idno>
<idno type="wicri:Area/PubMed/Corpus">000364</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000364</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.</title>
<author>
<name sortKey="Chen, Zhan Heng" sort="Chen, Zhan Heng" uniqKey="Chen Z" first="Zhan-Heng" last="Chen">Zhan-Heng Chen</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="You, Zhu Hong" sort="You, Zhu Hong" uniqKey="You Z" first="Zhu-Hong" last="You">Zhu-Hong You</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Wen Bo" sort="Zhang, Wen Bo" uniqKey="Zhang W" first="Wen-Bo" last="Zhang">Wen-Bo Zhang</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Wang, Yan Bin" sort="Wang, Yan Bin" uniqKey="Wang Y" first="Yan-Bin" last="Wang">Yan-Bin Wang</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Cheng, Li" sort="Cheng, Li" uniqKey="Cheng L" first="Li" last="Cheng">Li Cheng</name>
<affiliation>
<nlm:affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Alghazzawi, Daniyal" sort="Alghazzawi, Daniyal" uniqKey="Alghazzawi D" first="Daniyal" last="Alghazzawi">Daniyal Alghazzawi</name>
<affiliation>
<nlm:affiliation>Department of Information Systems, King Abdulaziz University, Jeddah 21589, Saudi Arabia.</nlm:affiliation>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Genes</title>
<idno type="eISSN">2073-4425</idno>
<imprint>
<date when="2019" type="published">2019</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Self-interacting proteins (SIPs) is of paramount importance in current molecular biology. There have been developed a number of traditional biological experiment methods for predicting SIPs in the past few years. However, these methods are costly, time-consuming and inefficient, and often limit their usage for predicting SIPs. Therefore, the development of computational method emerges at the times require. In this paper, we for the first time proposed a novel deep learning model which combined natural language processing (NLP) method for potential SIPs prediction from the protein sequence information. More specifically, the protein sequence is de novo assembled by
<i>k-</i>
<i>mers</i>
. Then, we obtained the global vectors representation for each protein sequences by using natural language processing (NLP) technique. Finally, based on the knowledge of known self-interacting and non-interacting proteins, a multi-grained cascade forest model is trained to predict SIPs. Comprehensive experiments were performed on
<i>yeast</i>
and
<i>human</i>
datasets, which obtained an accuracy rate of 91.45% and 93.12%, respectively. From our evaluations, the experimental results show that the use of amino acid semantics information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work would have potential applications for various biological classification problems.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="In-Process" Owner="NLM">
<PMID Version="1">31726752</PMID>
<DateRevised>
<Year>2020</Year>
<Month>02</Month>
<Day>05</Day>
</DateRevised>
<Article PubModel="Electronic">
<Journal>
<ISSN IssnType="Electronic">2073-4425</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>10</Volume>
<Issue>11</Issue>
<PubDate>
<Year>2019</Year>
<Month>11</Month>
<Day>12</Day>
</PubDate>
</JournalIssue>
<Title>Genes</Title>
<ISOAbbreviation>Genes (Basel)</ISOAbbreviation>
</Journal>
<ArticleTitle>Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.</ArticleTitle>
<ELocationID EIdType="pii" ValidYN="Y">E924</ELocationID>
<ELocationID EIdType="doi" ValidYN="Y">10.3390/genes10110924</ELocationID>
<Abstract>
<AbstractText>Self-interacting proteins (SIPs) is of paramount importance in current molecular biology. There have been developed a number of traditional biological experiment methods for predicting SIPs in the past few years. However, these methods are costly, time-consuming and inefficient, and often limit their usage for predicting SIPs. Therefore, the development of computational method emerges at the times require. In this paper, we for the first time proposed a novel deep learning model which combined natural language processing (NLP) method for potential SIPs prediction from the protein sequence information. More specifically, the protein sequence is de novo assembled by
<i>k-</i>
<i>mers</i>
. Then, we obtained the global vectors representation for each protein sequences by using natural language processing (NLP) technique. Finally, based on the knowledge of known self-interacting and non-interacting proteins, a multi-grained cascade forest model is trained to predict SIPs. Comprehensive experiments were performed on
<i>yeast</i>
and
<i>human</i>
datasets, which obtained an accuracy rate of 91.45% and 93.12%, respectively. From our evaluations, the experimental results show that the use of amino acid semantics information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work would have potential applications for various biological classification problems.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Chen</LastName>
<ForeName>Zhan-Heng</ForeName>
<Initials>ZH</Initials>
<AffiliationInfo>
<Affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>University of Chinese Academy of Sciences, Beijing 100049, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>You</LastName>
<ForeName>Zhu-Hong</ForeName>
<Initials>ZH</Initials>
<AffiliationInfo>
<Affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>University of Chinese Academy of Sciences, Beijing 100049, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Zhang</LastName>
<ForeName>Wen-Bo</ForeName>
<Initials>WB</Initials>
<AffiliationInfo>
<Affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>University of Chinese Academy of Sciences, Beijing 100049, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wang</LastName>
<ForeName>Yan-Bin</ForeName>
<Initials>YB</Initials>
<AffiliationInfo>
<Affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Cheng</LastName>
<ForeName>Li</ForeName>
<Initials>L</Initials>
<AffiliationInfo>
<Affiliation>The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>University of Chinese Academy of Sciences, Beijing 100049, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Alghazzawi</LastName>
<ForeName>Daniyal</ForeName>
<Initials>D</Initials>
<AffiliationInfo>
<Affiliation>Department of Information Systems, King Abdulaziz University, Jeddah 21589, Saudi Arabia.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2019</Year>
<Month>11</Month>
<Day>12</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>Switzerland</Country>
<MedlineTA>Genes (Basel)</MedlineTA>
<NlmUniqueID>101551097</NlmUniqueID>
<ISSNLinking>2073-4425</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="Y">de novo protein sequence</Keyword>
<Keyword MajorTopicYN="Y">global vector representation</Keyword>
<Keyword MajorTopicYN="Y">multi-grained cascade forest</Keyword>
<Keyword MajorTopicYN="Y">self-interacting proteins</Keyword>
</KeywordList>
<CoiStatement>The authors declare no conflict of interest.</CoiStatement>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2019</Year>
<Month>08</Month>
<Day>31</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="revised">
<Year>2019</Year>
<Month>11</Month>
<Day>05</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2019</Year>
<Month>11</Month>
<Day>06</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2019</Year>
<Month>11</Month>
<Day>16</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2019</Year>
<Month>11</Month>
<Day>16</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2019</Year>
<Month>11</Month>
<Day>16</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>epublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">31726752</ArticleId>
<ArticleId IdType="pii">genes10110924</ArticleId>
<ArticleId IdType="doi">10.3390/genes10110924</ArticleId>
<ArticleId IdType="pmc">PMC6896115</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>PLoS Comput Biol. 2007 Jul;3(7):e119</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17630824</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2004 Jan 1;32(Database issue):D449-51</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14681454</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Int J Mol Sci. 2014 Jul 18;15(7):12731-49</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25046746</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>BMC Bioinformatics. 2017 May 25;18(1):277</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28545462</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2013 Jan;41(Database issue):D1228-33</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23180781</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Mol Cell Proteomics. 2013 Jun;12(6):1689-700</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23422585</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Front Genet. 2019 Mar 01;10:90</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">30881376</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Genomics. 2014 Dec;104(6 Pt B):496-503</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25458812</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2014 Jan;42(Database issue):D358-63</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24234451</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>BMC Syst Biol. 2018 Dec 21;12(Suppl 8):129</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">30577794</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2005 Jun 27;33(11):3629-35</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">15983135</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2017 Jan 4;45(D1):D369-D379</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27980099</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>PLoS One. 2015 Nov 10;10(11):e0141287</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26555596</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27899622</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Genomics. 2013 Oct;102(4):237-42</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23747746</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Methods. 2016 Jan 15;93:84-91</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26370280</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>IEEE/ACM Trans Comput Biol Bioinform. 2017 Mar-Apr;14(2):345-352</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28368812</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>IEEE Trans Cybern. 2017 Mar;47(3):731-743</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28113829</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Bioinformatics. 2010 Nov 1;26(21):2744-51</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20817744</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>PLoS Comput Biol. 2007 Apr 27;3(4):e43</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17465672</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nature. 2012 Oct 25;490(7421):556-60</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23023127</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>J Biol. 2006;5(4):11</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16762047</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>BMC Bioinformatics. 2011 Dec 22;12:489</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22192482</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Int J Mol Sci. 2019 Feb 21;20(4):null</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">30795499</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Amino Acids. 2016 Jul;48(7):1655-65</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27074717</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nat Commun. 2019 Mar 18;10(1):1240</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">30886144</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2017 Jan 4;45(D1):D362-D368</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27924014</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Cells. 2019 Feb 03;8(2):</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">30717470</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nat Biotechnol. 2007 Oct;25(10):1119-26</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17921997</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2019 Jan 8;47(D1):D376-D381</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">30371822</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Science. 2003 Oct 17;302(5644):449-53</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14564010</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nat Methods. 2013 Mar;10(3):221-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23353650</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000364 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 000364 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:31726752
   |texte=   Global Vectors Representation of Protein Sequences and Its Application for Predicting Self-Interacting Proteins with Multi-Grained Cascade Forest Model.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:31726752" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021