Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Efficient Mining Multi-mers in a Variety of Biological Sequences.

Identifieur interne : 000840 ( PubMed/Corpus ); précédent : 000839; suivant : 000841

Efficient Mining Multi-mers in a Variety of Biological Sequences.

Auteurs : Jingsong Zhang ; Jianmei Guo ; Ming Zhang ; Xiangtian Yu ; Xiaoqing Yu ; Weifeng Guo ; Tao Zeng ; Luonan Chen

Source :

RBID : pubmed:29993642

Abstract

Counting the occurrence frequency of each -mer in a biological sequence is a preliminary yet important step in many bioinformatics applications. However, most -mer counting algorithms rely on a given k to produce single-length -mers, which is inefficient for sequence analysis for different k. Moreover, existing -mer counters focus more on DNA and RNA sequences and less on protein ones. In practice, the analysis of -mers in protein sequences can provide substantial biological insights in structure, function and evolution. To this end, an efficient algorithm, called MulMer (Multiple-Mer mining), is proposed to mine -mers of various lengths termed multi-mers via inverted-index technique, which is orders of magnitude faster than the conventional forward-index methods. Moreover, to the best of our knowledge, MulMer is the first able to mine multi-mers in a variety of sequences, including DNARNA and protein sequences.

DOI: 10.1109/TCBB.2018.2828313
PubMed: 29993642

Links to Exploration step

pubmed:29993642

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Efficient Mining Multi-mers in a Variety of Biological Sequences.</title>
<author>
<name sortKey="Zhang, Jingsong" sort="Zhang, Jingsong" uniqKey="Zhang J" first="Jingsong" last="Zhang">Jingsong Zhang</name>
</author>
<author>
<name sortKey="Guo, Jianmei" sort="Guo, Jianmei" uniqKey="Guo J" first="Jianmei" last="Guo">Jianmei Guo</name>
</author>
<author>
<name sortKey="Zhang, Ming" sort="Zhang, Ming" uniqKey="Zhang M" first="Ming" last="Zhang">Ming Zhang</name>
</author>
<author>
<name sortKey="Yu, Xiangtian" sort="Yu, Xiangtian" uniqKey="Yu X" first="Xiangtian" last="Yu">Xiangtian Yu</name>
</author>
<author>
<name sortKey="Yu, Xiaoqing" sort="Yu, Xiaoqing" uniqKey="Yu X" first="Xiaoqing" last="Yu">Xiaoqing Yu</name>
</author>
<author>
<name sortKey="Guo, Weifeng" sort="Guo, Weifeng" uniqKey="Guo W" first="Weifeng" last="Guo">Weifeng Guo</name>
</author>
<author>
<name sortKey="Zeng, Tao" sort="Zeng, Tao" uniqKey="Zeng T" first="Tao" last="Zeng">Tao Zeng</name>
</author>
<author>
<name sortKey="Chen, Luonan" sort="Chen, Luonan" uniqKey="Chen L" first="Luonan" last="Chen">Luonan Chen</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2018">2018</date>
<idno type="RBID">pubmed:29993642</idno>
<idno type="pmid">29993642</idno>
<idno type="doi">10.1109/TCBB.2018.2828313</idno>
<idno type="wicri:Area/PubMed/Corpus">000840</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000840</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Efficient Mining Multi-mers in a Variety of Biological Sequences.</title>
<author>
<name sortKey="Zhang, Jingsong" sort="Zhang, Jingsong" uniqKey="Zhang J" first="Jingsong" last="Zhang">Jingsong Zhang</name>
</author>
<author>
<name sortKey="Guo, Jianmei" sort="Guo, Jianmei" uniqKey="Guo J" first="Jianmei" last="Guo">Jianmei Guo</name>
</author>
<author>
<name sortKey="Zhang, Ming" sort="Zhang, Ming" uniqKey="Zhang M" first="Ming" last="Zhang">Ming Zhang</name>
</author>
<author>
<name sortKey="Yu, Xiangtian" sort="Yu, Xiangtian" uniqKey="Yu X" first="Xiangtian" last="Yu">Xiangtian Yu</name>
</author>
<author>
<name sortKey="Yu, Xiaoqing" sort="Yu, Xiaoqing" uniqKey="Yu X" first="Xiaoqing" last="Yu">Xiaoqing Yu</name>
</author>
<author>
<name sortKey="Guo, Weifeng" sort="Guo, Weifeng" uniqKey="Guo W" first="Weifeng" last="Guo">Weifeng Guo</name>
</author>
<author>
<name sortKey="Zeng, Tao" sort="Zeng, Tao" uniqKey="Zeng T" first="Tao" last="Zeng">Tao Zeng</name>
</author>
<author>
<name sortKey="Chen, Luonan" sort="Chen, Luonan" uniqKey="Chen L" first="Luonan" last="Chen">Luonan Chen</name>
</author>
</analytic>
<series>
<title level="j">IEEE/ACM transactions on computational biology and bioinformatics</title>
<idno type="eISSN">1557-9964</idno>
<imprint>
<date when="2018" type="published">2018</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Counting the occurrence frequency of each -mer in a biological sequence is a preliminary yet important step in many bioinformatics applications. However, most -mer counting algorithms rely on a given k to produce single-length -mers, which is inefficient for sequence analysis for different k. Moreover, existing -mer counters focus more on DNA and RNA sequences and less on protein ones. In practice, the analysis of -mers in protein sequences can provide substantial biological insights in structure, function and evolution. To this end, an efficient algorithm, called MulMer (Multiple-Mer mining), is proposed to mine -mers of various lengths termed multi-mers via inverted-index technique, which is orders of magnitude faster than the conventional forward-index methods. Moreover, to the best of our knowledge, MulMer is the first able to mine multi-mers in a variety of sequences, including DNARNA and protein sequences.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="Publisher" Owner="NLM">
<PMID Version="1">29993642</PMID>
<DateRevised>
<Year>2019</Year>
<Month>11</Month>
<Day>20</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1557-9964</ISSN>
<JournalIssue CitedMedium="Internet">
<PubDate>
<Year>2018</Year>
<Month>Apr</Month>
<Day>19</Day>
</PubDate>
</JournalIssue>
<Title>IEEE/ACM transactions on computational biology and bioinformatics</Title>
<ISOAbbreviation>IEEE/ACM Trans Comput Biol Bioinform</ISOAbbreviation>
</Journal>
<ArticleTitle>Efficient Mining Multi-mers in a Variety of Biological Sequences.</ArticleTitle>
<ELocationID EIdType="doi" ValidYN="Y">10.1109/TCBB.2018.2828313</ELocationID>
<Abstract>
<AbstractText>Counting the occurrence frequency of each -mer in a biological sequence is a preliminary yet important step in many bioinformatics applications. However, most -mer counting algorithms rely on a given k to produce single-length -mers, which is inefficient for sequence analysis for different k. Moreover, existing -mer counters focus more on DNA and RNA sequences and less on protein ones. In practice, the analysis of -mers in protein sequences can provide substantial biological insights in structure, function and evolution. To this end, an efficient algorithm, called MulMer (Multiple-Mer mining), is proposed to mine -mers of various lengths termed multi-mers via inverted-index technique, which is orders of magnitude faster than the conventional forward-index methods. Moreover, to the best of our knowledge, MulMer is the first able to mine multi-mers in a variety of sequences, including DNARNA and protein sequences.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Zhang</LastName>
<ForeName>Jingsong</ForeName>
<Initials>J</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Guo</LastName>
<ForeName>Jianmei</ForeName>
<Initials>J</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Zhang</LastName>
<ForeName>Ming</ForeName>
<Initials>M</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Yu</LastName>
<ForeName>Xiangtian</ForeName>
<Initials>X</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Yu</LastName>
<ForeName>Xiaoqing</ForeName>
<Initials>X</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Guo</LastName>
<ForeName>Weifeng</ForeName>
<Initials>W</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Zeng</LastName>
<ForeName>Tao</ForeName>
<Initials>T</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Chen</LastName>
<ForeName>Luonan</ForeName>
<Initials>L</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2018</Year>
<Month>04</Month>
<Day>19</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>IEEE/ACM Trans Comput Biol Bioinform</MedlineTA>
<NlmUniqueID>101196755</NlmUniqueID>
<ISSNLinking>1545-5963</ISSNLinking>
</MedlineJournalInfo>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="entrez">
<Year>2018</Year>
<Month>7</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2018</Year>
<Month>7</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2018</Year>
<Month>7</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>aheadofprint</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">29993642</ArticleId>
<ArticleId IdType="doi">10.1109/TCBB.2018.2828313</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000840 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 000840 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:29993642
   |texte=   Efficient Mining Multi-mers in a Variety of Biological Sequences.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:29993642" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021