Asymptotically optimal minimizers schemes
Identifieur interne : 001E79 ( Ncbi/Checkpoint ); précédent : 001E78; suivant : 001E80Asymptotically optimal minimizers schemes
Auteurs : Guillaume Marçais [États-Unis] ; Dan Deblasio [États-Unis] ; Carl Kingsford [États-Unis]Source :
- Bioinformatics [ 1367-4803 ] ; 2018.
Descripteurs français
- KwdFr :
- MESH :
English descriptors
- KwdEn :
- MESH :
- methods : Computational Biology.
- Algorithms.
Abstract
The minimizers technique is a method to sample
From the analysis of the asymptotic behavior of minimizers, forward and local schemes, we show that the previously believed lower bound on minimizers schemes does not hold, and that schemes with density lower than thought possible actually exist. The proof is constructive and leads to an efficient algorithm to compare
Url:
DOI: 10.1093/bioinformatics/bty258
PubMed: 29949995
PubMed Central: 6037127
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000B22
- to stream Pmc, to step Curation: 000B22
- to stream Pmc, to step Checkpoint: 000632
- to stream PubMed, to step Corpus: 000854
- to stream PubMed, to step Curation: 000854
- to stream PubMed, to step Checkpoint: 000A07
- to stream Ncbi, to step Merge: 001E79
- to stream Ncbi, to step Curation: 001E79
Links to Exploration step
PMC:6037127Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Asymptotically optimal minimizers schemes</title>
<author><name sortKey="Marcais, Guillaume" sort="Marcais, Guillaume" uniqKey="Marcais G" first="Guillaume" last="Marçais">Guillaume Marçais</name>
<affiliation wicri:level="4"><nlm:aff id="bty258-aff1">Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Deblasio, Dan" sort="Deblasio, Dan" uniqKey="Deblasio D" first="Dan" last="Deblasio">Dan Deblasio</name>
<affiliation wicri:level="4"><nlm:aff id="bty258-aff1">Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Kingsford, Carl" sort="Kingsford, Carl" uniqKey="Kingsford C" first="Carl" last="Kingsford">Carl Kingsford</name>
<affiliation wicri:level="4"><nlm:aff id="bty258-aff1">Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">29949995</idno>
<idno type="pmc">6037127</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6037127</idno>
<idno type="RBID">PMC:6037127</idno>
<idno type="doi">10.1093/bioinformatics/bty258</idno>
<date when="2018">2018</date>
<idno type="wicri:Area/Pmc/Corpus">000B22</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000B22</idno>
<idno type="wicri:Area/Pmc/Curation">000B22</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000B22</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000632</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000632</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:29949995</idno>
<idno type="wicri:Area/PubMed/Corpus">000854</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000854</idno>
<idno type="wicri:Area/PubMed/Curation">000854</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000854</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000A07</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000A07</idno>
<idno type="wicri:Area/Ncbi/Merge">001E79</idno>
<idno type="wicri:Area/Ncbi/Curation">001E79</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001E79</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Asymptotically optimal minimizers schemes</title>
<author><name sortKey="Marcais, Guillaume" sort="Marcais, Guillaume" uniqKey="Marcais G" first="Guillaume" last="Marçais">Guillaume Marçais</name>
<affiliation wicri:level="4"><nlm:aff id="bty258-aff1">Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Deblasio, Dan" sort="Deblasio, Dan" uniqKey="Deblasio D" first="Dan" last="Deblasio">Dan Deblasio</name>
<affiliation wicri:level="4"><nlm:aff id="bty258-aff1">Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Kingsford, Carl" sort="Kingsford, Carl" uniqKey="Kingsford C" first="Carl" last="Kingsford">Carl Kingsford</name>
<affiliation wicri:level="4"><nlm:aff id="bty258-aff1">Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
</analytic>
<series><title level="j">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2018">2018</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Computational Biology (methods)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Biologie informatique ()</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Computational Biology</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Biologie informatique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><title>Abstract</title>
<sec id="s1"><title>Motivation</title>
<p>The minimizers technique is a method to sample <italic>k</italic>
-mers that is used in many bioinformatics software to reduce computation, memory usage and run time. The number of applications using minimizers keeps on growing steadily. Despite its many uses, the theoretical understanding of minimizers is still very limited. In many applications, selecting as few <italic>k</italic>
-mers as possible (i.e. having a low <italic>density</italic>
) is beneficial. The density is highly dependent on the choice of the order on the <italic>k</italic>
-mers. Different applications use different orders, but none of these orders are optimal. A better understanding of minimizers schemes, and the related local and forward schemes, will allow designing schemes with lower density and thereby making existing and future bioinformatics tools even more efficient.</p>
</sec>
<sec id="s2"><title>Results</title>
<p>From the analysis of the asymptotic behavior of minimizers, forward and local schemes, we show that the previously believed lower bound on minimizers schemes does not hold, and that schemes with density lower than thought possible actually exist. The proof is constructive and leads to an efficient algorithm to compare <italic>k</italic>
-mers. These orders are the first known orders that are asymptotically optimal. Additionally, we give improved bounds on the density achievable by the three type of schemes.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="De Bruijn, N G" uniqKey="De Bruijn N">N.G. de Bruijn</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Deorowicz, S" uniqKey="Deorowicz S">S. Deorowicz</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Grabowski, S" uniqKey="Grabowski S">S. Grabowski</name>
</author>
<author><name sortKey="Raniszewski, M" uniqKey="Raniszewski M">M. Raniszewski</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kawulok, J" uniqKey="Kawulok J">J. Kawulok</name>
</author>
<author><name sortKey="Deorowicz, S" uniqKey="Deorowicz S">S. Deorowicz</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lempel, A" uniqKey="Lempel A">A. Lempel</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, H" uniqKey="Li H">H. Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, Y" uniqKey="Li Y">Y. Li</name>
</author>
<author><name sortKey="Xifengyan, X" uniqKey="Xifengyan X">X. XifengYan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, Y" uniqKey="Li Y">Y. Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lichiardopol, N" uniqKey="Lichiardopol N">N. Lichiardopol</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Marcais, G" uniqKey="Marcais G">G. Marçais</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Maurer, U M" uniqKey="Maurer U">U.M. Maurer</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mykkeltveit, J" uniqKey="Mykkeltveit J">J. Mykkeltveit</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ondov, B D" uniqKey="Ondov B">B.D. Ondov</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Orenstein, Y" uniqKey="Orenstein Y">Y. Orenstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Paindavoine, M" uniqKey="Paindavoine M">M. Paindavoine</name>
</author>
<author><name sortKey="Vialla, B" uniqKey="Vialla B">B. Vialla</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Roberts, M" uniqKey="Roberts M">M. Roberts</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Roberts, M" uniqKey="Roberts M">M. Roberts</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Schleimer, S" uniqKey="Schleimer S">S. Schleimer</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wood, D E" uniqKey="Wood D">D.E. Wood</name>
</author>
<author><name sortKey="Salzberg, S L" uniqKey="Salzberg S">S.L. Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ye, C" uniqKey="Ye C">C. Ye</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Pennsylvanie</li>
</region>
<settlement><li>Pittsburgh</li>
</settlement>
<orgName><li>Université Carnegie-Mellon</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="Pennsylvanie"><name sortKey="Marcais, Guillaume" sort="Marcais, Guillaume" uniqKey="Marcais G" first="Guillaume" last="Marçais">Guillaume Marçais</name>
</region>
<name sortKey="Deblasio, Dan" sort="Deblasio, Dan" uniqKey="Deblasio D" first="Dan" last="Deblasio">Dan Deblasio</name>
<name sortKey="Kingsford, Carl" sort="Kingsford, Carl" uniqKey="Kingsford C" first="Carl" last="Kingsford">Carl Kingsford</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Ncbi/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001E79 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Checkpoint/biblio.hfd -nk 001E79 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Ncbi |étape= Checkpoint |type= RBID |clé= PMC:6037127 |texte= Asymptotically optimal minimizers schemes }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Checkpoint/RBID.i -Sk "pubmed:29949995" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Checkpoint/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |