Gerbil: a fast and memory-efficient k-mer counter with GPU-support
Identifieur interne : 000D76 ( Main/Exploration ); précédent : 000D75; suivant : 000D77Gerbil: a fast and memory-efficient k-mer counter with GPU-support
Auteurs : Marius Erbert ; Steffen Rechner ; Matthias Müller-HannemannSource :
- Algorithms for Molecular Biology : AMB [ 1748-7188 ] ; 2017.
Abstract
A basic task in bioinformatics is the counting of
We present the open source
While
The online version of this article (doi:10.1186/s13015-017-0097-9) contains supplementary material, which is available to authorized users.
Url:
DOI: 10.1186/s13015-017-0097-9
PubMed: 28373894
PubMed Central: 5374613
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000247
- to stream Pmc, to step Curation: 000247
- to stream Pmc, to step Checkpoint: 000849
- to stream PubMed, to step Corpus: 000D36
- to stream PubMed, to step Curation: 000D36
- to stream PubMed, to step Checkpoint: 000C94
- to stream Ncbi, to step Merge: 001995
- to stream Ncbi, to step Curation: 001995
- to stream Ncbi, to step Checkpoint: 001995
- to stream Main, to step Merge: 000D79
- to stream Main, to step Curation: 000D76
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Gerbil: a fast and memory-efficient <italic>k</italic>
-mer counter with GPU-support</title>
<author><name sortKey="Erbert, Marius" sort="Erbert, Marius" uniqKey="Erbert M" first="Marius" last="Erbert">Marius Erbert</name>
<affiliation><nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Rechner, Steffen" sort="Rechner, Steffen" uniqKey="Rechner S" first="Steffen" last="Rechner">Steffen Rechner</name>
<affiliation><nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Muller Hannemann, Matthias" sort="Muller Hannemann, Matthias" uniqKey="Muller Hannemann M" first="Matthias" last="Müller-Hannemann">Matthias Müller-Hannemann</name>
<affiliation><nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">28373894</idno>
<idno type="pmc">5374613</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374613</idno>
<idno type="RBID">PMC:5374613</idno>
<idno type="doi">10.1186/s13015-017-0097-9</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Pmc/Corpus">000247</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000247</idno>
<idno type="wicri:Area/Pmc/Curation">000247</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000247</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000849</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000849</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:28373894</idno>
<idno type="wicri:Area/PubMed/Corpus">000D36</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000D36</idno>
<idno type="wicri:Area/PubMed/Curation">000D36</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000D36</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000C94</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000C94</idno>
<idno type="wicri:Area/Ncbi/Merge">001995</idno>
<idno type="wicri:Area/Ncbi/Curation">001995</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001995</idno>
<idno type="wicri:Area/Main/Merge">000D79</idno>
<idno type="wicri:Area/Main/Curation">000D76</idno>
<idno type="wicri:Area/Main/Exploration">000D76</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Gerbil: a fast and memory-efficient <italic>k</italic>
-mer counter with GPU-support</title>
<author><name sortKey="Erbert, Marius" sort="Erbert, Marius" uniqKey="Erbert M" first="Marius" last="Erbert">Marius Erbert</name>
<affiliation><nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Rechner, Steffen" sort="Rechner, Steffen" uniqKey="Rechner S" first="Steffen" last="Rechner">Steffen Rechner</name>
<affiliation><nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Muller Hannemann, Matthias" sort="Muller Hannemann, Matthias" uniqKey="Muller Hannemann M" first="Matthias" last="Müller-Hannemann">Matthias Müller-Hannemann</name>
<affiliation><nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">Algorithms for Molecular Biology : AMB</title>
<idno type="eISSN">1748-7188</idno>
<imprint><date when="2017">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>A basic task in bioinformatics is the counting of <italic>k</italic>
-mers in genome sequences. Existing <italic>k</italic>
-mer counting tools are most often optimized for small <italic>k</italic>
< 32 and suffer from excessive memory resource consumption or degrading performance for large <italic>k</italic>
. However, given the technology trend towards long reads of next-generation sequencers, support for large <italic>k</italic>
becomes increasingly important.</p>
</sec>
<sec><title>Results</title>
<p>We present the open source <italic>k</italic>
-mer counting software <italic>Gerbil</italic>
that has been designed for the efficient counting of <italic>k</italic>
-mers for <italic>k</italic>
≥ 32. Our software is the result of an intensive process of algorithm engineering. It implements a two-step approach. In the first step, genome reads are loaded from disk and redistributed to temporary files. In a second step, the <italic>k</italic>
-mers of each temporary file are counted via a hash table approach. In addition to its basic functionality, <italic>Gerbil</italic>
can optionally use GPUs to accelerate the counting step. In a set of experiments with real-world genome data sets, we show that <italic>Gerbil</italic>
is able to efficiently support both small and large <italic>k</italic>
.</p>
</sec>
<sec><title>Conclusions</title>
<p>While <italic>Gerbil</italic>
’s performance is comparable to existing state-of-the-art open source <italic>k</italic>
-mer counting tools for small <italic>k</italic>
< 32, it vastly outperforms its competitors for large <italic>k</italic>
, thereby enabling new applications which require large values of <italic>k</italic>
.</p>
</sec>
<sec><title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s13015-017-0097-9) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Xavier, Bb" uniqKey="Xavier B">BB Xavier</name>
</author>
<author><name sortKey="Sabirova, J" uniqKey="Sabirova J">J Sabirova</name>
</author>
<author><name sortKey="Pieter, M" uniqKey="Pieter M">M Pieter</name>
</author>
<author><name sortKey="Hernalsteens, J P" uniqKey="Hernalsteens J">J-P Hernalsteens</name>
</author>
<author><name sortKey="De Greve, H" uniqKey="De Greve H">H de Greve</name>
</author>
<author><name sortKey="Goossens, H" uniqKey="Goossens H">H Goossens</name>
</author>
<author><name sortKey="Malhotra Kumar, S" uniqKey="Malhotra Kumar S">S Malhotra-Kumar</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chikhi, R" uniqKey="Chikhi R">R Chikhi</name>
</author>
<author><name sortKey="Medvedev, P" uniqKey="Medvedev P">P Medvedev</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sameith, K" uniqKey="Sameith K">K Sameith</name>
</author>
<author><name sortKey="Roscito, Jg" uniqKey="Roscito J">JG Roscito</name>
</author>
<author><name sortKey="Hiller, M" uniqKey="Hiller M">M Hiller</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Marcais, G" uniqKey="Marcais G">G Marçais</name>
</author>
<author><name sortKey="Kingsford, C" uniqKey="Kingsford C">C Kingsford</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Melsted, P" uniqKey="Melsted P">P Melsted</name>
</author>
<author><name sortKey="Pritchard, Jk" uniqKey="Pritchard J">JK Pritchard</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rizk, G" uniqKey="Rizk G">G Rizk</name>
</author>
<author><name sortKey="Lavenier, D" uniqKey="Lavenier D">D Lavenier</name>
</author>
<author><name sortKey="Chikhi, R" uniqKey="Chikhi R">R Chikhi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Deorowicz, S" uniqKey="Deorowicz S">S Deorowicz</name>
</author>
<author><name sortKey="Debudaj Grabysz, A" uniqKey="Debudaj Grabysz A">A Debudaj-Grabysz</name>
</author>
<author><name sortKey="Grabowski, S" uniqKey="Grabowski S">S Grabowski</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Roy, Rs" uniqKey="Roy R">RS Roy</name>
</author>
<author><name sortKey="Bhattacharya, D" uniqKey="Bhattacharya D">D Bhattacharya</name>
</author>
<author><name sortKey="Schliep, A" uniqKey="Schliep A">A Schliep</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Deorowicz, S" uniqKey="Deorowicz S">S Deorowicz</name>
</author>
<author><name sortKey="Kokot, M" uniqKey="Kokot M">M Kokot</name>
</author>
<author><name sortKey="Grabowski, S" uniqKey="Grabowski S">S Grabowski</name>
</author>
<author><name sortKey="Debudaj Grabysz, A" uniqKey="Debudaj Grabysz A">A Debudaj-Grabysz</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Perez, N" uniqKey="Perez N">N Pérez</name>
</author>
<author><name sortKey="Gutierrez, M" uniqKey="Gutierrez M">M Gutierrez</name>
</author>
<author><name sortKey="Vera, N" uniqKey="Vera N">N Vera</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mamun, Aa" uniqKey="Mamun A">AA Mamun</name>
</author>
<author><name sortKey="Pal, S" uniqKey="Pal S">S Pal</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Roberts, M" uniqKey="Roberts M">M Roberts</name>
</author>
<author><name sortKey="Hunt, Br" uniqKey="Hunt B">BR Hunt</name>
</author>
<author><name sortKey="Yorke, Ja" uniqKey="Yorke J">JA Yorke</name>
</author>
<author><name sortKey="Bolanos, Ra" uniqKey="Bolanos R">RA Bolanos</name>
</author>
<author><name sortKey="Delcher, Al" uniqKey="Delcher A">AL Delcher</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Roberts, M" uniqKey="Roberts M">M Roberts</name>
</author>
<author><name sortKey="Hayes, W" uniqKey="Hayes W">W Hayes</name>
</author>
<author><name sortKey="Hunt, Br" uniqKey="Hunt B">BR Hunt</name>
</author>
<author><name sortKey="Mount, Sm" uniqKey="Mount S">SM Mount</name>
</author>
<author><name sortKey="Yorke, Ja" uniqKey="Yorke J">JA Yorke</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kim, Ke" uniqKey="Kim K">KE Kim</name>
</author>
<author><name sortKey="Peluso, P" uniqKey="Peluso P">P Peluso</name>
</author>
<author><name sortKey="Babayan, P" uniqKey="Babayan P">P Babayan</name>
</author>
<author><name sortKey="Yeadon, Pj" uniqKey="Yeadon P">PJ Yeadon</name>
</author>
<author><name sortKey="Yu, C" uniqKey="Yu C">C Yu</name>
</author>
<author><name sortKey="Fisher, Ww" uniqKey="Fisher W">WW Fisher</name>
</author>
<author><name sortKey="Chin, Cs" uniqKey="Chin C">CS Chin</name>
</author>
<author><name sortKey="Rapicavoli, Na" uniqKey="Rapicavoli N">NA Rapicavoli</name>
</author>
<author><name sortKey="Rank, Dr" uniqKey="Rank D">DR Rank</name>
</author>
<author><name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations><list></list>
<tree><noCountry><name sortKey="Erbert, Marius" sort="Erbert, Marius" uniqKey="Erbert M" first="Marius" last="Erbert">Marius Erbert</name>
<name sortKey="Muller Hannemann, Matthias" sort="Muller Hannemann, Matthias" uniqKey="Muller Hannemann M" first="Matthias" last="Müller-Hannemann">Matthias Müller-Hannemann</name>
<name sortKey="Rechner, Steffen" sort="Rechner, Steffen" uniqKey="Rechner S" first="Steffen" last="Rechner">Steffen Rechner</name>
</noCountry>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000D76 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000D76 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:5374613 |texte= Gerbil: a fast and memory-efficient k-mer counter with GPU-support }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:28373894" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |