QuorUM: An Error Corrector for Illumina Reads
Identifieur interne : 001610 ( Main/Merge ); précédent : 001609; suivant : 001611QuorUM: An Error Corrector for Illumina Reads
Auteurs : Guillaume Marçais ; James A. Yorke ; Aleksey ZiminSource :
- PLoS ONE [ 1932-6203 ] ; 2015.
Descripteurs français
- KwdFr :
- MESH :
English descriptors
- KwdEn :
- MESH :
Abstract
Illumina Sequencing data can provide high coverage of a genome by relatively short (most often 100 bp to 150 bp) reads at a low cost. Even with low (advertised 1%) error rate, 100 × coverage Illumina data on average has an error in some read at every base in the genome. These errors make handling the data more complicated because they result in a large number of low-count erroneous
We produce trimmed and error-corrected reads that result in assemblies with longer contigs and fewer errors. We compared QuorUM against several published error correctors and found that it is the best performer in most metrics we use. QuorUM is efficiently implemented making use of current multi-core computing architectures and it is suitable for large data sets (1 billion bases checked and corrected per day per core). We also demonstrate that a third-party assembler (SOAPdenovo) benefits significantly from using QuorUM error-corrected reads. QuorUM error corrected reads result in a factor of 1.1 to 4 improvement in N50 contig size compared to using the original reads with SOAPdenovo for the data sets investigated.
QuorUM is distributed as an independent software package and as a module of the MaSuRCA assembly software. Both are available under the GPL open source license at
Url:
DOI: 10.1371/journal.pone.0130821
PubMed: 26083032
PubMed Central: 4471408
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 001008
- to stream Pmc, to step Curation: 001008
- to stream Pmc, to step Checkpoint: 000D37
- to stream PubMed, to step Corpus: 001586
- to stream PubMed, to step Curation: 001586
- to stream PubMed, to step Checkpoint: 001378
- to stream Ncbi, to step Merge: 001154
- to stream Ncbi, to step Curation: 001154
- to stream Ncbi, to step Checkpoint: 001154
Links to Exploration step
PMC:4471408Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">QuorUM: An Error Corrector for Illumina Reads</title>
<author><name sortKey="Marcais, Guillaume" sort="Marcais, Guillaume" uniqKey="Marcais G" first="Guillaume" last="Marçais">Guillaume Marçais</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Yorke, James A" sort="Yorke, James A" uniqKey="Yorke J" first="James A." last="Yorke">James A. Yorke</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Zimin, Aleksey" sort="Zimin, Aleksey" uniqKey="Zimin A" first="Aleksey" last="Zimin">Aleksey Zimin</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">26083032</idno>
<idno type="pmc">4471408</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4471408</idno>
<idno type="RBID">PMC:4471408</idno>
<idno type="doi">10.1371/journal.pone.0130821</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">001008</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">001008</idno>
<idno type="wicri:Area/Pmc/Curation">001008</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">001008</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000D37</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000D37</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:26083032</idno>
<idno type="wicri:Area/PubMed/Corpus">001586</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001586</idno>
<idno type="wicri:Area/PubMed/Curation">001586</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001586</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001378</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001378</idno>
<idno type="wicri:Area/Ncbi/Merge">001154</idno>
<idno type="wicri:Area/Ncbi/Curation">001154</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001154</idno>
<idno type="wicri:Area/Main/Merge">001610</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">QuorUM: An Error Corrector for Illumina Reads</title>
<author><name sortKey="Marcais, Guillaume" sort="Marcais, Guillaume" uniqKey="Marcais G" first="Guillaume" last="Marçais">Guillaume Marçais</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Yorke, James A" sort="Yorke, James A" uniqKey="Yorke J" first="James A." last="Yorke">James A. Yorke</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
<author><name sortKey="Zimin, Aleksey" sort="Zimin, Aleksey" uniqKey="Zimin A" first="Aleksey" last="Zimin">Aleksey Zimin</name>
<affiliation><nlm:aff id="aff001"></nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint><date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Animals</term>
<term>Computational Biology (methods)</term>
<term>Genome</term>
<term>Genomics (methods)</term>
<term>High-Throughput Nucleotide Sequencing (methods)</term>
<term>Humans</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Animaux</term>
<term>Biologie informatique ()</term>
<term>Génome</term>
<term>Génomique ()</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Séquençage nucléotidique à haut débit ()</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Computational Biology</term>
<term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Animals</term>
<term>Genome</term>
<term>Humans</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Animaux</term>
<term>Biologie informatique</term>
<term>Génome</term>
<term>Génomique</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec id="sec001"><title>Motivation</title>
<p>Illumina Sequencing data can provide high coverage of a genome by relatively short (most often 100 bp to 150 bp) reads at a low cost. Even with low (advertised 1%) error rate, 100 × coverage Illumina data on average has an error in some read at every base in the genome. These errors make handling the data more complicated because they result in a large number of low-count erroneous <italic>k</italic>
-mers in the reads. However, there is enough information in the reads to correct most of the sequencing errors, thus making subsequent use of the data (e.g. for mapping or assembly) easier. Here we use the term “error correction” to denote the reduction in errors due to both changes in individual bases and trimming of unusable sequence. We developed an error correction software called QuorUM. QuorUM is mainly aimed at error correcting Illumina reads for subsequent assembly. It is designed around the novel idea of minimizing the number of distinct erroneous <italic>k</italic>
-mers in the output reads and preserving the most true <italic>k</italic>
-mers, and we introduce a composite statistic π that measures how successful we are at achieving this dual goal. We evaluate the performance of QuorUM by correcting actual Illumina reads from genomes for which a reference assembly is available.</p>
</sec>
<sec id="sec002"><title>Results</title>
<p>We produce trimmed and error-corrected reads that result in assemblies with longer contigs and fewer errors. We compared QuorUM against several published error correctors and found that it is the best performer in most metrics we use. QuorUM is efficiently implemented making use of current multi-core computing architectures and it is suitable for large data sets (1 billion bases checked and corrected per day per core). We also demonstrate that a third-party assembler (SOAPdenovo) benefits significantly from using QuorUM error-corrected reads. QuorUM error corrected reads result in a factor of 1.1 to 4 improvement in N50 contig size compared to using the original reads with SOAPdenovo for the data sets investigated.</p>
</sec>
<sec id="sec003"><title>Availability</title>
<p>QuorUM is distributed as an independent software package and as a module of the MaSuRCA assembly software. Both are available under the GPL open source license at <ext-link ext-link-type="uri" xlink:href="http://www.genome.umd.edu">http://www.genome.umd.edu</ext-link>
.</p>
</sec>
<sec id="sec004"><title>Contact</title>
<p><email>gmarcais@umd.edu</email>
.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Zerbino, Dr" uniqKey="Zerbino D">DR Zerbino</name>
</author>
<author><name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author><name sortKey="Zhu, H" uniqKey="Zhu H">H Zhu</name>
</author>
<author><name sortKey="Ruan, J" uniqKey="Ruan J">J Ruan</name>
</author>
<author><name sortKey="Qian, W" uniqKey="Qian W">W Qian</name>
</author>
<author><name sortKey="Fang, X" uniqKey="Fang X">X Fang</name>
</author>
<author><name sortKey="Shi, Z" uniqKey="Shi Z">Z Shi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chaisson, Mj" uniqKey="Chaisson M">MJ Chaisson</name>
</author>
<author><name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gnerre, S" uniqKey="Gnerre S">S Gnerre</name>
</author>
<author><name sortKey="Maccallum, I" uniqKey="Maccallum I">I Maccallum</name>
</author>
<author><name sortKey="Przybylski, D" uniqKey="Przybylski D">D Przybylski</name>
</author>
<author><name sortKey="Ribeiro, Fj" uniqKey="Ribeiro F">FJ Ribeiro</name>
</author>
<author><name sortKey="Burton, Jn" uniqKey="Burton J">JN Burton</name>
</author>
<author><name sortKey="Walker, Bj" uniqKey="Walker B">BJ Walker</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
<author><name sortKey="Phillippy, Am" uniqKey="Phillippy A">AM Phillippy</name>
</author>
<author><name sortKey="Zimin, Av" uniqKey="Zimin A">AV Zimin</name>
</author>
<author><name sortKey="Puiu, D" uniqKey="Puiu D">D Puiu</name>
</author>
<author><name sortKey="Magoc, T" uniqKey="Magoc T">T Magoc</name>
</author>
<author><name sortKey="Koren, S" uniqKey="Koren S">S Koren</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Magoc, T" uniqKey="Magoc T">T Magoc</name>
</author>
<author><name sortKey="Pabinger, S" uniqKey="Pabinger S">S Pabinger</name>
</author>
<author><name sortKey="Canzar, S" uniqKey="Canzar S">S Canzar</name>
</author>
<author><name sortKey="Liu, X" uniqKey="Liu X">X Liu</name>
</author>
<author><name sortKey="Su, Q" uniqKey="Su Q">Q Su</name>
</author>
<author><name sortKey="Puiu, D" uniqKey="Puiu D">D Puiu</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zimin, Av" uniqKey="Zimin A">AV Zimin</name>
</author>
<author><name sortKey="Marcais, G" uniqKey="Marcais G">G Marçais</name>
</author>
<author><name sortKey="Puiu, D" uniqKey="Puiu D">D Puiu</name>
</author>
<author><name sortKey="Roberts, M" uniqKey="Roberts M">M Roberts</name>
</author>
<author><name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
<author><name sortKey="Yorke, Ja" uniqKey="Yorke J">JA Yorke</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author><name sortKey="Fazayeli, F" uniqKey="Fazayeli F">F Fazayeli</name>
</author>
<author><name sortKey="Ilie, S" uniqKey="Ilie S">S Ilie</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kao, Wc" uniqKey="Kao W">WC Kao</name>
</author>
<author><name sortKey="Chan, Ah" uniqKey="Chan A">AH Chan</name>
</author>
<author><name sortKey="Song, Ys" uniqKey="Song Y">YS Song</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Salmela, L" uniqKey="Salmela L">L Salmela</name>
</author>
<author><name sortKey="Schroder, J" uniqKey="Schroder J">J Schröder</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Simpson, Jt" uniqKey="Simpson J">JT Simpson</name>
</author>
<author><name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author><name sortKey="Molnar, M" uniqKey="Molnar M">M Molnar</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, Y" uniqKey="Liu Y">Y Liu</name>
</author>
<author><name sortKey="Schrder, J" uniqKey="Schrder J">J Schrder</name>
</author>
<author><name sortKey="Schmidt, B" uniqKey="Schmidt B">B Schmidt</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kelley, Dr" uniqKey="Kelley D">DR Kelley</name>
</author>
<author><name sortKey="Schatz, Mc" uniqKey="Schatz M">MC Schatz</name>
</author>
<author><name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Pevzner, Pa" uniqKey="Pevzner P">PA Pevzner</name>
</author>
<author><name sortKey="Tang, H" uniqKey="Tang H">H Tang</name>
</author>
<author><name sortKey="Waterman, Ms" uniqKey="Waterman M">MS Waterman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhao, X" uniqKey="Zhao X">X Zhao</name>
</author>
<author><name sortKey="Palmer, Le" uniqKey="Palmer L">LE Palmer</name>
</author>
<author><name sortKey="Bolanos, R" uniqKey="Bolanos R">R Bolanos</name>
</author>
<author><name sortKey="Mircean, C" uniqKey="Mircean C">C Mircean</name>
</author>
<author><name sortKey="Fasulo, D" uniqKey="Fasulo D">D Fasulo</name>
</author>
<author><name sortKey="Wittenberg, Gm" uniqKey="Wittenberg G">GM Wittenberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shi, H" uniqKey="Shi H">H Shi</name>
</author>
<author><name sortKey="Schmidt, B" uniqKey="Schmidt B">B Schmidt</name>
</author>
<author><name sortKey="Liu, W" uniqKey="Liu W">W Liu</name>
</author>
<author><name sortKey="Mller Wittig, W" uniqKey="Mller Wittig W">W Mller-Wittig</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yang, X" uniqKey="Yang X">X Yang</name>
</author>
<author><name sortKey="Chockalingam, Sp" uniqKey="Chockalingam S">SP Chockalingam</name>
</author>
<author><name sortKey="Aluru, S" uniqKey="Aluru S">S Aluru</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bentley, Dr" uniqKey="Bentley D">DR Bentley</name>
</author>
<author><name sortKey="Balasubramanian, S" uniqKey="Balasubramanian S">S Balasubramanian</name>
</author>
<author><name sortKey="Swerdlow, Hp" uniqKey="Swerdlow H">HP Swerdlow</name>
</author>
<author><name sortKey="Smith, Gp" uniqKey="Smith G">GP Smith</name>
</author>
<author><name sortKey="Milton, J" uniqKey="Milton J">J Milton</name>
</author>
<author><name sortKey="Brown, Cg" uniqKey="Brown C">CG Brown</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Marcais, G" uniqKey="Marcais G">G Marçais</name>
</author>
<author><name sortKey="Kingsford, C" uniqKey="Kingsford C">C Kingsford</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mackenzie, C" uniqKey="Mackenzie C">C Mackenzie</name>
</author>
<author><name sortKey="Choudhary, M" uniqKey="Choudhary M">M Choudhary</name>
</author>
<author><name sortKey="Larimer, Fw" uniqKey="Larimer F">FW Larimer</name>
</author>
<author><name sortKey="Predki, Pf" uniqKey="Predki P">PF Predki</name>
</author>
<author><name sortKey="Stilwagen, S" uniqKey="Stilwagen S">S Stilwagen</name>
</author>
<author><name sortKey="Armitage, Jp" uniqKey="Armitage J">JP Armitage</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Waterston, Rh" uniqKey="Waterston R">RH Waterston</name>
</author>
<author><name sortKey="Lindblad Toh, K" uniqKey="Lindblad Toh K">K Lindblad-Toh</name>
</author>
<author><name sortKey="Birney, E" uniqKey="Birney E">E Birney</name>
</author>
<author><name sortKey="Rogers, J" uniqKey="Rogers J">J Rogers</name>
</author>
<author><name sortKey="Abril, Jf" uniqKey="Abril J">JF Abril</name>
</author>
<author><name sortKey="Agarwal, P" uniqKey="Agarwal P">P Agarwal</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Langmead, B" uniqKey="Langmead B">B Langmead</name>
</author>
<author><name sortKey="Trapnell, C" uniqKey="Trapnell C">C Trapnell</name>
</author>
<author><name sortKey="Pop, M" uniqKey="Pop M">M Pop</name>
</author>
<author><name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zimin, A" uniqKey="Zimin A">A Zimin</name>
</author>
<author><name sortKey="Stevens, Ka" uniqKey="Stevens K">KA Stevens</name>
</author>
<author><name sortKey="Crepeau, Mw" uniqKey="Crepeau M">MW Crepeau</name>
</author>
<author><name sortKey="Holtz Morris, A" uniqKey="Holtz Morris A">A Holtz-Morris</name>
</author>
<author><name sortKey="Koriabine, M" uniqKey="Koriabine M">M Koriabine</name>
</author>
<author><name sortKey="Marais, G" uniqKey="Marais G">G Marais</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Luo, R" uniqKey="Luo R">R Luo</name>
</author>
<author><name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author><name sortKey="Xie, Y" uniqKey="Xie Y">Y Xie</name>
</author>
<author><name sortKey="Li, Z" uniqKey="Li Z">Z Li</name>
</author>
<author><name sortKey="Huang, W" uniqKey="Huang W">W Huang</name>
</author>
<author><name sortKey="Yuan, J" uniqKey="Yuan J">J Yuan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gurevich, A" uniqKey="Gurevich A">A Gurevich</name>
</author>
<author><name sortKey="Saveliev, V" uniqKey="Saveliev V">V Saveliev</name>
</author>
<author><name sortKey="Vyahhi, N" uniqKey="Vyahhi N">N Vyahhi</name>
</author>
<author><name sortKey="Tesler, G" uniqKey="Tesler G">G Tesler</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Merge
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001610 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001610 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Merge |type= RBID |clé= PMC:4471408 |texte= QuorUM: An Error Corrector for Illumina Reads }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Merge/RBID.i -Sk "pubmed:26083032" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Merge/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |