MersV1, PubMed, Checkpoint, bibRecord, 000C26

MetaCache: context-aware classification of metagenomic reads using minhashing.

Identifieur interne : 000C26 ( PubMed/Checkpoint ); précédent : 000C25; suivant : 000C27

MetaCache: context-aware classification of metagenomic reads using minhashing.

Auteurs : André Müller ; Christian Hundt ; Andreas Hildebrandt ; Thomas Hankeln [Allemagne] ; Bertil Schmidt

Source :

Bioinformatics (Oxford, England) [ 1367-4811 ] ; 2017.

RBID : pubmed:28961782

Descripteurs français

KwdFr :
- Algorithmes, Analyse de séquence d'ADN, Humains, Logiciel, Métagénomique (), Séquençage nucléotidique à haut débit.
MESH :
- Algorithmes, Analyse de séquence d'ADN, Humains, Logiciel, Métagénomique, Séquençage nucléotidique à haut débit.

English descriptors

KwdEn :
- Algorithms, High-Throughput Nucleotide Sequencing, Humans, Metagenomics (methods), Sequence Analysis, DNA, Software.
MESH :
- methods : Metagenomics.
- Algorithms, High-Throughput Nucleotide Sequencing, Humans, Sequence Analysis, DNA, Software.

Abstract

Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy.

DOI: 10.1093/bioinformatics/btx520
PubMed: 28961782

Affiliations:

Links toward previous steps (curation, corpus...)

to stream PubMed, to step Corpus: 000B35
to stream PubMed, to step Curation: 000B35

Links to Exploration step

pubmed:28961782

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">MetaCache: context-aware classification of metagenomic reads using minhashing.</title>
<author><name sortKey="Muller, Andre" sort="Muller, Andre" uniqKey="Muller A" first="André" last="Müller">André Müller</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Hundt, Christian" sort="Hundt, Christian" uniqKey="Hundt C" first="Christian" last="Hundt">Christian Hundt</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Hildebrandt, Andreas" sort="Hildebrandt, Andreas" uniqKey="Hildebrandt A" first="Andreas" last="Hildebrandt">Andreas Hildebrandt</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Hankeln, Thomas" sort="Hankeln, Thomas" uniqKey="Hankeln T" first="Thomas" last="Hankeln">Thomas Hankeln</name>
<affiliation wicri:level="3"><nlm:affiliation>Molecular Genetics and Genome Analysis Group, Department of Biology, Department of Biology, Johannes Gutenberg University, 55128 Mainz, Germany.</nlm:affiliation>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Molecular Genetics and Genome Analysis Group, Department of Biology, Department of Biology, Johannes Gutenberg University, 55128 Mainz</wicri:regionArea>
<placeName><region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Mayence</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Schmidt, Bertil" sort="Schmidt, Bertil" uniqKey="Schmidt B" first="Bertil" last="Schmidt">Bertil Schmidt</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2017">2017</date>
<idno type="RBID">pubmed:28961782</idno>
<idno type="pmid">28961782</idno>
<idno type="doi">10.1093/bioinformatics/btx520</idno>
<idno type="wicri:Area/PubMed/Corpus">000B35</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000B35</idno>
<idno type="wicri:Area/PubMed/Curation">000B35</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000B35</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000C26</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000C26</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">MetaCache: context-aware classification of metagenomic reads using minhashing.</title>
<author><name sortKey="Muller, Andre" sort="Muller, Andre" uniqKey="Muller A" first="André" last="Müller">André Müller</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Hundt, Christian" sort="Hundt, Christian" uniqKey="Hundt C" first="Christian" last="Hundt">Christian Hundt</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Hildebrandt, Andreas" sort="Hildebrandt, Andreas" uniqKey="Hildebrandt A" first="Andreas" last="Hildebrandt">Andreas Hildebrandt</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
<author><name sortKey="Hankeln, Thomas" sort="Hankeln, Thomas" uniqKey="Hankeln T" first="Thomas" last="Hankeln">Thomas Hankeln</name>
<affiliation wicri:level="3"><nlm:affiliation>Molecular Genetics and Genome Analysis Group, Department of Biology, Department of Biology, Johannes Gutenberg University, 55128 Mainz, Germany.</nlm:affiliation>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Molecular Genetics and Genome Analysis Group, Department of Biology, Department of Biology, Johannes Gutenberg University, 55128 Mainz</wicri:regionArea>
<placeName><region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Mayence</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Schmidt, Bertil" sort="Schmidt, Bertil" uniqKey="Schmidt B" first="Bertil" last="Schmidt">Bertil Schmidt</name>
<affiliation><nlm:affiliation>Department of Computer Science.</nlm:affiliation>
<wicri:noCountry code="no comma">Department of Computer Science.</wicri:noCountry>
</affiliation>
</author>
</analytic>
<series><title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2017" type="published">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Metagenomics (methods)</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Métagénomique ()</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Metagenomics</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Métagénomique</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="MEDLINE" IndexingMethod="Curated" Owner="NLM"><PMID Version="1">28961782</PMID>
<DateCompleted><Year>2018</Year>
<Month>08</Month>
<Day>02</Day>
</DateCompleted>
<DateRevised><Year>2018</Year>
<Month>12</Month>
<Day>02</Day>
</DateRevised>
<Article PubModel="Print"><Journal><ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>33</Volume>
<Issue>23</Issue>
<PubDate><Year>2017</Year>
<Month>Dec</Month>
<Day>01</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>MetaCache: context-aware classification of metagenomic reads using minhashing.</ArticleTitle>
<Pagination><MedlinePgn>3740-3748</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1093/bioinformatics/btx520</ELocationID>
<Abstract><AbstractText Label="Motivation" NlmCategory="UNASSIGNED">Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy.</AbstractText>
<AbstractText Label="Results" NlmCategory="UNASSIGNED">We introduce MetaCache-a novel software for read classification using the big data technique minhashing. Our approach performs context-aware classification of reads by computing representative subsamples of k-mers within both, probed reads and locally constrained regions of the reference genomes. As a result, MetaCache consumes significantly less memory compared to the state-of-the-art read classifiers Kraken and CLARK while achieving highly competitive sensitivity and precision at comparable speed. For example, using NCBI RefSeq draft and completed genomes with a total length of around 140 billion bases as reference, MetaCache's database consumes only 62 GB of memory while both Kraken and CLARK fail to construct their respective databases on a workstation with 512 GB RAM. Our experimental results further show that classification accuracy continuously improves when increasing the amount of utilized reference genome data.</AbstractText>
<AbstractText Label="Availability and implementation" NlmCategory="UNASSIGNED">MetaCache is open source software written in C ++ and can be downloaded at http://github.com/muellan/metacache.</AbstractText>
<AbstractText Label="Contact" NlmCategory="UNASSIGNED">bertil.schmidt@uni-mainz.de.</AbstractText>
<AbstractText Label="Supplementary information" NlmCategory="UNASSIGNED">Supplementary data are available at Bioinformatics online.</AbstractText>
<CopyrightInformation>© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Müller</LastName>
<ForeName>André</ForeName>
<Initials>A</Initials>
<AffiliationInfo><Affiliation>Department of Computer Science.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Hundt</LastName>
<ForeName>Christian</ForeName>
<Initials>C</Initials>
<AffiliationInfo><Affiliation>Department of Computer Science.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Hildebrandt</LastName>
<ForeName>Andreas</ForeName>
<Initials>A</Initials>
<AffiliationInfo><Affiliation>Department of Computer Science.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Hankeln</LastName>
<ForeName>Thomas</ForeName>
<Initials>T</Initials>
<AffiliationInfo><Affiliation>Molecular Genetics and Genome Analysis Group, Department of Biology, Department of Biology, Johannes Gutenberg University, 55128 Mainz, Germany.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Schmidt</LastName>
<ForeName>Bertil</ForeName>
<Initials>B</Initials>
<AffiliationInfo><Affiliation>Department of Computer Science.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo><Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList><MeshHeading><DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D056186" MajorTopicYN="N">Metagenomics</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D012984" MajorTopicYN="Y">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2017</Year>
<Month>02</Month>
<Day>15</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2017</Year>
<Month>08</Month>
<Day>14</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2017</Year>
<Month>9</Month>
<Day>30</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2018</Year>
<Month>8</Month>
<Day>3</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2017</Year>
<Month>9</Month>
<Day>30</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">28961782</ArticleId>
<ArticleId IdType="pii">4083578</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btx520</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations><list><country><li>Allemagne</li>
</country>
<region><li>Rhénanie-Palatinat</li>
</region>
<settlement><li>Mayence</li>
</settlement>
</list>
<tree><noCountry><name sortKey="Hildebrandt, Andreas" sort="Hildebrandt, Andreas" uniqKey="Hildebrandt A" first="Andreas" last="Hildebrandt">Andreas Hildebrandt</name>
<name sortKey="Hundt, Christian" sort="Hundt, Christian" uniqKey="Hundt C" first="Christian" last="Hundt">Christian Hundt</name>
<name sortKey="Muller, Andre" sort="Muller, Andre" uniqKey="Muller A" first="André" last="Müller">André Müller</name>
<name sortKey="Schmidt, Bertil" sort="Schmidt, Bertil" uniqKey="Schmidt B" first="Bertil" last="Schmidt">Bertil Schmidt</name>
</noCountry>
<country name="Allemagne"><region name="Rhénanie-Palatinat"><name sortKey="Hankeln, Thomas" sort="Hankeln, Thomas" uniqKey="Hankeln T" first="Thomas" last="Hankeln">Thomas Hankeln</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Checkpoint

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C26 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 000C26 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:28961782
   |texte=   MetaCache: context-aware classification of metagenomic reads using minhashing.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i   -Sk "pubmed:28961782" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021

	Serveur d'exploration MERS
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration MERS

MetaCache: context-aware classification of metagenomic reads using minhashing.

MetaCache: context-aware classification of metagenomic reads using minhashing.

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki