Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample

Identifieur interne : 002233 ( Main/Exploration ); précédent : 002232; suivant : 002234

MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample

Auteurs : Yi Wang ; Henry C. M. Leung ; S. M. Yiu ; Francis Y. L. Chin

Source :

RBID : PMC:3436824

Descripteurs français

English descriptors

Abstract

Motivation: Metagenomic binning remains an important topic in metagenomic analysis. Existing unsupervised binning methods for next-generation sequencing (NGS) reads do not perform well on (i) samples with low-abundance species or (ii) samples (even with high abundance) when there are many extremely low-abundance species. These two problems are common for real metagenomic datasets. Binning methods that can solve these problems are desirable.

Results: We proposed a two-round binning method (MetaCluster 5.0) that aims at identifying both low-abundance and high-abundance species in the presence of a large amount of noise due to many extremely low-abundance species. In summary, MetaCluster 5.0 uses a filtering strategy to remove noise from the extremely low-abundance species. It separate reads of high-abundance species from those of low-abundance species in two different rounds. To overcome the issue of low coverage for low-abundance species, multiple w values are used to group reads with overlapping w-mers, whereas reads from high-abundance species are grouped with high confidence based on a large w and then binning expands to low-abundance species using a relaxed (shorter) w. Compared to the recent tools, TOSS and MetaCluster 4.0, MetaCluster 5.0 can find more species (especially those with low abundance of say 6× to 10×) and can achieve better sensitivity and specificity using less memory and running time.

Availability:http://i.cs.hku.hk/~alse/MetaCluster/

Contact:chin@cs.hku.hk


Url:
DOI: 10.1093/bioinformatics/bts397
PubMed: 22962452
PubMed Central: 3436824


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample</title>
<author>
<name sortKey="Wang, Yi" sort="Wang, Yi" uniqKey="Wang Y" first="Yi" last="Wang">Yi Wang</name>
</author>
<author>
<name sortKey="Leung, Henry C M" sort="Leung, Henry C M" uniqKey="Leung H" first="Henry C. M." last="Leung">Henry C. M. Leung</name>
</author>
<author>
<name sortKey="Yiu, S M" sort="Yiu, S M" uniqKey="Yiu S" first="S. M." last="Yiu">S. M. Yiu</name>
</author>
<author>
<name sortKey="Chin, Francis Y L" sort="Chin, Francis Y L" uniqKey="Chin F" first="Francis Y. L." last="Chin">Francis Y. L. Chin</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">22962452</idno>
<idno type="pmc">3436824</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436824</idno>
<idno type="RBID">PMC:3436824</idno>
<idno type="doi">10.1093/bioinformatics/bts397</idno>
<date when="2012">2012</date>
<idno type="wicri:Area/Pmc/Corpus">000B08</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000B08</idno>
<idno type="wicri:Area/Pmc/Curation">000B08</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000B08</idno>
<idno type="wicri:Area/Pmc/Checkpoint">001295</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">001295</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:22962452</idno>
<idno type="wicri:Area/PubMed/Corpus">001D51</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001D51</idno>
<idno type="wicri:Area/PubMed/Curation">001D51</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001D51</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001C59</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001C59</idno>
<idno type="wicri:Area/Ncbi/Merge">000992</idno>
<idno type="wicri:Area/Ncbi/Curation">000992</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000992</idno>
<idno type="wicri:doubleKey">1367-4803:2012:Wang Y:metacluster:a:two</idno>
<idno type="wicri:Area/Main/Merge">002258</idno>
<idno type="wicri:Area/Main/Curation">002233</idno>
<idno type="wicri:Area/Main/Exploration">002233</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample</title>
<author>
<name sortKey="Wang, Yi" sort="Wang, Yi" uniqKey="Wang Y" first="Yi" last="Wang">Yi Wang</name>
</author>
<author>
<name sortKey="Leung, Henry C M" sort="Leung, Henry C M" uniqKey="Leung H" first="Henry C. M." last="Leung">Henry C. M. Leung</name>
</author>
<author>
<name sortKey="Yiu, S M" sort="Yiu, S M" uniqKey="Yiu S" first="S. M." last="Yiu">S. M. Yiu</name>
</author>
<author>
<name sortKey="Chin, Francis Y L" sort="Chin, Francis Y L" uniqKey="Chin F" first="Francis Y. L." last="Chin">Francis Y. L. Chin</name>
</author>
</analytic>
<series>
<title level="j">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1367-4811</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Metagenomics (methods)</term>
<term>Sensitivity and Specificity</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Logiciel</term>
<term>Métagénomique ()</term>
<term>Sensibilité et spécificité</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Metagenomics</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Sensitivity and Specificity</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Logiciel</term>
<term>Métagénomique</term>
<term>Sensibilité et spécificité</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>
<bold>Motivation:</bold>
Metagenomic binning remains an important topic in metagenomic analysis. Existing unsupervised binning methods for next-generation sequencing (NGS) reads do not perform well on (i) samples with low-abundance species or (ii) samples (even with high abundance) when there are many extremely low-abundance species. These two problems are common for real metagenomic datasets. Binning methods that can solve these problems are desirable.</p>
<p>
<bold>Results:</bold>
We proposed a two-round binning method (MetaCluster 5.0) that aims at identifying both low-abundance and high-abundance species in the presence of a large amount of noise due to many extremely low-abundance species. In summary, MetaCluster 5.0 uses a filtering strategy to remove noise from the extremely low-abundance species. It separate reads of high-abundance species from those of low-abundance species in two different rounds. To overcome the issue of low coverage for low-abundance species, multiple
<italic>w</italic>
values are used to group reads with overlapping
<italic>w</italic>
-mers, whereas reads from high-abundance species are grouped with high confidence based on a large
<italic>w</italic>
and then binning expands to low-abundance species using a relaxed (shorter)
<italic>w</italic>
. Compared to the recent tools, TOSS and MetaCluster 4.0, MetaCluster 5.0 can find more species (especially those with low abundance of say 6× to 10×) and can achieve better sensitivity and specificity using less memory and running time.</p>
<p>
<bold>Availability:</bold>
<ext-link ext-link-type="uri" xlink:href="http://i.cs.hku.hk/~alse/MetaCluster/">http://i.cs.hku.hk/~alse/MetaCluster/</ext-link>
</p>
<p>
<bold>Contact:</bold>
<email>chin@cs.hku.hk</email>
</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Amann, R I" uniqKey="Amann R">R.I. Amann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brady, A" uniqKey="Brady A">A. Brady</name>
</author>
<author>
<name sortKey="Salzberg, S L" uniqKey="Salzberg S">S.L. Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Case, R J" uniqKey="Case R">R.J. Case</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chatterji, S" uniqKey="Chatterji S">S. Chatterji</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cole, J" uniqKey="Cole J">J. Cole</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eisen, J A" uniqKey="Eisen J">J.A. Eisen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fofanov, Y" uniqKey="Fofanov Y">Y. Fofanov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garcia Martin, H" uniqKey="Garcia Martin H">H. Garcia Martin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kent, W J" uniqKey="Kent W">W.J. Kent</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Khachatryan, Z A" uniqKey="Khachatryan Z">Z.A. Khachatryan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, A C" uniqKey="Mchardy A">A.C. McHardy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Prabhakara, S" uniqKey="Prabhakara S">S. Prabhakara</name>
</author>
<author>
<name sortKey="Acharya, R" uniqKey="Acharya R">R. Acharya</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J. Qin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tanaseichuk, O" uniqKey="Tanaseichuk O">O. Tanaseichuk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H. Teeling</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, Y W" uniqKey="Wu Y">Y.W. Wu</name>
</author>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y. Ye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, B" uniqKey="Yang B">B. Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, B" uniqKey="Yang B">B. Yang</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list></list>
<tree>
<noCountry>
<name sortKey="Chin, Francis Y L" sort="Chin, Francis Y L" uniqKey="Chin F" first="Francis Y. L." last="Chin">Francis Y. L. Chin</name>
<name sortKey="Leung, Henry C M" sort="Leung, Henry C M" uniqKey="Leung H" first="Henry C. M." last="Leung">Henry C. M. Leung</name>
<name sortKey="Wang, Yi" sort="Wang, Yi" uniqKey="Wang Y" first="Yi" last="Wang">Yi Wang</name>
<name sortKey="Yiu, S M" sort="Yiu, S M" uniqKey="Yiu S" first="S. M." last="Yiu">S. M. Yiu</name>
</noCountry>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002233 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002233 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:3436824
   |texte=   MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:22962452" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021