Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Exploiting topic modeling to boost metagenomic reads binning.

Identifieur interne : 001552 ( PubMed/Checkpoint ); précédent : 001551; suivant : 001553

Exploiting topic modeling to boost metagenomic reads binning.

Auteurs : Ruichang Zhang ; Zhanzhan Cheng ; Jihong Guan ; Shuigeng Zhou

Source :

RBID : pubmed:25859745

Descripteurs français

English descriptors

Abstract

With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as binning of metagenomic data.

DOI: 10.1186/1471-2105-16-S5-S2
PubMed: 25859745


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:25859745

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Exploiting topic modeling to boost metagenomic reads binning.</title>
<author>
<name sortKey="Zhang, Ruichang" sort="Zhang, Ruichang" uniqKey="Zhang R" first="Ruichang" last="Zhang">Ruichang Zhang</name>
</author>
<author>
<name sortKey="Cheng, Zhanzhan" sort="Cheng, Zhanzhan" uniqKey="Cheng Z" first="Zhanzhan" last="Cheng">Zhanzhan Cheng</name>
</author>
<author>
<name sortKey="Guan, Jihong" sort="Guan, Jihong" uniqKey="Guan J" first="Jihong" last="Guan">Jihong Guan</name>
</author>
<author>
<name sortKey="Zhou, Shuigeng" sort="Zhou, Shuigeng" uniqKey="Zhou S" first="Shuigeng" last="Zhou">Shuigeng Zhou</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2015">2015</date>
<idno type="RBID">pubmed:25859745</idno>
<idno type="pmid">25859745</idno>
<idno type="doi">10.1186/1471-2105-16-S5-S2</idno>
<idno type="wicri:Area/PubMed/Corpus">001652</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001652</idno>
<idno type="wicri:Area/PubMed/Curation">001652</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001652</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001552</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001552</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Exploiting topic modeling to boost metagenomic reads binning.</title>
<author>
<name sortKey="Zhang, Ruichang" sort="Zhang, Ruichang" uniqKey="Zhang R" first="Ruichang" last="Zhang">Ruichang Zhang</name>
</author>
<author>
<name sortKey="Cheng, Zhanzhan" sort="Cheng, Zhanzhan" uniqKey="Cheng Z" first="Zhanzhan" last="Cheng">Zhanzhan Cheng</name>
</author>
<author>
<name sortKey="Guan, Jihong" sort="Guan, Jihong" uniqKey="Guan J" first="Jihong" last="Guan">Jihong Guan</name>
</author>
<author>
<name sortKey="Zhou, Shuigeng" sort="Zhou, Shuigeng" uniqKey="Zhou S" first="Shuigeng" last="Zhou">Shuigeng Zhou</name>
</author>
</analytic>
<series>
<title level="j">BMC bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2015" type="published">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Cluster Analysis</term>
<term>DNA Barcoding, Taxonomic (methods)</term>
<term>Genome, Bacterial</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Metagenome</term>
<term>Metagenomics (methods)</term>
<term>Microbiota (genetics)</term>
<term>Molecular Sequence Annotation (methods)</term>
<term>Phylogeny</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de regroupements</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Annotation de séquence moléculaire ()</term>
<term>Codage à barres de l'ADN pour la taxonomie ()</term>
<term>Génome bactérien</term>
<term>Logiciel</term>
<term>Microbiote (génétique)</term>
<term>Métagénome</term>
<term>Métagénomique ()</term>
<term>Phylogénie</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
<keywords scheme="MESH" qualifier="genetics" xml:lang="en">
<term>Microbiota</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>Microbiote</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>DNA Barcoding, Taxonomic</term>
<term>Metagenomics</term>
<term>Molecular Sequence Annotation</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Cluster Analysis</term>
<term>Genome, Bacterial</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Metagenome</term>
<term>Phylogeny</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de regroupements</term>
<term>Analyse de séquence d'ADN</term>
<term>Annotation de séquence moléculaire</term>
<term>Codage à barres de l'ADN pour la taxonomie</term>
<term>Génome bactérien</term>
<term>Logiciel</term>
<term>Métagénome</term>
<term>Métagénomique</term>
<term>Phylogénie</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as binning of metagenomic data.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" IndexingMethod="Curated" Owner="NLM">
<PMID Version="1">25859745</PMID>
<DateCompleted>
<Year>2015</Year>
<Month>07</Month>
<Day>28</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>12</Month>
<Day>02</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1471-2105</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>16 Suppl 5</Volume>
<PubDate>
<Year>2015</Year>
</PubDate>
</JournalIssue>
<Title>BMC bioinformatics</Title>
<ISOAbbreviation>BMC Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>Exploiting topic modeling to boost metagenomic reads binning.</ArticleTitle>
<Pagination>
<MedlinePgn>S2</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1186/1471-2105-16-S5-S2</ELocationID>
<Abstract>
<AbstractText Label="BACKGROUND" NlmCategory="BACKGROUND">With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as binning of metagenomic data.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">In this paper, we propose a new method TM-MCluster for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions.</AbstractText>
<AbstractText Label="CONCLUSIONS" NlmCategory="CONCLUSIONS">Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Zhang</LastName>
<ForeName>Ruichang</ForeName>
<Initials>R</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Cheng</LastName>
<ForeName>Zhanzhan</ForeName>
<Initials>Z</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Guan</LastName>
<ForeName>Jihong</ForeName>
<Initials>J</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Zhou</LastName>
<ForeName>Shuigeng</ForeName>
<Initials>S</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2015</Year>
<Month>03</Month>
<Day>18</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>BMC Bioinformatics</MedlineTA>
<NlmUniqueID>100965194</NlmUniqueID>
<ISSNLinking>1471-2105</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="Y">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016000" MajorTopicYN="N">Cluster Analysis</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D058893" MajorTopicYN="N">DNA Barcoding, Taxonomic</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016680" MajorTopicYN="N">Genome, Bacterial</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D054892" MajorTopicYN="Y">Metagenome</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D056186" MajorTopicYN="N">Metagenomics</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D064307" MajorTopicYN="N">Microbiota</DescriptorName>
<QualifierName UI="Q000235" MajorTopicYN="Y">genetics</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D058977" MajorTopicYN="N">Molecular Sequence Annotation</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D010802" MajorTopicYN="N">Phylogeny</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012984" MajorTopicYN="N">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="entrez">
<Year>2015</Year>
<Month>4</Month>
<Day>11</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2015</Year>
<Month>4</Month>
<Day>11</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2015</Year>
<Month>7</Month>
<Day>29</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">25859745</ArticleId>
<ArticleId IdType="pii">1471-2105-16-S5-S2</ArticleId>
<ArticleId IdType="doi">10.1186/1471-2105-16-S5-S2</ArticleId>
<ArticleId IdType="pmc">PMC4402587</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>BMC Bioinformatics. 2009;10 Suppl 1:S12</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19208111</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):980-91</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21844637</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Comput Biol. 2011 Mar;18(3):523-34</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21385052</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2009;10:56</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19210774</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2008;3(8):e3064</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18725973</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2012 Sep 15;28(18):i356-i362</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22962452</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Methods. 2007 Jan;4(1):63-72</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17179938</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2004 Mar 4;428(6978):37-43</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14961025</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1:5228-35</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14872004</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Inform. 2009 Oct;23(1):3-12</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20180257</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Genomics. 2014;15 Suppl 1:S12</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24564377</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2008;9:546</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19091119</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2010 Mar 4;464(7285):59-65</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20203603</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Comput Biol. 2012 Feb;19(2):241-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22300323</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Methods. 2009 Sep;6(9):673-6</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19648916</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):42-54</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26355506</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2006;7:58</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16466569</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2008;3(10):e3373</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18841204</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Jun 1;27(11):1489-95</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21493653</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Genomics. 2010;11:461</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20687950</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Methods. 2007 Jun;4(6):495-500</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17468765</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2009;10(10):R108</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19814784</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
<affiliations>
<list></list>
<tree>
<noCountry>
<name sortKey="Cheng, Zhanzhan" sort="Cheng, Zhanzhan" uniqKey="Cheng Z" first="Zhanzhan" last="Cheng">Zhanzhan Cheng</name>
<name sortKey="Guan, Jihong" sort="Guan, Jihong" uniqKey="Guan J" first="Jihong" last="Guan">Jihong Guan</name>
<name sortKey="Zhang, Ruichang" sort="Zhang, Ruichang" uniqKey="Zhang R" first="Ruichang" last="Zhang">Ruichang Zhang</name>
<name sortKey="Zhou, Shuigeng" sort="Zhou, Shuigeng" uniqKey="Zhou S" first="Shuigeng" last="Zhou">Shuigeng Zhou</name>
</noCountry>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001552 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 001552 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:25859745
   |texte=   Exploiting topic modeling to boost metagenomic reads binning.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i   -Sk "pubmed:25859745" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021