Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Exploiting topic modeling to boost metagenomic reads binning

Identifieur interne : 000567 ( Pmc/Curation ); précédent : 000566; suivant : 000568

Exploiting topic modeling to boost metagenomic reads binning

Auteurs : Ruichang Zhang [République populaire de Chine] ; Zhanzhan Cheng [République populaire de Chine] ; Jihong Guan [République populaire de Chine] ; Shuigeng Zhou [République populaire de Chine]

Source :

RBID : PMC:4402587

Abstract

Background

With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as binning of metagenomic data.

Results

In this paper, we propose a new method TM-MCluster for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions.

Conclusions

Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.


Url:
DOI: 10.1186/1471-2105-16-S5-S2
PubMed: 25859745
PubMed Central: 4402587

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:4402587

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Exploiting topic modeling to boost metagenomic reads binning</title>
<author>
<name sortKey="Zhang, Ruichang" sort="Zhang, Ruichang" uniqKey="Zhang R" first="Ruichang" last="Zhang">Ruichang Zhang</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Cheng, Zhanzhan" sort="Cheng, Zhanzhan" uniqKey="Cheng Z" first="Zhanzhan" last="Cheng">Zhanzhan Cheng</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Guan, Jihong" sort="Guan, Jihong" uniqKey="Guan J" first="Jihong" last="Guan">Jihong Guan</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Zhou, Shuigeng" sort="Zhou, Shuigeng" uniqKey="Zhou S" first="Shuigeng" last="Zhou">Shuigeng Zhou</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">25859745</idno>
<idno type="pmc">4402587</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4402587</idno>
<idno type="RBID">PMC:4402587</idno>
<idno type="doi">10.1186/1471-2105-16-S5-S2</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000567</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000567</idno>
<idno type="wicri:Area/Pmc/Curation">000567</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000567</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Exploiting topic modeling to boost metagenomic reads binning</title>
<author>
<name sortKey="Zhang, Ruichang" sort="Zhang, Ruichang" uniqKey="Zhang R" first="Ruichang" last="Zhang">Ruichang Zhang</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Cheng, Zhanzhan" sort="Cheng, Zhanzhan" uniqKey="Cheng Z" first="Zhanzhan" last="Cheng">Zhanzhan Cheng</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Guan, Jihong" sort="Guan, Jihong" uniqKey="Guan J" first="Jihong" last="Guan">Jihong Guan</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Zhou, Shuigeng" sort="Zhou, Shuigeng" uniqKey="Zhou S" first="Shuigeng" last="Zhou">Shuigeng Zhou</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as
<italic>binning </italic>
of metagenomic data.</p>
</sec>
<sec>
<title>Results</title>
<p>In this paper, we propose a new method
<italic>TM-MCluster </italic>
for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J Qin</name>
</author>
<author>
<name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Arumugam, M" uniqKey="Arumugam M">M Arumugam</name>
</author>
<author>
<name sortKey="Burgdorf, Ks" uniqKey="Burgdorf K">KS Burgdorf</name>
</author>
<author>
<name sortKey="Manichanh, C" uniqKey="Manichanh C">C Manichanh</name>
</author>
<author>
<name sortKey="Nielsen, T" uniqKey="Nielsen T">T Nielsen</name>
</author>
<author>
<name sortKey="Pons, N" uniqKey="Pons N">N Pons</name>
</author>
<author>
<name sortKey="Levenez, F" uniqKey="Levenez F">F Levenez</name>
</author>
<author>
<name sortKey="Yamada, T" uniqKey="Yamada T">T Yamada</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Khachatryan, Za" uniqKey="Khachatryan Z">ZA Khachatryan</name>
</author>
<author>
<name sortKey="Ktsoyan, Za" uniqKey="Ktsoyan Z">ZA Ktsoyan</name>
</author>
<author>
<name sortKey="Manukyan, Gp" uniqKey="Manukyan G">GP Manukyan</name>
</author>
<author>
<name sortKey="Kelly, D" uniqKey="Kelly D">D Kelly</name>
</author>
<author>
<name sortKey="Ghazaryan, Ka" uniqKey="Ghazaryan K">KA Ghazaryan</name>
</author>
<author>
<name sortKey="Aminov, Ri" uniqKey="Aminov R">RI Aminov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Barry, K" uniqKey="Barry K">K Barry</name>
</author>
<author>
<name sortKey="Shapiro, H" uniqKey="Shapiro H">H Shapiro</name>
</author>
<author>
<name sortKey="Goltsman, E" uniqKey="Goltsman E">E Goltsman</name>
</author>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
<author>
<name sortKey="Salamov, A" uniqKey="Salamov A">A Salamov</name>
</author>
<author>
<name sortKey="Korzeniewski, F" uniqKey="Korzeniewski F">F Korzeniewski</name>
</author>
<author>
<name sortKey="Land, M" uniqKey="Land M">M Land</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Richter, Dc" uniqKey="Richter D">DC Richter</name>
</author>
<author>
<name sortKey="Mitra, S" uniqKey="Mitra S">S Mitra</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Martin, Hg" uniqKey="Martin H">HG Martin</name>
</author>
<author>
<name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A Tsirigos</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stark, M" uniqKey="Stark M">M Stark</name>
</author>
<author>
<name sortKey="Berger, S" uniqKey="Berger S">S Berger</name>
</author>
<author>
<name sortKey="Stamatakis, A" uniqKey="Stamatakis A">A Stamatakis</name>
</author>
<author>
<name sortKey="Von Mering, C" uniqKey="Von Mering C">C von Mering</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Diaz, Nn" uniqKey="Diaz N">NN Diaz</name>
</author>
<author>
<name sortKey="Krause, L" uniqKey="Krause L">L Krause</name>
</author>
<author>
<name sortKey="Goesmann, A" uniqKey="Goesmann A">A Goesmann</name>
</author>
<author>
<name sortKey="Niehaus, K" uniqKey="Niehaus K">K Niehaus</name>
</author>
<author>
<name sortKey="Nattkemper, Tw" uniqKey="Nattkemper T">TW Nattkemper</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brady, A" uniqKey="Brady A">A Brady</name>
</author>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wu, Y W" uniqKey="Wu Y">Y-W Wu</name>
</author>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y Ye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author>
<name sortKey="Yiu, S M" uniqKey="Yiu S">S-M Yiu</name>
</author>
<author>
<name sortKey="Yang, B" uniqKey="Yang B">B Yang</name>
</author>
<author>
<name sortKey="Peng, Y" uniqKey="Peng Y">Y Peng</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Liu, Z" uniqKey="Liu Z">Z Liu</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J Qin</name>
</author>
<author>
<name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author>
<name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author>
<name sortKey="Yiu, S M" uniqKey="Yiu S">S-M Yiu</name>
</author>
<author>
<name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author>
<name sortKey="Yiu, S M" uniqKey="Yiu S">S-M Yiu</name>
</author>
<author>
<name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author>
<name sortKey="Yiu, Sm" uniqKey="Yiu S">SM Yiu</name>
</author>
<author>
<name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liao, R" uniqKey="Liao R">R Liao</name>
</author>
<author>
<name sortKey="Zhang, R" uniqKey="Zhang R">R Zhang</name>
</author>
<author>
<name sortKey="Guan, J" uniqKey="Guan J">J Guan</name>
</author>
<author>
<name sortKey="Zhou, S" uniqKey="Zhou S">S Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blei, D" uniqKey="Blei D">D Blei</name>
</author>
<author>
<name sortKey="Ng, A" uniqKey="Ng A">A Ng</name>
</author>
<author>
<name sortKey="Jordan, M" uniqKey="Jordan M">M Jordan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Frigui, H" uniqKey="Frigui H">H Frigui</name>
</author>
<author>
<name sortKey="Nasraoui, O" uniqKey="Nasraoui O">O Nasraoui</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aso, T" uniqKey="Aso T">T Aso</name>
</author>
<author>
<name sortKey="Eguchi, K" uniqKey="Eguchi K">K Eguchi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zheng, B" uniqKey="Zheng B">B Zheng</name>
</author>
<author>
<name sortKey="Mclean, Dc" uniqKey="Mclean D">DC McLean</name>
</author>
<author>
<name sortKey="Lu, X" uniqKey="Lu X">X Lu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gerber, Gk" uniqKey="Gerber G">GK Gerber</name>
</author>
<author>
<name sortKey="Dowell, Rd" uniqKey="Dowell R">RD Dowell</name>
</author>
<author>
<name sortKey="Jaakkola, Ts" uniqKey="Jaakkola T">TS Jaakkola</name>
</author>
<author>
<name sortKey="Gifford, Dk" uniqKey="Gifford D">DK Gifford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, X" uniqKey="Chen X">X Chen</name>
</author>
<author>
<name sortKey="Hu, X" uniqKey="Hu X">X Hu</name>
</author>
<author>
<name sortKey="Lim, Ty" uniqKey="Lim T">TY Lim</name>
</author>
<author>
<name sortKey="Shen, X" uniqKey="Shen X">X Shen</name>
</author>
<author>
<name sortKey="Park, E" uniqKey="Park E">E Park</name>
</author>
<author>
<name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chor, B" uniqKey="Chor B">B Chor</name>
</author>
<author>
<name sortKey="Horn, D" uniqKey="Horn D">D Horn</name>
</author>
<author>
<name sortKey="Goldman, N" uniqKey="Goldman N">N Goldman</name>
</author>
<author>
<name sortKey="Levy, Y" uniqKey="Levy Y">Y Levy</name>
</author>
<author>
<name sortKey="Massingham, T" uniqKey="Massingham T">T Massingham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, F" uniqKey="Zhou F">F Zhou</name>
</author>
<author>
<name sortKey="Olman, V" uniqKey="Olman V">V Olman</name>
</author>
<author>
<name sortKey="Xu, Y" uniqKey="Xu Y">Y Xu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Griffiths, Tl" uniqKey="Griffiths T">TL Griffiths</name>
</author>
<author>
<name sortKey="Steyvers, M" uniqKey="Steyvers M">M Steyvers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Richter, Dc" uniqKey="Richter D">DC Richter</name>
</author>
<author>
<name sortKey="Ott, F" uniqKey="Ott F">F Ott</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Schmid, R" uniqKey="Schmid R">R Schmid</name>
</author>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
<author>
<name sortKey="Chapman, J" uniqKey="Chapman J">J Chapman</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Allen, Ee" uniqKey="Allen E">EE Allen</name>
</author>
<author>
<name sortKey="Ram, Rj" uniqKey="Ram R">RJ Ram</name>
</author>
<author>
<name sortKey="Richardson, Pm" uniqKey="Richardson P">PM Richardson</name>
</author>
<author>
<name sortKey="Solovyev, Vv" uniqKey="Solovyev V">VV Solovyev</name>
</author>
<author>
<name sortKey="Rubin, Em" uniqKey="Rubin E">EM Rubin</name>
</author>
<author>
<name sortKey="Rokhsar, Ds" uniqKey="Rokhsar D">DS Rokhsar</name>
</author>
<author>
<name sortKey="Banfield, Jf" uniqKey="Banfield J">JF Banfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Larsen, B" uniqKey="Larsen B">B Larsen</name>
</author>
<author>
<name sortKey="Aone, C" uniqKey="Aone C">C Aone</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="abstract" xml:lang="en">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">25859745</article-id>
<article-id pub-id-type="pmc">4402587</article-id>
<article-id pub-id-type="publisher-id">1471-2105-16-S5-S2</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-16-S5-S2</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Proceedings</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Exploiting topic modeling to boost metagenomic reads binning</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" id="A1">
<name>
<surname>Zhang</surname>
<given-names>Ruichang</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Cheng</surname>
<given-names>Zhanzhan</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Guan</surname>
<given-names>Jihong</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A4">
<name>
<surname>Zhou</surname>
<given-names>Shuigeng</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>sgzhou@fudan.edu.cn</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</aff>
<aff id="I2">
<label>2</label>
Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804, China</aff>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<pub-date pub-type="epub">
<day>18</day>
<month>3</month>
<year>2015</year>
</pub-date>
<volume>16</volume>
<issue>Suppl 5</issue>
<supplement>
<named-content content-type="supplement-title">Selected articles from the 10th International Symposium on Bioinformatics Research and Applications (ISBRA-14): Bioinformatics</named-content>
<named-content content-type="supplement-editor">Min Li, Jianxin Wang and Fangxiang Wu</named-content>
<named-content content-type="supplement-sponsor">Publication of this supplement has not been supported by sponsorship. Information about the source of funding for publication chargess can be found in the individual articles. Articles have been through the journal's standard peer review process. The Supplement Edtiors declare that they have no competing interests.</named-content>
</supplement>
<fpage>S2</fpage>
<lpage>S2</lpage>
<permissions>
<copyright-statement>Copyright © 2015 Zhang et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2015</copyright-year>
<copyright-holder>Zhang et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0">http://creativecommons.org/licenses/by/4.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/16/S5/S2"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as
<italic>binning </italic>
of metagenomic data.</p>
</sec>
<sec>
<title>Results</title>
<p>In this paper, we propose a new method
<italic>TM-MCluster </italic>
for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.</p>
</sec>
</abstract>
<kwd-group>
<kwd>Metagenomics</kwd>
<kwd>Metagenomic data binning</kwd>
<kwd>Topic modeling</kwd>
</kwd-group>
<conference>
<conf-date>28-30 June 2014</conf-date>
<conf-name>10th International Symposium on Bioinformatics Research and Applications (ISBRA-14)</conf-name>
<conf-loc>Zhangjiajie, China</conf-loc>
</conference>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000567 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000567 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:4402587
   |texte=   Exploiting topic modeling to boost metagenomic reads binning
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:25859745" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021