Exploiting topic modeling to boost metagenomic reads binning
Identifieur interne : 000567 ( Pmc/Curation ); précédent : 000566; suivant : 000568Exploiting topic modeling to boost metagenomic reads binning
Auteurs : Ruichang Zhang [République populaire de Chine] ; Zhanzhan Cheng [République populaire de Chine] ; Jihong Guan [République populaire de Chine] ; Shuigeng Zhou [République populaire de Chine]Source :
- BMC Bioinformatics [ 1471-2105 ] ; 2015.
Abstract
With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as
In this paper, we propose a new method
Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.
Url:
DOI: 10.1186/1471-2105-16-S5-S2
PubMed: 25859745
PubMed Central: 4402587
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000567
Links to Exploration step
PMC:4402587Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Exploiting topic modeling to boost metagenomic reads binning</title>
<author><name sortKey="Zhang, Ruichang" sort="Zhang, Ruichang" uniqKey="Zhang R" first="Ruichang" last="Zhang">Ruichang Zhang</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Cheng, Zhanzhan" sort="Cheng, Zhanzhan" uniqKey="Cheng Z" first="Zhanzhan" last="Cheng">Zhanzhan Cheng</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Guan, Jihong" sort="Guan, Jihong" uniqKey="Guan J" first="Jihong" last="Guan">Jihong Guan</name>
<affiliation wicri:level="1"><nlm:aff id="I2">Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Zhou, Shuigeng" sort="Zhou, Shuigeng" uniqKey="Zhou S" first="Shuigeng" last="Zhou">Shuigeng Zhou</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">25859745</idno>
<idno type="pmc">4402587</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4402587</idno>
<idno type="RBID">PMC:4402587</idno>
<idno type="doi">10.1186/1471-2105-16-S5-S2</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000567</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000567</idno>
<idno type="wicri:Area/Pmc/Curation">000567</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000567</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Exploiting topic modeling to boost metagenomic reads binning</title>
<author><name sortKey="Zhang, Ruichang" sort="Zhang, Ruichang" uniqKey="Zhang R" first="Ruichang" last="Zhang">Ruichang Zhang</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Cheng, Zhanzhan" sort="Cheng, Zhanzhan" uniqKey="Cheng Z" first="Zhanzhan" last="Cheng">Zhanzhan Cheng</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Guan, Jihong" sort="Guan, Jihong" uniqKey="Guan J" first="Jihong" last="Guan">Jihong Guan</name>
<affiliation wicri:level="1"><nlm:aff id="I2">Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Zhou, Shuigeng" sort="Zhou, Shuigeng" uniqKey="Zhou S" first="Shuigeng" last="Zhou">Shuigeng Zhou</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series><title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint><date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as <italic>binning </italic>
of metagenomic data.</p>
</sec>
<sec><title>Results</title>
<p>In this paper, we propose a new method <italic>TM-MCluster </italic>
for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions.</p>
</sec>
<sec><title>Conclusions</title>
<p>Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Qin, J" uniqKey="Qin J">J Qin</name>
</author>
<author><name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author><name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author><name sortKey="Arumugam, M" uniqKey="Arumugam M">M Arumugam</name>
</author>
<author><name sortKey="Burgdorf, Ks" uniqKey="Burgdorf K">KS Burgdorf</name>
</author>
<author><name sortKey="Manichanh, C" uniqKey="Manichanh C">C Manichanh</name>
</author>
<author><name sortKey="Nielsen, T" uniqKey="Nielsen T">T Nielsen</name>
</author>
<author><name sortKey="Pons, N" uniqKey="Pons N">N Pons</name>
</author>
<author><name sortKey="Levenez, F" uniqKey="Levenez F">F Levenez</name>
</author>
<author><name sortKey="Yamada, T" uniqKey="Yamada T">T Yamada</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Khachatryan, Za" uniqKey="Khachatryan Z">ZA Khachatryan</name>
</author>
<author><name sortKey="Ktsoyan, Za" uniqKey="Ktsoyan Z">ZA Ktsoyan</name>
</author>
<author><name sortKey="Manukyan, Gp" uniqKey="Manukyan G">GP Manukyan</name>
</author>
<author><name sortKey="Kelly, D" uniqKey="Kelly D">D Kelly</name>
</author>
<author><name sortKey="Ghazaryan, Ka" uniqKey="Ghazaryan K">KA Ghazaryan</name>
</author>
<author><name sortKey="Aminov, Ri" uniqKey="Aminov R">RI Aminov</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author><name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author><name sortKey="Barry, K" uniqKey="Barry K">K Barry</name>
</author>
<author><name sortKey="Shapiro, H" uniqKey="Shapiro H">H Shapiro</name>
</author>
<author><name sortKey="Goltsman, E" uniqKey="Goltsman E">E Goltsman</name>
</author>
<author><name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author><name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
<author><name sortKey="Salamov, A" uniqKey="Salamov A">A Salamov</name>
</author>
<author><name sortKey="Korzeniewski, F" uniqKey="Korzeniewski F">F Korzeniewski</name>
</author>
<author><name sortKey="Land, M" uniqKey="Land M">M Land</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author><name sortKey="Richter, Dc" uniqKey="Richter D">DC Richter</name>
</author>
<author><name sortKey="Mitra, S" uniqKey="Mitra S">S Mitra</name>
</author>
<author><name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author><name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author><name sortKey="Martin, Hg" uniqKey="Martin H">HG Martin</name>
</author>
<author><name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A Tsirigos</name>
</author>
<author><name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author><name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Stark, M" uniqKey="Stark M">M Stark</name>
</author>
<author><name sortKey="Berger, S" uniqKey="Berger S">S Berger</name>
</author>
<author><name sortKey="Stamatakis, A" uniqKey="Stamatakis A">A Stamatakis</name>
</author>
<author><name sortKey="Von Mering, C" uniqKey="Von Mering C">C von Mering</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Diaz, Nn" uniqKey="Diaz N">NN Diaz</name>
</author>
<author><name sortKey="Krause, L" uniqKey="Krause L">L Krause</name>
</author>
<author><name sortKey="Goesmann, A" uniqKey="Goesmann A">A Goesmann</name>
</author>
<author><name sortKey="Niehaus, K" uniqKey="Niehaus K">K Niehaus</name>
</author>
<author><name sortKey="Nattkemper, Tw" uniqKey="Nattkemper T">TW Nattkemper</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Brady, A" uniqKey="Brady A">A Brady</name>
</author>
<author><name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wu, Y W" uniqKey="Wu Y">Y-W Wu</name>
</author>
<author><name sortKey="Ye, Y" uniqKey="Ye Y">Y Ye</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author><name sortKey="Yiu, S M" uniqKey="Yiu S">S-M Yiu</name>
</author>
<author><name sortKey="Yang, B" uniqKey="Yang B">B Yang</name>
</author>
<author><name sortKey="Peng, Y" uniqKey="Peng Y">Y Peng</name>
</author>
<author><name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author><name sortKey="Liu, Z" uniqKey="Liu Z">Z Liu</name>
</author>
<author><name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author><name sortKey="Qin, J" uniqKey="Qin J">J Qin</name>
</author>
<author><name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author><name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author><name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author><name sortKey="Yiu, S M" uniqKey="Yiu S">S-M Yiu</name>
</author>
<author><name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author><name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author><name sortKey="Yiu, S M" uniqKey="Yiu S">S-M Yiu</name>
</author>
<author><name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author><name sortKey="Leung, Hc" uniqKey="Leung H">HC Leung</name>
</author>
<author><name sortKey="Yiu, Sm" uniqKey="Yiu S">SM Yiu</name>
</author>
<author><name sortKey="Chin, Fy" uniqKey="Chin F">FY Chin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liao, R" uniqKey="Liao R">R Liao</name>
</author>
<author><name sortKey="Zhang, R" uniqKey="Zhang R">R Zhang</name>
</author>
<author><name sortKey="Guan, J" uniqKey="Guan J">J Guan</name>
</author>
<author><name sortKey="Zhou, S" uniqKey="Zhou S">S Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Blei, D" uniqKey="Blei D">D Blei</name>
</author>
<author><name sortKey="Ng, A" uniqKey="Ng A">A Ng</name>
</author>
<author><name sortKey="Jordan, M" uniqKey="Jordan M">M Jordan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Frigui, H" uniqKey="Frigui H">H Frigui</name>
</author>
<author><name sortKey="Nasraoui, O" uniqKey="Nasraoui O">O Nasraoui</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Aso, T" uniqKey="Aso T">T Aso</name>
</author>
<author><name sortKey="Eguchi, K" uniqKey="Eguchi K">K Eguchi</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zheng, B" uniqKey="Zheng B">B Zheng</name>
</author>
<author><name sortKey="Mclean, Dc" uniqKey="Mclean D">DC McLean</name>
</author>
<author><name sortKey="Lu, X" uniqKey="Lu X">X Lu</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gerber, Gk" uniqKey="Gerber G">GK Gerber</name>
</author>
<author><name sortKey="Dowell, Rd" uniqKey="Dowell R">RD Dowell</name>
</author>
<author><name sortKey="Jaakkola, Ts" uniqKey="Jaakkola T">TS Jaakkola</name>
</author>
<author><name sortKey="Gifford, Dk" uniqKey="Gifford D">DK Gifford</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chen, X" uniqKey="Chen X">X Chen</name>
</author>
<author><name sortKey="Hu, X" uniqKey="Hu X">X Hu</name>
</author>
<author><name sortKey="Lim, Ty" uniqKey="Lim T">TY Lim</name>
</author>
<author><name sortKey="Shen, X" uniqKey="Shen X">X Shen</name>
</author>
<author><name sortKey="Park, E" uniqKey="Park E">E Park</name>
</author>
<author><name sortKey="Rosen, Gl" uniqKey="Rosen G">GL Rosen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chor, B" uniqKey="Chor B">B Chor</name>
</author>
<author><name sortKey="Horn, D" uniqKey="Horn D">D Horn</name>
</author>
<author><name sortKey="Goldman, N" uniqKey="Goldman N">N Goldman</name>
</author>
<author><name sortKey="Levy, Y" uniqKey="Levy Y">Y Levy</name>
</author>
<author><name sortKey="Massingham, T" uniqKey="Massingham T">T Massingham</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhou, F" uniqKey="Zhou F">F Zhou</name>
</author>
<author><name sortKey="Olman, V" uniqKey="Olman V">V Olman</name>
</author>
<author><name sortKey="Xu, Y" uniqKey="Xu Y">Y Xu</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Griffiths, Tl" uniqKey="Griffiths T">TL Griffiths</name>
</author>
<author><name sortKey="Steyvers, M" uniqKey="Steyvers M">M Steyvers</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Richter, Dc" uniqKey="Richter D">DC Richter</name>
</author>
<author><name sortKey="Ott, F" uniqKey="Ott F">F Ott</name>
</author>
<author><name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author><name sortKey="Schmid, R" uniqKey="Schmid R">R Schmid</name>
</author>
<author><name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
<author><name sortKey="Chapman, J" uniqKey="Chapman J">J Chapman</name>
</author>
<author><name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author><name sortKey="Allen, Ee" uniqKey="Allen E">EE Allen</name>
</author>
<author><name sortKey="Ram, Rj" uniqKey="Ram R">RJ Ram</name>
</author>
<author><name sortKey="Richardson, Pm" uniqKey="Richardson P">PM Richardson</name>
</author>
<author><name sortKey="Solovyev, Vv" uniqKey="Solovyev V">VV Solovyev</name>
</author>
<author><name sortKey="Rubin, Em" uniqKey="Rubin E">EM Rubin</name>
</author>
<author><name sortKey="Rokhsar, Ds" uniqKey="Rokhsar D">DS Rokhsar</name>
</author>
<author><name sortKey="Banfield, Jf" uniqKey="Banfield J">JF Banfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Larsen, B" uniqKey="Larsen B">B Larsen</name>
</author>
<author><name sortKey="Aone, C" uniqKey="Aone C">C Aone</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="abstract" xml:lang="en"><pmc-dir>properties open_access</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group><journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher><publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">25859745</article-id>
<article-id pub-id-type="pmc">4402587</article-id>
<article-id pub-id-type="publisher-id">1471-2105-16-S5-S2</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-16-S5-S2</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Proceedings</subject>
</subj-group>
</article-categories>
<title-group><article-title>Exploiting topic modeling to boost metagenomic reads binning</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" id="A1"><name><surname>Zhang</surname>
<given-names>Ruichang</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
</contrib>
<contrib contrib-type="author" id="A2"><name><surname>Cheng</surname>
<given-names>Zhanzhan</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
</contrib>
<contrib contrib-type="author" id="A3"><name><surname>Guan</surname>
<given-names>Jihong</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A4"><name><surname>Zhou</surname>
<given-names>Shuigeng</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>sgzhou@fudan.edu.cn</email>
</contrib>
</contrib-group>
<aff id="I1"><label>1</label>
Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China</aff>
<aff id="I2"><label>2</label>
Department of Computer Science and Technology, Tongji University, 4800 Cao'an Highway, Shanghai 201804, China</aff>
<pub-date pub-type="collection"><year>2015</year>
</pub-date>
<pub-date pub-type="epub"><day>18</day>
<month>3</month>
<year>2015</year>
</pub-date>
<volume>16</volume>
<issue>Suppl 5</issue>
<supplement><named-content content-type="supplement-title">Selected articles from the 10th International Symposium on Bioinformatics Research and Applications (ISBRA-14): Bioinformatics</named-content>
<named-content content-type="supplement-editor">Min Li, Jianxin Wang and Fangxiang Wu</named-content>
<named-content content-type="supplement-sponsor">Publication of this supplement has not been supported by sponsorship. Information about the source of funding for publication chargess can be found in the individual articles. Articles have been through the journal's standard peer review process. The Supplement Edtiors declare that they have no competing interests.</named-content>
</supplement>
<fpage>S2</fpage>
<lpage>S2</lpage>
<permissions><copyright-statement>Copyright © 2015 Zhang et al.; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2015</copyright-year>
<copyright-holder>Zhang et al.; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0"><license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0">http://creativecommons.org/licenses/by/4.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/16/S5/S2"></self-uri>
<abstract><sec><title>Background</title>
<p>With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as <italic>binning </italic>
of metagenomic data.</p>
</sec>
<sec><title>Results</title>
<p>In this paper, we propose a new method <italic>TM-MCluster </italic>
for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions.</p>
</sec>
<sec><title>Conclusions</title>
<p>Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.</p>
</sec>
</abstract>
<kwd-group><kwd>Metagenomics</kwd>
<kwd>Metagenomic data binning</kwd>
<kwd>Topic modeling</kwd>
</kwd-group>
<conference><conf-date>28-30 June 2014</conf-date>
<conf-name>10th International Symposium on Bioinformatics Research and Applications (ISBRA-14)</conf-name>
<conf-loc>Zhangjiajie, China</conf-loc>
</conference>
</article-meta>
</front>
</pmc>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000567 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000567 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Pmc |étape= Curation |type= RBID |clé= PMC:4402587 |texte= Exploiting topic modeling to boost metagenomic reads binning }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i -Sk "pubmed:25859745" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |