Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Unsupervised binning of environmental genomic fragments based on an error robust selection of l-mers

Identifieur interne : 001346 ( Pmc/Checkpoint ); précédent : 001345; suivant : 001347

Unsupervised binning of environmental genomic fragments based on an error robust selection of l-mers

Auteurs : Bin Yang [République populaire de Chine, Hong Kong] ; Yu Peng [Hong Kong] ; Henry Chi-Ming Leung [Hong Kong] ; Siu-Ming Yiu [Hong Kong] ; Jing-Chi Chen [Hong Kong] ; Francis Yuk-Lun Chin [Hong Kong]

Source :

RBID : PMC:3165929

Abstract

Background

With the rapid development of genome sequencing techniques, traditional research methods based on the isolation and cultivation of microorganisms are being gradually replaced by metagenomics, which is also known as environmental genomics. The first step, which is still a major bottleneck, of metagenomics is the taxonomic characterization of DNA fragments (reads) resulting from sequencing a sample of mixed species. This step is usually referred as “binning”. Existing binning methods are based on supervised or semi-supervised approaches which rely heavily on reference genomes of known microorganisms and phylogenetic marker genes. Due to the limited availability of reference genomes and the bias and instability of marker genes, existing binning methods may not be applicable in many cases.

Results

In this paper, we present an unsupervised binning method based on the distribution of a carefully selected set of l-mers (substrings of length l in DNA fragments). From our experiments, we show that our method can accurately bin DNA fragments with various lengths and relative species abundance ratios without using any reference and training datasets.

Another feature of our method is its error robustness. The binning accuracy decreases by less than 1% when the sequencing error rate increases from 0% to 5%. Note that the typical sequencing error rate of existing commercial sequencing platforms is less than 2%.

Conclusions

We provide a new and effective tool to solve the metagenome binning problem without using any reference datasets or markers information of any known reference genomes (species). The source code of our software tool, the reference genomes of the species for generating the test datasets and the corresponding test datasets are available at http://i.cs.hku.hk/~alse/MetaCluster/.


Url:
DOI: 10.1186/1471-2105-11-S2-S5
PubMed: 20406503
PubMed Central: 3165929


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:3165929

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Unsupervised binning of environmental genomic fragments based on an error robust selection of
<italic>l</italic>
-mers</title>
<author>
<name sortKey="Yang, Bin" sort="Yang, Bin" uniqKey="Yang B" first="Bin" last="Yang">Bin Yang</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096 PR China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing, Jiangsu</wicri:regionArea>
<wicri:noRegion>Jiangsu</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Peng, Yu" sort="Peng, Yu" uniqKey="Peng Y" first="Yu" last="Peng">Yu Peng</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Leung, Henry Chi Ming" sort="Leung, Henry Chi Ming" uniqKey="Leung H" first="Henry Chi-Ming" last="Leung">Henry Chi-Ming Leung</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Yiu, Siu Ming" sort="Yiu, Siu Ming" uniqKey="Yiu S" first="Siu-Ming" last="Yiu">Siu-Ming Yiu</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Chen, Jing Chi" sort="Chen, Jing Chi" uniqKey="Chen J" first="Jing-Chi" last="Chen">Jing-Chi Chen</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Chin, Francis Yuk Lun" sort="Chin, Francis Yuk Lun" uniqKey="Chin F" first="Francis Yuk-Lun" last="Chin">Francis Yuk-Lun Chin</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">20406503</idno>
<idno type="pmc">3165929</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3165929</idno>
<idno type="RBID">PMC:3165929</idno>
<idno type="doi">10.1186/1471-2105-11-S2-S5</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">000A89</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000A89</idno>
<idno type="wicri:Area/Pmc/Curation">000A89</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000A89</idno>
<idno type="wicri:Area/Pmc/Checkpoint">001346</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">001346</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Unsupervised binning of environmental genomic fragments based on an error robust selection of
<italic>l</italic>
-mers</title>
<author>
<name sortKey="Yang, Bin" sort="Yang, Bin" uniqKey="Yang B" first="Bin" last="Yang">Bin Yang</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096 PR China</nlm:aff>
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing, Jiangsu</wicri:regionArea>
<wicri:noRegion>Jiangsu</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Peng, Yu" sort="Peng, Yu" uniqKey="Peng Y" first="Yu" last="Peng">Yu Peng</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Leung, Henry Chi Ming" sort="Leung, Henry Chi Ming" uniqKey="Leung H" first="Henry Chi-Ming" last="Leung">Henry Chi-Ming Leung</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Yiu, Siu Ming" sort="Yiu, Siu Ming" uniqKey="Yiu S" first="Siu-Ming" last="Yiu">Siu-Ming Yiu</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Chen, Jing Chi" sort="Chen, Jing Chi" uniqKey="Chen J" first="Jing-Chi" last="Chen">Jing-Chi Chen</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Chin, Francis Yuk Lun" sort="Chin, Francis Yuk Lun" uniqKey="Chin F" first="Francis Yuk-Lun" last="Chin">Francis Yuk-Lun Chin</name>
<affiliation wicri:level="1">
<nlm:aff id="I2">Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</nlm:aff>
<country xml:lang="fr">Hong Kong</country>
<wicri:regionArea>Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road</wicri:regionArea>
<wicri:noRegion>Pok Fu Lam Road</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>With the rapid development of genome sequencing techniques, traditional research methods based on the isolation and cultivation of microorganisms are being gradually replaced by metagenomics, which is also known as environmental genomics. The first step, which is still a major bottleneck, of metagenomics is the taxonomic characterization of DNA fragments (reads) resulting from sequencing a sample of mixed species. This step is usually referred as “binning”. Existing binning methods are based on supervised or semi-supervised approaches which rely heavily on reference genomes of known microorganisms and phylogenetic marker genes. Due to the limited availability of reference genomes and the bias and instability of marker genes, existing binning methods may not be applicable in many cases.</p>
</sec>
<sec>
<title>Results</title>
<p>In this paper, we present an unsupervised binning method based on the distribution of a carefully selected set of
<italic>l</italic>
-mers (substrings of length
<italic>l</italic>
in DNA fragments). From our experiments, we show that our method can accurately bin DNA fragments with various lengths and relative species abundance ratios without using any reference and training datasets.</p>
<p>Another feature of our method is its error robustness. The binning accuracy decreases by less than 1% when the sequencing error rate increases from 0% to 5%. Note that the typical sequencing error rate of existing commercial sequencing platforms is less than 2%.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We provide a new and effective tool to solve the metagenome binning problem without using any reference datasets or markers information of any known reference genomes (species). The source code of our software tool, the reference genomes of the species for generating the test datasets and the corresponding test datasets are available at
<ext-link ext-link-type="uri" xlink:href="http://i.cs.hku.hk/~alse/MetaCluster/">http://i.cs.hku.hk/~alse/MetaCluster/</ext-link>
.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Cobb, Cm" uniqKey="Cobb C">CM Cobb</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Khachatryan, Za" uniqKey="Khachatryan Z">ZA Khachatryan</name>
</author>
<author>
<name sortKey="Ktsoyan, Za" uniqKey="Ktsoyan Z">ZA Ktsoyan</name>
</author>
<author>
<name sortKey="Manukyan, Gp" uniqKey="Manukyan G">GP Manukyan</name>
</author>
<author>
<name sortKey="Kelly, D" uniqKey="Kelly D">D Kelly</name>
</author>
<author>
<name sortKey="Ghazaryan, Ka" uniqKey="Ghazaryan K">KA Ghazaryan</name>
</author>
<author>
<name sortKey="Aminov, Ri" uniqKey="Aminov R">RI Aminov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Amann, Ri" uniqKey="Amann R">RI Amann</name>
</author>
<author>
<name sortKey="Binder, Bj" uniqKey="Binder B">BJ Binder</name>
</author>
<author>
<name sortKey="Olson, Rj" uniqKey="Olson R">RJ Olson</name>
</author>
<author>
<name sortKey="Chisholm, Sw" uniqKey="Chisholm S">SW Chisholm</name>
</author>
<author>
<name sortKey="Devereux, R" uniqKey="Devereux R">R Devereux</name>
</author>
<author>
<name sortKey="Stahl, Da" uniqKey="Stahl D">DA Stahl</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Venter, Jc" uniqKey="Venter J">JC Venter</name>
</author>
<author>
<name sortKey="Remington, K" uniqKey="Remington K">K Remington</name>
</author>
<author>
<name sortKey="Heidelberg, Jf" uniqKey="Heidelberg J">JF Heidelberg</name>
</author>
<author>
<name sortKey="Halpern, Al" uniqKey="Halpern A">AL Halpern</name>
</author>
<author>
<name sortKey="Rusch, D" uniqKey="Rusch D">D Rusch</name>
</author>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
<author>
<name sortKey="Wu, D" uniqKey="Wu D">D Wu</name>
</author>
<author>
<name sortKey="Paulsen, I" uniqKey="Paulsen I">I Paulsen</name>
</author>
<author>
<name sortKey="Nelson, Ke" uniqKey="Nelson K">KE Nelson</name>
</author>
<author>
<name sortKey="Nelson, W" uniqKey="Nelson W">W Nelson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tyson, Gw" uniqKey="Tyson G">GW Tyson</name>
</author>
<author>
<name sortKey="Chapman, J" uniqKey="Chapman J">J Chapman</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Allen, Ee" uniqKey="Allen E">EE Allen</name>
</author>
<author>
<name sortKey="Ram, Rj" uniqKey="Ram R">RJ Ram</name>
</author>
<author>
<name sortKey="Richardson, Pm" uniqKey="Richardson P">PM Richardson</name>
</author>
<author>
<name sortKey="Solovyev, Vv" uniqKey="Solovyev V">VV Solovyev</name>
</author>
<author>
<name sortKey="Rubin, Em" uniqKey="Rubin E">EM Rubin</name>
</author>
<author>
<name sortKey="Rokhsar, Ds" uniqKey="Rokhsar D">DS Rokhsar</name>
</author>
<author>
<name sortKey="Banfield, Jf" uniqKey="Banfield J">JF Banfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jones, Bv" uniqKey="Jones B">BV Jones</name>
</author>
<author>
<name sortKey="Begley, M" uniqKey="Begley M">M Begley</name>
</author>
<author>
<name sortKey="Hill, C" uniqKey="Hill C">C Hill</name>
</author>
<author>
<name sortKey="Gahan, Cg" uniqKey="Gahan C">CG Gahan</name>
</author>
<author>
<name sortKey="Marchesi, Jr" uniqKey="Marchesi J">JR Marchesi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Barry, K" uniqKey="Barry K">K Barry</name>
</author>
<author>
<name sortKey="Shapiro, H" uniqKey="Shapiro H">H Shapiro</name>
</author>
<author>
<name sortKey="Goltsman, E" uniqKey="Goltsman E">E Goltsman</name>
</author>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
<author>
<name sortKey="Salamov, A" uniqKey="Salamov A">A Salamov</name>
</author>
<author>
<name sortKey="Korzeniewski, F" uniqKey="Korzeniewski F">F Korzeniewski</name>
</author>
<author>
<name sortKey="Land, M" uniqKey="Land M">M Land</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Madden, Tl" uniqKey="Madden T">TL Madden</name>
</author>
<author>
<name sortKey="Schaffer, Aa" uniqKey="Schaffer A">AA Schaffer</name>
</author>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J Zhang</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
<author>
<name sortKey="Auch, Af" uniqKey="Auch A">AF Auch</name>
</author>
<author>
<name sortKey="Qi, J" uniqKey="Qi J">J Qi</name>
</author>
<author>
<name sortKey="Schuster, Sc" uniqKey="Schuster S">SC Schuster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cole, Jr" uniqKey="Cole J">JR Cole</name>
</author>
<author>
<name sortKey="Chai, B" uniqKey="Chai B">B Chai</name>
</author>
<author>
<name sortKey="Farris, Rj" uniqKey="Farris R">RJ Farris</name>
</author>
<author>
<name sortKey="Wang, Q" uniqKey="Wang Q">Q Wang</name>
</author>
<author>
<name sortKey="Kulam, Sa" uniqKey="Kulam S">SA Kulam</name>
</author>
<author>
<name sortKey="Mcgarrell, Dm" uniqKey="Mcgarrell D">DM McGarrell</name>
</author>
<author>
<name sortKey="Garrity, Gm" uniqKey="Garrity G">GM Garrity</name>
</author>
<author>
<name sortKey="Tiedje, Jm" uniqKey="Tiedje J">JM Tiedje</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garcia Martin, H" uniqKey="Garcia Martin H">H Garcia Martin</name>
</author>
<author>
<name sortKey="Ivanova, N" uniqKey="Ivanova N">N Ivanova</name>
</author>
<author>
<name sortKey="Kunin, V" uniqKey="Kunin V">V Kunin</name>
</author>
<author>
<name sortKey="Warnecke, F" uniqKey="Warnecke F">F Warnecke</name>
</author>
<author>
<name sortKey="Barry, Kw" uniqKey="Barry K">KW Barry</name>
</author>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Yeates, C" uniqKey="Yeates C">C Yeates</name>
</author>
<author>
<name sortKey="He, S" uniqKey="He S">S He</name>
</author>
<author>
<name sortKey="Salamov, Aa" uniqKey="Salamov A">AA Salamov</name>
</author>
<author>
<name sortKey="Szeto, E" uniqKey="Szeto E">E Szeto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
<author>
<name sortKey="Von Mering, C" uniqKey="Von Mering C">C von Mering</name>
</author>
<author>
<name sortKey="Kobayashi, A" uniqKey="Kobayashi A">A Kobayashi</name>
</author>
<author>
<name sortKey="Salamov, Aa" uniqKey="Salamov A">AA Salamov</name>
</author>
<author>
<name sortKey="Chen, K" uniqKey="Chen K">K Chen</name>
</author>
<author>
<name sortKey="Chang, Hw" uniqKey="Chang H">HW Chang</name>
</author>
<author>
<name sortKey="Podar, M" uniqKey="Podar M">M Podar</name>
</author>
<author>
<name sortKey="Short, Jm" uniqKey="Short J">JM Short</name>
</author>
<author>
<name sortKey="Mathur, Ej" uniqKey="Mathur E">EJ Mathur</name>
</author>
<author>
<name sortKey="Detter, Jc" uniqKey="Detter J">JC Detter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Case, Rj" uniqKey="Case R">RJ Case</name>
</author>
<author>
<name sortKey="Boucher, Y" uniqKey="Boucher Y">Y Boucher</name>
</author>
<author>
<name sortKey="Dahllof, I" uniqKey="Dahllof I">I Dahllof</name>
</author>
<author>
<name sortKey="Holmstrom, C" uniqKey="Holmstrom C">C Holmstrom</name>
</author>
<author>
<name sortKey="Doolittle, Wf" uniqKey="Doolittle W">WF Doolittle</name>
</author>
<author>
<name sortKey="Kjelleberg, S" uniqKey="Kjelleberg S">S Kjelleberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Desnues, C" uniqKey="Desnues C">C Desnues</name>
</author>
<author>
<name sortKey="Rodriguez Brito, B" uniqKey="Rodriguez Brito B">B Rodriguez-Brito</name>
</author>
<author>
<name sortKey="Rayhawk, S" uniqKey="Rayhawk S">S Rayhawk</name>
</author>
<author>
<name sortKey="Kelley, S" uniqKey="Kelley S">S Kelley</name>
</author>
<author>
<name sortKey="Tran, T" uniqKey="Tran T">T Tran</name>
</author>
<author>
<name sortKey="Haynes, M" uniqKey="Haynes M">M Haynes</name>
</author>
<author>
<name sortKey="Liu, H" uniqKey="Liu H">H Liu</name>
</author>
<author>
<name sortKey="Furlan, M" uniqKey="Furlan M">M Furlan</name>
</author>
<author>
<name sortKey="Wegley, L" uniqKey="Wegley L">L Wegley</name>
</author>
<author>
<name sortKey="Chau, B" uniqKey="Chau B">B Chau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
<author>
<name sortKey="Burge, C" uniqKey="Burge C">C Burge</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
<author>
<name sortKey="Burge, C" uniqKey="Burge C">C Burge</name>
</author>
<author>
<name sortKey="Campbell, Am" uniqKey="Campbell A">AM Campbell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
<author>
<name sortKey="Ladunga, I" uniqKey="Ladunga I">I Ladunga</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rubin, Gm" uniqKey="Rubin G">GM Rubin</name>
</author>
<author>
<name sortKey="Yandell, Md" uniqKey="Yandell M">MD Yandell</name>
</author>
<author>
<name sortKey="Wortman, Jr" uniqKey="Wortman J">JR Wortman</name>
</author>
<author>
<name sortKey="Gabor Miklos, Gl" uniqKey="Gabor Miklos G">GL Gabor Miklos</name>
</author>
<author>
<name sortKey="Nelson, Cr" uniqKey="Nelson C">CR Nelson</name>
</author>
<author>
<name sortKey="Hariharan, Ik" uniqKey="Hariharan I">IK Hariharan</name>
</author>
<author>
<name sortKey="Fortini, Me" uniqKey="Fortini M">ME Fortini</name>
</author>
<author>
<name sortKey="Li, Pw" uniqKey="Li P">PW Li</name>
</author>
<author>
<name sortKey="Apweiler, R" uniqKey="Apweiler R">R Apweiler</name>
</author>
<author>
<name sortKey="Fleischmann, W" uniqKey="Fleischmann W">W Fleischmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sandberg, R" uniqKey="Sandberg R">R Sandberg</name>
</author>
<author>
<name sortKey="Winberg, G" uniqKey="Winberg G">G Winberg</name>
</author>
<author>
<name sortKey="Branden, Ci" uniqKey="Branden C">CI Branden</name>
</author>
<author>
<name sortKey="Kaske, A" uniqKey="Kaske A">A Kaske</name>
</author>
<author>
<name sortKey="Ernberg, I" uniqKey="Ernberg I">I Ernberg</name>
</author>
<author>
<name sortKey="Coster, J" uniqKey="Coster J">J Coster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karlin, S" uniqKey="Karlin S">S Karlin</name>
</author>
<author>
<name sortKey="Mrazek, J" uniqKey="Mrazek J">J Mrazek</name>
</author>
<author>
<name sortKey="Campbell, Am" uniqKey="Campbell A">AM Campbell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chatterji, S" uniqKey="Chatterji S">S Chatterji</name>
</author>
<author>
<name sortKey="Yamazaki, I" uniqKey="Yamazaki I">I Yamazaki</name>
</author>
<author>
<name sortKey="Bai, Zj" uniqKey="Bai Z">ZJ Bai</name>
</author>
<author>
<name sortKey="Eisen, Ja" uniqKey="Eisen J">JA Eisen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H Teeling</name>
</author>
<author>
<name sortKey="Meyerdierks, A" uniqKey="Meyerdierks A">A Meyerdierks</name>
</author>
<author>
<name sortKey="Bauer, M" uniqKey="Bauer M">M Bauer</name>
</author>
<author>
<name sortKey="Amann, R" uniqKey="Amann R">R Amann</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO Glockner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H Teeling</name>
</author>
<author>
<name sortKey="Waldmann, J" uniqKey="Waldmann J">J Waldmann</name>
</author>
<author>
<name sortKey="Lombardot, T" uniqKey="Lombardot T">T Lombardot</name>
</author>
<author>
<name sortKey="Bauer, M" uniqKey="Bauer M">M Bauer</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO Glockner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mchardy, Ac" uniqKey="Mchardy A">AC McHardy</name>
</author>
<author>
<name sortKey="Martin, Hg" uniqKey="Martin H">HG Martin</name>
</author>
<author>
<name sortKey="Tsirigos, A" uniqKey="Tsirigos A">A Tsirigos</name>
</author>
<author>
<name sortKey="Hugenholtz, P" uniqKey="Hugenholtz P">P Hugenholtz</name>
</author>
<author>
<name sortKey="Rigoutsos, I" uniqKey="Rigoutsos I">I Rigoutsos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Diaz, Nn" uniqKey="Diaz N">NN Diaz</name>
</author>
<author>
<name sortKey="Krause, L" uniqKey="Krause L">L Krause</name>
</author>
<author>
<name sortKey="Goesmann, A" uniqKey="Goesmann A">A Goesmann</name>
</author>
<author>
<name sortKey="Niehaus, K" uniqKey="Niehaus K">K Niehaus</name>
</author>
<author>
<name sortKey="Nattkemper, Tw" uniqKey="Nattkemper T">TW Nattkemper</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">20406503</article-id>
<article-id pub-id-type="pmc">3165929</article-id>
<article-id pub-id-type="publisher-id">1471-2105-11-S2-S5</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-11-S2-S5</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Proceedings</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Unsupervised binning of environmental genomic fragments based on an error robust selection of
<italic>l</italic>
-mers</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes" id="A1">
<name>
<surname>Yang</surname>
<given-names>Bin</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>byang@cs.hku.hk</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Peng</surname>
<given-names>Yu</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>ypeng@cs.hku.hk</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Leung</surname>
<given-names>Henry Chi-Ming</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>cmleung2@cs.hku.hk</email>
</contrib>
<contrib contrib-type="author" id="A4">
<name>
<surname>Yiu</surname>
<given-names>Siu-Ming</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>smyiu@cs.hku.hk</email>
</contrib>
<contrib contrib-type="author" id="A5">
<name>
<surname>Chen</surname>
<given-names>Jing-Chi</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>jchen@cs.hku.hk</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A6">
<name>
<surname>Chin</surname>
<given-names>Francis Yuk-Lun</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<email>chin@cs.hku.hk</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing, Jiangsu, 210096 PR China</aff>
<aff id="I2">
<label>2</label>
Department of Computer Science, The University of Hong Kong, Pok Fu Lam Road, Hong Kong</aff>
<pub-date pub-type="collection">
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>16</day>
<month>4</month>
<year>2010</year>
</pub-date>
<volume>11</volume>
<issue>Suppl 2</issue>
<supplement>
<named-content content-type="supplement-title">Third International Workshop on Data and Text Mining in Bioinformatics (DTMBio) 2009</named-content>
<named-content content-type="supplement-editor">Min Song, Doheon Lee and Jun Huan</named-content>
</supplement>
<fpage>S5</fpage>
<lpage>S5</lpage>
<permissions>
<copyright-statement>Copyright ©2010 Yang and Chin; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2010</copyright-year>
<copyright-holder>Yang and Chin; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an open access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/11/S2/S5"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>With the rapid development of genome sequencing techniques, traditional research methods based on the isolation and cultivation of microorganisms are being gradually replaced by metagenomics, which is also known as environmental genomics. The first step, which is still a major bottleneck, of metagenomics is the taxonomic characterization of DNA fragments (reads) resulting from sequencing a sample of mixed species. This step is usually referred as “binning”. Existing binning methods are based on supervised or semi-supervised approaches which rely heavily on reference genomes of known microorganisms and phylogenetic marker genes. Due to the limited availability of reference genomes and the bias and instability of marker genes, existing binning methods may not be applicable in many cases.</p>
</sec>
<sec>
<title>Results</title>
<p>In this paper, we present an unsupervised binning method based on the distribution of a carefully selected set of
<italic>l</italic>
-mers (substrings of length
<italic>l</italic>
in DNA fragments). From our experiments, we show that our method can accurately bin DNA fragments with various lengths and relative species abundance ratios without using any reference and training datasets.</p>
<p>Another feature of our method is its error robustness. The binning accuracy decreases by less than 1% when the sequencing error rate increases from 0% to 5%. Note that the typical sequencing error rate of existing commercial sequencing platforms is less than 2%.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We provide a new and effective tool to solve the metagenome binning problem without using any reference datasets or markers information of any known reference genomes (species). The source code of our software tool, the reference genomes of the species for generating the test datasets and the corresponding test datasets are available at
<ext-link ext-link-type="uri" xlink:href="http://i.cs.hku.hk/~alse/MetaCluster/">http://i.cs.hku.hk/~alse/MetaCluster/</ext-link>
.</p>
</sec>
</abstract>
<conference>
<conf-date>
<day>6</day>
<month>11</month>
<year>2009</year>
</conf-date>
<conf-name>Third International Workshop on Data and Text Mining in Bioinformatics (DTMBio) 2009</conf-name>
<conf-loc>Hong Kong</conf-loc>
</conference>
</article-meta>
</front>
</pmc>
<affiliations>
<list>
<country>
<li>Hong Kong</li>
<li>République populaire de Chine</li>
</country>
</list>
<tree>
<country name="République populaire de Chine">
<noRegion>
<name sortKey="Yang, Bin" sort="Yang, Bin" uniqKey="Yang B" first="Bin" last="Yang">Bin Yang</name>
</noRegion>
</country>
<country name="Hong Kong">
<noRegion>
<name sortKey="Yang, Bin" sort="Yang, Bin" uniqKey="Yang B" first="Bin" last="Yang">Bin Yang</name>
</noRegion>
<name sortKey="Chen, Jing Chi" sort="Chen, Jing Chi" uniqKey="Chen J" first="Jing-Chi" last="Chen">Jing-Chi Chen</name>
<name sortKey="Chin, Francis Yuk Lun" sort="Chin, Francis Yuk Lun" uniqKey="Chin F" first="Francis Yuk-Lun" last="Chin">Francis Yuk-Lun Chin</name>
<name sortKey="Leung, Henry Chi Ming" sort="Leung, Henry Chi Ming" uniqKey="Leung H" first="Henry Chi-Ming" last="Leung">Henry Chi-Ming Leung</name>
<name sortKey="Peng, Yu" sort="Peng, Yu" uniqKey="Peng Y" first="Yu" last="Peng">Yu Peng</name>
<name sortKey="Yiu, Siu Ming" sort="Yiu, Siu Ming" uniqKey="Yiu S" first="Siu-Ming" last="Yiu">Siu-Ming Yiu</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001346 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd -nk 001346 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Checkpoint
   |type=    RBID
   |clé=     PMC:3165929
   |texte=   Unsupervised binning of environmental genomic fragments based on an error robust selection of l-mers
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/RBID.i   -Sk "pubmed:20406503" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021