Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps

Identifieur interne : 001058 ( Pmc/Curation ); précédent : 001057; suivant : 001059

Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps

Auteurs : Michael Roberts [États-Unis] ; Aleksey V. Zimin [États-Unis] ; Wayne Hayes [États-Unis] ; Brian R. Hunt [États-Unis] ; Cevat Ustun [États-Unis] ; James R. White [États-Unis] ; Paul Havlak [États-Unis] ; James Yorke [États-Unis]

Source :

RBID : PMC:2266800

Abstract

The assembly methods used for whole-genome shotgun (WGS) data have a major impact on the quality of resulting draft genomes. We present a novel algorithm to generate a set of “reliable” overlaps based on identifying repeat k-mers. To demonstrate the benefits of using reliable overlaps, we have created a version of the Phrap assembly program that uses only overlaps from a specific list. We call this version PhrapUMD. Integrating PhrapUMD and our “reliable-overlap” algorithm with the Baylor College of Medicine assembler, Atlas, we assemble the BACs from the Rattus norvegicus genome project. Starting with the same data as the Nov. 2002 Atlas assembly, we compare our results and the Atlas assembly to the 4.3 Mb of rat sequence in the 21 BACs that have been finished. Our version of the draft assembly of the 21 BACs increases the coverage of finished sequence from 93.4% to 96.3%, while simultaneously reducing the base error rate from 4.5 to 1.1 errors per 10,000 bases. There are a number of ways of assessing the relative merits of assemblies when the finished sequence is available. If one views the overall quality of an assembly as proportional to the inverse of the product of the error rate and sequence missed, then the assembly presented here is seven times better. The UMD Overlapper with options for reliable overlaps is available from the authors at http://www.genome.umd.edu. We also provide the changes to the Phrap source code enabling it to use only the reliable overlaps.


Url:
DOI: 10.1371/journal.pone.0001836
PubMed: 18350171
PubMed Central: 2266800

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:2266800

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps</title>
<author>
<name sortKey="Roberts, Michael" sort="Roberts, Michael" uniqKey="Roberts M" first="Michael" last="Roberts">Michael Roberts</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Zimin, Aleksey V" sort="Zimin, Aleksey V" uniqKey="Zimin A" first="Aleksey V." last="Zimin">Aleksey V. Zimin</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Hayes, Wayne" sort="Hayes, Wayne" uniqKey="Hayes W" first="Wayne" last="Hayes">Wayne Hayes</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Hunt, Brian R" sort="Hunt, Brian R" uniqKey="Hunt B" first="Brian R." last="Hunt">Brian R. Hunt</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ustun, Cevat" sort="Ustun, Cevat" uniqKey="Ustun C" first="Cevat" last="Ustun">Cevat Ustun</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="White, James R" sort="White, James R" uniqKey="White J" first="James R." last="White">James R. White</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Havlak, Paul" sort="Havlak, Paul" uniqKey="Havlak P" first="Paul" last="Havlak">Paul Havlak</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Yorke, James" sort="Yorke, James" uniqKey="Yorke J" first="James" last="Yorke">James Yorke</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">18350171</idno>
<idno type="pmc">2266800</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2266800</idno>
<idno type="RBID">PMC:2266800</idno>
<idno type="doi">10.1371/journal.pone.0001836</idno>
<date when="2008">2008</date>
<idno type="wicri:Area/Pmc/Corpus">001058</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">001058</idno>
<idno type="wicri:Area/Pmc/Curation">001058</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">001058</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps</title>
<author>
<name sortKey="Roberts, Michael" sort="Roberts, Michael" uniqKey="Roberts M" first="Michael" last="Roberts">Michael Roberts</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Zimin, Aleksey V" sort="Zimin, Aleksey V" uniqKey="Zimin A" first="Aleksey V." last="Zimin">Aleksey V. Zimin</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Hayes, Wayne" sort="Hayes, Wayne" uniqKey="Hayes W" first="Wayne" last="Hayes">Wayne Hayes</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Hunt, Brian R" sort="Hunt, Brian R" uniqKey="Hunt B" first="Brian R." last="Hunt">Brian R. Hunt</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ustun, Cevat" sort="Ustun, Cevat" uniqKey="Ustun C" first="Cevat" last="Ustun">Cevat Ustun</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="White, James R" sort="White, James R" uniqKey="White J" first="James R." last="White">James R. White</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Havlak, Paul" sort="Havlak, Paul" uniqKey="Havlak P" first="Paul" last="Havlak">Paul Havlak</name>
<affiliation wicri:level="1">
<nlm:aff id="aff2">
<addr-line>Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Yorke, James" sort="Yorke, James" uniqKey="Yorke J" first="James" last="Yorke">James Yorke</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>The assembly methods used for whole-genome shotgun (WGS) data have a major impact on the quality of resulting draft genomes. We present a novel algorithm to generate a set of “reliable” overlaps based on identifying repeat k-mers. To demonstrate the benefits of using reliable overlaps, we have created a version of the Phrap assembly program that uses only overlaps from a specific list. We call this version
<italic>PhrapUMD</italic>
. Integrating PhrapUMD and our “reliable-overlap” algorithm with the Baylor College of Medicine assembler, Atlas, we assemble the BACs from the
<italic>Rattus norvegicus</italic>
genome project. Starting with the same data as the Nov. 2002 Atlas assembly, we compare our results and the Atlas assembly to the 4.3 Mb of rat sequence in the 21 BACs that have been finished. Our version of the draft assembly of the 21 BACs increases the coverage of finished sequence from 93.4% to 96.3%, while simultaneously reducing the base error rate from 4.5 to 1.1 errors per 10,000 bases. There are a number of ways of assessing the relative merits of assemblies when the finished sequence is available. If one views the overall quality of an assembly as proportional to the inverse of the product of the error rate and sequence missed, then the assembly presented here is seven times better. The UMD Overlapper with options for reliable overlaps is available from the authors at
<ext-link ext-link-type="uri" xlink:href="http://www.genome.umd.edu">http://www.genome.umd.edu</ext-link>
. We also provide the changes to the Phrap source code enabling it to use only the reliable overlaps.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Ewing, B" uniqKey="Ewing B">B Ewing</name>
</author>
<author>
<name sortKey="Hillier, L" uniqKey="Hillier L">L Hillier</name>
</author>
<author>
<name sortKey="Wendl, Mc" uniqKey="Wendl M">MC Wendl</name>
</author>
<author>
<name sortKey="Green, P" uniqKey="Green P">P Green</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ewing, B" uniqKey="Ewing B">B Ewing</name>
</author>
<author>
<name sortKey="Green, P" uniqKey="Green P">P Green</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sutton, Gg" uniqKey="Sutton G">GG Sutton</name>
</author>
<author>
<name sortKey="White, O" uniqKey="White O">O White</name>
</author>
<author>
<name sortKey="Adams, Md" uniqKey="Adams M">MD Adams</name>
</author>
<author>
<name sortKey="Kerlavage, Ar" uniqKey="Kerlavage A">AR Kerlavage</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author>
<name sortKey="Sutton, Gg" uniqKey="Sutton G">GG Sutton</name>
</author>
<author>
<name sortKey="Delcher, Al" uniqKey="Delcher A">AL Delcher</name>
</author>
<author>
<name sortKey="Dew, Im" uniqKey="Dew I">IM Dew</name>
</author>
<author>
<name sortKey="Fasulo, Dp" uniqKey="Fasulo D">DP Fasulo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Havlak, P" uniqKey="Havlak P">P Havlak</name>
</author>
<author>
<name sortKey="Chen, R" uniqKey="Chen R">R Chen</name>
</author>
<author>
<name sortKey="Durbin, Kj" uniqKey="Durbin K">KJ Durbin</name>
</author>
<author>
<name sortKey="Egan, A" uniqKey="Egan A">A Egan</name>
</author>
<author>
<name sortKey="Ren, Y" uniqKey="Ren Y">Y Ren</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Batzoglou, S" uniqKey="Batzoglou S">S Batzoglou</name>
</author>
<author>
<name sortKey="Jaffe, Db" uniqKey="Jaffe D">DB Jaffe</name>
</author>
<author>
<name sortKey="Stanley, K" uniqKey="Stanley K">K Stanley</name>
</author>
<author>
<name sortKey="Butler, J" uniqKey="Butler J">J Butler</name>
</author>
<author>
<name sortKey="Gnerre, S" uniqKey="Gnerre S">S Gnerre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mullikin, Jc" uniqKey="Mullikin J">JC Mullikin</name>
</author>
<author>
<name sortKey="Ning, Z" uniqKey="Ning Z">Z Ning</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aparicio, S" uniqKey="Aparicio S">S Aparicio</name>
</author>
<author>
<name sortKey="Chapman, J" uniqKey="Chapman J">J Chapman</name>
</author>
<author>
<name sortKey="Stupka, E" uniqKey="Stupka E">E Stupka</name>
</author>
<author>
<name sortKey="Putnam, N" uniqKey="Putnam N">N Putnam</name>
</author>
<author>
<name sortKey="Chia, Jm" uniqKey="Chia J">JM Chia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huang, X" uniqKey="Huang X">X Huang</name>
</author>
<author>
<name sortKey="Wang, J" uniqKey="Wang J">J Wang</name>
</author>
<author>
<name sortKey="Aluru, S" uniqKey="Aluru S">S Aluru</name>
</author>
<author>
<name sortKey="Yang, Sp" uniqKey="Yang S">SP Yang</name>
</author>
<author>
<name sortKey="Hillier, L" uniqKey="Hillier L">L Hillier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Green, P" uniqKey="Green P">P Green</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roberts, M" uniqKey="Roberts M">M Roberts</name>
</author>
<author>
<name sortKey="Hunt, Br" uniqKey="Hunt B">BR Hunt</name>
</author>
<author>
<name sortKey="Yorke, Ja" uniqKey="Yorke J">JA Yorke</name>
</author>
<author>
<name sortKey="Bolanos, R" uniqKey="Bolanos R">R Bolanos</name>
</author>
<author>
<name sortKey="Delcher, A" uniqKey="Delcher A">A Delcher</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Salzberg, Sl" uniqKey="Salzberg S">SL Salzberg</name>
</author>
<author>
<name sortKey="Yorke, J" uniqKey="Yorke J">J Yorke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schwartz, S" uniqKey="Schwartz S">S Schwartz</name>
</author>
<author>
<name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
<author>
<name sortKey="Smit, A" uniqKey="Smit A">A Smit</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Baertsch, R" uniqKey="Baertsch R">R Baertsch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delcher, Al" uniqKey="Delcher A">AL Delcher</name>
</author>
<author>
<name sortKey="Kasif, S" uniqKey="Kasif S">S Kasif</name>
</author>
<author>
<name sortKey="Fleischmann, Rd" uniqKey="Fleischmann R">RD Fleischmann</name>
</author>
<author>
<name sortKey="Peterson, J" uniqKey="Peterson J">J Peterson</name>
</author>
<author>
<name sortKey="White, O" uniqKey="White O">O White</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">PLoS One</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS ONE</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">plosone</journal-id>
<journal-title-group>
<journal-title>PLoS ONE</journal-title>
</journal-title-group>
<issn pub-type="epub">1932-6203</issn>
<publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">18350171</article-id>
<article-id pub-id-type="pmc">2266800</article-id>
<article-id pub-id-type="publisher-id">07-PONE-RA-02495R1</article-id>
<article-id pub-id-type="doi">10.1371/journal.pone.0001836</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline">
<subject>Computational Biology</subject>
<subject>Genetics and Genomics/Bioinformatics</subject>
<subject>Genetics and Genomics/Genome Projects</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps</article-title>
<alt-title alt-title-type="running-head">Improving Phrap-Based Assembly</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Roberts</surname>
<given-names>Michael</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zimin</surname>
<given-names>Aleksey V.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="cor1">
<sup>*</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hayes</surname>
<given-names>Wayne</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn1">
<sup>¤a</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hunt</surname>
<given-names>Brian R.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ustun</surname>
<given-names>Cevat</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn2">
<sup>¤b</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>White</surname>
<given-names>James R.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Havlak</surname>
<given-names>Paul</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn3">
<sup>¤c</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yorke</surname>
<given-names>James</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<label>1</label>
<addr-line>Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America</addr-line>
</aff>
<contrib-group>
<contrib contrib-type="editor">
<name>
<surname>Hall</surname>
<given-names>Neil</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1">University of Liverpool, United Kingdom</aff>
<author-notes>
<corresp id="cor1">* E-mail:
<email>alekseyz@ipst.umd.edu</email>
</corresp>
<fn fn-type="con">
<p>Conceived and designed the experiments: MR AZ WH BH JY CU. Performed the experiments: MR AZ WH JW CU. Analyzed the data: MR PH AZ WH JW CU. Contributed reagents/materials/analysis tools: PH. Wrote the paper: AZ WH JY. Other: Headed the project: JY. PI on the grant: JY.</p>
</fn>
<fn id="fn1" fn-type="current-aff">
<label>¤a</label>
<p>Current address: Department of Computer Science, University of California Irvine, Irvine, California, United States of America</p>
</fn>
<fn id="fn2" fn-type="current-aff">
<label>¤b</label>
<p>Current address: Department of Biology, California Institute of Technology, Pasadena, California, United States of America</p>
</fn>
<fn id="fn3" fn-type="current-aff">
<label>¤c</label>
<p>Current address: National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland, United States of America</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<year>2008</year>
</pub-date>
<pub-date pub-type="epub">
<day>19</day>
<month>3</month>
<year>2008</year>
</pub-date>
<volume>3</volume>
<issue>3</issue>
<elocation-id>e1836</elocation-id>
<history>
<date date-type="received">
<day>15</day>
<month>10</month>
<year>2007</year>
</date>
<date date-type="accepted">
<day>9</day>
<month>2</month>
<year>2008</year>
</date>
</history>
<permissions>
<copyright-statement>Roberts et al.</copyright-statement>
<copyright-year>2008</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.</license-p>
</license>
</permissions>
<abstract>
<p>The assembly methods used for whole-genome shotgun (WGS) data have a major impact on the quality of resulting draft genomes. We present a novel algorithm to generate a set of “reliable” overlaps based on identifying repeat k-mers. To demonstrate the benefits of using reliable overlaps, we have created a version of the Phrap assembly program that uses only overlaps from a specific list. We call this version
<italic>PhrapUMD</italic>
. Integrating PhrapUMD and our “reliable-overlap” algorithm with the Baylor College of Medicine assembler, Atlas, we assemble the BACs from the
<italic>Rattus norvegicus</italic>
genome project. Starting with the same data as the Nov. 2002 Atlas assembly, we compare our results and the Atlas assembly to the 4.3 Mb of rat sequence in the 21 BACs that have been finished. Our version of the draft assembly of the 21 BACs increases the coverage of finished sequence from 93.4% to 96.3%, while simultaneously reducing the base error rate from 4.5 to 1.1 errors per 10,000 bases. There are a number of ways of assessing the relative merits of assemblies when the finished sequence is available. If one views the overall quality of an assembly as proportional to the inverse of the product of the error rate and sequence missed, then the assembly presented here is seven times better. The UMD Overlapper with options for reliable overlaps is available from the authors at
<ext-link ext-link-type="uri" xlink:href="http://www.genome.umd.edu">http://www.genome.umd.edu</ext-link>
. We also provide the changes to the Phrap source code enabling it to use only the reliable overlaps.</p>
</abstract>
<counts>
<page-count count="5"></page-count>
</counts>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001058 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 001058 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:2266800
   |texte=   Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:18350171" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021