Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Exploiting sparseness in de novo genome assembly.

Identifieur interne : 001C83 ( PubMed/Checkpoint ); précédent : 001C82; suivant : 001C84

Exploiting sparseness in de novo genome assembly.

Auteurs : Chengxi Ye [États-Unis] ; Zhanshan Sam Ma ; Charles H. Cannon ; Mihai Pop ; Douglas W. Yu

Source :

RBID : pubmed:22537038

Descripteurs français

English descriptors

Abstract

The very large memory requirements for the construction of assembly graphs for de novo genome assembly limit current algorithms to super-computing environments.

DOI: 10.1186/1471-2105-13-S6-S1
PubMed: 22537038


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:22537038

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Exploiting sparseness in de novo genome assembly.</title>
<author>
<name sortKey="Ye, Chengxi" sort="Ye, Chengxi" uniqKey="Ye C" first="Chengxi" last="Ye">Chengxi Ye</name>
<affiliation wicri:level="1">
<nlm:affiliation>Ecology & Evolution of Plant-Animal Interaction Group, Xishuangbanna Tropical Botanic Garden, Chinese Academy of Sciences, Menglun, Yunnan 666303 China. cxy@umd.edu</nlm:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Ma, Zhanshan Sam" sort="Ma, Zhanshan Sam" uniqKey="Ma Z" first="Zhanshan Sam" last="Ma">Zhanshan Sam Ma</name>
</author>
<author>
<name sortKey="Cannon, Charles H" sort="Cannon, Charles H" uniqKey="Cannon C" first="Charles H" last="Cannon">Charles H. Cannon</name>
</author>
<author>
<name sortKey="Pop, Mihai" sort="Pop, Mihai" uniqKey="Pop M" first="Mihai" last="Pop">Mihai Pop</name>
</author>
<author>
<name sortKey="Yu, Douglas W" sort="Yu, Douglas W" uniqKey="Yu D" first="Douglas W" last="Yu">Douglas W. Yu</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2012">2012</date>
<idno type="RBID">pubmed:22537038</idno>
<idno type="pmid">22537038</idno>
<idno type="doi">10.1186/1471-2105-13-S6-S1</idno>
<idno type="wicri:Area/PubMed/Corpus">001D85</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001D85</idno>
<idno type="wicri:Area/PubMed/Curation">001D85</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001D85</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001C83</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001C83</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Exploiting sparseness in de novo genome assembly.</title>
<author>
<name sortKey="Ye, Chengxi" sort="Ye, Chengxi" uniqKey="Ye C" first="Chengxi" last="Ye">Chengxi Ye</name>
<affiliation wicri:level="1">
<nlm:affiliation>Ecology & Evolution of Plant-Animal Interaction Group, Xishuangbanna Tropical Botanic Garden, Chinese Academy of Sciences, Menglun, Yunnan 666303 China. cxy@umd.edu</nlm:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Ma, Zhanshan Sam" sort="Ma, Zhanshan Sam" uniqKey="Ma Z" first="Zhanshan Sam" last="Ma">Zhanshan Sam Ma</name>
</author>
<author>
<name sortKey="Cannon, Charles H" sort="Cannon, Charles H" uniqKey="Cannon C" first="Charles H" last="Cannon">Charles H. Cannon</name>
</author>
<author>
<name sortKey="Pop, Mihai" sort="Pop, Mihai" uniqKey="Pop M" first="Mihai" last="Pop">Mihai Pop</name>
</author>
<author>
<name sortKey="Yu, Douglas W" sort="Yu, Douglas W" uniqKey="Yu D" first="Douglas W" last="Yu">Douglas W. Yu</name>
</author>
</analytic>
<series>
<title level="j">BMC bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2012" type="published">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Computer Storage Devices</term>
<term>Escherichia coli (genetics)</term>
<term>Genome</term>
<term>Genome, Human</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Dispositifs mémoires d'ordinateur</term>
<term>Escherichia coli (génétique)</term>
<term>Génome</term>
<term>Génome humain</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
<keywords scheme="MESH" qualifier="genetics" xml:lang="en">
<term>Escherichia coli</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr">
<term>Escherichia coli</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Computer Storage Devices</term>
<term>Genome</term>
<term>Genome, Human</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Humans</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Dispositifs mémoires d'ordinateur</term>
<term>Génome</term>
<term>Génome humain</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">The very large memory requirements for the construction of assembly graphs for de novo genome assembly limit current algorithms to super-computing environments.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" IndexingMethod="Curated" Owner="NLM">
<PMID Version="1">22537038</PMID>
<DateCompleted>
<Year>2012</Year>
<Month>10</Month>
<Day>26</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>12</Month>
<Day>01</Day>
</DateRevised>
<Article PubModel="Electronic">
<Journal>
<ISSN IssnType="Electronic">1471-2105</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>13 Suppl 6</Volume>
<PubDate>
<Year>2012</Year>
<Month>Apr</Month>
<Day>19</Day>
</PubDate>
</JournalIssue>
<Title>BMC bioinformatics</Title>
<ISOAbbreviation>BMC Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>Exploiting sparseness in de novo genome assembly.</ArticleTitle>
<Pagination>
<MedlinePgn>S1</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1186/1471-2105-13-S6-S1</ELocationID>
<Abstract>
<AbstractText Label="BACKGROUND" NlmCategory="BACKGROUND">The very large memory requirements for the construction of assembly graphs for de novo genome assembly limit current algorithms to super-computing environments.</AbstractText>
<AbstractText Label="METHODS" NlmCategory="METHODS">In this paper, we demonstrate that constructing a sparse assembly graph which stores only a small fraction of the observed k-mers as nodes and the links between these nodes allows the de novo assembly of even moderately-sized genomes (~500 M) on a typical laptop computer.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">We implement this sparse graph concept in a proof-of-principle software package, SparseAssembler, utilizing a new sparse k-mer graph structure evolved from the de Bruijn graph. We test our SparseAssembler with both simulated and real data, achieving ~90% memory savings and retaining high assembly accuracy, without sacrificing speed in comparison to existing de novo assemblers.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Ye</LastName>
<ForeName>Chengxi</ForeName>
<Initials>C</Initials>
<AffiliationInfo>
<Affiliation>Ecology & Evolution of Plant-Animal Interaction Group, Xishuangbanna Tropical Botanic Garden, Chinese Academy of Sciences, Menglun, Yunnan 666303 China. cxy@umd.edu</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Ma</LastName>
<ForeName>Zhanshan Sam</ForeName>
<Initials>ZS</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Cannon</LastName>
<ForeName>Charles H</ForeName>
<Initials>CH</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Pop</LastName>
<ForeName>Mihai</ForeName>
<Initials>M</Initials>
</Author>
<Author ValidYN="Y">
<LastName>Yu</LastName>
<ForeName>Douglas W</ForeName>
<Initials>DW</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
<PublicationType UI="D013486">Research Support, U.S. Gov't, Non-P.H.S.</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2012</Year>
<Month>04</Month>
<Day>19</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>BMC Bioinformatics</MedlineTA>
<NlmUniqueID>100965194</NlmUniqueID>
<ISSNLinking>1471-2105</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016248" MajorTopicYN="Y">Computer Storage Devices</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D004926" MajorTopicYN="N">Escherichia coli</DescriptorName>
<QualifierName UI="Q000235" MajorTopicYN="N">genetics</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016678" MajorTopicYN="Y">Genome</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D015894" MajorTopicYN="N">Genome, Human</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012984" MajorTopicYN="Y">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="entrez">
<Year>2012</Year>
<Month>4</Month>
<Day>28</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2012</Year>
<Month>5</Month>
<Day>2</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2012</Year>
<Month>10</Month>
<Day>27</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>epublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">22537038</ArticleId>
<ArticleId IdType="pii">1471-2105-13-S6-S1</ArticleId>
<ArticleId IdType="doi">10.1186/1471-2105-13-S6-S1</ArticleId>
<ArticleId IdType="pmc">PMC3369186</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Science. 2000 Mar 24;287(5461):2196-204</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">10731133</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2012 Mar;22(3):549-56</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22156294</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2002 Jan;12(1):177-89</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">11779843</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2003 Jan;13(1):81-90</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">12529309</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2004;5(2):R12</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14759262</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2004 Apr;14(4):721-32</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">15060016</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2004 Sep 1;20(13):2067-74</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">15059830</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2004 Dec 12;20(18):3363-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">15256412</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2005 Sep 1;21 Suppl 2:ii79-85</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16204131</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2007 Feb 15;23(4):500-1</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17158514</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2007;2(5):e484</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17534434</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2007 Nov;17(11):1697-706</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17908823</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Trends Genet. 2008 Mar;24(3):142-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18262676</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2008 May;18(5):821-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18349386</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2008;9(3):R55</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18341692</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2009 Jun;19(6):1117-23</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19251739</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2010 Feb;20(2):265-72</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20019144</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2011 Jan 25;108(4):1513-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21187386</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Feb 15;27(4):479-86</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21245053</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Mar 15;27(6):764-70</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21217122</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2011;6(3):e17915</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21423806</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Aug 1;27(15):2031-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21636596</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2011;12:333</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21831268</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Nov 1;27(21):2957-63</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21903629</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2012 Mar;22(3):557-67</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22147368</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">11504945</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
</list>
<tree>
<noCountry>
<name sortKey="Cannon, Charles H" sort="Cannon, Charles H" uniqKey="Cannon C" first="Charles H" last="Cannon">Charles H. Cannon</name>
<name sortKey="Ma, Zhanshan Sam" sort="Ma, Zhanshan Sam" uniqKey="Ma Z" first="Zhanshan Sam" last="Ma">Zhanshan Sam Ma</name>
<name sortKey="Pop, Mihai" sort="Pop, Mihai" uniqKey="Pop M" first="Mihai" last="Pop">Mihai Pop</name>
<name sortKey="Yu, Douglas W" sort="Yu, Douglas W" uniqKey="Yu D" first="Douglas W" last="Yu">Douglas W. Yu</name>
</noCountry>
<country name="États-Unis">
<noRegion>
<name sortKey="Ye, Chengxi" sort="Ye, Chengxi" uniqKey="Ye C" first="Chengxi" last="Ye">Chengxi Ye</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001C83 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 001C83 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:22537038
   |texte=   Exploiting sparseness in de novo genome assembly.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i   -Sk "pubmed:22537038" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021