MersV1, Ncbi, Merge, bibRecord, 001699

LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.

Identifieur interne : 001699 ( Ncbi/Merge ); précédent : 001698; suivant : 001700

LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.

Auteurs : Sara El-Metwally [Égypte] ; Magdi Zakaria [Égypte] ; Taher Hamza [Égypte]

Source :

Bioinformatics (Oxford, England) [ 1367-4811 ] ; 2016.

RBID : pubmed:27412092

Descripteurs français

KwdFr :
- Algorithmes, Analyse de séquence d'ADN, Animaux, Génome, Génomique, Humains, Séquençage nucléotidique à haut débit ().
MESH :
- Algorithmes, Analyse de séquence d'ADN, Animaux, Génome, Génomique, Humains, Séquençage nucléotidique à haut débit.

English descriptors

KwdEn :
- Algorithms, Animals, Genome, Genomics, High-Throughput Nucleotide Sequencing (methods), Humans, Sequence Analysis, DNA.
MESH :
- methods : High-Throughput Nucleotide Sequencing.
- Algorithms, Animals, Genome, Genomics, Humans, Sequence Analysis, DNA.

Abstract

The deluge of current sequenced data has exceeded Moore's Law, more than doubling every 2 years since the next-generation sequencing (NGS) technologies were invented. Accordingly, we will able to generate more and more data with high speed at fixed cost, but lack the computational resources to store, process and analyze it. With error prone high throughput NGS reads and genomic repeats, the assembly graph contains massive amount of redundant nodes and branching edges. Most assembly pipelines require this large graph to reside in memory to start their workflows, which is intractable for mammalian genomes. Resource-efficient genome assemblers combine both the power of advanced computing techniques and innovative data structures to encode the assembly graph efficiently in a computer memory.

DOI: 10.1093/bioinformatics/btw470
PubMed: 27412092

Links toward previous steps (curation, corpus...)

to stream PubMed, to step Corpus: 000F22
to stream PubMed, to step Curation: 000F22
to stream PubMed, to step Checkpoint: 001029

Links to Exploration step

pubmed:27412092

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.</title>
<author><name sortKey="El Metwally, Sara" sort="El Metwally, Sara" uniqKey="El Metwally S" first="Sara" last="El-Metwally">Sara El-Metwally</name>
<affiliation wicri:level="4"><nlm:affiliation>Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</nlm:affiliation>
<country xml:lang="fr">Égypte</country>
<wicri:regionArea>Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516</wicri:regionArea>
<orgName type="university">Université de Californie du Sud</orgName>
<placeName><settlement type="city">Los Angeles</settlement>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Zakaria, Magdi" sort="Zakaria, Magdi" uniqKey="Zakaria M" first="Magdi" last="Zakaria">Magdi Zakaria</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</nlm:affiliation>
<country xml:lang="fr">Égypte</country>
<wicri:regionArea>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516</wicri:regionArea>
<wicri:noRegion>Mansoura 35516</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Hamza, Taher" sort="Hamza, Taher" uniqKey="Hamza T" first="Taher" last="Hamza">Taher Hamza</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</nlm:affiliation>
<country xml:lang="fr">Égypte</country>
<wicri:regionArea>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516</wicri:regionArea>
<wicri:noRegion>Mansoura 35516</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2016">2016</date>
<idno type="RBID">pubmed:27412092</idno>
<idno type="pmid">27412092</idno>
<idno type="doi">10.1093/bioinformatics/btw470</idno>
<idno type="wicri:Area/PubMed/Corpus">000F22</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000F22</idno>
<idno type="wicri:Area/PubMed/Curation">000F22</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000F22</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001029</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001029</idno>
<idno type="wicri:Area/Ncbi/Merge">001699</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.</title>
<author><name sortKey="El Metwally, Sara" sort="El Metwally, Sara" uniqKey="El Metwally S" first="Sara" last="El-Metwally">Sara El-Metwally</name>
<affiliation wicri:level="4"><nlm:affiliation>Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</nlm:affiliation>
<country xml:lang="fr">Égypte</country>
<wicri:regionArea>Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516</wicri:regionArea>
<orgName type="university">Université de Californie du Sud</orgName>
<placeName><settlement type="city">Los Angeles</settlement>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Zakaria, Magdi" sort="Zakaria, Magdi" uniqKey="Zakaria M" first="Magdi" last="Zakaria">Magdi Zakaria</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</nlm:affiliation>
<country xml:lang="fr">Égypte</country>
<wicri:regionArea>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516</wicri:regionArea>
<wicri:noRegion>Mansoura 35516</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Hamza, Taher" sort="Hamza, Taher" uniqKey="Hamza T" first="Taher" last="Hamza">Taher Hamza</name>
<affiliation wicri:level="1"><nlm:affiliation>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</nlm:affiliation>
<country xml:lang="fr">Égypte</country>
<wicri:regionArea>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516</wicri:regionArea>
<wicri:noRegion>Mansoura 35516</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2016" type="published">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Animals</term>
<term>Genome</term>
<term>Genomics</term>
<term>High-Throughput Nucleotide Sequencing (methods)</term>
<term>Humans</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Animaux</term>
<term>Génome</term>
<term>Génomique</term>
<term>Humains</term>
<term>Séquençage nucléotidique à haut débit ()</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>High-Throughput Nucleotide Sequencing</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Animals</term>
<term>Genome</term>
<term>Genomics</term>
<term>Humans</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Animaux</term>
<term>Génome</term>
<term>Génomique</term>
<term>Humains</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">The deluge of current sequenced data has exceeded Moore's Law, more than doubling every 2 years since the next-generation sequencing (NGS) technologies were invented. Accordingly, we will able to generate more and more data with high speed at fixed cost, but lack the computational resources to store, process and analyze it. With error prone high throughput NGS reads and genomic repeats, the assembly graph contains massive amount of redundant nodes and branching edges. Most assembly pipelines require this large graph to reside in memory to start their workflows, which is intractable for mammalian genomes. Resource-efficient genome assemblers combine both the power of advanced computing techniques and innovative data structures to encode the assembly graph efficiently in a computer memory.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="MEDLINE" IndexingMethod="Curated" Owner="NLM"><PMID Version="1">27412092</PMID>
<DateCompleted><Year>2017</Year>
<Month>08</Month>
<Day>15</Day>
</DateCompleted>
<DateRevised><Year>2018</Year>
<Month>12</Month>
<Day>02</Day>
</DateRevised>
<Article PubModel="Print-Electronic"><Journal><ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>32</Volume>
<Issue>21</Issue>
<PubDate><Year>2016</Year>
<Month>11</Month>
<Day>01</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.</ArticleTitle>
<Pagination><MedlinePgn>3215-3223</MedlinePgn>
</Pagination>
<Abstract><AbstractText Label="MOTIVATION">The deluge of current sequenced data has exceeded Moore's Law, more than doubling every 2 years since the next-generation sequencing (NGS) technologies were invented. Accordingly, we will able to generate more and more data with high speed at fixed cost, but lack the computational resources to store, process and analyze it. With error prone high throughput NGS reads and genomic repeats, the assembly graph contains massive amount of redundant nodes and branching edges. Most assembly pipelines require this large graph to reside in memory to start their workflows, which is intractable for mammalian genomes. Resource-efficient genome assemblers combine both the power of advanced computing techniques and innovative data structures to encode the assembly graph efficiently in a computer memory.</AbstractText>
<AbstractText Label="RESULTS">LightAssembler is a lightweight assembly algorithm designed to be executed on a desktop machine. It uses a pair of cache oblivious Bloom filters, one holding a uniform sample of [Formula: see text]-spaced sequenced [Formula: see text]-mers and the other holding [Formula: see text]-mers classified as likely correct, using a simple statistical test. LightAssembler contains a light implementation of the graph traversal and simplification modules that achieves comparable assembly accuracy and contiguity to other competing tools. Our method reduces the memory usage by [Formula: see text] compared to the resource-efficient assemblers using benchmark datasets from GAGE and Assemblathon projects. While LightAssembler can be considered as a gap-based sequence assembler, different gap sizes result in an almost constant assembly size and genome coverage.</AbstractText>
<AbstractText Label="AVAILABILITY AND IMPLEMENTATION">https://github.com/SaraEl-Metwally/LightAssembler CONTACT: sarah_almetwally4@mans.edu.egSupplementary information: Supplementary data are available at Bioinformatics online.</AbstractText>
<CopyrightInformation>© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>El-Metwally</LastName>
<ForeName>Sara</ForeName>
<Initials>S</Initials>
<AffiliationInfo><Affiliation>Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Zakaria</LastName>
<ForeName>Magdi</ForeName>
<Initials>M</Initials>
<AffiliationInfo><Affiliation>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Hamza</LastName>
<ForeName>Taher</ForeName>
<Initials>T</Initials>
<AffiliationInfo><Affiliation>Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura 35516, Egypt.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic"><Year>2016</Year>
<Month>07</Month>
<Day>13</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo><Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList><MeshHeading><DescriptorName UI="D000465" MajorTopicYN="Y">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D000818" MajorTopicYN="N">Animals</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D016678" MajorTopicYN="N">Genome</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D023281" MajorTopicYN="N">Genomics</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading><DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2016</Year>
<Month>04</Month>
<Day>02</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2016</Year>
<Month>06</Month>
<Day>28</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2016</Year>
<Month>10</Month>
<Day>30</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2017</Year>
<Month>8</Month>
<Day>16</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2016</Year>
<Month>7</Month>
<Day>15</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">27412092</ArticleId>
<ArticleId IdType="pii">btw470</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btw470</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
<affiliations><list><country><li>Égypte</li>
</country>
<region><li>Californie</li>
</region>
<settlement><li>Los Angeles</li>
</settlement>
<orgName><li>Université de Californie du Sud</li>
</orgName>
</list>
<tree><country name="Égypte"><region name="Californie"><name sortKey="El Metwally, Sara" sort="El Metwally, Sara" uniqKey="El Metwally S" first="Sara" last="El-Metwally">Sara El-Metwally</name>
</region>
<name sortKey="Hamza, Taher" sort="Hamza, Taher" uniqKey="Hamza T" first="Taher" last="Hamza">Taher Hamza</name>
<name sortKey="Zakaria, Magdi" sort="Zakaria, Magdi" uniqKey="Zakaria M" first="Magdi" last="Zakaria">Magdi Zakaria</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Ncbi/Merge

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001699 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd -nk 001699 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Ncbi
   |étape=   Merge
   |type=    RBID
   |clé=     pubmed:27412092
   |texte=   LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/RBID.i   -Sk "pubmed:27412092" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021

	Serveur d'exploration MERS
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration MERS

LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.

LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki