Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Improved assembly of noisy long reads by k-mer validation.

Identifieur interne : 000E98 ( PubMed/Curation ); précédent : 000E97; suivant : 000E99

Improved assembly of noisy long reads by k-mer validation.

Auteurs : Antonio Bernardo Carvalho [Brésil] ; Eduardo G. Dupim [Brésil] ; Gabriel Goldstein [Brésil]

Source :

RBID : pubmed:27831497

Descripteurs français

English descriptors

Abstract

Genome assembly depends critically on read length. Two recent technologies, from Pacific Biosciences (PacBio) and Oxford Nanopore, produce read lengths >20 kb, which yield de novo genome assemblies with vastly greater contiguity than those based on Sanger, Illumina, or other technologies. However, the very high error rates of these two new technologies (∼15% per base) makes assembly imprecise at repeats longer than the read length and computationally expensive. Here we show that the contiguity and quality of the assembly of these noisy long reads can be significantly improved at a minimal cost, by leveraging on the low error rate and low cost of Illumina short reads. Namely, k-mers from the PacBio raw reads that are not present in Illumina reads (which account for ∼95% of the distinct k-mers) are deemed sequencing errors and ignored at the seed alignment step. By focusing on the ∼5% of k-mers that are error free, read overlap sensitivity is dramatically increased. Of equal importance, the validation procedure can be extended to exclude repetitive k-mers, which prevents read miscorrection at repeats and further improves the resulting assemblies. We tested the k-mer validation procedure using one long-read technology (PacBio) and one assembler (MHAP/Celera Assembler), but it is very likely to yield analogous improvements with alternative long-read technologies and assemblers, such as Oxford Nanopore and BLASR/DALIGNER/Falcon, respectively.

DOI: 10.1101/gr.209247.116
PubMed: 27831497

Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:27831497

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Improved assembly of noisy long reads by k-mer validation.</title>
<author>
<name sortKey="Carvalho, Antonio Bernardo" sort="Carvalho, Antonio Bernardo" uniqKey="Carvalho A" first="Antonio Bernardo" last="Carvalho">Antonio Bernardo Carvalho</name>
<affiliation wicri:level="1">
<nlm:affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</nlm:affiliation>
<country xml:lang="fr">Brésil</country>
<wicri:regionArea>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Dupim, Eduardo G" sort="Dupim, Eduardo G" uniqKey="Dupim E" first="Eduardo G" last="Dupim">Eduardo G. Dupim</name>
<affiliation wicri:level="1">
<nlm:affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</nlm:affiliation>
<country xml:lang="fr">Brésil</country>
<wicri:regionArea>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Goldstein, Gabriel" sort="Goldstein, Gabriel" uniqKey="Goldstein G" first="Gabriel" last="Goldstein">Gabriel Goldstein</name>
<affiliation wicri:level="1">
<nlm:affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</nlm:affiliation>
<country xml:lang="fr">Brésil</country>
<wicri:regionArea>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2016">2016</date>
<idno type="RBID">pubmed:27831497</idno>
<idno type="pmid">27831497</idno>
<idno type="doi">10.1101/gr.209247.116</idno>
<idno type="wicri:Area/PubMed/Corpus">000E98</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000E98</idno>
<idno type="wicri:Area/PubMed/Curation">000E98</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000E98</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Improved assembly of noisy long reads by k-mer validation.</title>
<author>
<name sortKey="Carvalho, Antonio Bernardo" sort="Carvalho, Antonio Bernardo" uniqKey="Carvalho A" first="Antonio Bernardo" last="Carvalho">Antonio Bernardo Carvalho</name>
<affiliation wicri:level="1">
<nlm:affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</nlm:affiliation>
<country xml:lang="fr">Brésil</country>
<wicri:regionArea>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Dupim, Eduardo G" sort="Dupim, Eduardo G" uniqKey="Dupim E" first="Eduardo G" last="Dupim">Eduardo G. Dupim</name>
<affiliation wicri:level="1">
<nlm:affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</nlm:affiliation>
<country xml:lang="fr">Brésil</country>
<wicri:regionArea>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Goldstein, Gabriel" sort="Goldstein, Gabriel" uniqKey="Goldstein G" first="Gabriel" last="Goldstein">Gabriel Goldstein</name>
<affiliation wicri:level="1">
<nlm:affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</nlm:affiliation>
<country xml:lang="fr">Brésil</country>
<wicri:regionArea>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Genome research</title>
<idno type="eISSN">1549-5469</idno>
<imprint>
<date when="2016" type="published">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Animals</term>
<term>Contig Mapping (methods)</term>
<term>Contig Mapping (standards)</term>
<term>High-Throughput Nucleotide Sequencing (methods)</term>
<term>High-Throughput Nucleotide Sequencing (standards)</term>
<term>Humans</term>
<term>Nanopores</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Sequence Analysis, DNA (standards)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Analyse de séquence d'ADN (normes)</term>
<term>Animaux</term>
<term>Cartographie de contigs ()</term>
<term>Cartographie de contigs (normes)</term>
<term>Humains</term>
<term>Nanopores</term>
<term>Séquençage nucléotidique à haut débit ()</term>
<term>Séquençage nucléotidique à haut débit (normes)</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Contig Mapping</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" qualifier="normes" xml:lang="fr">
<term>Analyse de séquence d'ADN</term>
<term>Cartographie de contigs</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
<keywords scheme="MESH" qualifier="standards" xml:lang="en">
<term>Contig Mapping</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Animals</term>
<term>Humans</term>
<term>Nanopores</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Animaux</term>
<term>Cartographie de contigs</term>
<term>Humains</term>
<term>Nanopores</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Genome assembly depends critically on read length. Two recent technologies, from Pacific Biosciences (PacBio) and Oxford Nanopore, produce read lengths >20 kb, which yield de novo genome assemblies with vastly greater contiguity than those based on Sanger, Illumina, or other technologies. However, the very high error rates of these two new technologies (∼15% per base) makes assembly imprecise at repeats longer than the read length and computationally expensive. Here we show that the contiguity and quality of the assembly of these noisy long reads can be significantly improved at a minimal cost, by leveraging on the low error rate and low cost of Illumina short reads. Namely, k-mers from the PacBio raw reads that are not present in Illumina reads (which account for ∼95% of the distinct k-mers) are deemed sequencing errors and ignored at the seed alignment step. By focusing on the ∼5% of k-mers that are error free, read overlap sensitivity is dramatically increased. Of equal importance, the validation procedure can be extended to exclude repetitive k-mers, which prevents read miscorrection at repeats and further improves the resulting assemblies. We tested the k-mer validation procedure using one long-read technology (PacBio) and one assembler (MHAP/Celera Assembler), but it is very likely to yield analogous improvements with alternative long-read technologies and assemblers, such as Oxford Nanopore and BLASR/DALIGNER/Falcon, respectively.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">27831497</PMID>
<DateCompleted>
<Year>2017</Year>
<Month>12</Month>
<Day>15</Day>
</DateCompleted>
<DateRevised>
<Year>2019</Year>
<Month>01</Month>
<Day>15</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1549-5469</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>26</Volume>
<Issue>12</Issue>
<PubDate>
<Year>2016</Year>
<Month>12</Month>
</PubDate>
</JournalIssue>
<Title>Genome research</Title>
<ISOAbbreviation>Genome Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>Improved assembly of noisy long reads by k-mer validation.</ArticleTitle>
<Pagination>
<MedlinePgn>1710-1720</MedlinePgn>
</Pagination>
<Abstract>
<AbstractText>Genome assembly depends critically on read length. Two recent technologies, from Pacific Biosciences (PacBio) and Oxford Nanopore, produce read lengths >20 kb, which yield de novo genome assemblies with vastly greater contiguity than those based on Sanger, Illumina, or other technologies. However, the very high error rates of these two new technologies (∼15% per base) makes assembly imprecise at repeats longer than the read length and computationally expensive. Here we show that the contiguity and quality of the assembly of these noisy long reads can be significantly improved at a minimal cost, by leveraging on the low error rate and low cost of Illumina short reads. Namely, k-mers from the PacBio raw reads that are not present in Illumina reads (which account for ∼95% of the distinct k-mers) are deemed sequencing errors and ignored at the seed alignment step. By focusing on the ∼5% of k-mers that are error free, read overlap sensitivity is dramatically increased. Of equal importance, the validation procedure can be extended to exclude repetitive k-mers, which prevents read miscorrection at repeats and further improves the resulting assemblies. We tested the k-mer validation procedure using one long-read technology (PacBio) and one assembler (MHAP/Celera Assembler), but it is very likely to yield analogous improvements with alternative long-read technologies and assemblers, such as Oxford Nanopore and BLASR/DALIGNER/Falcon, respectively.</AbstractText>
<CopyrightInformation>© 2016 Carvalho et al.; Published by Cold Spring Harbor Laboratory Press.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Carvalho</LastName>
<ForeName>Antonio Bernardo</ForeName>
<Initials>AB</Initials>
<Identifier Source="ORCID">0000-0001-8959-6469</Identifier>
<AffiliationInfo>
<Affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Dupim</LastName>
<ForeName>Eduardo G</ForeName>
<Initials>EG</Initials>
<AffiliationInfo>
<Affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Goldstein</LastName>
<ForeName>Gabriel</ForeName>
<Initials>G</Initials>
<AffiliationInfo>
<Affiliation>Departamento de Genética, Universidade Federal do Rio de Janeiro, CEP 21941-971, Rio de Janeiro, Brazil.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2016</Year>
<Month>10</Month>
<Day>07</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>United States</Country>
<MedlineTA>Genome Res</MedlineTA>
<NlmUniqueID>9518021</NlmUniqueID>
<ISSNLinking>1088-9051</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D000818" MajorTopicYN="N">Animals</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D020451" MajorTopicYN="N">Contig Mapping</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
<QualifierName UI="Q000592" MajorTopicYN="Y">standards</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="N">methods</QualifierName>
<QualifierName UI="Q000592" MajorTopicYN="N">standards</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D058608" MajorTopicYN="N">Nanopores</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="N">methods</QualifierName>
<QualifierName UI="Q000592" MajorTopicYN="N">standards</QualifierName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2016</Year>
<Month>05</Month>
<Day>01</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2016</Year>
<Month>09</Month>
<Day>29</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2016</Year>
<Month>11</Month>
<Day>11</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2017</Year>
<Month>12</Month>
<Day>16</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2016</Year>
<Month>11</Month>
<Day>11</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">27831497</ArticleId>
<ArticleId IdType="pii">gr.209247.116</ArticleId>
<ArticleId IdType="doi">10.1101/gr.209247.116</ArticleId>
<ArticleId IdType="pmc">PMC5131822</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Genetics. 2000 Feb;154(2):759-69</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">10655227</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genetica. 2000;109(1-2):113-23</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">11293786</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genetica. 2003 Mar;117(2-3):227-37</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">12723702</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2004;5(2):R12</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">14759262</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2008;9(3):R55</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18341692</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genetics. 1991 Sep;129(1):177-89</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">1936957</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Comput Biol. 2009 Jul;16(7):897-908</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19580519</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2010 Oct 28;467(7319):1061-73</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">20981092</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Mar 15;27(6):764-70</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21217122</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Genet. 2011 Aug 28;43(10):956-63</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21874002</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2011 Aug 28;477(7365):419-23</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21874022</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Annu Rev Genomics Hum Genet. 2012;13:83-108</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22483277</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Cell. 2012 May 11;149(4):912-22</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22559943</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2012 Jul 01;30(7):693-700</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22750884</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>BMC Bioinformatics. 2012 Sep 19;13:238</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22988817</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Rev Genet. 2013 Mar;14(3):157-67</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23358380</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2013 Apr 15;29(8):1072-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23422339</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Methods. 2013 Jun;10(6):563-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23644548</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2013 May 29;14(5):R51</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23718773</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2013 Jul 23;8(7):e68824</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23894349</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2013;14(9):R101</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24034426</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS One. 2014 Sep 04;9(9):e106689</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25188499</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nature. 2015 Jan 29;517(7536):608-11</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25383537</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Curr Opin Microbiol. 2015 Feb;23:110-20</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25461581</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>MBio. 2015 Mar 31;6(2):null</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25827421</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>G3 (Bethesda). 2015 Apr 09;5(6):1145-50</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25858959</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Sci Data. 2014 Nov 25;1:140045</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25977796</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Biotechnol. 2015 Jun;33(6):623-30</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26006009</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Methods. 2015 Aug;12(8):733-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26076426</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nat Commun. 2015 Jun 16;6:7394</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26077599</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Proc Natl Acad Sci U S A. 2015 Oct 6;112(40):12450-5</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26385968</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2015 Nov;25(11):1750-6</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26447147</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>G3 (Bethesda). 2015 Oct 23;5(12):2843-56</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26497146</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2016 Jul 15;32(14):2103-10</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27153593</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Sci Data. 2016 Jun 07;3:160025</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27271295</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2017 May;27(5):722-736</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">28298431</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Comput Biol. 1995 Summer;2(2):275-90</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">7497129</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 1997 May;7(5):401-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">9149936</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Science. 1997 Sep 5;277(5331):1453-62</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">9278503</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000E98 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd -nk 000E98 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Curation
   |type=    RBID
   |clé=     pubmed:27831497
   |texte=   Improved assembly of noisy long reads by k-mer validation.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Curation/RBID.i   -Sk "pubmed:27831497" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021