Haplotype-aware graph indexes.
Identifieur interne : 000455 ( PubMed/Corpus ); précédent : 000454; suivant : 000456Haplotype-aware graph indexes.
Auteurs : Jouni Sirén ; Erik Garrison ; Adam M. Novak ; Benedict Paten ; Richard DurbinSource :
- Bioinformatics (Oxford, England) [ 1367-4811 ] ; 2020.
Abstract
The variation graph toolkit (VG) represents genetic variation as a graph. Although each path in the graph is a potential haplotype, most paths are non-biological, unlikely recombinations of true haplotypes.
DOI: 10.1093/bioinformatics/btz575
PubMed: 31406990
Links to Exploration step
pubmed:31406990Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Haplotype-aware graph indexes.</title>
<author><name sortKey="Siren, Jouni" sort="Siren, Jouni" uniqKey="Siren J" first="Jouni" last="Sirén">Jouni Sirén</name>
<affiliation><nlm:affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Garrison, Erik" sort="Garrison, Erik" uniqKey="Garrison E" first="Erik" last="Garrison">Erik Garrison</name>
<affiliation><nlm:affiliation>Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Novak, Adam M" sort="Novak, Adam M" uniqKey="Novak A" first="Adam M" last="Novak">Adam M. Novak</name>
<affiliation><nlm:affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Paten, Benedict" sort="Paten, Benedict" uniqKey="Paten B" first="Benedict" last="Paten">Benedict Paten</name>
<affiliation><nlm:affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Durbin, Richard" sort="Durbin, Richard" uniqKey="Durbin R" first="Richard" last="Durbin">Richard Durbin</name>
<affiliation><nlm:affiliation>Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.</nlm:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2020">2020</date>
<idno type="RBID">pubmed:31406990</idno>
<idno type="pmid">31406990</idno>
<idno type="doi">10.1093/bioinformatics/btz575</idno>
<idno type="wicri:Area/PubMed/Corpus">000455</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000455</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Haplotype-aware graph indexes.</title>
<author><name sortKey="Siren, Jouni" sort="Siren, Jouni" uniqKey="Siren J" first="Jouni" last="Sirén">Jouni Sirén</name>
<affiliation><nlm:affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Garrison, Erik" sort="Garrison, Erik" uniqKey="Garrison E" first="Erik" last="Garrison">Erik Garrison</name>
<affiliation><nlm:affiliation>Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Novak, Adam M" sort="Novak, Adam M" uniqKey="Novak A" first="Adam M" last="Novak">Adam M. Novak</name>
<affiliation><nlm:affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Paten, Benedict" sort="Paten, Benedict" uniqKey="Paten B" first="Benedict" last="Paten">Benedict Paten</name>
<affiliation><nlm:affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</nlm:affiliation>
</affiliation>
</author>
<author><name sortKey="Durbin, Richard" sort="Durbin, Richard" uniqKey="Durbin R" first="Richard" last="Durbin">Richard Durbin</name>
<affiliation><nlm:affiliation>Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.</nlm:affiliation>
</affiliation>
</author>
</analytic>
<series><title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2020" type="published">2020</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">The variation graph toolkit (VG) represents genetic variation as a graph. Although each path in the graph is a potential haplotype, most paths are non-biological, unlikely recombinations of true haplotypes.</div>
</front>
</TEI>
<pubmed><MedlineCitation Status="In-Data-Review" Owner="NLM"><PMID Version="1">31406990</PMID>
<DateRevised><Year>2020</Year>
<Month>03</Month>
<Day>16</Day>
</DateRevised>
<Article PubModel="Print"><Journal><ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet"><Volume>36</Volume>
<Issue>2</Issue>
<PubDate><Year>2020</Year>
<Month>Jan</Month>
<Day>15</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>Haplotype-aware graph indexes.</ArticleTitle>
<Pagination><MedlinePgn>400-407</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1093/bioinformatics/btz575</ELocationID>
<Abstract><AbstractText Label="MOTIVATION" NlmCategory="BACKGROUND">The variation graph toolkit (VG) represents genetic variation as a graph. Although each path in the graph is a potential haplotype, most paths are non-biological, unlikely recombinations of true haplotypes.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">We augment the VG model with haplotype information to identify which paths are more likely to exist in nature. For this purpose, we develop a scalable implementation of the graph extension of the positional Burrows-Wheeler transform. We demonstrate the scalability of the new implementation by building a whole-genome index of the 5008 haplotypes of the 1000 Genomes Project, and an index of all 108 070 Trans-Omics for Precision Medicine Freeze 5 chromosome 17 haplotypes. We also develop an algorithm for simplifying variation graphs for k-mer indexing without losing any k-mers in the haplotypes.</AbstractText>
<AbstractText Label="AVAILABILITY AND IMPLEMENTATION" NlmCategory="METHODS">Our software is available at https://github.com/vgteam/vg, https://github.com/jltsiren/gbwt and https://github.com/jltsiren/gcsa2.</AbstractText>
<AbstractText Label="SUPPLEMENTARY INFORMATION" NlmCategory="BACKGROUND">Supplementary data are available at Bioinformatics online.</AbstractText>
<CopyrightInformation>© The Author(s) 2019. Published by Oxford University Press.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y"><Author ValidYN="Y"><LastName>Sirén</LastName>
<ForeName>Jouni</ForeName>
<Initials>J</Initials>
<AffiliationInfo><Affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</Affiliation>
</AffiliationInfo>
<AffiliationInfo><Affiliation>Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Garrison</LastName>
<ForeName>Erik</ForeName>
<Initials>E</Initials>
<AffiliationInfo><Affiliation>Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Novak</LastName>
<ForeName>Adam M</ForeName>
<Initials>AM</Initials>
<AffiliationInfo><Affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Paten</LastName>
<ForeName>Benedict</ForeName>
<Initials>B</Initials>
<AffiliationInfo><Affiliation>UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y"><LastName>Durbin</LastName>
<ForeName>Richard</ForeName>
<Initials>R</Initials>
<AffiliationInfo><Affiliation>Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.</Affiliation>
</AffiliationInfo>
<AffiliationInfo><Affiliation>Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<GrantList CompleteYN="Y"><Grant><GrantID>207492/Z/17/Z</GrantID>
<Acronym>WT_</Acronym>
<Agency>Wellcome Trust</Agency>
<Country>United Kingdom</Country>
</Grant>
</GrantList>
<PublicationTypeList><PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo><Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
</MedlineCitation>
<PubmedData><History><PubMedPubDate PubStatus="received"><Year>2019</Year>
<Month>02</Month>
<Day>20</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="revised"><Year>2019</Year>
<Month>05</Month>
<Day>29</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted"><Year>2019</Year>
<Month>07</Month>
<Day>18</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed"><Year>2019</Year>
<Month>8</Month>
<Day>14</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline"><Year>2019</Year>
<Month>8</Month>
<Day>14</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez"><Year>2019</Year>
<Month>8</Month>
<Day>14</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList><ArticleId IdType="pubmed">31406990</ArticleId>
<ArticleId IdType="pii">5538990</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btz575</ArticleId>
</ArticleIdList>
</PubmedData>
</pubmed>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000455 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 000455 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= PubMed |étape= Corpus |type= RBID |clé= pubmed:31406990 |texte= Haplotype-aware graph indexes. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i -Sk "pubmed:31406990" \ | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |