Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

KAnalyze: a fast versatile pipelined k-mer toolkit.

Identifieur interne : 001A25 ( PubMed/Corpus ); précédent : 001A24; suivant : 001A26

KAnalyze: a fast versatile pipelined k-mer toolkit.

Auteurs : Peter Audano ; Fredrik Vannberg

Source :

RBID : pubmed:24642064

English descriptors

Abstract

Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language.

DOI: 10.1093/bioinformatics/btu152
PubMed: 24642064

Links to Exploration step

pubmed:24642064

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">KAnalyze: a fast versatile pipelined k-mer toolkit.</title>
<author>
<name sortKey="Audano, Peter" sort="Audano, Peter" uniqKey="Audano P" first="Peter" last="Audano">Peter Audano</name>
<affiliation>
<nlm:affiliation>School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Vannberg, Fredrik" sort="Vannberg, Fredrik" uniqKey="Vannberg F" first="Fredrik" last="Vannberg">Fredrik Vannberg</name>
<affiliation>
<nlm:affiliation>School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.</nlm:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2014">2014</date>
<idno type="RBID">pubmed:24642064</idno>
<idno type="pmid">24642064</idno>
<idno type="doi">10.1093/bioinformatics/btu152</idno>
<idno type="wicri:Area/PubMed/Corpus">001A25</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001A25</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">KAnalyze: a fast versatile pipelined k-mer toolkit.</title>
<author>
<name sortKey="Audano, Peter" sort="Audano, Peter" uniqKey="Audano P" first="Peter" last="Audano">Peter Audano</name>
<affiliation>
<nlm:affiliation>School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Vannberg, Fredrik" sort="Vannberg, Fredrik" uniqKey="Vannberg F" first="Fredrik" last="Vannberg">Fredrik Vannberg</name>
<affiliation>
<nlm:affiliation>School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.</nlm:affiliation>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint>
<date when="2014" type="published">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Chromosomes, Human, Pair 1 (chemistry)</term>
<term>Humans</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" qualifier="chemistry" xml:lang="en">
<term>Chromosomes, Human, Pair 1</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Humans</term>
<term>Software</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" IndexingMethod="Curated" Owner="NLM">
<PMID Version="1">24642064</PMID>
<DateCompleted>
<Year>2014</Year>
<Month>09</Month>
<Day>18</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>12</Month>
<Day>02</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>30</Volume>
<Issue>14</Issue>
<PubDate>
<Year>2014</Year>
<Month>Jul</Month>
<Day>15</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>KAnalyze: a fast versatile pipelined k-mer toolkit.</ArticleTitle>
<Pagination>
<MedlinePgn>2070-2</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1093/bioinformatics/btu152</ELocationID>
<Abstract>
<AbstractText Label="MOTIVATION" NlmCategory="BACKGROUND">Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">As a k-mer counter, KAnalyze outperforms Jellyfish, DSK and a pipeline built on Perl and Linux utilities. Through extensive unit and system testing, we have verified that KAnalyze produces the correct k-mer counts over multiple datasets and k-mer sizes.</AbstractText>
<AbstractText Label="AVAILABILITY AND IMPLEMENTATION" NlmCategory="METHODS">KAnalyze is available on SourceForge: https://sourceforge.net/projects/kanalyze/.</AbstractText>
<CopyrightInformation>© The Author 2014. Published by Oxford University Press.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Audano</LastName>
<ForeName>Peter</ForeName>
<Initials>P</Initials>
<AffiliationInfo>
<Affiliation>School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Vannberg</LastName>
<ForeName>Fredrik</ForeName>
<Initials>F</Initials>
<AffiliationInfo>
<Affiliation>School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<GrantList CompleteYN="Y">
<Grant>
<GrantID>T32 GM105490</GrantID>
<Acronym>GM</Acronym>
<Agency>NIGMS NIH HHS</Agency>
<Country>United States</Country>
</Grant>
</GrantList>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2014</Year>
<Month>03</Month>
<Day>18</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="N">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D002878" MajorTopicYN="N">Chromosomes, Human, Pair 1</DescriptorName>
<QualifierName UI="Q000737" MajorTopicYN="N">chemistry</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D006801" MajorTopicYN="N">Humans</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012984" MajorTopicYN="Y">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="entrez">
<Year>2014</Year>
<Month>3</Month>
<Day>20</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2014</Year>
<Month>3</Month>
<Day>20</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2014</Year>
<Month>9</Month>
<Day>19</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">24642064</ArticleId>
<ArticleId IdType="pii">btu152</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btu152</ArticleId>
<ArticleId IdType="pmc">PMC4080738</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Nat Biotechnol. 2013 Apr;31(4):325-30</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23475072</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>PLoS Biol. 2014 Jan;12(1):e1001745</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24415924</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2013 Mar 1;29(5):652-3</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23325618</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2011 Mar 15;27(6):764-70</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21217122</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Nucleic Acids Res. 2009 Jan;37(Database issue):D77-82</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18842628</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A25 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 001A25 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:24642064
   |texte=   KAnalyze: a fast versatile pipelined k-mer toolkit.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:24642064" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021