Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

ntHash: recursive nucleotide hashing

Identifieur interne : 000F38 ( Main/Exploration ); précédent : 000F37; suivant : 000F39

ntHash: recursive nucleotide hashing

Auteurs : Hamid Mohamadi ; Justin Chu ; Benjamin P. Vandervalk ; Inanc Birol

Source :

RBID : PMC:5181554

Descripteurs français

English descriptors

Abstract

Motivation: Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, k-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient.

Results: We present ntHash, a hashing algorithm tuned for processing DNA/RNA sequences. It performs the best when calculating hash values for adjacent k-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases.

Availability and implementation: ntHash is available online at http://www.bcgsc.ca/platform/bioinfo/software/nthash and is free for academic use.

Contacts: hmohamadi@bcgsc.ca or ibirol@bcgsc.ca

Supplementary information:Supplementary data are available at Bioinformatics online.


Url:
DOI: 10.1093/bioinformatics/btw397
PubMed: 27423894
PubMed Central: 5181554


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">ntHash: recursive nucleotide hashing</title>
<author>
<name sortKey="Mohamadi, Hamid" sort="Mohamadi, Hamid" uniqKey="Mohamadi H" first="Hamid" last="Mohamadi">Hamid Mohamadi</name>
</author>
<author>
<name sortKey="Chu, Justin" sort="Chu, Justin" uniqKey="Chu J" first="Justin" last="Chu">Justin Chu</name>
</author>
<author>
<name sortKey="Vandervalk, Benjamin P" sort="Vandervalk, Benjamin P" uniqKey="Vandervalk B" first="Benjamin P." last="Vandervalk">Benjamin P. Vandervalk</name>
</author>
<author>
<name sortKey="Birol, Inanc" sort="Birol, Inanc" uniqKey="Birol I" first="Inanc" last="Birol">Inanc Birol</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27423894</idno>
<idno type="pmc">5181554</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181554</idno>
<idno type="RBID">PMC:5181554</idno>
<idno type="doi">10.1093/bioinformatics/btw397</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000B14</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000B14</idno>
<idno type="wicri:Area/Pmc/Curation">000B14</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000B14</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000955</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000955</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:27423894</idno>
<idno type="wicri:Area/PubMed/Corpus">001046</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001046</idno>
<idno type="wicri:Area/PubMed/Curation">001046</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001046</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000E25</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000E25</idno>
<idno type="wicri:Area/Ncbi/Merge">001701</idno>
<idno type="wicri:Area/Ncbi/Curation">001701</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">001701</idno>
<idno type="wicri:doubleKey">1367-4803:2016:Mohamadi H:nthash:recursive:nucleotide</idno>
<idno type="wicri:Area/Main/Merge">000F41</idno>
<idno type="wicri:Area/Main/Curation">000F38</idno>
<idno type="wicri:Area/Main/Exploration">000F38</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">ntHash: recursive nucleotide hashing</title>
<author>
<name sortKey="Mohamadi, Hamid" sort="Mohamadi, Hamid" uniqKey="Mohamadi H" first="Hamid" last="Mohamadi">Hamid Mohamadi</name>
</author>
<author>
<name sortKey="Chu, Justin" sort="Chu, Justin" uniqKey="Chu J" first="Justin" last="Chu">Justin Chu</name>
</author>
<author>
<name sortKey="Vandervalk, Benjamin P" sort="Vandervalk, Benjamin P" uniqKey="Vandervalk B" first="Benjamin P." last="Vandervalk">Benjamin P. Vandervalk</name>
</author>
<author>
<name sortKey="Birol, Inanc" sort="Birol, Inanc" uniqKey="Birol I" first="Inanc" last="Birol">Inanc Birol</name>
</author>
</analytic>
<series>
<title level="j">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1367-4811</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Animals</term>
<term>Humans</term>
<term>Nucleotides</term>
<term>Sequence Alignment</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Analyse de séquence d'ADN</term>
<term>Animaux</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Nucléotides</term>
</keywords>
<keywords scheme="MESH" type="chemical" xml:lang="en">
<term>Nucleotides</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Animals</term>
<term>Humans</term>
<term>Sequence Alignment</term>
<term>Sequence Analysis, DNA</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Analyse de séquence d'ADN</term>
<term>Animaux</term>
<term>Humains</term>
<term>Logiciel</term>
<term>Nucléotides</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>
<bold>Motivation</bold>
: Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly,
<italic>k</italic>
-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient.</p>
<p>
<bold>Results</bold>
: We present ntHash, a hashing algorithm tuned for processing DNA/RNA sequences. It performs the best when calculating hash values for adjacent
<italic>k</italic>
-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases.</p>
<p>
<bold>Availability and implementation</bold>
: ntHash is available online at
<ext-link ext-link-type="uri" xlink:href="http://www.bcgsc.ca/platform/bioinfo/software/nthash">http://www.bcgsc.ca/platform/bioinfo/software/nthash</ext-link>
and is free for academic use.</p>
<p>
<bold>Contacts</bold>
:
<email>hmohamadi@bcgsc.ca</email>
or
<email>ibirol@bcgsc.ca</email>
</p>
<p>
<bold>Supplementary information:</bold>
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/lookup/suppl/doi:10.1093/bioinformatics/btw397/-/DC1">Supplementary data</ext-link>
are available at
<italic>Bioinformatics</italic>
online.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Cohen, J D" uniqKey="Cohen J">J.D. Cohen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gonnet, G H" uniqKey="Gonnet G">G.H. Gonnet</name>
</author>
<author>
<name sortKey="Baezayates, R A" uniqKey="Baezayates R">R.A. Baezayates</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karp, R M" uniqKey="Karp R">R.M. Karp</name>
</author>
<author>
<name sortKey="Rabin, M O" uniqKey="Rabin M">M.O. Rabin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lemire, D" uniqKey="Lemire D">D. Lemire</name>
</author>
<author>
<name sortKey="Kaser, O" uniqKey="Kaser O">O. Kaser</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list></list>
<tree>
<noCountry>
<name sortKey="Birol, Inanc" sort="Birol, Inanc" uniqKey="Birol I" first="Inanc" last="Birol">Inanc Birol</name>
<name sortKey="Chu, Justin" sort="Chu, Justin" uniqKey="Chu J" first="Justin" last="Chu">Justin Chu</name>
<name sortKey="Mohamadi, Hamid" sort="Mohamadi, Hamid" uniqKey="Mohamadi H" first="Hamid" last="Mohamadi">Hamid Mohamadi</name>
<name sortKey="Vandervalk, Benjamin P" sort="Vandervalk, Benjamin P" uniqKey="Vandervalk B" first="Benjamin P." last="Vandervalk">Benjamin P. Vandervalk</name>
</noCountry>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F38 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000F38 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:5181554
   |texte=   ntHash: recursive nucleotide hashing
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:27423894" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021