ntHash: recursive nucleotide hashing
Identifieur interne : 000955 ( Pmc/Checkpoint ); précédent : 000954; suivant : 000956ntHash: recursive nucleotide hashing
Auteurs : Hamid Mohamadi ; Justin Chu ; Benjamin P. Vandervalk ; Inanc BirolSource :
- Bioinformatics [ 1367-4803 ] ; 2016.
Abstract
Url:
DOI: 10.1093/bioinformatics/btw397
PubMed: 27423894
PubMed Central: 5181554
Affiliations:
Links toward previous steps (curation, corpus...)
Links to Exploration step
PMC:5181554Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">ntHash: recursive nucleotide hashing</title>
<author><name sortKey="Mohamadi, Hamid" sort="Mohamadi, Hamid" uniqKey="Mohamadi H" first="Hamid" last="Mohamadi">Hamid Mohamadi</name>
</author>
<author><name sortKey="Chu, Justin" sort="Chu, Justin" uniqKey="Chu J" first="Justin" last="Chu">Justin Chu</name>
</author>
<author><name sortKey="Vandervalk, Benjamin P" sort="Vandervalk, Benjamin P" uniqKey="Vandervalk B" first="Benjamin P." last="Vandervalk">Benjamin P. Vandervalk</name>
</author>
<author><name sortKey="Birol, Inanc" sort="Birol, Inanc" uniqKey="Birol I" first="Inanc" last="Birol">Inanc Birol</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">27423894</idno>
<idno type="pmc">5181554</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181554</idno>
<idno type="RBID">PMC:5181554</idno>
<idno type="doi">10.1093/bioinformatics/btw397</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000B14</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000B14</idno>
<idno type="wicri:Area/Pmc/Curation">000B14</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000B14</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000955</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000955</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">ntHash: recursive nucleotide hashing</title>
<author><name sortKey="Mohamadi, Hamid" sort="Mohamadi, Hamid" uniqKey="Mohamadi H" first="Hamid" last="Mohamadi">Hamid Mohamadi</name>
</author>
<author><name sortKey="Chu, Justin" sort="Chu, Justin" uniqKey="Chu J" first="Justin" last="Chu">Justin Chu</name>
</author>
<author><name sortKey="Vandervalk, Benjamin P" sort="Vandervalk, Benjamin P" uniqKey="Vandervalk B" first="Benjamin P." last="Vandervalk">Benjamin P. Vandervalk</name>
</author>
<author><name sortKey="Birol, Inanc" sort="Birol, Inanc" uniqKey="Birol I" first="Inanc" last="Birol">Inanc Birol</name>
</author>
</analytic>
<series><title level="j">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1367-4811</idno>
<imprint><date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p><bold>Motivation</bold>
: Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, <italic>k</italic>
-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient.</p>
<p><bold>Results</bold>
: We present ntHash, a hashing algorithm tuned for processing DNA/RNA sequences. It performs the best when calculating hash values for adjacent <italic>k</italic>
-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases.</p>
<p><bold>Availability and implementation</bold>
: ntHash is available online at <ext-link ext-link-type="uri" xlink:href="http://www.bcgsc.ca/platform/bioinfo/software/nthash">http://www.bcgsc.ca/platform/bioinfo/software/nthash</ext-link>
and is free for academic use.</p>
<p><bold>Contacts</bold>
: <email>hmohamadi@bcgsc.ca</email>
or <email>ibirol@bcgsc.ca</email>
</p>
<p><bold>Supplementary information:</bold>
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/lookup/suppl/doi:10.1093/bioinformatics/btw397/-/DC1">Supplementary data</ext-link>
are available at <italic>Bioinformatics</italic>
online.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Cohen, J D" uniqKey="Cohen J">J.D. Cohen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gonnet, G H" uniqKey="Gonnet G">G.H. Gonnet</name>
</author>
<author><name sortKey="Baezayates, R A" uniqKey="Baezayates R">R.A. Baezayates</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Karp, R M" uniqKey="Karp R">R.M. Karp</name>
</author>
<author><name sortKey="Rabin, M O" uniqKey="Rabin M">M.O. Rabin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lemire, D" uniqKey="Lemire D">D. Lemire</name>
</author>
<author><name sortKey="Kaser, O" uniqKey="Kaser O">O. Kaser</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-ta">Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">Bioinformatics</journal-id>
<journal-id journal-id-type="publisher-id">bioinformatics</journal-id>
<journal-id journal-id-type="hwp">bioinfo</journal-id>
<journal-title-group><journal-title>Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="ppub">1367-4803</issn>
<issn pub-type="epub">1367-4811</issn>
<publisher><publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">27423894</article-id>
<article-id pub-id-type="pmc">5181554</article-id>
<article-id pub-id-type="doi">10.1093/bioinformatics/btw397</article-id>
<article-id pub-id-type="publisher-id">btw397</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Applications Notes</subject>
<subj-group subj-group-type="heading"><subject>Sequence Analysis</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group><article-title>ntHash: recursive nucleotide hashing</article-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Mohamadi</surname>
<given-names>Hamid</given-names>
</name>
<xref ref-type="corresp" rid="btw397-cor1">*</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Chu</surname>
<given-names>Justin</given-names>
</name>
</contrib>
<contrib contrib-type="author"><name><surname>Vandervalk</surname>
<given-names>Benjamin P.</given-names>
</name>
</contrib>
<contrib contrib-type="author"><name><surname>Birol</surname>
<given-names>Inanc</given-names>
</name>
<xref ref-type="corresp" rid="btw397-cor1">*</xref>
</contrib>
<aff id="btw397-aff1">Canada’s Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4S6, Canada</aff>
</contrib-group>
<author-notes><corresp id="btw397-cor1">*To whom correspondence should be addressed.</corresp>
<fn id="btw397-FM1"><p>Associate Editor: Bonnie Berger</p>
</fn>
</author-notes>
<pub-date pub-type="ppub"><day>15</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="epub"><day>16</day>
<month>7</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release"><day>16</day>
<month>7</month>
<year>2016</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>32</volume>
<issue>22</issue>
<fpage>3492</fpage>
<lpage>3494</lpage>
<history><date date-type="received"><day>3</day>
<month>2</month>
<year>2016</year>
</date>
<date date-type="rev-recd"><day>14</day>
<month>6</month>
<year>2016</year>
</date>
<date date-type="accepted"><day>17</day>
<month>6</month>
<year>2016</year>
</date>
</history>
<permissions><copyright-statement>© The Author 2016. Published by Oxford University Press.</copyright-statement>
<copyright-year>2016</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by-nc/4.0/" license-type="creative-commons"><license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</ext-link>
), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com</license-p>
</license>
</permissions>
<abstract><p><bold>Motivation</bold>
: Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, <italic>k</italic>
-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient.</p>
<p><bold>Results</bold>
: We present ntHash, a hashing algorithm tuned for processing DNA/RNA sequences. It performs the best when calculating hash values for adjacent <italic>k</italic>
-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases.</p>
<p><bold>Availability and implementation</bold>
: ntHash is available online at <ext-link ext-link-type="uri" xlink:href="http://www.bcgsc.ca/platform/bioinfo/software/nthash">http://www.bcgsc.ca/platform/bioinfo/software/nthash</ext-link>
and is free for academic use.</p>
<p><bold>Contacts</bold>
: <email>hmohamadi@bcgsc.ca</email>
or <email>ibirol@bcgsc.ca</email>
</p>
<p><bold>Supplementary information:</bold>
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/lookup/suppl/doi:10.1093/bioinformatics/btw397/-/DC1">Supplementary data</ext-link>
are available at <italic>Bioinformatics</italic>
online.</p>
</abstract>
<counts><page-count count="3"></page-count>
</counts>
</article-meta>
</front>
</pmc>
<affiliations><list></list>
<tree><noCountry><name sortKey="Birol, Inanc" sort="Birol, Inanc" uniqKey="Birol I" first="Inanc" last="Birol">Inanc Birol</name>
<name sortKey="Chu, Justin" sort="Chu, Justin" uniqKey="Chu J" first="Justin" last="Chu">Justin Chu</name>
<name sortKey="Mohamadi, Hamid" sort="Mohamadi, Hamid" uniqKey="Mohamadi H" first="Hamid" last="Mohamadi">Hamid Mohamadi</name>
<name sortKey="Vandervalk, Benjamin P" sort="Vandervalk, Benjamin P" uniqKey="Vandervalk B" first="Benjamin P." last="Vandervalk">Benjamin P. Vandervalk</name>
</noCountry>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000955 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd -nk 000955 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Pmc |étape= Checkpoint |type= RBID |clé= PMC:5181554 |texte= ntHash: recursive nucleotide hashing }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/RBID.i -Sk "pubmed:27423894" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
![]() | This area was generated with Dilib version V0.6.33. | ![]() |