COSINE: non-seeding method for mapping long noisy sequences
Identifieur interne : 000918 ( Pmc/Checkpoint ); précédent : 000917; suivant : 000919COSINE: non-seeding method for mapping long noisy sequences
Auteurs : Pegah Tootoonchi Afshar [États-Unis] ; Wing Hung Wong [États-Unis]Source :
- Nucleic Acids Research [ 0305-1048 ] ; 2017.
Abstract
Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short
Url:
DOI: 10.1093/nar/gkx511
PubMed: 28586438
PubMed Central: 5737678
Affiliations:
Links toward previous steps (curation, corpus...)
Links to Exploration step
PMC:5737678Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">COSINE: non-seeding method for mapping long noisy sequences</title>
<author><name sortKey="Afshar, Pegah Tootoonchi" sort="Afshar, Pegah Tootoonchi" uniqKey="Afshar P" first="Pegah Tootoonchi" last="Afshar">Pegah Tootoonchi Afshar</name>
<affiliation wicri:level="2"><nlm:aff id="AFF1">Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName><region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Wong, Wing Hung" sort="Wong, Wing Hung" uniqKey="Wong W" first="Wing Hung" last="Wong">Wing Hung Wong</name>
<affiliation wicri:level="2"><nlm:aff id="AFF2">Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName><region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">28586438</idno>
<idno type="pmc">5737678</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737678</idno>
<idno type="RBID">PMC:5737678</idno>
<idno type="doi">10.1093/nar/gkx511</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Pmc/Corpus">000F60</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000F60</idno>
<idno type="wicri:Area/Pmc/Curation">000F60</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000F60</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000918</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000918</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">COSINE: non-seeding method for mapping long noisy sequences</title>
<author><name sortKey="Afshar, Pegah Tootoonchi" sort="Afshar, Pegah Tootoonchi" uniqKey="Afshar P" first="Pegah Tootoonchi" last="Afshar">Pegah Tootoonchi Afshar</name>
<affiliation wicri:level="2"><nlm:aff id="AFF1">Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName><region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Wong, Wing Hung" sort="Wong, Wing Hung" uniqKey="Wong W" first="Wing Hung" last="Wong">Wing Hung Wong</name>
<affiliation wicri:level="2"><nlm:aff id="AFF2">Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName><region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j">Nucleic Acids Research</title>
<idno type="ISSN">0305-1048</idno>
<idno type="eISSN">1362-4962</idno>
<imprint><date when="2017">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><title>Abstract</title>
<p>Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short <italic>k</italic>
-mers (<italic>k</italic>
= 3–4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Zhang, J" uniqKey="Zhang J">J. Zhang</name>
</author>
<author><name sortKey="Chiodini, R" uniqKey="Chiodini R">R. Chiodini</name>
</author>
<author><name sortKey="Badr, A" uniqKey="Badr A">A. Badr</name>
</author>
<author><name sortKey="Zhangd, G" uniqKey="Zhangd G">G. Zhangd</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Liu, L" uniqKey="Liu L">L. Liu</name>
</author>
<author><name sortKey="Li, Y" uniqKey="Li Y">Y. Li</name>
</author>
<author><name sortKey="Li, L" uniqKey="Li L">L. Li</name>
</author>
<author><name sortKey="Hu, N" uniqKey="Hu N">N. Hu</name>
</author>
<author><name sortKey="He, Y" uniqKey="He Y">Y. He</name>
</author>
<author><name sortKey="Pong, R" uniqKey="Pong R">R. Pong</name>
</author>
<author><name sortKey="Lin, D" uniqKey="Lin D">D. Lin</name>
</author>
<author><name sortKey="Lu, L" uniqKey="Lu L">L. Lu</name>
</author>
<author><name sortKey="Law, M" uniqKey="Law M">M. Law</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Koboldt, D C" uniqKey="Koboldt D">D.C. Koboldt</name>
</author>
<author><name sortKey="Steinberg, K M" uniqKey="Steinberg K">K.M. Steinberg</name>
</author>
<author><name sortKey="Larson, D E" uniqKey="Larson D">D.E. Larson</name>
</author>
<author><name sortKey="Wilson, R K" uniqKey="Wilson R">R.K. Wilson</name>
</author>
<author><name sortKey="Mardis, E R" uniqKey="Mardis E">E.R. Mardis</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Buermans, H P" uniqKey="Buermans H">H.P. Buermans</name>
</author>
<author><name sortKey="Den Dunnen, J T" uniqKey="Den Dunnen J">J.T. den Dunnen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Leblanc, V G" uniqKey="Leblanc V">V.G. LeBlanc</name>
</author>
<author><name sortKey="Marra, M A" uniqKey="Marra M">M.A. Marra</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Shendure, J" uniqKey="Shendure J">J. Shendure</name>
</author>
<author><name sortKey="Ji, H" uniqKey="Ji H">H. Ji</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Laehnemann, D" uniqKey="Laehnemann D">D. Laehnemann</name>
</author>
<author><name sortKey="Borkhardt, A" uniqKey="Borkhardt A">A. Borkhardt</name>
</author>
<author><name sortKey="Mchardy, A C" uniqKey="Mchardy A">A.C. McHardy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Goodwin, S" uniqKey="Goodwin S">S. Goodwin</name>
</author>
<author><name sortKey="Gurtowski, J" uniqKey="Gurtowski J">J. Gurtowski</name>
</author>
<author><name sortKey="Ethe Sayers, S" uniqKey="Ethe Sayers S">S. Ethe-Sayers</name>
</author>
<author><name sortKey="Deshpande, P" uniqKey="Deshpande P">P. Deshpande</name>
</author>
<author><name sortKey="Schatz, M C" uniqKey="Schatz M">M.C. Schatz</name>
</author>
<author><name sortKey="Mccombie, W R" uniqKey="Mccombie W">W.R. McCombie</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lavera, T" uniqKey="Lavera T">T. Lavera</name>
</author>
<author><name sortKey="Harrisona, J" uniqKey="Harrisona J">J. Harrisona</name>
</author>
<author><name sortKey="O Eilla, P A" uniqKey="O Eilla P">P.A. O’Neilla</name>
</author>
<author><name sortKey="Moorea, K" uniqKey="Moorea K">K. Moorea</name>
</author>
<author><name sortKey="Farbosa, A" uniqKey="Farbosa A">A. Farbosa</name>
</author>
<author><name sortKey="Paszkiewicza, K" uniqKey="Paszkiewicza K">K. Paszkiewicza</name>
</author>
<author><name sortKey="Studholmea, D J" uniqKey="Studholmea D">D.J. Studholmea</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ip, C L C" uniqKey="Ip C">C.L.C. Ip</name>
</author>
<author><name sortKey="Loose, M" uniqKey="Loose M">M. Loose</name>
</author>
<author><name sortKey="Tyson, J R" uniqKey="Tyson J">J.R. Tyson</name>
</author>
<author><name sortKey="De Cesare, M" uniqKey="De Cesare M">M. de Cesare</name>
</author>
<author><name sortKey="Brown, B L" uniqKey="Brown B">B.L. Brown</name>
</author>
<author><name sortKey="Jain, M" uniqKey="Jain M">M. Jain</name>
</author>
<author><name sortKey="Leggett, R M" uniqKey="Leggett R">R.M. Leggett</name>
</author>
<author><name sortKey="Eccles, D A" uniqKey="Eccles D">D.A. Eccles</name>
</author>
<author><name sortKey="Zalunin, V" uniqKey="Zalunin V">V. Zalunin</name>
</author>
<author><name sortKey="Urban, J M" uniqKey="Urban J">J.M. Urban</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kielbasa, S M" uniqKey="Kielbasa S">S.M. Kiełbasa</name>
</author>
<author><name sortKey="Wan, R" uniqKey="Wan R">R. Wan</name>
</author>
<author><name sortKey="Sato, K" uniqKey="Sato K">K. Sato</name>
</author>
<author><name sortKey="Horton, P" uniqKey="Horton P">P. Horton</name>
</author>
<author><name sortKey="Frith, M C" uniqKey="Frith M">M.C. Frith</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chaisson, M J" uniqKey="Chaisson M">M.J. Chaisson</name>
</author>
<author><name sortKey="Tesler, G" uniqKey="Tesler G">G. Tesler</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Katoh, K" uniqKey="Katoh K">K. Katoh</name>
</author>
<author><name sortKey="Misawa, K" uniqKey="Misawa K">K. Misawa</name>
</author>
<author><name sortKey="Miyata, T" uniqKey="Miyata T">T. Miyata</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S. Rajasekaran</name>
</author>
<author><name sortKey="Jin, X" uniqKey="Jin X">X. Jin</name>
</author>
<author><name sortKey="Spouge, J L" uniqKey="Spouge J">J.L. Spouge</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rockwood, A L" uniqKey="Rockwood A">A.L. Rockwood</name>
</author>
<author><name sortKey="Crockett, D K" uniqKey="Crockett D">D.K. Crockett</name>
</author>
<author><name sortKey="Oliphant, J R" uniqKey="Oliphant J">J.R. Oliphant</name>
</author>
<author><name sortKey="Elenitoba Johnson, K S J" uniqKey="Elenitoba Johnson K">K.S.J. Elenitoba-Johnson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Magi, A" uniqKey="Magi A">A. Magi</name>
</author>
<author><name sortKey="Giusti, B" uniqKey="Giusti B">B. Giusti</name>
</author>
<author><name sortKey="Tattini, L" uniqKey="Tattini L">L. Tattini</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Felsenstein, J" uniqKey="Felsenstein J">J. Felsenstein</name>
</author>
<author><name sortKey="Sawyer, S" uniqKey="Sawyer S">S. Sawyer</name>
</author>
<author><name sortKey="Kochin, R" uniqKey="Kochin R">R. Kochin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ono, Y" uniqKey="Ono Y">Y. Ono</name>
</author>
<author><name sortKey="Asai, K" uniqKey="Asai K">K. Asai</name>
</author>
<author><name sortKey="Hamada, M" uniqKey="Hamada M">M. Hamada</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Quick, J" uniqKey="Quick J">J. Quick</name>
</author>
<author><name sortKey="Quinlan, A R" uniqKey="Quinlan A">A.R. Quinlan</name>
</author>
<author><name sortKey="Loman, N J" uniqKey="Loman N">N.J. Loman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Loman, N J" uniqKey="Loman N">N.J. Loman</name>
</author>
<author><name sortKey="Quinlan, A R" uniqKey="Quinlan A">A.R. Quinlan</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-ta">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="iso-abbrev">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="publisher-id">nar</journal-id>
<journal-title-group><journal-title>Nucleic Acids Research</journal-title>
</journal-title-group>
<issn pub-type="ppub">0305-1048</issn>
<issn pub-type="epub">1362-4962</issn>
<publisher><publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">28586438</article-id>
<article-id pub-id-type="pmc">5737678</article-id>
<article-id pub-id-type="doi">10.1093/nar/gkx511</article-id>
<article-id pub-id-type="publisher-id">gkx511</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Methods Online</subject>
</subj-group>
</article-categories>
<title-group><article-title>COSINE: non-seeding method for mapping long noisy sequences</article-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Afshar</surname>
<given-names>Pegah Tootoonchi</given-names>
</name>
<xref ref-type="aff" rid="AFF1">1</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Wong</surname>
<given-names>Wing Hung</given-names>
</name>
<pmc-comment>whwong@stanford.edu </pmc-comment>
<xref ref-type="aff" rid="AFF2">2</xref>
<xref ref-type="corresp" rid="COR1"></xref>
</contrib>
</contrib-group>
<aff id="AFF1"><label>1</label>
Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305, USA</aff>
<aff id="AFF2"><label>2</label>
Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA</aff>
<author-notes><corresp id="COR1"><label>*</label>
To whom correspondence should be addressed. Tel: +1 650 725 2915; Fax: +1 650 725 8977; Email: <email>whwong@stanford.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub"><day>21</day>
<month>8</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="epub" iso-8601-date="2017-06-06"><day>06</day>
<month>6</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="pmc-release"><day>06</day>
<month>6</month>
<year>2017</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>45</volume>
<issue>14</issue>
<fpage>e132</fpage>
<lpage>e132</lpage>
<history><date date-type="accepted"><day>04</day>
<month>6</month>
<year>2017</year>
</date>
<date date-type="rev-recd"><day>16</day>
<month>5</month>
<year>2017</year>
</date>
<date date-type="received"><day>28</day>
<month>8</month>
<year>2016</year>
</date>
</history>
<permissions><copyright-statement>© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.</copyright-statement>
<copyright-year>2017</copyright-year>
<license license-type="cc-by-nc" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/"><license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<uri xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</uri>
), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact <email>journals.permissions@oup.com</email>
</license-p>
</license>
</permissions>
<self-uri xlink:href="gkx511.pdf"></self-uri>
<abstract><title>Abstract</title>
<p>Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short <italic>k</italic>
-mers (<italic>k</italic>
= 3–4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.</p>
</abstract>
<counts><page-count count="13"></page-count>
</counts>
</article-meta>
</front>
</pmc>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Californie</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Californie"><name sortKey="Afshar, Pegah Tootoonchi" sort="Afshar, Pegah Tootoonchi" uniqKey="Afshar P" first="Pegah Tootoonchi" last="Afshar">Pegah Tootoonchi Afshar</name>
</region>
<name sortKey="Wong, Wing Hung" sort="Wong, Wing Hung" uniqKey="Wong W" first="Wing Hung" last="Wong">Wing Hung Wong</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000918 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd -nk 000918 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Pmc |étape= Checkpoint |type= RBID |clé= PMC:5737678 |texte= COSINE: non-seeding method for mapping long noisy sequences }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Checkpoint/RBID.i -Sk "pubmed:28586438" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Checkpoint/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |