Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

COSINE: non-seeding method for mapping long noisy sequences.

Identifieur interne : 000D74 ( PubMed/Checkpoint ); précédent : 000D73; suivant : 000D75

COSINE: non-seeding method for mapping long noisy sequences.

Auteurs : Pegah Tootoonchi Afshar [États-Unis] ; Wing Hung Wong [États-Unis]

Source :

RBID : pubmed:28586438

Descripteurs français

English descriptors

Abstract

Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.

DOI: 10.1093/nar/gkx511
PubMed: 28586438


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:28586438

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">COSINE: non-seeding method for mapping long noisy sequences.</title>
<author>
<name sortKey="Afshar, Pegah Tootoonchi" sort="Afshar, Pegah Tootoonchi" uniqKey="Afshar P" first="Pegah Tootoonchi" last="Afshar">Pegah Tootoonchi Afshar</name>
<affiliation wicri:level="2">
<nlm:affiliation>Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Wong, Wing Hung" sort="Wong, Wing Hung" uniqKey="Wong W" first="Wing Hung" last="Wong">Wing Hung Wong</name>
<affiliation wicri:level="2">
<nlm:affiliation>Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2017">2017</date>
<idno type="RBID">pubmed:28586438</idno>
<idno type="pmid">28586438</idno>
<idno type="doi">10.1093/nar/gkx511</idno>
<idno type="wicri:Area/PubMed/Corpus">000C69</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000C69</idno>
<idno type="wicri:Area/PubMed/Curation">000C69</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000C69</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000D74</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000D74</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">COSINE: non-seeding method for mapping long noisy sequences.</title>
<author>
<name sortKey="Afshar, Pegah Tootoonchi" sort="Afshar, Pegah Tootoonchi" uniqKey="Afshar P" first="Pegah Tootoonchi" last="Afshar">Pegah Tootoonchi Afshar</name>
<affiliation wicri:level="2">
<nlm:affiliation>Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Wong, Wing Hung" sort="Wong, Wing Hung" uniqKey="Wong W" first="Wing Hung" last="Wong">Wing Hung Wong</name>
<affiliation wicri:level="2">
<nlm:affiliation>Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Nucleic acids research</title>
<idno type="eISSN">1362-4962</idno>
<imprint>
<date when="2017" type="published">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Base Sequence</term>
<term>Computational Biology (methods)</term>
<term>High-Throughput Nucleotide Sequencing (methods)</term>
<term>High-Throughput Nucleotide Sequencing (statistics & numerical data)</term>
<term>Reproducibility of Results</term>
<term>Sequence Alignment (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Alignement de séquences ()</term>
<term>Biologie informatique ()</term>
<term>Logiciel</term>
<term>Reproductibilité des résultats</term>
<term>Séquence nucléotidique</term>
<term>Séquençage nucléotidique à haut débit ()</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Computational Biology</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Sequence Alignment</term>
</keywords>
<keywords scheme="MESH" qualifier="statistics & numerical data" xml:lang="en">
<term>High-Throughput Nucleotide Sequencing</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Base Sequence</term>
<term>Reproducibility of Results</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Alignement de séquences</term>
<term>Biologie informatique</term>
<term>Logiciel</term>
<term>Reproductibilité des résultats</term>
<term>Séquence nucléotidique</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" Owner="NLM">
<PMID Version="1">28586438</PMID>
<DateCompleted>
<Year>2017</Year>
<Month>10</Month>
<Day>23</Day>
</DateCompleted>
<DateRevised>
<Year>2020</Year>
<Month>03</Month>
<Day>06</Day>
</DateRevised>
<Article PubModel="Print">
<Journal>
<ISSN IssnType="Electronic">1362-4962</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>45</Volume>
<Issue>14</Issue>
<PubDate>
<Year>2017</Year>
<Month>Aug</Month>
<Day>21</Day>
</PubDate>
</JournalIssue>
<Title>Nucleic acids research</Title>
<ISOAbbreviation>Nucleic Acids Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>COSINE: non-seeding method for mapping long noisy sequences.</ArticleTitle>
<Pagination>
<MedlinePgn>e132</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1093/nar/gkx511</ELocationID>
<Abstract>
<AbstractText>Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.</AbstractText>
<CopyrightInformation>© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Afshar</LastName>
<ForeName>Pegah Tootoonchi</ForeName>
<Initials>PT</Initials>
<AffiliationInfo>
<Affiliation>Department of Electrical Engineering, School of Engineering, Stanford University, Stanford, CA 94305, USA.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wong</LastName>
<ForeName>Wing Hung</ForeName>
<Initials>WH</Initials>
<AffiliationInfo>
<Affiliation>Department of Statistics and Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<GrantList CompleteYN="Y">
<Grant>
<GrantID>R01 HG007834</GrantID>
<Acronym>HG</Acronym>
<Agency>NHGRI NIH HHS</Agency>
<Country>United States</Country>
</Grant>
</GrantList>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
</PublicationTypeList>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>Nucleic Acids Res</MedlineTA>
<NlmUniqueID>0411011</NlmUniqueID>
<ISSNLinking>0305-1048</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="Y">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D001483" MajorTopicYN="N">Base Sequence</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D019295" MajorTopicYN="N">Computational Biology</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D059014" MajorTopicYN="N">High-Throughput Nucleotide Sequencing</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="N">methods</QualifierName>
<QualifierName UI="Q000706" MajorTopicYN="N">statistics & numerical data</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D015203" MajorTopicYN="N">Reproducibility of Results</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D016415" MajorTopicYN="N">Sequence Alignment</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012984" MajorTopicYN="Y">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="received">
<Year>2016</Year>
<Month>08</Month>
<Day>28</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="accepted">
<Year>2017</Year>
<Month>06</Month>
<Day>04</Day>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2017</Year>
<Month>6</Month>
<Day>7</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2017</Year>
<Month>10</Month>
<Day>24</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="entrez">
<Year>2017</Year>
<Month>6</Month>
<Day>7</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">28586438</ArticleId>
<ArticleId IdType="pii">3861609</ArticleId>
<ArticleId IdType="doi">10.1093/nar/gkx511</ArticleId>
<ArticleId IdType="pmc">PMC5737678</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>J Biomol Tech. 2005 Dec;16(4):453-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16522868</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Gigascience. 2014 Oct 20;3:22</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25386338</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Bioinformatics. 2013 Jan 1;29(1):119-21</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">23129296</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Cancers (Basel). 2015 Sep 23;7(3):1925-58</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26404381</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Biomol Detect Quantif. 2015 Mar;3:1-8</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26753127</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Cell. 2013 Sep 26;155(1):27-38</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24074859</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>J Genet Genomics. 2011 Mar 20;38(3):95-109</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21477781</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Biochim Biophys Acta. 2014 Oct;1842(10):1932-1941</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">24995601</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>J Biomed Biotechnol. 2012;2012:251364</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22829749</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 2002 Jul 15;30(14):3059-66</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">12136088</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Brief Bioinform. 2016 Jan;17(1):154-79</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26026159</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nucleic Acids Res. 1982 Jan 11;10(1):133-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">6174932</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Genome Res. 2011 Mar;21(3):487-93</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">21209072</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>J Comput Biol. 2002;9(1):23-33</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">11911793</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Nat Biotechnol. 2008 Oct;26(10):1135-45</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18846087</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Brief Bioinform. 2017 Nov 1;18(6):940-953</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">27559152</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>Bioinformatics. 2014 Dec 1;30(23):3399-401</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">25143291</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>BMC Bioinformatics. 2012 Sep 19;13:238</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">22988817</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
<ReferenceList>
<Reference>
<Citation>F1000Res. 2015 Oct 15;4:1075</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">26834992</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Californie</li>
</region>
</list>
<tree>
<country name="États-Unis">
<region name="Californie">
<name sortKey="Afshar, Pegah Tootoonchi" sort="Afshar, Pegah Tootoonchi" uniqKey="Afshar P" first="Pegah Tootoonchi" last="Afshar">Pegah Tootoonchi Afshar</name>
</region>
<name sortKey="Wong, Wing Hung" sort="Wong, Wing Hung" uniqKey="Wong W" first="Wing Hung" last="Wong">Wing Hung Wong</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000D74 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd -nk 000D74 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:28586438
   |texte=   COSINE: non-seeding method for mapping long noisy sequences.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Checkpoint/RBID.i   -Sk "pubmed:28586438" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021