Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

SlideSort: all pairs similarity search for short reads.

Identifieur interne : 001F20 ( PubMed/Corpus ); précédent : 001F19; suivant : 001F21

SlideSort: all pairs similarity search for short reads.

Auteurs : Kana Shimizu ; Koji Tsuda

Source :

RBID : pubmed:21148542

English descriptors

Abstract

Recent progress in DNA sequencing technologies calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount of short reads. Searching similar pairs from a string pool is a fundamental process of de novo genome assembly, genome-wide alignment and other important analyses.

DOI: 10.1093/bioinformatics/btq677
PubMed: 21148542

Links to Exploration step

pubmed:21148542

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">SlideSort: all pairs similarity search for short reads.</title>
<author>
<name sortKey="Shimizu, Kana" sort="Shimizu, Kana" uniqKey="Shimizu K" first="Kana" last="Shimizu">Kana Shimizu</name>
<affiliation>
<nlm:affiliation>Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Japan.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Tsuda, Koji" sort="Tsuda, Koji" uniqKey="Tsuda K" first="Koji" last="Tsuda">Koji Tsuda</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2011">2011</date>
<idno type="RBID">pubmed:21148542</idno>
<idno type="pmid">21148542</idno>
<idno type="doi">10.1093/bioinformatics/btq677</idno>
<idno type="wicri:Area/PubMed/Corpus">001F20</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001F20</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">SlideSort: all pairs similarity search for short reads.</title>
<author>
<name sortKey="Shimizu, Kana" sort="Shimizu, Kana" uniqKey="Shimizu K" first="Kana" last="Shimizu">Kana Shimizu</name>
<affiliation>
<nlm:affiliation>Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Japan.</nlm:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Tsuda, Koji" sort="Tsuda, Koji" uniqKey="Tsuda K" first="Koji" last="Tsuda">Koji Tsuda</name>
</author>
</analytic>
<series>
<title level="j">Bioinformatics (Oxford, England)</title>
<idno type="eISSN">1367-4811</idno>
<imprint>
<date when="2011" type="published">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Base Sequence</term>
<term>Computational Biology (methods)</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Computational Biology</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Base Sequence</term>
<term>Software</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Recent progress in DNA sequencing technologies calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount of short reads. Searching similar pairs from a string pool is a fundamental process of de novo genome assembly, genome-wide alignment and other important analyses.</div>
</front>
</TEI>
<pubmed>
<MedlineCitation Status="MEDLINE" IndexingMethod="Curated" Owner="NLM">
<PMID Version="1">21148542</PMID>
<DateCompleted>
<Year>2011</Year>
<Month>04</Month>
<Day>19</Day>
</DateCompleted>
<DateRevised>
<Year>2018</Year>
<Month>12</Month>
<Day>01</Day>
</DateRevised>
<Article PubModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1367-4811</ISSN>
<JournalIssue CitedMedium="Internet">
<Volume>27</Volume>
<Issue>4</Issue>
<PubDate>
<Year>2011</Year>
<Month>Feb</Month>
<Day>15</Day>
</PubDate>
</JournalIssue>
<Title>Bioinformatics (Oxford, England)</Title>
<ISOAbbreviation>Bioinformatics</ISOAbbreviation>
</Journal>
<ArticleTitle>SlideSort: all pairs similarity search for short reads.</ArticleTitle>
<Pagination>
<MedlinePgn>464-70</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1093/bioinformatics/btq677</ELocationID>
<Abstract>
<AbstractText Label="MOTIVATION" NlmCategory="BACKGROUND">Recent progress in DNA sequencing technologies calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount of short reads. Searching similar pairs from a string pool is a fundamental process of de novo genome assembly, genome-wide alignment and other important analyses.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">In this study, we designed and implemented an exact algorithm SlideSort that finds all similar pairs from a string pool in terms of edit distance. Using an efficient pattern growth algorithm, SlideSort discovers chains of common k-mers to narrow down the search. Compared to existing methods based on single k-mers, our method is more effective in reducing the number of edit distance calculations. In comparison to backtracking methods such as BWA, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing short reads for further processing.</AbstractText>
<AbstractText Label="AVAILABILITY" NlmCategory="BACKGROUND">Executable binary files and C++ libraries are available at http://www.cbrc.jp/~shimizu/slidesort/ for Linux and Windows.</AbstractText>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Shimizu</LastName>
<ForeName>Kana</ForeName>
<Initials>K</Initials>
<AffiliationInfo>
<Affiliation>Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Japan.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Tsuda</LastName>
<ForeName>Koji</ForeName>
<Initials>K</Initials>
</Author>
</AuthorList>
<Language>eng</Language>
<PublicationTypeList>
<PublicationType UI="D016428">Journal Article</PublicationType>
<PublicationType UI="D013485">Research Support, Non-U.S. Gov't</PublicationType>
</PublicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2010</Year>
<Month>12</Month>
<Day>09</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>England</Country>
<MedlineTA>Bioinformatics</MedlineTA>
<NlmUniqueID>9808944</NlmUniqueID>
<ISSNLinking>1367-4803</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<MeshHeadingList>
<MeshHeading>
<DescriptorName UI="D000465" MajorTopicYN="Y">Algorithms</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D001483" MajorTopicYN="N">Base Sequence</DescriptorName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D019295" MajorTopicYN="N">Computational Biology</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="N">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D017422" MajorTopicYN="N">Sequence Analysis, DNA</DescriptorName>
<QualifierName UI="Q000379" MajorTopicYN="Y">methods</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName UI="D012984" MajorTopicYN="Y">Software</DescriptorName>
</MeshHeading>
</MeshHeadingList>
</MedlineCitation>
<PubmedData>
<History>
<PubMedPubDate PubStatus="entrez">
<Year>2010</Year>
<Month>12</Month>
<Day>15</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="pubmed">
<Year>2010</Year>
<Month>12</Month>
<Day>15</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
<PubMedPubDate PubStatus="medline">
<Year>2011</Year>
<Month>4</Month>
<Day>20</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</PubMedPubDate>
</History>
<PublicationStatus>ppublish</PublicationStatus>
<ArticleIdList>
<ArticleId IdType="pubmed">21148542</ArticleId>
<ArticleId IdType="pii">btq677</ArticleId>
<ArticleId IdType="doi">10.1093/bioinformatics/btq677</ArticleId>
<ArticleId IdType="pmc">PMC3035798</ArticleId>
</ArticleIdList>
<ReferenceList>
<Reference>
<Citation>Bioinformatics. 2008 Oct 15;24(20):2395-6</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18697769</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2009 May 1;25(9):1105-11</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19289445</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Biol. 2009;10(3):R25</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19261174</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2009 Jun;19(6):1117-23</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19251739</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2009 Jul;19(7):1309-15</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19439514</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2008 May;18(5):821-9</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">18349386</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2009 Aug 1;25(15):1966-7</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19497933</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Genome Res. 2009 Sep;19(9):1646-54</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19592482</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Science. 1985 Mar 22;227(4693):1435-41</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">2983426</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>J Clin Monit Comput. 2005 Oct;19(4-5):319-28</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">16328946</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2007 Feb 15;23(4):500-1</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">17158514</ArticleId>
</ArticleIdList>
</Reference>
<Reference>
<Citation>Bioinformatics. 2009 Jul 15;25(14):1754-60</Citation>
<ArticleIdList>
<ArticleId IdType="pubmed">19451168</ArticleId>
</ArticleIdList>
</Reference>
</ReferenceList>
</PubmedData>
</pubmed>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/PubMed/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001F20 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd -nk 001F20 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    PubMed
   |étape=   Corpus
   |type=    RBID
   |clé=     pubmed:21148542
   |texte=   SlideSort: all pairs similarity search for short reads.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/PubMed/Corpus/RBID.i   -Sk "pubmed:21148542" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/PubMed/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021