Efficient sequential and parallel algorithms for planted motif search
Identifieur interne : 000950 ( Pmc/Curation ); précédent : 000949; suivant : 000951Efficient sequential and parallel algorithms for planted motif search
Auteurs : Marius Nicolae [États-Unis] ; Sanguthevar Rajasekaran [États-Unis]Source :
- BMC Bioinformatics [ 1471-2105 ] ; 2014.
Abstract
Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (
This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (
We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.
Url:
DOI: 10.1186/1471-2105-15-34
PubMed: 24479443
PubMed Central: 3924400
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000950
Links to Exploration step
PMC:3924400Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Efficient sequential and parallel algorithms for planted motif search</title>
<author><name sortKey="Nicolae, Marius" sort="Nicolae, Marius" uniqKey="Nicolae M" first="Marius" last="Nicolae">Marius Nicolae</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">24479443</idno>
<idno type="pmc">3924400</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3924400</idno>
<idno type="RBID">PMC:3924400</idno>
<idno type="doi">10.1186/1471-2105-15-34</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">000950</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000950</idno>
<idno type="wicri:Area/Pmc/Curation">000950</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000950</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Efficient sequential and parallel algorithms for planted motif search</title>
<author><name sortKey="Nicolae, Marius" sort="Nicolae, Marius" uniqKey="Nicolae M" first="Marius" last="Nicolae">Marius Nicolae</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="1"><nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series><title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint><date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><sec><title>Background</title>
<p>Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (<italic>l</italic>
,<italic>d</italic>
)-motif search or Planted Motif Search (PMS). In PMS we are given two integers <italic>l</italic>
and <italic>d</italic>
and <italic>n</italic>
biological sequences. We want to find all sequences of length <italic>l</italic>
that appear in each of the input sequences with at most <italic>d</italic>
mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances.</p>
</sec>
<sec><title>Results</title>
<p>This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (<italic>l</italic>
,<italic>d</italic>
) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger <italic>l</italic>
and <italic>d</italic>
such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 <italic>l</italic>
-mers to have a common <italic>d</italic>
-neighbor. The program is freely available at <ext-link ext-link-type="uri" xlink:href="http://engr.uconn.edu/~man09004/PMS8/">http://engr.uconn.edu/~man09004/PMS8/</ext-link>
.</p>
</sec>
<sec><title>Conclusions</title>
<p>We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.</p>
</sec>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Pevzner, P" uniqKey="Pevzner P">P Pevzner</name>
</author>
<author><name sortKey="Sze, S" uniqKey="Sze S">S Sze</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lanctot, J" uniqKey="Lanctot J">J Lanctot</name>
</author>
<author><name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
<author><name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author><name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
<author><name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yu, Q" uniqKey="Yu Q">Q Yu</name>
</author>
<author><name sortKey="Huo, H" uniqKey="Huo H">H Huo</name>
</author>
<author><name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author><name sortKey="Guo, H" uniqKey="Guo H">H Guo</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
<author><name sortKey="Balla, S" uniqKey="Balla S">S Balla</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author><name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ho, E" uniqKey="Ho E">E Ho</name>
</author>
<author><name sortKey="Jakubowski, C" uniqKey="Jakubowski C">C Jakubowski</name>
</author>
<author><name sortKey="Gunderson, S" uniqKey="Gunderson S">S Gunderson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Desaraju, S" uniqKey="Desaraju S">S Desaraju</name>
</author>
<author><name sortKey="Mukkamala, R" uniqKey="Mukkamala R">R Mukkamala</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author><name sortKey="Kundeti, V" uniqKey="Kundeti V">V Kundeti</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bandyopadhyay, S" uniqKey="Bandyopadhyay S">S Bandyopadhyay</name>
</author>
<author><name sortKey="Sahni, S" uniqKey="Sahni S">S Sahni</name>
</author>
<author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author><name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gramm, J" uniqKey="Gramm J">J Gramm</name>
</author>
<author><name sortKey="Niedermeier, R" uniqKey="Niedermeier R">R Niedermeier</name>
</author>
<author><name sortKey="Rossmanith, P" uniqKey="Rossmanith P">P Rossmanith</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Abbas, M" uniqKey="Abbas M">M Abbas</name>
</author>
<author><name sortKey="Abouelhoda, M" uniqKey="Abouelhoda M">M Abouelhoda</name>
</author>
<author><name sortKey="Bahig, H" uniqKey="Bahig H">H Bahig</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dasari, N" uniqKey="Dasari N">N Dasari</name>
</author>
<author><name sortKey="Ranjan, D" uniqKey="Ranjan D">D Ranjan</name>
</author>
<author><name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dasari, N" uniqKey="Dasari N">N Dasari</name>
</author>
<author><name sortKey="Desh, R" uniqKey="Desh R">R Desh</name>
</author>
<author><name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dasari, Ns" uniqKey="Dasari N">NS Dasari</name>
</author>
<author><name sortKey="Desh, R" uniqKey="Desh R">R Desh</name>
</author>
<author><name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sahoo, B" uniqKey="Sahoo B">B Sahoo</name>
</author>
<author><name sortKey="Sourav, R" uniqKey="Sourav R">R Sourav</name>
</author>
<author><name sortKey="Ranjan, R" uniqKey="Ranjan R">R Ranjan</name>
</author>
<author><name sortKey="Padhy, S" uniqKey="Padhy S">S Padhy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sun, H" uniqKey="Sun H">H Sun</name>
</author>
<author><name sortKey="Low, M" uniqKey="Low M">M Low</name>
</author>
<author><name sortKey="Hsu, W" uniqKey="Hsu W">W Hsu</name>
</author>
<author><name sortKey="Tan, C" uniqKey="Tan C">C Tan</name>
</author>
<author><name sortKey="Rajapakse, J" uniqKey="Rajapakse J">J Rajapakse</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sun, Hq" uniqKey="Sun H">HQ Sun</name>
</author>
<author><name sortKey="Low, M" uniqKey="Low M">M Low</name>
</author>
<author><name sortKey="Hsu, Wj" uniqKey="Hsu W">WJ Hsu</name>
</author>
<author><name sortKey="Rajapakse, J" uniqKey="Rajapakse J">J Rajapakse</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Faheem, Hm" uniqKey="Faheem H">HM Faheem</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tompa, M" uniqKey="Tompa M">M Tompa</name>
</author>
<author><name sortKey="Li, N" uniqKey="Li N">N Li</name>
</author>
<author><name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author><name sortKey="Church, Gm" uniqKey="Church G">GM Church</name>
</author>
<author><name sortKey="De Moor, B" uniqKey="De Moor B">B De Moor</name>
</author>
<author><name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author><name sortKey="Favorov, Av" uniqKey="Favorov A">AV Favorov</name>
</author>
<author><name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
<author><name sortKey="Fu, Y" uniqKey="Fu Y">Y Fu</name>
</author>
<author><name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article" xml:lang="en"><pmc-dir>properties open_access</pmc-dir>
<front><journal-meta><journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group><journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher><publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">24479443</article-id>
<article-id pub-id-type="pmc">3924400</article-id>
<article-id pub-id-type="publisher-id">1471-2105-15-34</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-15-34</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group><article-title>Efficient sequential and parallel algorithms for planted motif search</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" corresp="yes" id="A1"><name><surname>Nicolae</surname>
<given-names>Marius</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>marius.nicolae@engr.uconn.edu</email>
</contrib>
<contrib contrib-type="author" id="A2"><name><surname>Rajasekaran</surname>
<given-names>Sanguthevar</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>rajasek@engr.uconn.edu</email>
</contrib>
</contrib-group>
<aff id="I1"><label>1</label>
Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</aff>
<pub-date pub-type="collection"><year>2014</year>
</pub-date>
<pub-date pub-type="epub"><day>31</day>
<month>1</month>
<year>2014</year>
</pub-date>
<volume>15</volume>
<fpage>34</fpage>
<lpage>34</lpage>
<history><date date-type="received"><day>15</day>
<month>3</month>
<year>2013</year>
</date>
<date date-type="accepted"><day>27</day>
<month>1</month>
<year>2014</year>
</date>
</history>
<permissions><copyright-statement>Copyright © 2014 Nicolae and Rajasekaran; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2014</copyright-year>
<copyright-holder>Nicolae and Rajasekaran; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0"><license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/15/34"></self-uri>
<abstract><sec><title>Background</title>
<p>Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (<italic>l</italic>
,<italic>d</italic>
)-motif search or Planted Motif Search (PMS). In PMS we are given two integers <italic>l</italic>
and <italic>d</italic>
and <italic>n</italic>
biological sequences. We want to find all sequences of length <italic>l</italic>
that appear in each of the input sequences with at most <italic>d</italic>
mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances.</p>
</sec>
<sec><title>Results</title>
<p>This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (<italic>l</italic>
,<italic>d</italic>
) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger <italic>l</italic>
and <italic>d</italic>
such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 <italic>l</italic>
-mers to have a common <italic>d</italic>
-neighbor. The program is freely available at <ext-link ext-link-type="uri" xlink:href="http://engr.uconn.edu/~man09004/PMS8/">http://engr.uconn.edu/~man09004/PMS8/</ext-link>
.</p>
</sec>
<sec><title>Conclusions</title>
<p>We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.</p>
</sec>
</abstract>
</article-meta>
</front>
</pmc>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000950 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000950 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Pmc |étape= Curation |type= RBID |clé= PMC:3924400 |texte= Efficient sequential and parallel algorithms for planted motif search }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i -Sk "pubmed:24479443" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |