Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Efficient sequential and parallel algorithms for planted motif search

Identifieur interne : 000950 ( Pmc/Curation ); précédent : 000949; suivant : 000951

Efficient sequential and parallel algorithms for planted motif search

Auteurs : Marius Nicolae [États-Unis] ; Sanguthevar Rajasekaran [États-Unis]

Source :

RBID : PMC:3924400

Abstract

Background

Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances.

Results

This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 l-mers to have a common d-neighbor. The program is freely available at http://engr.uconn.edu/~man09004/PMS8/.

Conclusions

We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.


Url:
DOI: 10.1186/1471-2105-15-34
PubMed: 24479443
PubMed Central: 3924400

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:3924400

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Efficient sequential and parallel algorithms for planted motif search</title>
<author>
<name sortKey="Nicolae, Marius" sort="Nicolae, Marius" uniqKey="Nicolae M" first="Marius" last="Nicolae">Marius Nicolae</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">24479443</idno>
<idno type="pmc">3924400</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3924400</idno>
<idno type="RBID">PMC:3924400</idno>
<idno type="doi">10.1186/1471-2105-15-34</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">000950</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000950</idno>
<idno type="wicri:Area/Pmc/Curation">000950</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000950</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Efficient sequential and parallel algorithms for planted motif search</title>
<author>
<name sortKey="Nicolae, Marius" sort="Nicolae, Marius" uniqKey="Nicolae M" first="Marius" last="Nicolae">Marius Nicolae</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Rajasekaran, Sanguthevar" sort="Rajasekaran, Sanguthevar" uniqKey="Rajasekaran S" first="Sanguthevar" last="Rajasekaran">Sanguthevar Rajasekaran</name>
<affiliation wicri:level="1">
<nlm:aff id="I1">Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, University of Connecticut, Storrs, CT</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (
<italic>l</italic>
,
<italic>d</italic>
)-motif search or Planted Motif Search (PMS). In PMS we are given two integers
<italic>l</italic>
and
<italic>d</italic>
and
<italic>n</italic>
biological sequences. We want to find all sequences of length
<italic>l</italic>
that appear in each of the input sequences with at most
<italic>d</italic>
mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances.</p>
</sec>
<sec>
<title>Results</title>
<p>This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (
<italic>l</italic>
,
<italic>d</italic>
) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger
<italic>l</italic>
and
<italic>d</italic>
such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3
<italic>l</italic>
-mers to have a common
<italic>d</italic>
-neighbor. The program is freely available at
<ext-link ext-link-type="uri" xlink:href="http://engr.uconn.edu/~man09004/PMS8/">http://engr.uconn.edu/~man09004/PMS8/</ext-link>
.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Pevzner, P" uniqKey="Pevzner P">P Pevzner</name>
</author>
<author>
<name sortKey="Sze, S" uniqKey="Sze S">S Sze</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lanctot, J" uniqKey="Lanctot J">J Lanctot</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
<author>
<name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, Q" uniqKey="Yu Q">Q Yu</name>
</author>
<author>
<name sortKey="Huo, H" uniqKey="Huo H">H Huo</name>
</author>
<author>
<name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author>
<name sortKey="Guo, H" uniqKey="Guo H">H Guo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
<author>
<name sortKey="Balla, S" uniqKey="Balla S">S Balla</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author>
<name sortKey="Davila, J" uniqKey="Davila J">J Davila</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ho, E" uniqKey="Ho E">E Ho</name>
</author>
<author>
<name sortKey="Jakubowski, C" uniqKey="Jakubowski C">C Jakubowski</name>
</author>
<author>
<name sortKey="Gunderson, S" uniqKey="Gunderson S">S Gunderson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Desaraju, S" uniqKey="Desaraju S">S Desaraju</name>
</author>
<author>
<name sortKey="Mukkamala, R" uniqKey="Mukkamala R">R Mukkamala</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author>
<name sortKey="Kundeti, V" uniqKey="Kundeti V">V Kundeti</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bandyopadhyay, S" uniqKey="Bandyopadhyay S">S Bandyopadhyay</name>
</author>
<author>
<name sortKey="Sahni, S" uniqKey="Sahni S">S Sahni</name>
</author>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rajasekaran, S" uniqKey="Rajasekaran S">S Rajasekaran</name>
</author>
<author>
<name sortKey="Dinh, H" uniqKey="Dinh H">H Dinh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gramm, J" uniqKey="Gramm J">J Gramm</name>
</author>
<author>
<name sortKey="Niedermeier, R" uniqKey="Niedermeier R">R Niedermeier</name>
</author>
<author>
<name sortKey="Rossmanith, P" uniqKey="Rossmanith P">P Rossmanith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Abbas, M" uniqKey="Abbas M">M Abbas</name>
</author>
<author>
<name sortKey="Abouelhoda, M" uniqKey="Abouelhoda M">M Abouelhoda</name>
</author>
<author>
<name sortKey="Bahig, H" uniqKey="Bahig H">H Bahig</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dasari, N" uniqKey="Dasari N">N Dasari</name>
</author>
<author>
<name sortKey="Ranjan, D" uniqKey="Ranjan D">D Ranjan</name>
</author>
<author>
<name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dasari, N" uniqKey="Dasari N">N Dasari</name>
</author>
<author>
<name sortKey="Desh, R" uniqKey="Desh R">R Desh</name>
</author>
<author>
<name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dasari, Ns" uniqKey="Dasari N">NS Dasari</name>
</author>
<author>
<name sortKey="Desh, R" uniqKey="Desh R">R Desh</name>
</author>
<author>
<name sortKey="Zubair, M" uniqKey="Zubair M">M Zubair</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sahoo, B" uniqKey="Sahoo B">B Sahoo</name>
</author>
<author>
<name sortKey="Sourav, R" uniqKey="Sourav R">R Sourav</name>
</author>
<author>
<name sortKey="Ranjan, R" uniqKey="Ranjan R">R Ranjan</name>
</author>
<author>
<name sortKey="Padhy, S" uniqKey="Padhy S">S Padhy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, H" uniqKey="Sun H">H Sun</name>
</author>
<author>
<name sortKey="Low, M" uniqKey="Low M">M Low</name>
</author>
<author>
<name sortKey="Hsu, W" uniqKey="Hsu W">W Hsu</name>
</author>
<author>
<name sortKey="Tan, C" uniqKey="Tan C">C Tan</name>
</author>
<author>
<name sortKey="Rajapakse, J" uniqKey="Rajapakse J">J Rajapakse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, Hq" uniqKey="Sun H">HQ Sun</name>
</author>
<author>
<name sortKey="Low, M" uniqKey="Low M">M Low</name>
</author>
<author>
<name sortKey="Hsu, Wj" uniqKey="Hsu W">WJ Hsu</name>
</author>
<author>
<name sortKey="Rajapakse, J" uniqKey="Rajapakse J">J Rajapakse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Faheem, Hm" uniqKey="Faheem H">HM Faheem</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tompa, M" uniqKey="Tompa M">M Tompa</name>
</author>
<author>
<name sortKey="Li, N" uniqKey="Li N">N Li</name>
</author>
<author>
<name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author>
<name sortKey="Church, Gm" uniqKey="Church G">GM Church</name>
</author>
<author>
<name sortKey="De Moor, B" uniqKey="De Moor B">B De Moor</name>
</author>
<author>
<name sortKey="Eskin, E" uniqKey="Eskin E">E Eskin</name>
</author>
<author>
<name sortKey="Favorov, Av" uniqKey="Favorov A">AV Favorov</name>
</author>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
<author>
<name sortKey="Fu, Y" uniqKey="Fu Y">Y Fu</name>
</author>
<author>
<name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article" xml:lang="en">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">24479443</article-id>
<article-id pub-id-type="pmc">3924400</article-id>
<article-id pub-id-type="publisher-id">1471-2105-15-34</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-15-34</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Efficient sequential and parallel algorithms for planted motif search</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes" id="A1">
<name>
<surname>Nicolae</surname>
<given-names>Marius</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>marius.nicolae@engr.uconn.edu</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Rajasekaran</surname>
<given-names>Sanguthevar</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<email>rajasek@engr.uconn.edu</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA</aff>
<pub-date pub-type="collection">
<year>2014</year>
</pub-date>
<pub-date pub-type="epub">
<day>31</day>
<month>1</month>
<year>2014</year>
</pub-date>
<volume>15</volume>
<fpage>34</fpage>
<lpage>34</lpage>
<history>
<date date-type="received">
<day>15</day>
<month>3</month>
<year>2013</year>
</date>
<date date-type="accepted">
<day>27</day>
<month>1</month>
<year>2014</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2014 Nicolae and Rajasekaran; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2014</copyright-year>
<copyright-holder>Nicolae and Rajasekaran; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/15/34"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (
<italic>l</italic>
,
<italic>d</italic>
)-motif search or Planted Motif Search (PMS). In PMS we are given two integers
<italic>l</italic>
and
<italic>d</italic>
and
<italic>n</italic>
biological sequences. We want to find all sequences of length
<italic>l</italic>
that appear in each of the input sequences with at most
<italic>d</italic>
mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances.</p>
</sec>
<sec>
<title>Results</title>
<p>This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (
<italic>l</italic>
,
<italic>d</italic>
) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger
<italic>l</italic>
and
<italic>d</italic>
such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3
<italic>l</italic>
-mers to have a common
<italic>d</italic>
-neighbor. The program is freely available at
<ext-link ext-link-type="uri" xlink:href="http://engr.uconn.edu/~man09004/PMS8/">http://engr.uconn.edu/~man09004/PMS8/</ext-link>
.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.</p>
</sec>
</abstract>
</article-meta>
</front>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000950 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 000950 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:3924400
   |texte=   Efficient sequential and parallel algorithms for planted motif search
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:24479443" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021