Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds

Identifieur interne : 000246 ( Pmc/Corpus ); précédent : 000245; suivant : 000247

Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds

Auteurs : Laurent Noé

Source :

RBID : PMC:5310094

Abstract

Background

Spaced seeds, also named gapped q-grams, gapped k-mers, spaced q-grams, have been proven to be more sensitive than contiguous seeds (contiguous q-grams, contiguous k-mers) in nucleic and amino-acid sequences analysis. Initially proposed to detect sequence similarities and to anchor sequence alignments, spaced seeds have more recently been applied in several alignment-free related methods. Unfortunately, spaced seeds need to be initially designed. This task is known to be time-consuming due to the number of spaced seed candidates. Moreover, it can be altered by a set of arbitrary chosen parameters from the probabilistic alignment models used. In this general context, Dominant seeds have been introduced by Mak and Benson (Bioinformatics 25:302–308, 2009) on the Bernoulli model, in order to reduce the number of spaced seed candidates that are further processed in a parameter-free calculation of the sensitivity.

Results

We expand the scope of work of Mak and Benson on single and multiple seeds by considering the Hit Integration model of Chung and Park (BMC Bioinform 11:31, 2010), demonstrate that the same dominance definition can be applied, and that a parameter-free study can be performed without any significant additional cost. We also consider two new discrete models, namely the Heaviside and the Dirac models, where lossless seeds can be integrated. From a theoretical standpoint, we establish a generic framework on all the proposed models, by applying a counting semi-ring to quickly compute large polynomial coefficients needed by the dominance filter. From a practical standpoint, we confirm that dominant seeds reduce the set of, either single seeds to thoroughly analyse, or multiple seeds to store. Moreover, in http://bioinfo.cristal.univ-lille.fr/yass/iedera_dominance, we provide a full list of spaced seeds computed on the four aforementioned models, with one (continuous) parameter left free for each model, and with several (discrete) alignment lengths.


Url:
DOI: 10.1186/s13015-017-0092-1
PubMed: 28289437
PubMed Central: 5310094

Links to Exploration step

PMC:5310094

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds</title>
<author>
<name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noé">Laurent Noé</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">28289437</idno>
<idno type="pmc">5310094</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5310094</idno>
<idno type="RBID">PMC:5310094</idno>
<idno type="doi">10.1186/s13015-017-0092-1</idno>
<date when="2017">2017</date>
<idno type="wicri:Area/Pmc/Corpus">000246</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000246</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds</title>
<author>
<name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noé">Laurent Noé</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Algorithms for Molecular Biology : AMB</title>
<idno type="eISSN">1748-7188</idno>
<imprint>
<date when="2017">2017</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>
<italic>Spaced seeds</italic>
, also named
<italic>gapped q-grams, gapped k-mers, spaced q-grams</italic>
, have been proven to be more sensitive than contiguous seeds (
<italic>contiguous q-grams, contiguous k-mers</italic>
) in nucleic and amino-acid sequences analysis. Initially proposed to detect sequence similarities and to anchor sequence alignments, spaced seeds have more recently been applied in several
<italic>alignment-free</italic>
related methods. Unfortunately, spaced seeds need to be initially designed. This task is known to be time-consuming due to the number of spaced seed candidates. Moreover, it can be altered by a set of
<italic>arbitrary chosen</italic>
parameters from the probabilistic alignment models used. In this general context,
<italic>Dominant seeds</italic>
have been introduced by Mak and Benson (Bioinformatics 25:302–308, 2009) on the Bernoulli model, in order to reduce the number of spaced seed candidates that are further processed in a
<italic>parameter-free</italic>
calculation of the sensitivity.</p>
</sec>
<sec>
<title>Results</title>
<p>We expand the scope of work of Mak and Benson on single and multiple seeds by considering the Hit Integration model of Chung and Park (BMC Bioinform 11:31, 2010), demonstrate that the same dominance definition can be applied, and that a parameter-free study can be performed without any significant additional cost. We also consider two new discrete models, namely the Heaviside and the Dirac models, where lossless seeds can be integrated. From a theoretical standpoint, we establish a generic framework on all the proposed models, by applying a
<italic>counting semi-ring</italic>
to quickly compute large polynomial coefficients needed by the
<italic>dominance</italic>
filter. From a practical standpoint, we confirm that
<italic>dominant seeds</italic>
reduce the set of, either single seeds to thoroughly analyse, or multiple seeds to store. Moreover, in
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.cristal.univ-lille.fr/yass/iedera%5fdominance">http://bioinfo.cristal.univ-lille.fr/yass/iedera_dominance</ext-link>
, we provide a full list of spaced seeds computed on the four aforementioned models, with one (continuous) parameter left free for each model, and with several (discrete) alignment lengths.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Mak, Dyf" uniqKey="Mak D">DYF Mak</name>
</author>
<author>
<name sortKey="Benson, G" uniqKey="Benson G">G Benson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chung, Wh" uniqKey="Chung W">WH Chung</name>
</author>
<author>
<name sortKey="Park, Sb" uniqKey="Park S">SB Park</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Tromp, J" uniqKey="Tromp J">J Tromp</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burkhardt, S" uniqKey="Burkhardt S">S Burkhardt</name>
</author>
<author>
<name sortKey="K Rkk Inen, J" uniqKey="K Rkk Inen J">J Kärkkäinen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brejova, B" uniqKey="Brejova B">B Brejová</name>
</author>
<author>
<name sortKey="Brown, Dg" uniqKey="Brown D">DG Brown</name>
</author>
<author>
<name sortKey="Vina, T" uniqKey="Vina T">T Vinař</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mak, Dyf" uniqKey="Mak D">DYF Mak</name>
</author>
<author>
<name sortKey="Gelfand, Y" uniqKey="Gelfand Y">Y Gelfand</name>
</author>
<author>
<name sortKey="Benson, G" uniqKey="Benson G">G Benson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, K" uniqKey="Chen K">K Chen</name>
</author>
<author>
<name sortKey="Zhu, Q" uniqKey="Zhu Q">Q Zhu</name>
</author>
<author>
<name sortKey="Yang, F" uniqKey="Yang F">F Yang</name>
</author>
<author>
<name sortKey="Tang, D" uniqKey="Tang D">D Tang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cs Ros, M" uniqKey="Cs Ros M">M Csűrös</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author>
<name sortKey="Ilie, S" uniqKey="Ilie S">S Ilie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, W" uniqKey="Chen W">W Chen</name>
</author>
<author>
<name sortKey="Sung, Wk" uniqKey="Sung W">WK Sung</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, J" uniqKey="Yang J">J Yang</name>
</author>
<author>
<name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, L" uniqKey="Zhou L">L Zhou</name>
</author>
<author>
<name sortKey="Stanton, J" uniqKey="Stanton J">J Stanton</name>
</author>
<author>
<name sortKey="Florea, L" uniqKey="Florea L">L Florea</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Kisman, D" uniqKey="Kisman D">D Kisman</name>
</author>
<author>
<name sortKey="Tromp, J" uniqKey="Tromp J">J Tromp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, Y" uniqKey="Sun Y">Y Sun</name>
</author>
<author>
<name sortKey="Buhler, J" uniqKey="Buhler J">J Buhler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
<author>
<name sortKey="Roytberg, Ma" uniqKey="Roytberg M">MA Roytberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Farach Colton, M" uniqKey="Farach Colton M">M Farach-Colton</name>
</author>
<author>
<name sortKey="Landau, Gm" uniqKey="Landau G">GM Landau</name>
</author>
<author>
<name sortKey="Cenk Sahinalp, S" uniqKey="Cenk Sahinalp S">S Cenk Sahinalp</name>
</author>
<author>
<name sortKey="Tsur, D" uniqKey="Tsur D">D Tsur</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kielbasa, Sm" uniqKey="Kielbasa S">SM Kiełbasa</name>
</author>
<author>
<name sortKey="Wan, R" uniqKey="Wan R">R Wan</name>
</author>
<author>
<name sortKey="Sato, K" uniqKey="Sato K">K Sato</name>
</author>
<author>
<name sortKey="Horton, P" uniqKey="Horton P">P Horton</name>
</author>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Crochemore, M" uniqKey="Crochemore M">M Crochemore</name>
</author>
<author>
<name sortKey="Tischler, G" uniqKey="Tischler G">G Tischler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Onodera, T" uniqKey="Onodera T">T Onodera</name>
</author>
<author>
<name sortKey="Shibuya, T" uniqKey="Shibuya T">T Shibuya</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shrestha, Ams" uniqKey="Shrestha A">AMS Shrestha</name>
</author>
<author>
<name sortKey="Frith, Mc" uniqKey="Frith M">MC Frith</name>
</author>
<author>
<name sortKey="Horton, P" uniqKey="Horton P">P Horton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Birol, I" uniqKey="Birol I">I Birol</name>
</author>
<author>
<name sortKey="Chu, J" uniqKey="Chu J">J Chu</name>
</author>
<author>
<name sortKey="Mohamadi, H" uniqKey="Mohamadi H">H Mohamadi</name>
</author>
<author>
<name sortKey="Jackman, Sd" uniqKey="Jackman S">SD Jackman</name>
</author>
<author>
<name sortKey="Raghavan, K" uniqKey="Raghavan K">K Raghavan</name>
</author>
<author>
<name sortKey="Vandervalk, Bp" uniqKey="Vandervalk B">BP Vandervalk</name>
</author>
<author>
<name sortKey="Raymond, A" uniqKey="Raymond A">A Raymond</name>
</author>
<author>
<name sortKey="Warren, Rl" uniqKey="Warren R">RL Warren</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Keich, U" uniqKey="Keich U">U Keich</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Tromp, J" uniqKey="Tromp J">J Tromp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nicolas, F" uniqKey="Nicolas F">F Nicolas</name>
</author>
<author>
<name sortKey="Rivals, E" uniqKey="Rivals E">É Rivals</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Yao, H" uniqKey="Yao H">H Yao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schwartz, S" uniqKey="Schwartz S">S Schwartz</name>
</author>
<author>
<name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
<author>
<name sortKey="Smit, A" uniqKey="Smit A">A Smit</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Baertsch, R" uniqKey="Baertsch R">R Baertsch</name>
</author>
<author>
<name sortKey="Hardison, Rc" uniqKey="Hardison R">RC Hardison</name>
</author>
<author>
<name sortKey="Haussler, D" uniqKey="Haussler D">D Haussler</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lin, H" uniqKey="Lin H">H Lin</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Zhang, Mq" uniqKey="Zhang M">MQ Zhang</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rumble, Sm" uniqKey="Rumble S">SM Rumble</name>
</author>
<author>
<name sortKey="Lacroute, P" uniqKey="Lacroute P">P Lacroute</name>
</author>
<author>
<name sortKey="Dalca, Av" uniqKey="Dalca A">AV Dalca</name>
</author>
<author>
<name sortKey="Fiume, M" uniqKey="Fiume M">M Fiume</name>
</author>
<author>
<name sortKey="Sidow, A" uniqKey="Sidow A">A Sidow</name>
</author>
<author>
<name sortKey="Brudno, M" uniqKey="Brudno M">M Brudno</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, Y" uniqKey="Chen Y">Y Chen</name>
</author>
<author>
<name sortKey="Souaiaia, T" uniqKey="Souaiaia T">T Souaiaia</name>
</author>
<author>
<name sortKey="Chen, T" uniqKey="Chen T">T Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Giladi, E" uniqKey="Giladi E">E Giladi</name>
</author>
<author>
<name sortKey="Healy, J" uniqKey="Healy J">J Healy</name>
</author>
<author>
<name sortKey="Myers, G" uniqKey="Myers G">G Myers</name>
</author>
<author>
<name sortKey="Hart, C" uniqKey="Hart C">C Hart</name>
</author>
<author>
<name sortKey="Kapranov, P" uniqKey="Kapranov P">P Kapranov</name>
</author>
<author>
<name sortKey="Lipson, D" uniqKey="Lipson D">D Lipson</name>
</author>
<author>
<name sortKey="Roels, S" uniqKey="Roels S">S Roels</name>
</author>
<author>
<name sortKey="Thayer, E" uniqKey="Thayer E">E Thayer</name>
</author>
<author>
<name sortKey="Letovsky, S" uniqKey="Letovsky S">S Letovsky</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="David, M" uniqKey="David M">M David</name>
</author>
<author>
<name sortKey="Dzamba, M" uniqKey="Dzamba M">M Dzamba</name>
</author>
<author>
<name sortKey="Lister, D" uniqKey="Lister D">D Lister</name>
</author>
<author>
<name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author>
<name sortKey="Brudno, M" uniqKey="Brudno M">M Brudno</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sovi, I" uniqKey="Sovi I">I Sović</name>
</author>
<author>
<name sortKey="Siki, M" uniqKey="Siki M">M Šikić</name>
</author>
<author>
<name sortKey="Wilm, A" uniqKey="Wilm A">A Wilm</name>
</author>
<author>
<name sortKey="Fenlon, Sn" uniqKey="Fenlon S">SN Fenlon</name>
</author>
<author>
<name sortKey="Chen, S" uniqKey="Chen S">S Chen</name>
</author>
<author>
<name sortKey="Nagarajan, N" uniqKey="Nagarajan N">N Nagarajan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Preparata, Fp" uniqKey="Preparata F">FP Preparata</name>
</author>
<author>
<name sortKey="Oliver, Js" uniqKey="Oliver J">JS Oliver</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feng, S" uniqKey="Feng S">S Feng</name>
</author>
<author>
<name sortKey="Tillier, Erm" uniqKey="Tillier E">ERM Tillier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chung, W H" uniqKey="Chung W">W-H Chung</name>
</author>
<author>
<name sortKey="Park, S B" uniqKey="Park S">S-B Park</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author>
<name sortKey="Ilie, S" uniqKey="Ilie S">S Ilie</name>
</author>
<author>
<name sortKey="Khoshraftar, S" uniqKey="Khoshraftar S">S Khoshraftar</name>
</author>
<author>
<name sortKey="Mansouri Bigvand, A" uniqKey="Mansouri Bigvand A">A Mansouri Bigvand</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author>
<name sortKey="Mohamadi, H" uniqKey="Mohamadi H">H Mohamadi</name>
</author>
<author>
<name sortKey="Brian Golding, G" uniqKey="Brian Golding G">G Brian Golding</name>
</author>
<author>
<name sortKey="Smyth, Wf" uniqKey="Smyth W">WF Smyth</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kisman, D" uniqKey="Kisman D">D Kisman</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brown, Dg" uniqKey="Brown D">DG Brown</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roytberg, Ma" uniqKey="Roytberg M">MA Roytberg</name>
</author>
<author>
<name sortKey="Gambin, A" uniqKey="Gambin A">A Gambin</name>
</author>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
<author>
<name sortKey="Lasota, S" uniqKey="Lasota S">S Lasota</name>
</author>
<author>
<name sortKey="Furletova, E" uniqKey="Furletova E">E Furletova</name>
</author>
<author>
<name sortKey="Szczurek, E" uniqKey="Szczurek E">E Szczurek</name>
</author>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nguyen, V H" uniqKey="Nguyen V">V-H Nguyen</name>
</author>
<author>
<name sortKey="Lavenier, D" uniqKey="Lavenier D">D Lavenier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Startek, M" uniqKey="Startek M">M Startek</name>
</author>
<author>
<name sortKey="Lasota, S" uniqKey="Lasota S">S Lasota</name>
</author>
<author>
<name sortKey="Sykulski, M" uniqKey="Sykulski M">M Sykulski</name>
</author>
<author>
<name sortKey="Bulak, A" uniqKey="Bulak A">A Bułak</name>
</author>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
<author>
<name sortKey="Gambin, A" uniqKey="Gambin A">A Gambin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Zhang, K" uniqKey="Zhang K">K Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Buchfink, B" uniqKey="Buchfink B">B Buchfink</name>
</author>
<author>
<name sortKey="Xie, C" uniqKey="Xie C">C Xie</name>
</author>
<author>
<name sortKey="Huson, Dh" uniqKey="Huson D">DH Huson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Somervuo, P" uniqKey="Somervuo P">P Somervuo</name>
</author>
<author>
<name sortKey="Holm, L" uniqKey="Holm L">L Holm</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author>
<name sortKey="Ilie, S" uniqKey="Ilie S">S Ilie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ilie, L" uniqKey="Ilie L">L Ilie</name>
</author>
<author>
<name sortKey="Ilie, S" uniqKey="Ilie S">S Ilie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ilie, S" uniqKey="Ilie S">S Ilie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Egidi, L" uniqKey="Egidi L">L Egidi</name>
</author>
<author>
<name sortKey="Manzini, G" uniqKey="Manzini G">G Manzini</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Egidi, L" uniqKey="Egidi L">L Egidi</name>
</author>
<author>
<name sortKey="Manzini, G" uniqKey="Manzini G">G Manzini</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Egidi, L" uniqKey="Egidi L">L Egidi</name>
</author>
<author>
<name sortKey="Manzini, G" uniqKey="Manzini G">G Manzini</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Egidi, L" uniqKey="Egidi L">L Egidi</name>
</author>
<author>
<name sortKey="Manzini, G" uniqKey="Manzini G">G Manzini</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brejova, B" uniqKey="Brejova B">B Brejová</name>
</author>
<author>
<name sortKey="Brown, Dg" uniqKey="Brown D">DG Brown</name>
</author>
<author>
<name sortKey="Vina, T" uniqKey="Vina T">T Vinař</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Buhler, J" uniqKey="Buhler J">J Buhler</name>
</author>
<author>
<name sortKey="Keich, U" uniqKey="Keich U">U Keich</name>
</author>
<author>
<name sortKey="Sun, Y" uniqKey="Sun Y">Y Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Preparata, Fp" uniqKey="Preparata F">FP Preparata</name>
</author>
<author>
<name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
<author>
<name sortKey="Choi, Kp" uniqKey="Choi K">KP Choi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
<author>
<name sortKey="Roytberg, Ma" uniqKey="Roytberg M">MA Roytberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kong, Y" uniqKey="Kong Y">Y Kong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
<author>
<name sortKey="Girdea, M" uniqKey="Girdea M">M Gîrdea</name>
</author>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marschall, T" uniqKey="Marschall T">T Marschall</name>
</author>
<author>
<name sortKey="Herms, I" uniqKey="Herms I">I Herms</name>
</author>
<author>
<name sortKey="Kaltenbach, H M" uniqKey="Kaltenbach H">H-M Kaltenbach</name>
</author>
<author>
<name sortKey="Rahmann, S" uniqKey="Rahmann S">S Rahmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Martin, Dek" uniqKey="Martin D">DEK Martin</name>
</author>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Horwege, S" uniqKey="Horwege S">S Horwege</name>
</author>
<author>
<name sortKey="Lindner, S" uniqKey="Lindner S">S Lindner</name>
</author>
<author>
<name sortKey="Boden, M" uniqKey="Boden M">M Boden</name>
</author>
<author>
<name sortKey="Hatje, K" uniqKey="Hatje K">K Hatje</name>
</author>
<author>
<name sortKey="Kollmar, M" uniqKey="Kollmar M">M Kollmar</name>
</author>
<author>
<name sortKey="Leimeister, C A" uniqKey="Leimeister C">C-A Leimeister</name>
</author>
<author>
<name sortKey="Morgenstern, B" uniqKey="Morgenstern B">B Morgenstern</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leimeister, Ca" uniqKey="Leimeister C">CA Leimeister</name>
</author>
<author>
<name sortKey="Boden, M" uniqKey="Boden M">M Boden</name>
</author>
<author>
<name sortKey="Horwege, S" uniqKey="Horwege S">S Horwege</name>
</author>
<author>
<name sortKey="Lindner, S" uniqKey="Lindner S">S Lindner</name>
</author>
<author>
<name sortKey="Morgenstern, B" uniqKey="Morgenstern B">B Morgenstern</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ghandi, M" uniqKey="Ghandi M">M Ghandi</name>
</author>
<author>
<name sortKey="Mohammad Noori, M" uniqKey="Mohammad Noori M">M Mohammad-Noori</name>
</author>
<author>
<name sortKey="Beer, Ma" uniqKey="Beer M">MA Beer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Morgenstern, B" uniqKey="Morgenstern B">B Morgenstern</name>
</author>
<author>
<name sortKey="Zhu, B" uniqKey="Zhu B">B Zhu</name>
</author>
<author>
<name sortKey="Horwege, S" uniqKey="Horwege S">S Horwege</name>
</author>
<author>
<name sortKey="Leimeister, Ca" uniqKey="Leimeister C">CA Leimeister</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="B Inda, K" uniqKey="B Inda K">K Břinda</name>
</author>
<author>
<name sortKey="Sykulski, M" uniqKey="Sykulski M">M Sykulski</name>
</author>
<author>
<name sortKey="Kucherov, G" uniqKey="Kucherov G">G Kucherov</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gheraibia, Y" uniqKey="Gheraibia Y">Y Gheraibia</name>
</author>
<author>
<name sortKey="Moussaoui, A" uniqKey="Moussaoui A">A Moussaoui</name>
</author>
<author>
<name sortKey="Djenouri, Y" uniqKey="Djenouri Y">Y Djenouri</name>
</author>
<author>
<name sortKey="Kabir, S" uniqKey="Kabir S">S Kabir</name>
</author>
<author>
<name sortKey="Yin, P Y" uniqKey="Yin P">P-Y Yin</name>
</author>
<author>
<name sortKey="Mazouzi, S" uniqKey="Mazouzi S">S Mazouzi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hahn, L" uniqKey="Hahn L">L Hahn</name>
</author>
<author>
<name sortKey="Leimeister, C A" uniqKey="Leimeister C">C-A Leimeister</name>
</author>
<author>
<name sortKey="Ounit, R" uniqKey="Ounit R">R Ounit</name>
</author>
<author>
<name sortKey="Lonardi, S" uniqKey="Lonardi S">S Lonardi</name>
</author>
<author>
<name sortKey="Morgenstern, B" uniqKey="Morgenstern B">B Morgenstern</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Choi, Kp" uniqKey="Choi K">KP Choi</name>
</author>
<author>
<name sortKey="Zeng, F" uniqKey="Zeng F">F Zeng</name>
</author>
<author>
<name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Allauzen, C" uniqKey="Allauzen C">C Allauzen</name>
</author>
<author>
<name sortKey="Riley, M" uniqKey="Riley M">M Riley</name>
</author>
<author>
<name sortKey="Schalkwyk, J" uniqKey="Schalkwyk J">J Schalkwyk</name>
</author>
<author>
<name sortKey="Skut, W" uniqKey="Skut W">W Skut</name>
</author>
<author>
<name sortKey="Mohri, M" uniqKey="Mohri M">M Mohri</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hopcroft, Je" uniqKey="Hopcroft J">JE Hopcroft</name>
</author>
<author>
<name sortKey="Motwani, R" uniqKey="Motwani R">R Motwani</name>
</author>
<author>
<name sortKey="Ullman, Jd" uniqKey="Ullman J">JD Ullman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aston, Jad" uniqKey="Aston J">JAD Aston</name>
</author>
<author>
<name sortKey="Martin, Dek" uniqKey="Martin D">DEK Martin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Noe, L" uniqKey="Noe L">L Noé</name>
</author>
<author>
<name sortKey="Martin, Dek" uniqKey="Martin D">DEK Martin</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ma, B" uniqKey="Ma B">B Ma</name>
</author>
<author>
<name sortKey="Li, M" uniqKey="Li M">M Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nicodeme, P" uniqKey="Nicodeme P">P Nicodème</name>
</author>
<author>
<name sortKey="Salvy, B" uniqKey="Salvy B">B Salvy</name>
</author>
<author>
<name sortKey="Flajolet, P" uniqKey="Flajolet P">P Flajolet</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Algorithms Mol Biol</journal-id>
<journal-id journal-id-type="iso-abbrev">Algorithms Mol Biol</journal-id>
<journal-title-group>
<journal-title>Algorithms for Molecular Biology : AMB</journal-title>
</journal-title-group>
<issn pub-type="epub">1748-7188</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">28289437</article-id>
<article-id pub-id-type="pmc">5310094</article-id>
<article-id pub-id-type="publisher-id">92</article-id>
<article-id pub-id-type="doi">10.1186/s13015-017-0092-1</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-1170-8376</contrib-id>
<name>
<surname>Noé</surname>
<given-names>Laurent</given-names>
</name>
<address>
<email>laurent.noe@univ-lille.fr</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 2186 1211</institution-id>
<institution-id institution-id-type="GRID">grid.4461.7</institution-id>
<institution>CRIStAL (UMR 9189 Lille University/CNRS)-Inria Lille, Bat M3 ext,</institution>
<institution>Université Lille 1,</institution>
</institution-wrap>
59655 Villeneuve d’Ascq, France</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>14</day>
<month>2</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>14</day>
<month>2</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="collection">
<year>2017</year>
</pub-date>
<volume>12</volume>
<elocation-id>1</elocation-id>
<history>
<date date-type="received">
<day>19</day>
<month>9</month>
<year>2016</year>
</date>
<date date-type="accepted">
<day>30</day>
<month>1</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2017</copyright-statement>
<license license-type="OpenAccess">
<license-p>
<bold>Open Access</bold>
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<sec>
<title>Background</title>
<p>
<italic>Spaced seeds</italic>
, also named
<italic>gapped q-grams, gapped k-mers, spaced q-grams</italic>
, have been proven to be more sensitive than contiguous seeds (
<italic>contiguous q-grams, contiguous k-mers</italic>
) in nucleic and amino-acid sequences analysis. Initially proposed to detect sequence similarities and to anchor sequence alignments, spaced seeds have more recently been applied in several
<italic>alignment-free</italic>
related methods. Unfortunately, spaced seeds need to be initially designed. This task is known to be time-consuming due to the number of spaced seed candidates. Moreover, it can be altered by a set of
<italic>arbitrary chosen</italic>
parameters from the probabilistic alignment models used. In this general context,
<italic>Dominant seeds</italic>
have been introduced by Mak and Benson (Bioinformatics 25:302–308, 2009) on the Bernoulli model, in order to reduce the number of spaced seed candidates that are further processed in a
<italic>parameter-free</italic>
calculation of the sensitivity.</p>
</sec>
<sec>
<title>Results</title>
<p>We expand the scope of work of Mak and Benson on single and multiple seeds by considering the Hit Integration model of Chung and Park (BMC Bioinform 11:31, 2010), demonstrate that the same dominance definition can be applied, and that a parameter-free study can be performed without any significant additional cost. We also consider two new discrete models, namely the Heaviside and the Dirac models, where lossless seeds can be integrated. From a theoretical standpoint, we establish a generic framework on all the proposed models, by applying a
<italic>counting semi-ring</italic>
to quickly compute large polynomial coefficients needed by the
<italic>dominance</italic>
filter. From a practical standpoint, we confirm that
<italic>dominant seeds</italic>
reduce the set of, either single seeds to thoroughly analyse, or multiple seeds to store. Moreover, in
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.cristal.univ-lille.fr/yass/iedera%5fdominance">http://bioinfo.cristal.univ-lille.fr/yass/iedera_dominance</ext-link>
, we provide a full list of spaced seeds computed on the four aforementioned models, with one (continuous) parameter left free for each model, and with several (discrete) alignment lengths.</p>
</sec>
</abstract>
<kwd-group xml:lang="en">
<title>Keywords</title>
<kwd>Spaced seeds</kwd>
<kwd>Dominant seeds</kwd>
<kwd>Bernoulli</kwd>
<kwd>Hit Integration</kwd>
<kwd>Heaviside</kwd>
<kwd>Dirac</kwd>
<kwd>Counting semi-ring</kwd>
<kwd>Polynomial form</kwd>
<kwd>DFA</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source>
<institution>INRIA</institution>
</funding-source>
</award-group>
</funding-group>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2017</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1">
<title>Background</title>
<p>
<italic>Optimized spaced seeds</italic>
, or
<italic>best gapped q-grams</italic>
, have independently been proposed in PatternHunter [
<xref ref-type="bibr" rid="CR3">3</xref>
] and by Burkhardt and Karkkainen [
<xref ref-type="bibr" rid="CR4">4</xref>
]. The primary objective was either to improve the sensitivity of the heuristic but efficient
<italic>hit and extend</italic>
BLAST-like strategy (without using the
<italic>neighborhood word principle</italic>
<xref ref-type="fn" rid="Fn1">1</xref>
), or to increase the selectivity for lossless filters on alignments of size
<inline-formula id="IEq1">
<alternatives>
<tex-math id="M1">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M2">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq1.gif"></inline-graphic>
</alternatives>
</inline-formula>
under a given Hamming distance of
<italic>k</italic>
.</p>
<p>Several extensions of the spaced seed model have then been proposed on the two aforementioned problems: vector seeds [
<xref ref-type="bibr" rid="CR5">5</xref>
], one gapped
<italic>q</italic>
-grams [
<xref ref-type="bibr" rid="CR6">6</xref>
] or indel seeds [
<xref ref-type="bibr" rid="CR7">7</xref>
,
<xref ref-type="bibr" rid="CR8">8</xref>
], neighbor seeds [
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR10">10</xref>
], transition seeds  [
<xref ref-type="bibr" rid="CR11">11</xref>
<xref ref-type="bibr" rid="CR15">15</xref>
], multiple seeds [
<xref ref-type="bibr" rid="CR16">16</xref>
<xref ref-type="bibr" rid="CR19">19</xref>
], adaptive seeds [
<xref ref-type="bibr" rid="CR20">20</xref>
] and related work on the associated indexes  [
<xref ref-type="bibr" rid="CR21">21</xref>
<xref ref-type="bibr" rid="CR26">26</xref>
], just to mention a few.</p>
<p>Unfortunately, spaced seeds are known to produce hard problems, both on the seed sensitivity computation [
<xref ref-type="bibr" rid="CR27">27</xref>
] or the lossless computation [
<xref ref-type="bibr" rid="CR28">28</xref>
], and moreover on the seed design [
<xref ref-type="bibr" rid="CR29">29</xref>
]. But the choice of the right seed pattern has a significant impact on genomic sequence comparison [
<xref ref-type="bibr" rid="CR3">3</xref>
,
<xref ref-type="bibr" rid="CR12">12</xref>
,
<xref ref-type="bibr" rid="CR16">16</xref>
,
<xref ref-type="bibr" rid="CR20">20</xref>
,
<xref ref-type="bibr" rid="CR30">30</xref>
<xref ref-type="bibr" rid="CR38">38</xref>
], on oligonucleotide design [
<xref ref-type="bibr" rid="CR39">39</xref>
<xref ref-type="bibr" rid="CR44">44</xref>
], as well as on amino acid sequence comparison [
<xref ref-type="bibr" rid="CR45">45</xref>
<xref ref-type="bibr" rid="CR53">53</xref>
]; this has led to several effective methods to (possibly greedily) select spaced seeds [
<xref ref-type="bibr" rid="CR54">54</xref>
<xref ref-type="bibr" rid="CR61">61</xref>
] with elaborated alignment models and their associated algorithms [
<xref ref-type="bibr" rid="CR62">62</xref>
<xref ref-type="bibr" rid="CR70">70</xref>
].</p>
<p>Another less frequently mentioned problem is that the seed design is mostly performed on a
<italic>fixed and already fully parameterized</italic>
alignment model (for example, a
<italic>Bernoulli</italic>
model where the
<italic>probability of a match</italic>
<italic>p</italic>
is set to 0.7). There is not so much choice for the optimal seed, when, for example, the scoring system is changed, and thus the expected distribution of alignments.</p>
<p>We note that several recent works mention the use of spaced seeds in
<italic>alignment-free</italic>
methods [
<xref ref-type="bibr" rid="CR71">71</xref>
<xref ref-type="bibr" rid="CR73">73</xref>
] with applications in phylogenetic distance estimation [
<xref ref-type="bibr" rid="CR74">74</xref>
], metagenomic classification [
<xref ref-type="bibr" rid="CR75">75</xref>
,
<xref ref-type="bibr" rid="CR76">76</xref>
], just to cite a few.</p>
<p>Finally, we also noticed that several recent studies use the
<italic>overlap complexity</italic>
 [
<xref ref-type="bibr" rid="CR54">54</xref>
,
<xref ref-type="bibr" rid="CR56">56</xref>
,
<xref ref-type="bibr" rid="CR57">57</xref>
,
<xref ref-type="bibr" rid="CR77">77</xref>
<xref ref-type="bibr" rid="CR79">79</xref>
] which is closely linked to the
<italic>variance</italic>
of the number of spaced-word matches [
<xref ref-type="bibr" rid="CR80">80</xref>
] and is known to provide an upper/lower bound for the expectation of the length preceding the first seed hit [
<xref ref-type="bibr" rid="CR27">27</xref>
,
<xref ref-type="bibr" rid="CR66">66</xref>
,
<xref ref-type="bibr" rid="CR81">81</xref>
]. We mention here that a similar
<italic>parameter-free</italic>
approach could also be applied for the
<italic>variance induced</italic>
selection of seeds, but an interesting question remains in that case: to find a
<italic>dominance equivalent</italic>
criterion associated with the selection of candidate seeds.</p>
<p>The paper is organized as follows. We start with an introduction to the
<italic>spaced seed model</italic>
and its associated
<italic>sensitivity</italic>
or
<italic>lossless aspect</italic>
, and show how
<italic>semi-rings</italic>
on DFA can help determining such features. Section “
<xref rid="Sec3" ref-type="sec">Semi-rings and number of alignments</xref>
” restricts the description to
<italic>counting semi-rings</italic>
that are applied on a specific DFA to perform an efficient dynamic programming algorithm on a set of counters. This is a prerequisite for the two next sections that present respectively
<italic>continuous models</italic>
and
<italic>discrete models</italic>
. Section “
<xref rid="Sec4" ref-type="sec">Continuous models</xref>
” is divided into two parts : the first one outlines the
<italic>polynomial form of the sensitivity</italic>
proposed by [
<xref ref-type="bibr" rid="CR1">1</xref>
] to compute the sensitivity on the
<italic>Bernoulli model</italic>
together with the associated
<italic>dominance principle</italic>
, whereas the second one extends this
<italic>polynomial form</italic>
to the
<italic>Hit Integration model</italic>
of [
<xref ref-type="bibr" rid="CR2">2</xref>
], and explains why the dominance principle remains valid. Section “
<xref rid="Sec7" ref-type="sec">Discrete models</xref>
” describes two new
<italic>Dirac</italic>
and
<italic>Heaviside</italic>
models, and shows how
<italic>lossless seeds</italic>
can be integrated into them. Then, we report our experimental analysis on all the aforementioned models, display and explain several optimal seed Pareto plots for the restricted case of one single seed, and links to a wide range of compiled results for multiple seeds. The last section brings the discussion to the asymptotic problem, and to several finite extensions.</p>
</sec>
<sec id="Sec2">
<title>Spaced seeds and seed sensitivity</title>
<p>We suppose here that strings are indexed starting from position number 1. For a given string
<italic>u</italic>
, we will use the following notation:
<italic>u</italic>
[
<italic>i</italic>
] gives the
<italic>i</italic>
-th symbol of
<italic>u</italic>
, |
<italic>u</italic>
| is the length of
<italic>u</italic>
, and
<inline-formula id="IEq2">
<alternatives>
<tex-math id="M3">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|u|_a$$\end{document}</tex-math>
<mml:math id="M4">
<mml:msub>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>u</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq2.gif"></inline-graphic>
</alternatives>
</inline-formula>
is the number of symbol letters
<italic>a</italic>
that
<italic>u</italic>
contains.</p>
<p>Nucleotide sequence alignments without
<italic>indels</italic>
can be represented as a succession of
<italic>match</italic>
or
<italic>mismatch</italic>
symbols, and thus represented as a string
<italic>x</italic>
over a binary alphabet
<inline-formula id="IEq3">
<alternatives>
<tex-math id="M5">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\texttt {0},\texttt {1}\}$$\end{document}</tex-math>
<mml:math id="M6">
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mn mathvariant="monospace">0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn mathvariant="monospace">1</mml:mn>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq3.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
<p>A spaced seed can be represented as a string
<inline-formula id="IEq4">
<alternatives>
<tex-math id="M7">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M8">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq4.gif"></inline-graphic>
</alternatives>
</inline-formula>
over a binary alphabet
<inline-formula id="IEq5">
<alternatives>
<tex-math id="M9">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\text {0},\text {1}\}$$\end{document}</tex-math>
<mml:math id="M10">
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mtext>0</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext>1</mml:mtext>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq5.gif"></inline-graphic>
</alternatives>
</inline-formula>
but with a different meaning for each of the two symbols:
<inline-formula id="IEq6">
<alternatives>
<tex-math id="M11">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {1}$$\end{document}</tex-math>
<mml:math id="M12">
<mml:mtext>1</mml:mtext>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq6.gif"></inline-graphic>
</alternatives>
</inline-formula>
indicates a position on the seed
<inline-formula id="IEq7">
<alternatives>
<tex-math id="M13">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M14">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq7.gif"></inline-graphic>
</alternatives>
</inline-formula>
where a single
<italic>match</italic>
must occur in the alignment
<italic>x</italic>
(it is thus called a
<italic>must match</italic>
symbol), whereas
<inline-formula id="IEq8">
<alternatives>
<tex-math id="M15">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {0}$$\end{document}</tex-math>
<mml:math id="M16">
<mml:mtext>0</mml:mtext>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq8.gif"></inline-graphic>
</alternatives>
</inline-formula>
indicates a position where a single
<italic>match</italic>
or a single
<italic>mismatch</italic>
is allowed (it is thus called a
<italic>don’t-care</italic>
symbol).</p>
<p>The
<italic>weight</italic>
of a seed
<inline-formula id="IEq9">
<alternatives>
<tex-math id="M17">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M18">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq9.gif"></inline-graphic>
</alternatives>
</inline-formula>
(denoted by
<italic>w</italic>
or
<inline-formula id="IEq10">
<alternatives>
<tex-math id="M19">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w_\pi$$\end{document}</tex-math>
<mml:math id="M20">
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq10.gif"></inline-graphic>
</alternatives>
</inline-formula>
) is defined as the number of
<italic>must match</italic>
symbols (
<inline-formula id="IEq11">
<alternatives>
<tex-math id="M21">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w_\pi = |\pi |_1$$\end{document}</tex-math>
<mml:math id="M22">
<mml:mrow>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq11.gif"></inline-graphic>
</alternatives>
</inline-formula>
): the weight is frequently set constant or with a minimal value, because it is related to the
<italic>selectivity</italic>
of the seed. The
<italic>span</italic>
or
<italic>length</italic>
of a seed
<inline-formula id="IEq12">
<alternatives>
<tex-math id="M23">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M24">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq12.gif"></inline-graphic>
</alternatives>
</inline-formula>
(denoted by
<inline-formula id="IEq13">
<alternatives>
<tex-math id="M25">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_\pi$$\end{document}</tex-math>
<mml:math id="M26">
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq13.gif"></inline-graphic>
</alternatives>
</inline-formula>
) is its full length (
<inline-formula id="IEq14">
<alternatives>
<tex-math id="M27">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_\pi = |\pi |$$\end{document}</tex-math>
<mml:math id="M28">
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq14.gif"></inline-graphic>
</alternatives>
</inline-formula>
). We will also frequently use
<inline-formula id="IEq15">
<alternatives>
<tex-math id="M29">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M30">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq15.gif"></inline-graphic>
</alternatives>
</inline-formula>
for the length of the alignment (
<inline-formula id="IEq16">
<alternatives>
<tex-math id="M31">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =|x|$$\end{document}</tex-math>
<mml:math id="M32">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq16.gif"></inline-graphic>
</alternatives>
</inline-formula>
).</p>
<p>The spaced seed
<inline-formula id="IEq17">
<alternatives>
<tex-math id="M33">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M34">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq17.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>hits</italic>
at position
<italic>i</italic>
of the alignment
<italic>x</italic>
where
<inline-formula id="IEq18">
<alternatives>
<tex-math id="M35">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i \in \big [1\ldots\,|x|-|\pi |+1\big ] = \big [1\ldots\,\ell -s_\pi +1\big ]$$\end{document}</tex-math>
<mml:math id="M36">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">[</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">]</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">[</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq18.gif"></inline-graphic>
</alternatives>
</inline-formula>
iff
<disp-formula id="Equ3">
<alternatives>
<tex-math id="M37">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \forall j \in \big [1\ldots\,s_\pi \big ] \qquad \pi [j] = \text {1}\implies x[j+i-1] = \texttt {1} \end{aligned}$$\end{document}</tex-math>
<mml:math id="M38" display="block">
<mml:mrow>
<mml:mtable columnspacing="0.5ex">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">[</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">]</mml:mo>
</mml:mrow>
<mml:mspace width="2em"></mml:mspace>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext>1</mml:mtext>
<mml:mo stretchy="false"></mml:mo>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn mathvariant="monospace">1</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<graphic xlink:href="13015_2017_92_Article_Equ3.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
For example, the seed
<inline-formula id="IEq19">
<alternatives>
<tex-math id="M39">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi = \text {1101}$$\end{document}</tex-math>
<mml:math id="M40">
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext>1101</mml:mtext>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq19.gif"></inline-graphic>
</alternatives>
</inline-formula>
hits the alignment
<inline-formula id="IEq20">
<alternatives>
<tex-math id="M41">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x = \texttt {111010101111}$$\end{document}</tex-math>
<mml:math id="M42">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn mathvariant="monospace">111010101111</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq20.gif"></inline-graphic>
</alternatives>
</inline-formula>
twice, at positions 2 and 9.
<graphic position="anchor" xlink:href="13015_2017_92_Figa_HTML" id="MO20"></graphic>
</p>
<p>Naturally, the shape of the seed, i.e.  possible placement of a set of
<italic>don’t-care</italic>
symbols between any consecutive pair of the
<italic>w</italic>
<italic>must match</italic>
symbols, plays a significant role and must be carefully controlled. Requiring
<italic>at least one hit</italic>
for a seed, on an alignment
<italic>x</italic>
, is the most common (but not unique) way to select a
<italic>good seed</italic>
.</p>
<p>However, depending on the context and the problem being solved, even measuring this simple feature can easily take one of the two (previously briefly mentioned) forms:
<list list-type="alpha-lower">
<list-item>
<p>When considering that any alignment
<italic>x</italic>
is of given length
<inline-formula id="IEq32">
<alternatives>
<tex-math id="M43">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M44">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq32.gif"></inline-graphic>
</alternatives>
</inline-formula>
, and each symbol is generated by a Bernoulli model (so there is no restriction on the number of match or mismatch symbols an alignment must contain, but with some configurations more probable than others), the problem is to select a
<italic>good seed</italic>
(respectively the
<italic>best seed</italic>
) as the one that has a
<italic>high probability</italic>
(respectively the
<italic>best probability</italic>
) to hit at least once.</p>
</list-item>
<list-item>
<p>When considering that any alignment
<italic>x</italic>
is of given length
<inline-formula id="IEq33">
<alternatives>
<tex-math id="M45">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M46">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq33.gif"></inline-graphic>
</alternatives>
</inline-formula>
, and contains at most
<italic>k</italic>
mismatch symbols, a classical requirement for a
<italic>good seed</italic>
is to guarantee that
<italic>all the possible alignments</italic>
, obtained by any placements of
<italic>k</italic>
mismatch symbols on the
<inline-formula id="IEq34">
<alternatives>
<tex-math id="M47">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M48">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq34.gif"></inline-graphic>
</alternatives>
</inline-formula>
alignment symbols, will
<italic>all</italic>
be detected by at least one seed hit each: when this distinctive feature occurs, the seed is considered
<italic>lossless</italic>
or
<inline-formula id="IEq35">
<alternatives>
<tex-math id="M49">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\ell ,k)$$\end{document}</tex-math>
<mml:math id="M50">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq35.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>-lossless</italic>
.</p>
</list-item>
</list>
<fig id="Fig1">
<label>Fig. 1</label>
<caption>
<p>Spaced seed DFA. We represent the
<italic>at least one hit</italic>
DFA for the spaced seed
<inline-formula id="IEq36">
<alternatives>
<tex-math id="M51">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi = \text {1101}$$\end{document}</tex-math>
<mml:math id="M52">
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext>1101</mml:mtext>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq36.gif"></inline-graphic>
</alternatives>
</inline-formula>
. This automaton recognizes any alignment sequence with at least one occurrence of
<inline-formula id="IEq37">
<alternatives>
<tex-math id="M53">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {1101}$$\end{document}</tex-math>
<mml:math id="M54">
<mml:mn mathvariant="monospace">1101</mml:mn>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq37.gif"></inline-graphic>
</alternatives>
</inline-formula>
or
<inline-formula id="IEq38">
<alternatives>
<tex-math id="M55">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {1111}$$\end{document}</tex-math>
<mml:math id="M56">
<mml:mn mathvariant="monospace">1111</mml:mn>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq38.gif"></inline-graphic>
</alternatives>
</inline-formula>
</p>
</caption>
<graphic xlink:href="13015_2017_92_Fig1_HTML" id="MO2"></graphic>
</fig>
</p>
<p>The two problems can be solved by first considering the language recognized by the seed
<inline-formula id="IEq39">
<alternatives>
<tex-math id="M57">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M58">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq39.gif"></inline-graphic>
</alternatives>
</inline-formula>
, in this context the
<italic>at least one hit</italic>
  regular language, and its associated DFA. As an illustration, Fig.  
<xref rid="Fig1" ref-type="fig">1</xref>
displays the
<italic>at least one hit</italic>
  DFA for the spaced seed
<inline-formula id="IEq40">
<alternatives>
<tex-math id="M59">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {1101}$$\end{document}</tex-math>
<mml:math id="M60">
<mml:mtext>1101</mml:mtext>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq40.gif"></inline-graphic>
</alternatives>
</inline-formula>
: this automaton recognizes the associated regular language
<inline-formula id="IEq41">
<alternatives>
<tex-math id="M61">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\texttt {0},\texttt {1}\}^{*} ( \texttt {1101} | \texttt {1111}) \{\texttt {0},\texttt {1}\}^{*}$$\end{document}</tex-math>
<mml:math id="M62">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mn mathvariant="monospace">0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn mathvariant="monospace">1</mml:mn>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mrow></mml:mrow>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn mathvariant="monospace">1101</mml:mn>
<mml:mo stretchy="false">|</mml:mo>
<mml:mn mathvariant="monospace">1111</mml:mn>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mn mathvariant="monospace">0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn mathvariant="monospace">1</mml:mn>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mrow></mml:mrow>
<mml:mo></mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq41.gif"></inline-graphic>
</alternatives>
</inline-formula>
, or less formally, any binary alignment sequence
<italic>x</italic>
that has
<italic>at least one</italic>
occurrence of
<inline-formula id="IEq42">
<alternatives>
<tex-math id="M63">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {1101}$$\end{document}</tex-math>
<mml:math id="M64">
<mml:mn mathvariant="monospace">1101</mml:mn>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq42.gif"></inline-graphic>
</alternatives>
</inline-formula>
or
<inline-formula id="IEq43">
<alternatives>
<tex-math id="M65">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {1111}$$\end{document}</tex-math>
<mml:math id="M66">
<mml:mn mathvariant="monospace">1111</mml:mn>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq43.gif"></inline-graphic>
</alternatives>
</inline-formula>
as a factor.</p>
<p>The second step consists in computing, by using a simple dynamic programming (DP) procedure set for any states of the DFA and for each step
<inline-formula id="IEq44">
<alternatives>
<tex-math id="M67">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i \in \big [1\ldots\,\ell \big ]$$\end{document}</tex-math>
<mml:math id="M68">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">[</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi></mml:mi>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq44.gif"></inline-graphic>
</alternatives>
</inline-formula>
,
<list list-type="alpha-lower">
<list-item>
<p>Either, the probability to reach any of the automaton states.</p>
</list-item>
<list-item>
<p>Otherwise, the minimal number of mismatch symbols 0 that have been crossed to reach any state.</p>
</list-item>
</list>
For example, considering the probability problem (a) on a Bernoulli model where a
<italic>match</italic>
has a probability
<italic>p</italic>
set to 0.7, we show it can be computed—by first
<italic>“replacing”</italic>
, on the automaton of Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
, the transition symbols 0 and 1 by their respective probabilities 0.3 and 0.7—then, on each step
<italic>i</italic>
, it is possible to compute the probability
<inline-formula id="IEq45">
<alternatives>
<tex-math id="M69">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathscr {P}(i,q)$$\end{document}</tex-math>
<mml:math id="M70">
<mml:mrow>
<mml:mi mathvariant="script">P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq45.gif"></inline-graphic>
</alternatives>
</inline-formula>
to reach each of the states
<italic>q</italic>
by applying a recursive formula that uses the probability to be at any of its preceding states on step
<inline-formula id="IEq46">
<alternatives>
<tex-math id="M71">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i-1$$\end{document}</tex-math>
<mml:math id="M72">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq46.gif"></inline-graphic>
</alternatives>
</inline-formula>
. For the automaton of Fig.  
<xref rid="Fig1" ref-type="fig">1</xref>
, this gives
<inline-graphic xlink:href="13015_2017_92_Figb_HTML.gif" id="d29e1436"></inline-graphic>
on step
<inline-formula id="IEq108">
<alternatives>
<tex-math id="M73">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i=4$$\end{document}</tex-math>
<mml:math id="M74">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq108.gif"></inline-graphic>
</alternatives>
</inline-formula>
, the probability to reach the final state
<inline-formula id="IEq109">
<alternatives>
<tex-math id="M75">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_6$$\end{document}</tex-math>
<mml:math id="M76">
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>6</mml:mn>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq109.gif"></inline-graphic>
</alternatives>
</inline-formula>
can be computed to
<inline-formula id="IEq110">
<alternatives>
<tex-math id="M77">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {P}(4,q_6)= 0.343$$\end{document}</tex-math>
<mml:math id="M78">
<mml:mrow>
<mml:mi mathvariant="script">P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>4</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>6</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>0.343</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq110.gif"></inline-graphic>
</alternatives>
</inline-formula>
(
<inline-formula id="IEq111">
<alternatives>
<tex-math id="M79">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.7^3$$\end{document}</tex-math>
<mml:math id="M80">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>.</mml:mo>
<mml:msup>
<mml:mn>7</mml:mn>
<mml:mn>3</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq111.gif"></inline-graphic>
</alternatives>
</inline-formula>
), as a logical (and first non-null) probability for the seed
<inline-formula id="IEq112">
<alternatives>
<tex-math id="M81">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi = \text {1101}$$\end{document}</tex-math>
<mml:math id="M82">
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext>1101</mml:mtext>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq112.gif"></inline-graphic>
</alternatives>
</inline-formula>
to detect alignments of length
<inline-formula id="IEq113">
<alternatives>
<tex-math id="M83">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =4$$\end{document}</tex-math>
<mml:math id="M84">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq113.gif"></inline-graphic>
</alternatives>
</inline-formula>
—on step
<inline-formula id="IEq114">
<alternatives>
<tex-math id="M85">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i=5$$\end{document}</tex-math>
<mml:math id="M86">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>5</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq114.gif"></inline-graphic>
</alternatives>
</inline-formula>
, the probability to reach
<inline-formula id="IEq115">
<alternatives>
<tex-math id="M87">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_6$$\end{document}</tex-math>
<mml:math id="M88">
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>6</mml:mn>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq115.gif"></inline-graphic>
</alternatives>
</inline-formula>
can be computed to
<inline-formula id="IEq116">
<alternatives>
<tex-math id="M89">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {P}(5,q_6) = 0.51793$$\end{document}</tex-math>
<mml:math id="M90">
<mml:mrow>
<mml:mi mathvariant="script">P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>6</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>0.51793</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq116.gif"></inline-graphic>
</alternatives>
</inline-formula>
(
<inline-formula id="IEq117">
<alternatives>
<tex-math id="M91">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.7^3 \times (1 + 0.3 + 0.7 \times 0.3))$$\end{document}</tex-math>
<mml:math id="M92">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>.</mml:mo>
<mml:msup>
<mml:mn>7</mml:mn>
<mml:mn>3</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mo>×</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mn>0.3</mml:mn>
<mml:mo>+</mml:mo>
<mml:mn>0.7</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>0.3</mml:mn>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq117.gif"></inline-graphic>
</alternatives>
</inline-formula>
to detect alignments of length
<inline-formula id="IEq118">
<alternatives>
<tex-math id="M93">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =5$$\end{document}</tex-math>
<mml:math id="M94">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>5</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq118.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
<p>Another example, considering now the lossless property (b) for the spaced seed
<inline-formula id="IEq119">
<alternatives>
<tex-math id="M95">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi = \text {1101}$$\end{document}</tex-math>
<mml:math id="M96">
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext>1101</mml:mtext>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq119.gif"></inline-graphic>
</alternatives>
</inline-formula>
: we can show that this seed is lossless for one single mismatch, when
<inline-formula id="IEq120">
<alternatives>
<tex-math id="M97">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \ge 6$$\end{document}</tex-math>
<mml:math id="M98">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mn>6</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq120.gif"></inline-graphic>
</alternatives>
</inline-formula>
(but computational details are left to the reader, after a remark on
<italic>tropical semi-rings</italic>
in the next paragraph): the seed is thus
<inline-formula id="IEq121">
<alternatives>
<tex-math id="M99">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\ell =6,k=1)$$\end{document}</tex-math>
<mml:math id="M100">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>6</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq121.gif"></inline-graphic>
</alternatives>
</inline-formula>
-lossless ; however, this seed is not
<inline-formula id="IEq122">
<alternatives>
<tex-math id="M101">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\ell =5,k=1)$$\end{document}</tex-math>
<mml:math id="M102">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>5</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq122.gif"></inline-graphic>
</alternatives>
</inline-formula>
-lossless, since reading the consistent sequence
<inline-formula id="IEq123">
<alternatives>
<tex-math id="M103">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {10111}$$\end{document}</tex-math>
<mml:math id="M104">
<mml:mn mathvariant="monospace">10111</mml:mn>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq123.gif"></inline-graphic>
</alternatives>
</inline-formula>
leads to a non-final state.</p>
<p>Finally, we simply mention that this second computational step involves the implicit use of
<italic>semi-rings</italic>
,
<list list-type="alpha-lower">
<list-item>
<p>Either
<italic>probability semi-rings</italic>
:
<inline-formula id="IEq124">
<alternatives>
<tex-math id="M105">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(E = \mathbb {R}_{0 \le r \le 1},\; \oplus = +,\; \otimes = \;\times \;,\; 0_{\oplus ,\epsilon _\otimes } = 0,\; 1_{\otimes } = 1)$$\end{document}</tex-math>
<mml:math id="M106">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>E</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi mathvariant="double-struck">R</mml:mi>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:mi>r</mml:mi>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>,</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo>×</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo>,</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:msub>
<mml:mn>0</mml:mn>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">ϵ</mml:mi>
<mml:mo></mml:mo>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:msub>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq124.gif"></inline-graphic>
</alternatives>
</inline-formula>
; the final state(s) of the DFA give(s) the probability of having
<italic>at least one hit</italic>
after
<inline-formula id="IEq125">
<alternatives>
<tex-math id="M107">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M108">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq125.gif"></inline-graphic>
</alternatives>
</inline-formula>
steps of the DP algorithm,</p>
</list-item>
<list-item>
<p>Otherwise
<italic>tropical semi-rings</italic>
:
<inline-formula id="IEq126">
<alternatives>
<tex-math id="M109">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(E = \mathbb {R}_{\ge 0},\; \oplus = min,\; \otimes = +\;\; 0_{\oplus ,\epsilon _\otimes } = \infty ,\; 1_{\otimes } = 0)$$\end{document}</tex-math>
<mml:math id="M110">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>E</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi mathvariant="double-struck">R</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>,</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo></mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>+</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:msub>
<mml:mn>0</mml:mn>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">ϵ</mml:mi>
<mml:mo></mml:mo>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:msub>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq126.gif"></inline-graphic>
</alternatives>
</inline-formula>
. The seed is
<inline-formula id="IEq127">
<alternatives>
<tex-math id="M111">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\ell ,k)$$\end{document}</tex-math>
<mml:math id="M112">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq127.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>-lossless</italic>
iff all the non-final states of the DFA have a minimal number of mismatches that is strictly greater than
<italic>k</italic>
, after
<inline-formula id="IEq128">
<alternatives>
<tex-math id="M113">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M114">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq128.gif"></inline-graphic>
</alternatives>
</inline-formula>
steps of the DP algorithm.
<xref ref-type="fn" rid="Fn2">2</xref>
</p>
</list-item>
</list>
</p>
</sec>
<sec id="Sec3">
<title>Semi-rings and number of alignments</title>
<p>Semi-rings are a flexible and powerful tool, employed for example to compute probabilities, scores, distances, counts (to name a few) in a generic dynamic programming framework [
<xref ref-type="bibr" rid="CR82">82</xref>
,
<xref ref-type="bibr" rid="CR83">83</xref>
]. The first problem involved, mentioned at the end of the previous section, is the right choice of the semi-ring, adapted to the question being addressed. Sometimes, selecting an alternative semi-ring to
<italic>count elements</italic>
, may turn out to be a flexible choice that solves more involved problems (for example
<italic>computing probabilities</italic>
is one of them, and will be described in next section).
<fig id="Fig2">
<label>Fig. 2</label>
<caption>
<p>DFA intersection product. We represent the resulting intersection product of the
<italic>at least one hit</italic>
DFA for the seed
<inline-formula id="IEq132">
<alternatives>
<tex-math id="M115">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi = \text {101}$$\end{document}</tex-math>
<mml:math id="M116">
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext>101</mml:mtext>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq132.gif"></inline-graphic>
</alternatives>
</inline-formula>
(
<italic>top horizontal</italic>
automaton), with the
<inline-formula id="IEq133">
<alternatives>
<tex-math id="M117">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {1}$$\end{document}</tex-math>
<mml:math id="M118">
<mml:mtext>1</mml:mtext>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq133.gif"></inline-graphic>
</alternatives>
</inline-formula>
-counting DFA (
<italic>left vertical</italic>
automaton). The
<italic> dashed</italic>
transitions represent ellipsis in the construction between
<inline-formula id="IEq134">
<alternatives>
<tex-math id="M119">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m=2$$\end{document}</tex-math>
<mml:math id="M120">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq134.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq135">
<alternatives>
<tex-math id="M121">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m=\ell -1$$\end{document}</tex-math>
<mml:math id="M122">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq135.gif"></inline-graphic>
</alternatives>
</inline-formula>
, while the
<italic> dotted</italic>
transitions at the bottom of the resulting automaton make it complete</p>
</caption>
<graphic xlink:href="13015_2017_92_Fig2_HTML" id="MO3"></graphic>
</fig>
</p>
<p>Counting semi-rings [
<xref ref-type="bibr" rid="CR84">84</xref>
] are adapted for this task: when applied on the
<italic>right language</italic>
and its
<italic>right automaton</italic>
, they can report the number of alignments
<inline-formula id="IEq136">
<alternatives>
<tex-math id="M123">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M124">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq136.gif"></inline-graphic>
</alternatives>
</inline-formula>
that are at the same time detected by the seed
<inline-formula id="IEq137">
<alternatives>
<tex-math id="M125">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M126">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq137.gif"></inline-graphic>
</alternatives>
</inline-formula>
while having
<italic>m</italic>
matches out of
<inline-formula id="IEq138">
<alternatives>
<tex-math id="M127">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M128">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq138.gif"></inline-graphic>
</alternatives>
</inline-formula>
alignment symbols. The main idea that enables the computation of these
<inline-formula id="IEq139">
<alternatives>
<tex-math id="M129">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M130">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq139.gif"></inline-graphic>
</alternatives>
</inline-formula>
counting coefficients (illustrated on Fig.  
<xref rid="Fig2" ref-type="fig">2</xref>
as the intersection product) is first to intersect the language recognized by the seed
<inline-formula id="IEq140">
<alternatives>
<tex-math id="M131">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M132">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq140.gif"></inline-graphic>
</alternatives>
</inline-formula>
(the
<italic>at least one hit</italic>
language of
<inline-formula id="IEq141">
<alternatives>
<tex-math id="M133">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M134">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq141.gif"></inline-graphic>
</alternatives>
</inline-formula>
) with the classes of alignments that have exactly
<italic>m</italic>
matches: the automaton associated with all of these classes of alignments with
<italic>m</italic>
matches has a very simple linear form with
<inline-formula id="IEq142">
<alternatives>
<tex-math id="M135">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell +1$$\end{document}</tex-math>
<mml:math id="M136">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq142.gif"></inline-graphic>
</alternatives>
</inline-formula>
states, where several distinct final states are defined according to all the possible values of
<inline-formula id="IEq143">
<alternatives>
<tex-math id="M137">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m \in [0\ldots\,\ell ]$$\end{document}</tex-math>
<mml:math id="M138">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi></mml:mi>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq143.gif"></inline-graphic>
</alternatives>
</inline-formula>
. Finally, since the intersection of two regular languages is regular [Theorem 4.8 of the timeless
<xref ref-type="bibr" rid="CR85">85</xref>
], it can thus be represented by a conventional DFA, while keeping the feature of having several distinct final states.</p>
<p>As an illustration, Fig.  
<xref rid="Fig2" ref-type="fig">2</xref>
displays the
<italic>at least one hit</italic>
DFA for the spaced seed
<inline-formula id="IEq144">
<alternatives>
<tex-math id="M139">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {101}$$\end{document}</tex-math>
<mml:math id="M140">
<mml:mtext>101</mml:mtext>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq144.gif"></inline-graphic>
</alternatives>
</inline-formula>
(on the top), the linear
<inline-formula id="IEq145">
<alternatives>
<tex-math id="M141">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {1}$$\end{document}</tex-math>
<mml:math id="M142">
<mml:mtext>1</mml:mtext>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq145.gif"></inline-graphic>
</alternatives>
</inline-formula>
-counting DFA (on the vertical left part) to isolate alignments with exactly
<italic>m</italic>
matches, and finally their intersection product, that represent the intersecting language as a new DFA (itself obtained by crossing
<italic>synchronously</italic>
the two previous DFAs). Note that each of the final states
<inline-formula id="IEq146">
<alternatives>
<tex-math id="M143">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_m \times q_5$$\end{document}</tex-math>
<mml:math id="M144">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>5</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq146.gif"></inline-graphic>
</alternatives>
</inline-formula>
(for
<inline-formula id="IEq147">
<alternatives>
<tex-math id="M145">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m < \ell$$\end{document}</tex-math>
<mml:math id="M146">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo><</mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq147.gif"></inline-graphic>
</alternatives>
</inline-formula>
) of the resulting DFA is reached by alignment sequences with exactly
<italic>m</italic>
matches that are also detected by the seed
<inline-formula id="IEq148">
<alternatives>
<tex-math id="M147">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text {101}$$\end{document}</tex-math>
<mml:math id="M148">
<mml:mtext>101</mml:mtext>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq148.gif"></inline-graphic>
</alternatives>
</inline-formula>
(unless for the last state
<inline-formula id="IEq149">
<alternatives>
<tex-math id="M149">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_l \times q_5$$\end{document}</tex-math>
<mml:math id="M150">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>5</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq149.gif"></inline-graphic>
</alternatives>
</inline-formula>
, where
<inline-formula id="IEq150">
<alternatives>
<tex-math id="M151">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge \ell$$\end{document}</tex-math>
<mml:math id="M152">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq150.gif"></inline-graphic>
</alternatives>
</inline-formula>
matches may have been detected).</p>
<p>Then, starting with the empty word (counted once from the initial state
<inline-formula id="IEq151">
<alternatives>
<tex-math id="M153">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_0 \times q_1$$\end{document}</tex-math>
<mml:math id="M154">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq151.gif"></inline-graphic>
</alternatives>
</inline-formula>
), it is then possible to count the number of words of size one (two words 0 and 1 on a binary alphabet) by following transitions from the initial state to
<inline-formula id="IEq152">
<alternatives>
<tex-math id="M155">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_0 \times q_1$$\end{document}</tex-math>
<mml:math id="M156">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq152.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq153">
<alternatives>
<tex-math id="M157">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 \times q_2$$\end{document}</tex-math>
<mml:math id="M158">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq153.gif"></inline-graphic>
</alternatives>
</inline-formula>
, respectively; from the (two) states already reached, it is then possible to count words of size two (four words on a binary alphabet), and so on, while keeping, for each DFA state
<inline-formula id="IEq154">
<alternatives>
<tex-math id="M159">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_m \times q_j$$\end{document}</tex-math>
<mml:math id="M160">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq154.gif"></inline-graphic>
</alternatives>
</inline-formula>
and on each step
<italic>i</italic>
, a
<italic>single count</italic>
record, which represents the size of the subset of the partition of the
<inline-formula id="IEq155">
<alternatives>
<tex-math id="M161">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2^i$$\end{document}</tex-math>
<mml:math id="M162">
<mml:msup>
<mml:mn>2</mml:mn>
<mml:mi>i</mml:mi>
</mml:msup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq155.gif"></inline-graphic>
</alternatives>
</inline-formula>
words that reach
<inline-formula id="IEq156">
<alternatives>
<tex-math id="M163">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_m \times q_j$$\end{document}</tex-math>
<mml:math id="M164">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq156.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
<p>Note that, for a seed
<inline-formula id="IEq157">
<alternatives>
<tex-math id="M165">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M166">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq157.gif"></inline-graphic>
</alternatives>
</inline-formula>
of weight
<inline-formula id="IEq158">
<alternatives>
<tex-math id="M167">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w_\pi$$\end{document}</tex-math>
<mml:math id="M168">
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq158.gif"></inline-graphic>
</alternatives>
</inline-formula>
and span
<inline-formula id="IEq159">
<alternatives>
<tex-math id="M169">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_\pi$$\end{document}</tex-math>
<mml:math id="M170">
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq159.gif"></inline-graphic>
</alternatives>
</inline-formula>
(thus with
<inline-formula id="IEq160">
<alternatives>
<tex-math id="M171">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_\pi -w_\pi$$\end{document}</tex-math>
<mml:math id="M172">
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq160.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>don’t-care</italic>
symbols), the
<italic>at least one hit</italic>
automaton size is in
<inline-formula id="IEq161">
<alternatives>
<tex-math id="M173">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(w_\pi 2^{s_\pi -w_\pi })$$\end{document}</tex-math>
<mml:math id="M174">
<mml:mrow>
<mml:mi mathvariant="script">O</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:msup>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq161.gif"></inline-graphic>
</alternatives>
</inline-formula>
, so the intersection with the classes of alignments that have
<italic>m</italic>
matches out of
<inline-formula id="IEq162">
<alternatives>
<tex-math id="M175">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M176">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq162.gif"></inline-graphic>
</alternatives>
</inline-formula>
leads to a full size in
<inline-formula id="IEq163">
<alternatives>
<tex-math id="M177">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(\ell w_\pi 2^{s_\pi -w_\pi })$$\end{document}</tex-math>
<mml:math id="M178">
<mml:mrow>
<mml:mi mathvariant="script">O</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:msup>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq163.gif"></inline-graphic>
</alternatives>
</inline-formula>
: the computational complexity of the algorithm can thus be estimated in
<inline-formula id="IEq164">
<alternatives>
<tex-math id="M179">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(\ell ^2 w_\pi 2^{s_\pi -w_\pi })$$\end{document}</tex-math>
<mml:math id="M180">
<mml:mrow>
<mml:mi mathvariant="script">O</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msup>
<mml:mi></mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:msup>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq164.gif"></inline-graphic>
</alternatives>
</inline-formula>
in time and
<inline-formula id="IEq165">
<alternatives>
<tex-math id="M181">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(\ell w_\pi 2^{s_\pi -w_\pi })$$\end{document}</tex-math>
<mml:math id="M182">
<mml:mrow>
<mml:mi mathvariant="script">O</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:msup>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq165.gif"></inline-graphic>
</alternatives>
</inline-formula>
in space. As shown by [
<xref ref-type="bibr" rid="CR1">1</xref>
], it can be processed incrementally for all the alignment lengths up to
<inline-formula id="IEq166">
<alternatives>
<tex-math id="M183">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M184">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq166.gif"></inline-graphic>
</alternatives>
</inline-formula>
, with the only restriction that the numbers of alignments per state (
<inline-formula id="IEq167">
<alternatives>
<tex-math id="M185">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le 2^\ell$$\end{document}</tex-math>
<mml:math id="M186">
<mml:mrow>
<mml:mo></mml:mo>
<mml:msup>
<mml:mn>2</mml:mn>
<mml:mi></mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq167.gif"></inline-graphic>
</alternatives>
</inline-formula>
) fit inside an integer word (64 or 128bits).</p>
<p>We first mention that a
<italic>breadth-first</italic>
construction of the intersection product can be used to limit the
<italic>depth</italic>
of the reached states to
<inline-formula id="IEq168">
<alternatives>
<tex-math id="M187">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M188">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq168.gif"></inline-graphic>
</alternatives>
</inline-formula>
. We have already noticed that several authors have performed equivalent tasks with a matrix for the full automaton [
<xref ref-type="bibr" rid="CR86">86</xref>
], or with a vector for each automaton state [
<xref ref-type="bibr" rid="CR1">1</xref>
], probably because contiguous memory performance is better. An advantage of such lazy automaton product evaluation may be that, besides the fact that it is a
<italic>generic</italic>
automaton product, we avoid
<italic>sparse data-structures</italic>
combined with
<italic>many non-reachable</italic>
states (for example,
<inline-formula id="IEq169">
<alternatives>
<tex-math id="M189">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{\ell -1} \times q_1$$\end{document}</tex-math>
<mml:math id="M190">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq169.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq170">
<alternatives>
<tex-math id="M191">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{\ell } \times q_1$$\end{document}</tex-math>
<mml:math id="M192">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi></mml:mi>
</mml:msub>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq170.gif"></inline-graphic>
</alternatives>
</inline-formula>
will never be reached on any sequences of size
<inline-formula id="IEq171">
<alternatives>
<tex-math id="M193">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell > 2$$\end{document}</tex-math>
<mml:math id="M194">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>></mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq171.gif"></inline-graphic>
</alternatives>
</inline-formula>
: since two
<italic>mismatches</italic>
are needed to reach them, then
<inline-formula id="IEq172">
<alternatives>
<tex-math id="M195">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_m$$\end{document}</tex-math>
<mml:math id="M196">
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq172.gif"></inline-graphic>
</alternatives>
</inline-formula>
must always have its associated number of
<italic>matches</italic>
<inline-formula id="IEq173">
<alternatives>
<tex-math id="M197">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m \le \ell -2$$\end{document}</tex-math>
<mml:math id="M198">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq173.gif"></inline-graphic>
</alternatives>
</inline-formula>
).</p>
<p>We finally mention that a similar method was used in [
<xref ref-type="bibr" rid="CR87">87</xref>
] to compute correlation coefficients between the seed
<italic>number of hits</italic>
or the seed
<italic>coverage</italic>
, and the
<italic>true</italic>
alignment Hamming distance.
<xref ref-type="fn" rid="Fn3">3</xref>
</p>
<p>In the following sections, we will use the (
<italic>m</italic>
-matches counting)
<inline-formula id="IEq174">
<alternatives>
<tex-math id="M199">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M200">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq174.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients to compute, either probabilities on continuous models, or frequencies on discrete models.</p>
</sec>
<sec id="Sec4">
<title>Continuous models</title>
<sec id="Sec5">
<title>Bernoulli polynomial form and dominance between seeds</title>
<p>Once the
<inline-formula id="IEq175">
<alternatives>
<tex-math id="M201">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M202">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq175.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients (the number of alignments of length
<inline-formula id="IEq176">
<alternatives>
<tex-math id="M203">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M204">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq176.gif"></inline-graphic>
</alternatives>
</inline-formula>
with
<italic>m</italic>
matches that are detected by the seed
<inline-formula id="IEq177">
<alternatives>
<tex-math id="M205">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M206">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq177.gif"></inline-graphic>
</alternatives>
</inline-formula>
) are determined, the probability to hit an alignment of length
<inline-formula id="IEq178">
<alternatives>
<tex-math id="M207">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M208">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq178.gif"></inline-graphic>
</alternatives>
</inline-formula>
under a Bernoulli model (where the probability of having a match is
<italic>p</italic>
) can be directly computed as a polynomial over
<italic>p</italic>
of degree at most
<inline-formula id="IEq179">
<alternatives>
<tex-math id="M209">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M210">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq179.gif"></inline-graphic>
</alternatives>
</inline-formula>
:
<disp-formula id="Equ1">
<label>1</label>
<alternatives>
<tex-math id="M211">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned}Pr_\pi (p,\ell ) &= c_{\pi ,0} \, p^0 (1-p)^\ell + c_{\pi ,1}\, p^1 (1-p)^{\ell -1} +\cdots \nonumber \\& \quad \cdots + c_{\pi ,\ell -1}\, p^{\ell -1} (1-p)^{1} + c_{\pi ,\ell }\, p^\ell (1-p)^0 \end{aligned}$$\end{document}</tex-math>
<mml:math id="M212" display="block">
<mml:mrow>
<mml:mtable columnspacing="0.5ex">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mi></mml:mi>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow></mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mspace width="1em"></mml:mspace>
<mml:mo></mml:mo>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:msub>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi></mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>0</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<graphic xlink:href="13015_2017_92_Article_Equ1.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
The expression (
<xref rid="Equ1" ref-type="">1</xref>
) was first proposed by [
<xref ref-type="bibr" rid="CR1">1</xref>
] for spaced seeds, noticing that each alignment with
<italic>m</italic>
match symbols and
<inline-formula id="IEq180">
<alternatives>
<tex-math id="M213">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell -m$$\end{document}</tex-math>
<mml:math id="M214">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq180.gif"></inline-graphic>
</alternatives>
</inline-formula>
mismatch symbols,
<italic>“no matter how arranged”</italic>
, has the same probability
<inline-formula id="IEq181">
<alternatives>
<tex-math id="M215">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p^m (1-p)^{\ell -m}$$\end{document}</tex-math>
<mml:math id="M216">
<mml:mrow>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq181.gif"></inline-graphic>
</alternatives>
</inline-formula>
to occur. The coefficient
<inline-formula id="IEq182">
<alternatives>
<tex-math id="M217">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M218">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq182.gif"></inline-graphic>
</alternatives>
</inline-formula>
then gives the number of such (obviously independent) alignments that are detected by the seed
<inline-formula id="IEq183">
<alternatives>
<tex-math id="M219">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M220">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq183.gif"></inline-graphic>
</alternatives>
</inline-formula>
. This leads, for all the possible number of match/mismatch symbols in an alignment of length
<inline-formula id="IEq184">
<alternatives>
<tex-math id="M221">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M222">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq184.gif"></inline-graphic>
</alternatives>
</inline-formula>
, to the expression (
<xref rid="Equ1" ref-type="">1</xref>
) of the sensitivity for
<inline-formula id="IEq185">
<alternatives>
<tex-math id="M223">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M224">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq185.gif"></inline-graphic>
</alternatives>
</inline-formula>
. At first sight, we would conclude that this formula might be numerically unstable without any adapted computation, due to large
<inline-formula id="IEq186">
<alternatives>
<tex-math id="M225">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M226">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq186.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients, opposed to rather small
<inline-formula id="IEq187">
<alternatives>
<tex-math id="M227">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p^m (1-p)^{\ell -m}$$\end{document}</tex-math>
<mml:math id="M228">
<mml:mrow>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq187.gif"></inline-graphic>
</alternatives>
</inline-formula>
probability values. But we will see that this expression (
<xref rid="Equ1" ref-type="">1</xref>
) is not so frequently evaluated, and when it is, requires more involved tools than a classical numerical computation.</p>
<p>Mark and Benson [
<xref ref-type="bibr" rid="CR1">1</xref>
] also include in their paper an elegant and simple
<italic>partial order</italic>
named dominance between seeds: suppose that two spaced seeds
<inline-formula id="IEq188">
<alternatives>
<tex-math id="M229">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _a$$\end{document}</tex-math>
<mml:math id="M230">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq188.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq189">
<alternatives>
<tex-math id="M231">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _b$$\end{document}</tex-math>
<mml:math id="M232">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq189.gif"></inline-graphic>
</alternatives>
</inline-formula>
have to be compared according to their respective
<inline-formula id="IEq190">
<alternatives>
<tex-math id="M233">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi _a,m}$$\end{document}</tex-math>
<mml:math id="M234">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq190.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq191">
<alternatives>
<tex-math id="M235">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi _b,m}$$\end{document}</tex-math>
<mml:math id="M236">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq191.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients: now, assume that,
<inline-formula id="IEq192">
<alternatives>
<tex-math id="M237">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\forall m \in [0\ldots\,\ell ] \quad c_{\pi _a,m} \ge c_{\pi _b,m}$$\end{document}</tex-math>
<mml:math id="M238">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi></mml:mi>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mspace width="1em"></mml:mspace>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq192.gif"></inline-graphic>
</alternatives>
</inline-formula>
(with at least a single difference on at least one of the coefficients), then we can conclude that
<inline-formula id="IEq193">
<alternatives>
<tex-math id="M239">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _a$$\end{document}</tex-math>
<mml:math id="M240">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq193.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>dominates</italic>
<inline-formula id="IEq194">
<alternatives>
<tex-math id="M241">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _b$$\end{document}</tex-math>
<mml:math id="M242">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq194.gif"></inline-graphic>
</alternatives>
</inline-formula>
, and thus that
<inline-formula id="IEq195">
<alternatives>
<tex-math id="M243">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _b$$\end{document}</tex-math>
<mml:math id="M244">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq195.gif"></inline-graphic>
</alternatives>
</inline-formula>
can be discarded from the possible set of optimal seeds. Indeed, the sensitivity, defined by the formula (
<xref rid="Equ1" ref-type="">1</xref>
) as a sum of
<italic>same positive</italic>
terms
<inline-formula id="IEq196">
<alternatives>
<tex-math id="M245">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p^m (1-p)^{\ell -m}$$\end{document}</tex-math>
<mml:math id="M246">
<mml:mrow>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq196.gif"></inline-graphic>
</alternatives>
</inline-formula>
, each term being respectively multiplied by a
<italic>seed-dependent positive</italic>
coefficient
<inline-formula id="IEq197">
<alternatives>
<tex-math id="M247">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M248">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq197.gif"></inline-graphic>
</alternatives>
</inline-formula>
, guarantee that the sensitivity of
<inline-formula id="IEq198">
<alternatives>
<tex-math id="M249">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _b$$\end{document}</tex-math>
<mml:math id="M250">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq198.gif"></inline-graphic>
</alternatives>
</inline-formula>
will never be better than the sensitivity of
<inline-formula id="IEq199">
<alternatives>
<tex-math id="M251">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _a$$\end{document}</tex-math>
<mml:math id="M252">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq199.gif"></inline-graphic>
</alternatives>
</inline-formula>
, whatever parameter
<inline-formula id="IEq200">
<alternatives>
<tex-math id="M253">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p \in [0,1]$$\end{document}</tex-math>
<mml:math id="M254">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq200.gif"></inline-graphic>
</alternatives>
</inline-formula>
is chosen.</p>
<p>In practice, from the initial set of all the possible seeds of given weight
<italic>w</italic>
and maximal span
<italic>s</italic>
, several seeds can be discarded using this dominance principle, reducing the initial set to a small subset of candidate seeds to optimality. But this
<italic>dominance principle</italic>
is a
<italic>partial order</italic>
between seeds: this signifies that some seeds
<italic>cannot</italic>
be compared.
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>Polynomial coefficients</p>
</caption>
<graphic xlink:href="13015_2017_92_Tab1_HTML" id="MO5"></graphic>
<table-wrap-foot>
<p>Number
<inline-formula id="IEq201">
<alternatives>
<tex-math id="M255">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M256">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq201.gif"></inline-graphic>
</alternatives>
</inline-formula>
of alignments of length
<inline-formula id="IEq202">
<alternatives>
<tex-math id="M257">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M258">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq202.gif"></inline-graphic>
</alternatives>
</inline-formula>
with exactly
<italic>m</italic>
matches that are hit, by the contiguous seed (first column), by the Patternhunter I spaced seed (second column), and their respective difference (third column). The fourth column indicates the maximal number of alignments of length
<inline-formula id="IEq203">
<alternatives>
<tex-math id="M259">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M260">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq203.gif"></inline-graphic>
</alternatives>
</inline-formula>
with exactly
<italic>m</italic>
matches that could have been detected: when equality occurs with the first or the second column, the seed is then considered to be
<italic>lossless</italic>
: when this occurs, the background of the cell is pink</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>As an illustration, Table 
<xref rid="Tab1" ref-type="table">1</xref>
lists the
<inline-formula id="IEq204">
<alternatives>
<tex-math id="M261">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M262">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq204.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients of two single seeds, the contiguous seed (11111111111), and the Patternhunter I spaced seed (111010010100110111), for the alignment length
<inline-formula id="IEq205">
<alternatives>
<tex-math id="M263">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M264">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq205.gif"></inline-graphic>
</alternatives>
</inline-formula>
. Note that comparing only the pairs of coefficients
<inline-formula id="IEq206">
<alternatives>
<tex-math id="M265">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\mathtt{11111111111},m}$$\end{document}</tex-math>
<mml:math id="M266">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mn mathvariant="monospace">11111111111</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq206.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq207">
<alternatives>
<tex-math id="M267">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\mathtt{111010010100110111},m}$$\end{document}</tex-math>
<mml:math id="M268">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mn mathvariant="monospace">111010010100110111</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq207.gif"></inline-graphic>
</alternatives>
</inline-formula>
does not help in choosing/discarding any of the two seeds by the dominance principle, since
<inline-formula id="IEq208">
<alternatives>
<tex-math id="M269">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\mathtt{11111111111},m} > c_{\mathtt{111010010100110111},m}$$\end{document}</tex-math>
<mml:math id="M270">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mn mathvariant="monospace">11111111111</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mn mathvariant="monospace">111010010100110111</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq208.gif"></inline-graphic>
</alternatives>
</inline-formula>
when
<inline-formula id="IEq209">
<alternatives>
<tex-math id="M271">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m \le 18$$\end{document}</tex-math>
<mml:math id="M272">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
<mml:mn>18</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq209.gif"></inline-graphic>
</alternatives>
</inline-formula>
, or
<inline-formula id="IEq210">
<alternatives>
<tex-math id="M273">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\mathtt{11111111111},m} \le c_{\mathtt{111010010100110111},m}$$\end{document}</tex-math>
<mml:math id="M274">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mn mathvariant="monospace">11111111111</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mn mathvariant="monospace">111010010100110111</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq210.gif"></inline-graphic>
</alternatives>
</inline-formula>
otherwise (with a strict inequality when
<inline-formula id="IEq211">
<alternatives>
<tex-math id="M275">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m \le 59$$\end{document}</tex-math>
<mml:math id="M276">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
<mml:mn>59</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq211.gif"></inline-graphic>
</alternatives>
</inline-formula>
). Actually, both seeds are included in the set of the dominant seeds of weight
<inline-formula id="IEq212">
<alternatives>
<tex-math id="M277">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w=11$$\end{document}</tex-math>
<mml:math id="M278">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>11</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq212.gif"></inline-graphic>
</alternatives>
</inline-formula>
found on alignments of length
<inline-formula id="IEq213">
<alternatives>
<tex-math id="M279">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M280">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq213.gif"></inline-graphic>
</alternatives>
</inline-formula>
, as mentioned by [
<xref ref-type="bibr" rid="CR1">1</xref>
], and verified in our experiments.</p>
<p>Surprisingly, according to the experiments of [
<xref ref-type="bibr" rid="CR1">1</xref>
], very few single seeds are
<italic>overall dominant</italic>
in the class of seeds of same weight
<italic>w</italic>
and fixed or restricted span
<italic>s</italic>
(e.g.
<inline-formula id="IEq214">
<alternatives>
<tex-math id="M281">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s \le 2\times w$$\end{document}</tex-math>
<mml:math id="M282">
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq214.gif"></inline-graphic>
</alternatives>
</inline-formula>
) : this
<italic>dominance</italic>
criterion was thus used as a filter for the pre-selection of optimal seeds. In the section
<italic>“Experiments”</italic>
, we show that the dominance selection also scales reasonably well for selecting multiple seeds candidates.</p>
</sec>
<sec id="Sec6">
<title>Hit Integration and its associated polynomial form</title>
<p>
<italic>Hit Integration (HI)</italic>
for a given seed
<inline-formula id="IEq215">
<alternatives>
<tex-math id="M283">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M284">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq215.gif"></inline-graphic>
</alternatives>
</inline-formula>
was proposed by [
<xref ref-type="bibr" rid="CR2">2</xref>
] as
<inline-formula id="IEq216">
<alternatives>
<tex-math id="M285">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{\int _{p_a}^{p_b} Pr_\pi (p,\ell ) \, dp}{p_b-p_a}$$\end{document}</tex-math>
<mml:math id="M286">
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq216.gif"></inline-graphic>
</alternatives>
</inline-formula>
for a given interval
<inline-formula id="IEq217">
<alternatives>
<tex-math id="M287">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a,p_b]$$\end{document}</tex-math>
<mml:math id="M288">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq217.gif"></inline-graphic>
</alternatives>
</inline-formula>
(with
<inline-formula id="IEq218">
<alternatives>
<tex-math id="M289">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0 \le p_a < p_b \le 1$$\end{document}</tex-math>
<mml:math id="M290">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq218.gif"></inline-graphic>
</alternatives>
</inline-formula>
), where
<inline-formula id="IEq219">
<alternatives>
<tex-math id="M291">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Pr_\pi (p,\ell )$$\end{document}</tex-math>
<mml:math id="M292">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq219.gif"></inline-graphic>
</alternatives>
</inline-formula>
is the probability for the seed
<inline-formula id="IEq220">
<alternatives>
<tex-math id="M293">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M294">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq220.gif"></inline-graphic>
</alternatives>
</inline-formula>
to hit an alignment of length
<inline-formula id="IEq221">
<alternatives>
<tex-math id="M295">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M296">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq221.gif"></inline-graphic>
</alternatives>
</inline-formula>
generated by a Bernoulli model of parameter
<italic>p</italic>
, as mentioned at the beginning of the previous part.</p>
<p>The main idea behind this integral formula is that, to cope with a “once set” and “single”
<italic>p</italic>
value that gives higher probabilities to alignments with percent identities close to
<italic>p</italic>
, a given interval
<inline-formula id="IEq222">
<alternatives>
<tex-math id="M297">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a,p_b]$$\end{document}</tex-math>
<mml:math id="M298">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq222.gif"></inline-graphic>
</alternatives>
</inline-formula>
is more suitable. In terms of the generative process,
<inline-formula id="IEq223">
<alternatives>
<tex-math id="M299">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{\int _{p_a}^{p_b} Pr_\pi (p,\ell ) \, dp}{p_b-p_a}$$\end{document}</tex-math>
<mml:math id="M300">
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq223.gif"></inline-graphic>
</alternatives>
</inline-formula>
can be
<italic>interpreted</italic>
as choosing uniformly a value for the Bernoulli parameter
<italic>p</italic>
in the range
<inline-formula id="IEq224">
<alternatives>
<tex-math id="M301">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a,p_b]$$\end{document}</tex-math>
<mml:math id="M302">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq224.gif"></inline-graphic>
</alternatives>
</inline-formula>
, each time and once per alignment sequence, before running the Bernoulli model to generate this full alignment sequence
<italic>x</italic>
of length
<inline-formula id="IEq225">
<alternatives>
<tex-math id="M303">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M304">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq225.gif"></inline-graphic>
</alternatives>
</inline-formula>
.
<fig id="Fig3">
<label>Fig. 3</label>
<caption>
<p>Bernoulli, Hit Integration, and Heaviside models. The Bernoulli (for
<inline-formula id="IEq226">
<alternatives>
<tex-math id="M305">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.7$$\end{document}</tex-math>
<mml:math id="M306">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.7</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq226.gif"></inline-graphic>
</alternatives>
</inline-formula>
), the
<inline-formula id="IEq227">
<alternatives>
<tex-math id="M307">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{0.5}^{1.0}$$\end{document}</tex-math>
<mml:math id="M308">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mn>0.5</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>1.0</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq227.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration, and the
<inline-formula id="IEq228">
<alternatives>
<tex-math id="M309">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum _{\frac{1}{2}}^{1}$$\end{document}</tex-math>
<mml:math id="M310">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq228.gif"></inline-graphic>
</alternatives>
</inline-formula>
Heaviside probability mass functions of the number of matches, on alignments of length
<inline-formula id="IEq229">
<alternatives>
<tex-math id="M311">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M312">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq229.gif"></inline-graphic>
</alternatives>
</inline-formula>
. Highlighted dots indicate the weights given for each alignment class with a given number of matches
<italic>m</italic>
out of
<inline-formula id="IEq230">
<alternatives>
<tex-math id="M313">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M314">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq230.gif"></inline-graphic>
</alternatives>
</inline-formula>
alignment symbols, under each of the three models. Note that, since the sum of the weights is always 1 for any model, and since the class of alignments with exactly
<inline-formula id="IEq231">
<alternatives>
<tex-math id="M315">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m=32$$\end{document}</tex-math>
<mml:math id="M316">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>32</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq231.gif"></inline-graphic>
</alternatives>
</inline-formula>
matches out of
<inline-formula id="IEq232">
<alternatives>
<tex-math id="M317">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M318">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq232.gif"></inline-graphic>
</alternatives>
</inline-formula>
is fully included in
<inline-formula id="IEq233">
<alternatives>
<tex-math id="M319">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum _{\frac{1}{2}}^{1}$$\end{document}</tex-math>
<mml:math id="M320">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq233.gif"></inline-graphic>
</alternatives>
</inline-formula>
Heaviside model but only half-included in
<inline-formula id="IEq234">
<alternatives>
<tex-math id="M321">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{0.5}^{1.0}$$\end{document}</tex-math>
<mml:math id="M322">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mn>0.5</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>1.0</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq234.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration model, there is a thin difference between the two resulting lines</p>
</caption>
<graphic xlink:href="13015_2017_92_Fig3_HTML" id="MO6"></graphic>
</fig>
</p>
<p>An illustration of the full probability mass function for the
<italic>Hit Integration</italic>
compared with the
<italic>Bernoulli</italic>
and the
<italic>Heaviside</italic>
distributions (the latter is defined in the next section) is given in Fig. 
<xref rid="Fig3" ref-type="fig">3</xref>
for alignments of length
<inline-formula id="IEq235">
<alternatives>
<tex-math id="M323">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M324">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq235.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
<p>Chung and Park [
<xref ref-type="bibr" rid="CR2">2</xref>
] pointed out that designed spaced seeds were of different shapes, and that several seeds obtained on
<inline-formula id="IEq236">
<alternatives>
<tex-math id="M325">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a=0, p_b=1]$$\end{document}</tex-math>
<mml:math id="M326">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq236.gif"></inline-graphic>
</alternatives>
</inline-formula>
or
<inline-formula id="IEq237">
<alternatives>
<tex-math id="M327">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a=0.5, p_b=1]$$\end{document}</tex-math>
<mml:math id="M328">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0.5</mml:mn>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq237.gif"></inline-graphic>
</alternatives>
</inline-formula>
were
<italic>in practice</italic>
better (compared with three other criteria tested in their paper). We also noticed that the method of [
<xref ref-type="bibr" rid="CR2">2</xref>
] was modeled on the [
<xref ref-type="bibr" rid="CR27">27</xref>
] recursive decomposition, and is based on a very careful and non-trivial analysis of the terms
<inline-formula id="IEq238">
<alternatives>
<tex-math id="M329">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I^k[i,b]$$\end{document}</tex-math>
<mml:math id="M330">
<mml:mrow>
<mml:msup>
<mml:mi>I</mml:mi>
<mml:mi>k</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq238.gif"></inline-graphic>
</alternatives>
</inline-formula>
defined by :
<disp-formula id="Equ4">
<alternatives>
<tex-math id="M331">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} I^k[i,b] =\int p^k \times Pr_\pi \big (\langle i\;,b\;\rangle \big )\, dp \end{aligned}$$\end{document}</tex-math>
<mml:math id="M332" display="block">
<mml:mrow>
<mml:mtable columnspacing="0.5ex">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:msup>
<mml:mi>I</mml:mi>
<mml:mi>k</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>k</mml:mi>
</mml:msup>
<mml:mo>×</mml:mo>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">(</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false"></mml:mo>
<mml:mi>i</mml:mi>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
<mml:mspace width="0.277778em"></mml:mspace>
<mml:mo stretchy="false"></mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<graphic xlink:href="13015_2017_92_Article_Equ4.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
with
<italic> i</italic>
: position along alignment,
<italic> b</italic>
: alignment suffix that is also π-prefix hitting, over the parameter
<inline-formula id="IEq239">
<alternatives>
<tex-math id="M333">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k \in \big [|b|_1 \ldots\, \ell -i+|b|\big ]$$\end{document}</tex-math>
<mml:math id="M334">
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">[</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq239.gif"></inline-graphic>
</alternatives>
</inline-formula>
, and their relationship: this leads to their recurrence formula
<inline-formula id="IEq240">
<alternatives>
<tex-math id="M335">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I^k[i,b] = I^k[i,b0] + I^{k+1}[i,b1] - I^{k+1}[i,b0]$$\end{document}</tex-math>
<mml:math id="M336">
<mml:mrow>
<mml:msup>
<mml:mi>I</mml:mi>
<mml:mi>k</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mi>I</mml:mi>
<mml:mi>k</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
<mml:mn>0</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:msup>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:msup>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
<mml:mn>0</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq240.gif"></inline-graphic>
</alternatives>
</inline-formula>
computed with the [
<xref ref-type="bibr" rid="CR27">27</xref>
] algorithm scheme, using an additional internal loop layer for
<inline-formula id="IEq241">
<alternatives>
<tex-math id="M337">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k \in [|b|_1 \ldots\, \ell -i+|b|]$$\end{document}</tex-math>
<mml:math id="M338">
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>b</mml:mi>
<mml:msub>
<mml:mo stretchy="false">|</mml:mo>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>+</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq241.gif"></inline-graphic>
</alternatives>
</inline-formula>
, and a
<italic>non-obvious ordering of the computed terms on k vs |b|</italic>
to remain
<italic>DP-tractable</italic>
.</p>
<p>Even if the algorithm we propose to compute the Hit Integration (in the next paragraph) has the same
<italic>theoretical worst case</italic>
complexity, its advantages are twofold:
<list list-type="bullet">
<list-item>
<p>We propose a dynamic programming algorithm that is
<italic>strictly equivalent</italic>
to the one previously proposed for the the Bernoulli model : in fact, both model-dependent algorithms can even pool their most
<italic>time-consuming</italic>
part. Moreover, the automaton used by the dynamic programming algorithm can be previously minimized: this reduction is
<italic>greatly appreciated</italic>
when multiple seeds are processed.</p>
</list-item>
<list-item>
<p>We propose a parameter-free approach for the
<inline-formula id="IEq242">
<alternatives>
<tex-math id="M339">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_a$$\end{document}</tex-math>
<mml:math id="M340">
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq242.gif"></inline-graphic>
</alternatives>
</inline-formula>
or
<inline-formula id="IEq243">
<alternatives>
<tex-math id="M341">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_b$$\end{document}</tex-math>
<mml:math id="M342">
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq243.gif"></inline-graphic>
</alternatives>
</inline-formula>
parameters: it is therefore possible to compute, on
<italic>any interval</italic>
, how far a seed is optimal; moreover, we will show that the
<italic>dominance</italic>
criterion can be applied as a pre-processing step.</p>
</list-item>
</list>
The Hit Integration
<inline-formula id="IEq244">
<alternatives>
<tex-math id="M343">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{p_a}^{p_b} Pr_\pi (p,\ell ) \, dp$$\end{document}</tex-math>
<mml:math id="M344">
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq244.gif"></inline-graphic>
</alternatives>
</inline-formula>
can be rewritten by applying the polynomial formula (
<xref rid="Equ1" ref-type="">1</xref>
) into:
<disp-formula id="Equ2">
<label>2</label>
<alternatives>
<tex-math id="M345">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \int _{p_a}^{p_b}\!Pr_\pi (p,\ell ) \, dp & = \!\!\int _{p_a}^{p_b}\!\sum _{m=0}^{\ell } c_{\pi ,m} \, p^m (1-p)^{\ell -m} dp\nonumber \\ & = \!\!\sum _{m=0}^{\ell } c_{\pi ,m}\int _{p_a}^{p_b}\!p^m (1-p)^{\ell -m} dp \end{aligned}$$\end{document}</tex-math>
<mml:math id="M346" display="block">
<mml:mrow>
<mml:mtable columnspacing="0.5ex">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mspace width="-0.166667em"></mml:mspace>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>=</mml:mo>
<mml:mspace width="-0.166667em"></mml:mspace>
<mml:mspace width="-0.166667em"></mml:mspace>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mspace width="-0.166667em"></mml:mspace>
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mi></mml:mi>
</mml:munderover>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow></mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mo>=</mml:mo>
<mml:mspace width="-0.166667em"></mml:mspace>
<mml:mspace width="-0.166667em"></mml:mspace>
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mi></mml:mi>
</mml:munderover>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mspace width="-0.166667em"></mml:mspace>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<graphic xlink:href="13015_2017_92_Article_Equ2.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
Two interesting features can then be deduced from this trivial rewriting.</p>
<p>First, for any constant integers
<italic>u</italic>
and
<italic>v</italic>
, since the integral of the polynomial part
<inline-formula id="IEq245">
<alternatives>
<tex-math id="M347">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{p_a}^{p_b} p^u (1-p)^{v} \, dp = \Big [ p^{u+1} \sum _{k=0}^{v} {v \atopwithdelims ()k} \frac{(-p)^k}{u+k+1} \Big ]_{p_a}^{p_b}$$\end{document}</tex-math>
<mml:math id="M348">
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>u</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mi>v</mml:mi>
</mml:msup>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo maxsize="1.623em" minsize="1.623em" stretchy="true">[</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mi>v</mml:mi>
</mml:msubsup>
<mml:mfenced close=")" open="(" separators="">
<mml:mfrac linethickness="0pt">
<mml:mi>v</mml:mi>
<mml:mi>k</mml:mi>
</mml:mfrac>
</mml:mfenced>
<mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mrow>
<mml:mo maxsize="1.623em" minsize="1.623em" stretchy="true">]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq245.gif"></inline-graphic>
</alternatives>
</inline-formula>
can be easily computed (as a larger degree polynomial), the integral of the right part of the formula (
<xref rid="Equ2" ref-type="">2</xref>
) can be pre-computed independently of the counting coefficients
<inline-formula id="IEq246">
<alternatives>
<tex-math id="M349">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M350">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq246.gif"></inline-graphic>
</alternatives>
</inline-formula>
, and thus independently of the seed
<inline-formula id="IEq247">
<alternatives>
<tex-math id="M351">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M352">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq247.gif"></inline-graphic>
</alternatives>
</inline-formula>
. Thus, only
<inline-formula id="IEq248">
<alternatives>
<tex-math id="M353">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M354">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq248.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients characterize the seed
<inline-formula id="IEq249">
<alternatives>
<tex-math id="M355">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M356">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq249.gif"></inline-graphic>
</alternatives>
</inline-formula>
for
<italic>both</italic>
the Bernoulli model
<italic>and</italic>
the Hit Integration model.</p>
<p>Moreover, we can see that, for
<inline-formula id="IEq250">
<alternatives>
<tex-math id="M357">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0 \le p_a < p_b \le 1$$\end{document}</tex-math>
<mml:math id="M358">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq250.gif"></inline-graphic>
</alternatives>
</inline-formula>
and for all
<inline-formula id="IEq251">
<alternatives>
<tex-math id="M359">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m \in [0\ldots\,\ell ]$$\end{document}</tex-math>
<mml:math id="M360">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi></mml:mi>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq251.gif"></inline-graphic>
</alternatives>
</inline-formula>
, the integral
<inline-formula id="IEq252">
<alternatives>
<tex-math id="M361">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{p_a}^{p_b} p^m (1-p)^{\ell -m} \, dp$$\end{document}</tex-math>
<mml:math id="M362">
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:msup>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mi>d</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq252.gif"></inline-graphic>
</alternatives>
</inline-formula>
of the right part of the formula (
<xref rid="Equ2" ref-type="">2</xref>
) is always positive. Therefore, the
<italic>dominance between seeds</italic>
also can be directly applied on the
<inline-formula id="IEq253">
<alternatives>
<tex-math id="M363">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M364">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq253.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients to select dominant seeds before computing the Hit Integration (for any range
<inline-formula id="IEq254">
<alternatives>
<tex-math id="M365">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a,p_b]$$\end{document}</tex-math>
<mml:math id="M366">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq254.gif"></inline-graphic>
</alternatives>
</inline-formula>
) by applying the formula (
<xref rid="Equ2" ref-type="">2</xref>
), thereby saving computation time for the optimal set of seeds.</p>
<p>As a consequence, even if the
<italic>optimal</italic>
seeds selected from the Bernoulli and the Hit Integration models may have different shapes, all such
<italic>optimal</italic>
seeds are guaranteed to be
<italic>dominant</italic>
<xref ref-type="fn" rid="Fn4">4</xref>
in the sense of [
<xref ref-type="bibr" rid="CR1">1</xref>
]. Note that the dominance of a seed can be computed independently of any parameter
<italic>p</italic>
, or here, any parameters
<inline-formula id="IEq255">
<alternatives>
<tex-math id="M367">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a,p_b]$$\end{document}</tex-math>
<mml:math id="M368">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq255.gif"></inline-graphic>
</alternatives>
</inline-formula>
: the dominance criterion can thus be used to pre-select seeds using exactly the same process proposed at the end of the previous part.
<fig id="Fig4">
<label>Fig. 4</label>
<caption>
<p>Bernoulli and Hit Integration polynomials. The Bernoulli and
<inline-formula id="IEq256">
<alternatives>
<tex-math id="M369">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^x$$\end{document}</tex-math>
<mml:math id="M370">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
<mml:mi>x</mml:mi>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq256.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration polynomials plots for the contiguous seed and the Patternhunter I spaced seed, on alignments of length
<inline-formula id="IEq257">
<alternatives>
<tex-math id="M371">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M372">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq257.gif"></inline-graphic>
</alternatives>
</inline-formula>
. The two polynomials have been plotted according to their respective formulas (
<xref rid="Equ1" ref-type="">1</xref>
) and (
<xref rid="Equ2" ref-type="">2</xref>
). A
<italic> vertical mark</italic>
indicates where they cross each other in the range
<inline-formula id="IEq258">
<alternatives>
<tex-math id="M373">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x \in \, ]0,1[$$\end{document}</tex-math>
<mml:math id="M374">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mo stretchy="false">]</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">[</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq258.gif"></inline-graphic>
</alternatives>
</inline-formula>
: the contiguous seed is better under this marked value; otherwise, the Patternhunter I spaced seed is better</p>
</caption>
<graphic xlink:href="13015_2017_92_Fig4_HTML" id="MO9"></graphic>
</fig>
</p>
<p>As an illustration, Fig.  
<xref rid="Fig4" ref-type="fig">4</xref>
plots the Bernoulli (
<italic>left</italic>
) and the
<inline-formula id="IEq259">
<alternatives>
<tex-math id="M375">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^x$$\end{document}</tex-math>
<mml:math id="M376">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
<mml:mi>x</mml:mi>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq259.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration (
<italic>right</italic>
) polynomials of two seeds: the contiguous seed (11111111111) and the Patternhunter I spaced seed (111010010100110111) which are the two already mentioned out of the forty dominant seeds of weight
<inline-formula id="IEq260">
<alternatives>
<tex-math id="M377">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w=11$$\end{document}</tex-math>
<mml:math id="M378">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>11</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq260.gif"></inline-graphic>
</alternatives>
</inline-formula>
on alignments of length
<inline-formula id="IEq261">
<alternatives>
<tex-math id="M379">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M380">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq261.gif"></inline-graphic>
</alternatives>
</inline-formula>
. Note that the Patternhunter I spaced seed, when compared to the contiguous seed, turns out to be better, if we consider the Bernoulli criterion only when
<inline-formula id="IEq262">
<alternatives>
<tex-math id="M381">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p > 0.13209$$\end{document}</tex-math>
<mml:math id="M382">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>></mml:mo>
<mml:mn>0.13209</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq262.gif"></inline-graphic>
</alternatives>
</inline-formula>
(
<italic>dark red dashed line</italic>
)
<xref ref-type="fn" rid="Fn5">5</xref>
, or if we consider the
<inline-formula id="IEq263">
<alternatives>
<tex-math id="M383">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^x$$\end{document}</tex-math>
<mml:math id="M384">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
<mml:mi>x</mml:mi>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq263.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration criterion only when
<inline-formula id="IEq264">
<alternatives>
<tex-math id="M385">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x > 0.14301$$\end{document}</tex-math>
<mml:math id="M386">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>></mml:mo>
<mml:mn>0.14301</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq264.gif"></inline-graphic>
</alternatives>
</inline-formula>
(
<italic>dark red dashed line</italic>
). However, if one wants to consider, not the
<inline-formula id="IEq265">
<alternatives>
<tex-math id="M387">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^x$$\end{document}</tex-math>
<mml:math id="M388">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
<mml:mi>x</mml:mi>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq265.gif"></inline-graphic>
</alternatives>
</inline-formula>
, but the
<inline-formula id="IEq266">
<alternatives>
<tex-math id="M389">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _x^1$$\end{document}</tex-math>
<mml:math id="M390">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq266.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration criterion (data not shown), then the Patternhunter I spaced seed will always outperform the contiguous seed, even if both seeds are dominant in terms of
<inline-formula id="IEq267">
<alternatives>
<tex-math id="M391">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m}$$\end{document}</tex-math>
<mml:math id="M392">
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq267.gif"></inline-graphic>
</alternatives>
</inline-formula>
coefficients and cannot be directly compared at first with this
<italic>partial order</italic>
dominance.
<fig id="Fig5">
<label>Fig. 5</label>
<caption>
<p>Bernoulli and Dirac optimal seeds. The Bernoulli and Dirac optimal seeds, for single seeds of weight 11 and span
<inline-formula id="IEq268">
<alternatives>
<tex-math id="M393">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le 22$$\end{document}</tex-math>
<mml:math id="M394">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mn>22</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq268.gif"></inline-graphic>
</alternatives>
</inline-formula>
, over the match probability or the match frequency of each model (
<italic>x</italic>
-axis), and on any alignment length
<inline-formula id="IEq269">
<alternatives>
<tex-math id="M395">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in [22\ldots64]$$\end{document}</tex-math>
<mml:math id="M396">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>22</mml:mn>
<mml:mo></mml:mo>
<mml:mn>64</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq269.gif"></inline-graphic>
</alternatives>
</inline-formula>
(
<italic>y</italic>
-axis). On both Figs. 5 and
<xref rid="Fig6" ref-type="fig">6</xref>
, we choose to represent the same seeds with the same label and with the same background color. On discrete models, a pink mark is set. Seeds on the right of this mark are lossless for the two parameters indicated on the right margin: the minimum number of matches
<italic>m</italic>
over the alignment length
<inline-formula id="IEq270">
<alternatives>
<tex-math id="M397">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M398">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq270.gif"></inline-graphic>
</alternatives>
</inline-formula>
</p>
</caption>
<graphic xlink:href="13015_2017_92_Fig5_HTML" id="MO10"></graphic>
</fig>
</p>
<p>We finally mention that, for alignments of length
<inline-formula id="IEq271">
<alternatives>
<tex-math id="M399">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =64$$\end{document}</tex-math>
<mml:math id="M400">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq271.gif"></inline-graphic>
</alternatives>
</inline-formula>
, both the contiguous seed and the Patternhunter I seed are in the set of the twelve optimal seeds found for the Bernoulli model
<xref ref-type="fn" rid="Fn6">6</xref>
(they are reported by symbols
<inline-graphic xlink:href="13015_2017_92_Figaa_HTML.gif" id="d29e5993"></inline-graphic>
and
<inline-graphic xlink:href="13015_2017_92_Figab_HTML.gif" id="d29e5996"></inline-graphic>
in Fig.  
<xref rid="Fig5" ref-type="fig">5</xref>
,
<italic> top line</italic>
of the first plot). Both are also in the set of the eight optimal seeds for the
<inline-formula id="IEq272">
<alternatives>
<tex-math id="M401">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^x$$\end{document}</tex-math>
<mml:math id="M402">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
<mml:mi>x</mml:mi>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq272.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration model. But, quite surprisingly, neither of the two is in the set of the four optimal seeds for the
<inline-formula id="IEq273">
<alternatives>
<tex-math id="M403">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _x^1$$\end{document}</tex-math>
<mml:math id="M404">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq273.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration model (reported in Fig.  
<xref rid="Fig6" ref-type="fig">6</xref>
,
<italic> top line</italic>
of first plot). In fact, for the
<inline-formula id="IEq274">
<alternatives>
<tex-math id="M405">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _x^1$$\end{document}</tex-math>
<mml:math id="M406">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq274.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration model, the spaced seed 111001011001010111 (reported by a symbol
<inline-graphic xlink:href="13015_2017_92_Figac_HTML.gif" id="d29e6061"></inline-graphic>
in Fig.  
<xref rid="Fig6" ref-type="fig">6</xref>
, top line of first plot) is optimal
<xref ref-type="fn" rid="Fn7">7</xref>
on a wide range of
<italic>x</italic>
(
<inline-formula id="IEq277">
<alternatives>
<tex-math id="M407">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x \in [0,0.97189]$$\end{document}</tex-math>
<mml:math id="M408">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>0.97189</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq277.gif"></inline-graphic>
</alternatives>
</inline-formula>
) before being surpassed by three other seeds (
<inline-graphic xlink:href="13015_2017_92_Figad_HTML.gif" id="d29e6135"></inline-graphic>
,
<inline-graphic xlink:href="13015_2017_92_Figae_HTML.gif" id="d29e6138"></inline-graphic>
and
<inline-graphic xlink:href="13015_2017_92_Figaf_HTML.gif" id="d29e6141"></inline-graphic>
in Fig.  
<xref rid="Fig6" ref-type="fig">6</xref>
,
<italic> top line</italic>
of the first plot).
<fig id="Fig6">
<label>Fig. 6</label>
<caption>
<p>Hit Integration and Heaviside optimal seeds. The
<inline-formula id="IEq278">
<alternatives>
<tex-math id="M409">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{x}^{1}$$\end{document}</tex-math>
<mml:math id="M410">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq278.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration and
<inline-formula id="IEq279">
<alternatives>
<tex-math id="M411">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum _{x}^{1}$$\end{document}</tex-math>
<mml:math id="M412">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq279.gif"></inline-graphic>
</alternatives>
</inline-formula>
Heaviside optimal seeds, for single seeds of weight 11 and span
<inline-formula id="IEq280">
<alternatives>
<tex-math id="M413">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le 22$$\end{document}</tex-math>
<mml:math id="M414">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mn>22</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq280.gif"></inline-graphic>
</alternatives>
</inline-formula>
, over the match probability or the match frequency of each model (
<italic>x</italic>
-
<italic>axis</italic>
), and on any alignment length
<inline-formula id="IEq281">
<alternatives>
<tex-math id="M415">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in [22\ldots64]$$\end{document}</tex-math>
<mml:math id="M416">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>22</mml:mn>
<mml:mo></mml:mo>
<mml:mn>64</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq281.gif"></inline-graphic>
</alternatives>
</inline-formula>
(
<italic>y</italic>
-
<italic>axis</italic>
). On both Figs.
<xref rid="Fig5" ref-type="fig">5</xref>
and 6, we choose to represent the same seeds with the same label and with the same background color. On discrete models, a
<italic> pink mark</italic>
is set. Seeds on the right of this mark are lossless for the two parameters indicated on the right margin: the minimum number of matches
<italic>m</italic>
over the alignment length
<inline-formula id="IEq282">
<alternatives>
<tex-math id="M417">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M418">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq282.gif"></inline-graphic>
</alternatives>
</inline-formula>
</p>
</caption>
<graphic xlink:href="13015_2017_92_Fig6_HTML" id="MO11"></graphic>
</fig>
</p>
</sec>
</sec>
<sec id="Sec7">
<title>Discrete models and lossless seeds</title>
<p>In this section, we propose two additional models for selecting seeds. We will name them
<italic>Dirac</italic>
and
<italic>Heaviside</italic>
. These models can be seen as the
<italic>discrete</italic>
counterparts of the Bernoulli and the Hit Integration models, and are simply defined by:
<list list-type="order">
<list-item>
<p>
<inline-formula id="IEq283">
<alternatives>
<tex-math id="M419">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Dirac_\pi (m,\ell ) = \frac{c_{m,\pi }}{{\ell \atopwithdelims ()m}}$$\end{document}</tex-math>
<mml:math id="M420">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>a</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced close=")" open="(" separators="">
<mml:mfrac linethickness="0pt">
<mml:mi></mml:mi>
<mml:mi>m</mml:mi>
</mml:mfrac>
</mml:mfenced>
</mml:mfrac>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq283.gif"></inline-graphic>
</alternatives>
</inline-formula>
, to give the ratio between the number of alignments detected by the seed
<inline-formula id="IEq284">
<alternatives>
<tex-math id="M421">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M422">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq284.gif"></inline-graphic>
</alternatives>
</inline-formula>
over all the alignments of length
<inline-formula id="IEq285">
<alternatives>
<tex-math id="M423">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M424">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq285.gif"></inline-graphic>
</alternatives>
</inline-formula>
with exactly
<italic>m</italic>
matches,</p>
</list-item>
<list-item>
<p>
<inline-formula id="IEq286">
<alternatives>
<tex-math id="M425">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Heaviside_\pi (m_a,m_b,\ell ) = \frac{\sum \limits _{m=m_a}^{m_b} Dirac_\pi (m,\ell )}{m_b - m_a + 1}$$\end{document}</tex-math>
<mml:math id="M426">
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:msub>
<mml:mi>e</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false"></mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:munderover>
<mml:mi>D</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>a</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq286.gif"></inline-graphic>
</alternatives>
</inline-formula>
, to give the average ratio, over any number of matches
<italic>m</italic>
between
<inline-formula id="IEq287">
<alternatives>
<tex-math id="M427">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m_a$$\end{document}</tex-math>
<mml:math id="M428">
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq287.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq288">
<alternatives>
<tex-math id="M429">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m_b$$\end{document}</tex-math>
<mml:math id="M430">
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq288.gif"></inline-graphic>
</alternatives>
</inline-formula>
(out of
<inline-formula id="IEq289">
<alternatives>
<tex-math id="M431">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M432">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq289.gif"></inline-graphic>
</alternatives>
</inline-formula>
) of the previously defined Dirac model. The
<italic>Heaviside</italic>
full distribution has already been illustrated in Fig.  
<xref rid="Fig3" ref-type="fig">3</xref>
, together with the
<italic>Hit Integration</italic>
distribution with similar parameters.</p>
</list-item>
</list>
As long as we allow the possible loss of some of the
<italic>strictly equivalent</italic>
<xref ref-type="fn" rid="Fn8">8</xref>
seeds in terms of sensitivity defined by the Dirac and Heaviside functions, the
<italic>dominance</italic>
criterion can be applied to filter out many candidate seeds.</p>
<p>In addition, the Dirac and Heaviside functions are based on
<italic>rational number</italic>
computations/comparisons: they are thus one or two orders of magnitude faster and lighter to compute and store, compared to the polynomial forms given by the continuous models of the previous section.</p>
<p>Finally, an interesting feature of the
<inline-formula id="IEq296">
<alternatives>
<tex-math id="M433">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Dirac_\pi (m,\ell )$$\end{document}</tex-math>
<mml:math id="M434">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>a</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq296.gif"></inline-graphic>
</alternatives>
</inline-formula>
, also true for the specific
<inline-formula id="IEq297">
<alternatives>
<tex-math id="M435">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Heaviside_\pi (m,\ell ,\ell )$$\end{document}</tex-math>
<mml:math id="M436">
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:msub>
<mml:mi>e</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq297.gif"></inline-graphic>
</alternatives>
</inline-formula>
, is that, when the number of match symbols
<italic>m</italic>
is large enough, one seed
<inline-formula id="IEq298">
<alternatives>
<tex-math id="M437">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M438">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq298.gif"></inline-graphic>
</alternatives>
</inline-formula>
(or sometime several seeds) can meet the equality
<inline-formula id="IEq299">
<alternatives>
<tex-math id="M439">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m'} = {\ell \atopwithdelims ()m'}$$\end{document}</tex-math>
<mml:math id="M440">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfenced close=")" open="(" separators="">
<mml:mfrac linethickness="0pt">
<mml:mi></mml:mi>
<mml:msup>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq299.gif"></inline-graphic>
</alternatives>
</inline-formula>
for all
<inline-formula id="IEq300">
<alternatives>
<tex-math id="M441">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m' \ge m$$\end{document}</tex-math>
<mml:math id="M442">
<mml:mrow>
<mml:msup>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo></mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq300.gif"></inline-graphic>
</alternatives>
</inline-formula>
. Such seeds are thus lossless since they can detect all the alignments of length
<inline-formula id="IEq301">
<alternatives>
<tex-math id="M443">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M444">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq301.gif"></inline-graphic>
</alternatives>
</inline-formula>
with at least
<italic>m</italic>
matches (or with at most
<inline-formula id="IEq302">
<alternatives>
<tex-math id="M445">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell -m$$\end{document}</tex-math>
<mml:math id="M446">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq302.gif"></inline-graphic>
</alternatives>
</inline-formula>
mismatches), and obviously the best lossless ones are retained in the set of dominant seeds, when the equality
<inline-formula id="IEq303">
<alternatives>
<tex-math id="M447">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{\pi ,m} = {\ell \atopwithdelims ()m}$$\end{document}</tex-math>
<mml:math id="M448">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfenced close=")" open="(" separators="">
<mml:mfrac linethickness="0pt">
<mml:mi></mml:mi>
<mml:mi>m</mml:mi>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq303.gif"></inline-graphic>
</alternatives>
</inline-formula>
occurs. As a side consequence, the best
<italic>lossless seeds</italic>
are also in the set of
<italic>dominant seeds</italic>
and will be reported in the experiments.</p>
<p>Note that, to keep a symmetric notation with the
<inline-formula id="IEq304">
<alternatives>
<tex-math id="M449">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{p_a}^{p_b}\,$$\end{document}</tex-math>
<mml:math id="M450">
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mspace width="0.166667em"></mml:mspace>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq304.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>Hit Integration</italic>
, and also have the same range for the domain of definition (
<inline-formula id="IEq305">
<alternatives>
<tex-math id="M451">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0 \le p_a < p_b \le 1$$\end{document}</tex-math>
<mml:math id="M452">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq305.gif"></inline-graphic>
</alternatives>
</inline-formula>
), we will use the “frequency” notation
<inline-formula id="IEq306">
<alternatives>
<tex-math id="M453">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum _{f_a}^{f_b}\,$$\end{document}</tex-math>
<mml:math id="M454">
<mml:mrow>
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mspace width="0.166667em"></mml:mspace>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq306.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>Heaviside</italic>
to designate
<inline-formula id="IEq307">
<alternatives>
<tex-math id="M455">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Heaviside(\lfloor \ell \times f_a \rfloor ,\lfloor \ell \times f_b \rfloor ,\ell )$$\end{document}</tex-math>
<mml:math id="M456">
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq307.gif"></inline-graphic>
</alternatives>
</inline-formula>
. We will also rescale the
<italic>Dirac</italic>
function on the
<italic>Bernoulli’s</italic>
domain of definition, by using the frequency
<italic>f</italic>
(
<inline-formula id="IEq308">
<alternatives>
<tex-math id="M457">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0 \le f \le 1$$\end{document}</tex-math>
<mml:math id="M458">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:mi>f</mml:mi>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq308.gif"></inline-graphic>
</alternatives>
</inline-formula>
) to designate
<inline-formula id="IEq309">
<alternatives>
<tex-math id="M459">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Dirac(\lfloor \ell \times f \rfloor ,\ell )$$\end{document}</tex-math>
<mml:math id="M460">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
<mml:mo>×</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo></mml:mo>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq309.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
</sec>
<sec id="Sec8">
<title>Experiments</title>
<p>Single spaced seeds (
<inline-formula id="IEq310">
<alternatives>
<tex-math id="M461">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n =1$$\end{document}</tex-math>
<mml:math id="M462">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq310.gif"></inline-graphic>
</alternatives>
</inline-formula>
) and multiple co-designed spaced seeds (
<inline-formula id="IEq311">
<alternatives>
<tex-math id="M463">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n \in [2\ldots\,4]$$\end{document}</tex-math>
<mml:math id="M464">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mn>4</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq311.gif"></inline-graphic>
</alternatives>
</inline-formula>
) of weight
<inline-formula id="IEq312">
<alternatives>
<tex-math id="M465">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w \in [3\ldots\,16]$$\end{document}</tex-math>
<mml:math id="M466">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mn>16</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq312.gif"></inline-graphic>
</alternatives>
</inline-formula>
and span
<italic>s</italic>
at most
<inline-formula id="IEq313">
<alternatives>
<tex-math id="M467">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2 \times w$$\end{document}</tex-math>
<mml:math id="M468">
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq313.gif"></inline-graphic>
</alternatives>
</inline-formula>
have been considered. Note that, for single seeds of large weight (
<inline-formula id="IEq314">
<alternatives>
<tex-math id="M469">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w \ge 15$$\end{document}</tex-math>
<mml:math id="M470">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo></mml:mo>
<mml:mn>15</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq314.gif"></inline-graphic>
</alternatives>
</inline-formula>
), or for multiple seed, the full enumeration is respectively burdensome or intractable, so we prefer to apply the hill-climbing algorithm of Iedera [
<xref ref-type="bibr" rid="CR88">88</xref>
]: selected dominant spaced seeds are thus
<italic>locally dominant</italic>
, since it would be computationally unfeasible to guarantee their overall dominance. All the spaced seeds are evaluated on alignments of length
<inline-formula id="IEq315">
<alternatives>
<tex-math id="M471">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in [2 \times w\ldots\,64]$$\end{document}</tex-math>
<mml:math id="M472">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
<mml:mo></mml:mo>
<mml:mspace width="0.166667em"></mml:mspace>
<mml:mn>64</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq315.gif"></inline-graphic>
</alternatives>
</inline-formula>
.</p>
<p>The main idea during the evaluation, also used by [
<xref ref-type="bibr" rid="CR1">1</xref>
] but only for the single Bernoulli criterion and on a single spaced seed, is to split the computation in two distinct stages:
<list list-type="order">
<list-item>
<p>Selecting the
<italic>set of dominant seeds</italic>
is the first stage: it provides a reduced set of candidate seeds. Note that the dominant selection can be applicable without prior knowledge of the sensitivity criterion being used, provided that this sensitivity criterion is established on
<italic>i.i.d sequence</italic>
alignments (this last requirement is true for the
<italic>Bernoulli</italic>
, the
<italic>Hit Integration</italic>
, the
<italic>Dirac</italic>
, and the
<italic>Heaviside</italic>
models).</p>
</list-item>
<list-item>
<p>Comparing each of the seeds from the
<italic>set of dominant seeds</italic>
with a sensitivity criterion is the second stage: it usually depends on
<italic>at least</italic>
one parameter (for example, for the Bernoulli model: the probability
<italic>p</italic>
to generate a match) which has different consequences on continuous and discrete models:
<list list-type="bullet">
<list-item>
<label></label>
<p>For the
<italic>Bernoulli</italic>
and the
<italic>Hit Integration</italic>
continuous models, this implies comparing
<italic>p</italic>
-parametrized or
<inline-formula id="IEq316">
<alternatives>
<tex-math id="M473">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[p_a,p_b]$$\end{document}</tex-math>
<mml:math id="M474">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq316.gif"></inline-graphic>
</alternatives>
</inline-formula>
-parametrized polynomials: we follow the idea proposed in [
<xref ref-type="bibr" rid="CR1">1</xref>
] for the
<italic>Bernoulli</italic>
model and also apply it on the
<italic>Hit Integration</italic>
model where we compute the
<inline-formula id="IEq317">
<alternatives>
<tex-math id="M475">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^x$$\end{document}</tex-math>
<mml:math id="M476">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
<mml:mi>x</mml:mi>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq317.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>HI</italic>
and the
<inline-formula id="IEq318">
<alternatives>
<tex-math id="M477">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _x^1$$\end{document}</tex-math>
<mml:math id="M478">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq318.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>HI</italic>
respectively. Let us concentrate on the Bernoulli model with a (single) free parameter
<italic>p</italic>
: For two dominant seeds
<inline-formula id="IEq319">
<alternatives>
<tex-math id="M479">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _a$$\end{document}</tex-math>
<mml:math id="M480">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq319.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq320">
<alternatives>
<tex-math id="M481">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _b$$\end{document}</tex-math>
<mml:math id="M482">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq320.gif"></inline-graphic>
</alternatives>
</inline-formula>
and a given length
<inline-formula id="IEq321">
<alternatives>
<tex-math id="M483">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M484">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq321.gif"></inline-graphic>
</alternatives>
</inline-formula>
, we compute their respective polynomials
<inline-formula id="IEq322">
<alternatives>
<tex-math id="M485">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Pr_{\pi _a}(p,\ell )$$\end{document}</tex-math>
<mml:math id="M486">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq322.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq323">
<alternatives>
<tex-math id="M487">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Pr_{\pi _b}(p,\ell )$$\end{document}</tex-math>
<mml:math id="M488">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq323.gif"></inline-graphic>
</alternatives>
</inline-formula>
and their difference
<inline-formula id="IEq324">
<alternatives>
<tex-math id="M489">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Pr_{\pi _a - \pi _b}(p,\ell ) = Pr_{\pi _a}(p,\ell ) - Pr_{\pi _b}(p,\ell )$$\end{document}</tex-math>
<mml:math id="M490">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq324.gif"></inline-graphic>
</alternatives>
</inline-formula>
(an example of its associated coefficients is illustrated on the third column of Table 
<xref rid="Tab1" ref-type="table">1</xref>
), from which zeros in the range
<inline-formula id="IEq325">
<alternatives>
<tex-math id="M491">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p \in [0,1]$$\end{document}</tex-math>
<mml:math id="M492">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq325.gif"></inline-graphic>
</alternatives>
</inline-formula>
are numerically extracted using solvers from maple or maxima. Using the
<italic>p</italic>
-intervals between these zeros, it is then possible to determine whether
<inline-formula id="IEq326">
<alternatives>
<tex-math id="M493">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Pr_{\pi _a - \pi _b}(p,\ell )$$\end{document}</tex-math>
<mml:math id="M494">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq326.gif"></inline-graphic>
</alternatives>
</inline-formula>
is positive or negative, and thus which of the two seeds
<inline-formula id="IEq327">
<alternatives>
<tex-math id="M495">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _a$$\end{document}</tex-math>
<mml:math id="M496">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq327.gif"></inline-graphic>
</alternatives>
</inline-formula>
or
<inline-formula id="IEq328">
<alternatives>
<tex-math id="M497">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi _b$$\end{document}</tex-math>
<mml:math id="M498">
<mml:msub>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq328.gif"></inline-graphic>
</alternatives>
</inline-formula>
is better according to
<italic>p</italic>
. Finally, the Pareto envelope (
<italic>optimal seeds</italic>
) can be extracted from the initial set of dominant seeds.</p>
</list-item>
<list-item>
<label></label>
<p>For the
<italic>Dirac</italic>
and the
<italic>Heaviside</italic>
discrete models, this implies comparing, instead of real-valued polynomials, integer numbers for the Dirac model (and respectively rational numbers for the Heaviside model), which is an easier and lighter process. The
<bold> Pareto envelope</bold>
can then be easily extracted from these discrete models to select the
<italic>optimal seeds</italic>
from the set of dominant seeds. We have also extracted the lossless part for the
<italic>Dirac</italic>
and the
<inline-formula id="IEq329">
<alternatives>
<tex-math id="M499">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum _x^1$$\end{document}</tex-math>
<mml:math id="M500">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq329.gif"></inline-graphic>
</alternatives>
</inline-formula>
<italic>Heaviside</italic>
criteria.</p>
</list-item>
</list>
</p>
</list-item>
</list>
In the aforementioned experiments, we noticed that the size of the
<italic>set of dominant seeds</italic>
was at most
<inline-formula id="IEq330">
<alternatives>
<tex-math id="M501">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$3359$$\end{document}</tex-math>
<mml:math id="M502">
<mml:mrow>
<mml:mn>3359</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq330.gif"></inline-graphic>
</alternatives>
</inline-formula>
(with a median size of 57 and an average size of 303 for all the experiments). To briefly illustrate this point, a list of each maximum size in our experiments is provided on Table 
<xref rid="Tab2" ref-type="table">2</xref>
.
<table-wrap id="Tab2">
<label>Table 2</label>
<caption>
<p>Maximum size of the set of dominant seeds</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left"></th>
<th align="left" colspan="14">
<italic>w</italic>
</th>
</tr>
<tr>
<th align="left">
<italic>n</italic>
</th>
<th align="left">3</th>
<th align="left">4</th>
<th align="left">5</th>
<th align="left">6</th>
<th align="left">7</th>
<th align="left">8</th>
<th align="left">9</th>
<th align="left">10</th>
<th align="left">11</th>
<th align="left">12</th>
<th align="left">13</th>
<th align="left">14</th>
<th align="left">15</th>
<th align="left">16</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="2">1</td>
<td align="left">2</td>
<td align="left">7</td>
<td align="left">8</td>
<td align="left">13</td>
<td align="left">15</td>
<td align="left">26</td>
<td align="left">23</td>
<td align="left">32</td>
<td align="left">40</td>
<td align="left">45</td>
<td align="left">46</td>
<td align="left">48</td>
<td align="left">74</td>
<td align="left">84</td>
</tr>
<tr>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>62</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>61</italic>
)</td>
<td align="left">(
<italic>60</italic>
)</td>
<td align="left">(
<italic>62</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>63</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>59</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
</tr>
<tr>
<td align="left" rowspan="2">2</td>
<td align="left">5</td>
<td align="left">12</td>
<td align="left">35</td>
<td align="left">41</td>
<td align="left">52</td>
<td align="left">99</td>
<td align="left">128</td>
<td align="left">197</td>
<td align="left">231</td>
<td align="left">207</td>
<td align="left">350</td>
<td align="left">320</td>
<td align="left">439</td>
<td align="left">376</td>
</tr>
<tr>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>63</italic>
)</td>
<td align="left">(
<italic>63</italic>
)</td>
<td align="left">(
<italic>61</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>60</italic>
)</td>
<td align="left">(
<italic>62</italic>
)</td>
<td align="left">(
<italic>61</italic>
)</td>
<td align="left">(
<italic>59</italic>
)</td>
<td align="left">(
<italic>63</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>41</italic>
)</td>
</tr>
<tr>
<td align="left" rowspan="2">3</td>
<td align="left">6</td>
<td align="left">26</td>
<td align="left">85</td>
<td align="left">84</td>
<td align="left">204</td>
<td align="left">320</td>
<td align="left">391</td>
<td align="left">485</td>
<td align="left">854</td>
<td align="left">932</td>
<td align="left">1103</td>
<td align="left">1449</td>
<td align="left">1508</td>
<td align="left">1812</td>
</tr>
<tr>
<td align="left">(
<italic>60</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>62</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>60</italic>
)</td>
<td align="left">(
<italic>56</italic>
)</td>
<td align="left">(
<italic>56</italic>
)</td>
<td align="left">(
<italic>62</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>41</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>63</italic>
)</td>
</tr>
<tr>
<td align="left" rowspan="2">4</td>
<td align="left">7</td>
<td align="left">29</td>
<td align="left">124</td>
<td align="left">190</td>
<td align="left">254</td>
<td align="left">535</td>
<td align="left">811</td>
<td align="left">1041</td>
<td align="left">1450</td>
<td align="left">1908</td>
<td align="left">1775</td>
<td align="left">2364</td>
<td align="left">3125</td>
<td align="left">3359</td>
</tr>
<tr>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>59</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>58</italic>
)</td>
<td align="left">(
<italic>63</italic>
)</td>
<td align="left">(
<italic>64</italic>
)</td>
<td align="left">(
<italic>62</italic>
)</td>
<td align="left">(
<italic>39</italic>
)</td>
<td align="left">(
<italic>63</italic>
)</td>
<td align="left">(
<italic>37</italic>
)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>For
<italic>n</italic>
seeds of weight
<italic>w</italic>
, we indicate the maximum size of the dominant set found in our experiments on all the alignment lengths
<inline-formula id="IEq331">
<alternatives>
<tex-math id="M503">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \in [s\ldots64]$$\end{document}</tex-math>
<mml:math id="M504">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mo stretchy="false">[</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:mn>64</mml:mn>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq331.gif"></inline-graphic>
</alternatives>
</inline-formula>
. We also give the largest alignment length
<inline-formula id="IEq332">
<alternatives>
<tex-math id="M505">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\ell )$$\end{document}</tex-math>
<mml:math id="M506">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq332.gif"></inline-graphic>
</alternatives>
</inline-formula>
where this maximum has been reached</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>So far, we restricted the span of our designed seeds to
<inline-formula id="IEq333">
<alternatives>
<tex-math id="M507">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2 \times w$$\end{document}</tex-math>
<mml:math id="M508">
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq333.gif"></inline-graphic>
</alternatives>
</inline-formula>
, and also did not consider one single fixed probability
<italic>p</italic>
during the optimization process. These restrictive conditions could be of course alleviated, but we mention here that computed sensitivities are close to (even if not strictly speaking “better than”) the top ones mentioned in several publications [
<xref ref-type="bibr" rid="CR56">56</xref>
,
<xref ref-type="bibr" rid="CR77">77</xref>
,
<xref ref-type="bibr" rid="CR78">78</xref>
,
<xref ref-type="bibr" rid="CR80">80</xref>
] where the emphasis was on the heuristic being used for designing seed, the speed of the optimization algorithm, and the best seed for a fixed probability
<italic>p</italic>
. Table 
<xref rid="Tab3" ref-type="table">3</xref>
has been extracted from the Table
<xref rid="Tab1" ref-type="table">1</xref>
of recently published paper [
<xref ref-type="bibr" rid="CR80">80</xref>
] and summarizes known optimal sensitivities.
<table-wrap id="Tab3">
<label>Table 3</label>
<caption>
<p>Sensitivity comparison of different programs</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">
<italic>w</italic>
</th>
<th align="left">
<italic>p</italic>
</th>
<th align="left">SpEED</th>
<th align="left">AcoSeed</th>
<th align="left">FastHC</th>
<th align="left">MuteHC</th>
<th align="left">Rasbhari</th>
<th align="left">Current sensitivity (
<inline-formula id="IEq337">
<alternatives>
<tex-math id="M509">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\delta$$\end{document}</tex-math>
<mml:math id="M510">
<mml:mi mathvariant="italic">δ</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq337.gif"></inline-graphic>
</alternatives>
</inline-formula>
)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="3">10</td>
<td char="." align="char">0.75</td>
<td char="." align="char">90.9098</td>
<td char="." align="char">90.9513</td>
<td char="." align="char">90.7312</td>
<td char="." align="char">
<italic>92.6812</italic>
</td>
<td char="." align="char">90.9614</td>
<td char="." align="char">90.8753 (1.8059%)</td>
</tr>
<tr>
<td char="." align="char">0.80</td>
<td char="." align="char">97.8337</td>
<td char="." align="char">97.8521</td>
<td char="." align="char">97.7625</td>
<td char="." align="char">
<italic>98.3836</italic>
</td>
<td char="." align="char">97.8554</td>
<td char="." align="char">97.8203 (0.5633%)</td>
</tr>
<tr>
<td char="." align="char">0.85</td>
<td char="." align="char">99.7569</td>
<td char="." align="char">99.7614</td>
<td char="." align="char">99.7431</td>
<td char="." align="char">
<italic>99.8356</italic>
</td>
<td char="." align="char">99.7618</td>
<td char="." align="char">99.7568 (0.0788%)</td>
</tr>
<tr>
<td align="left" rowspan="3">11</td>
<td char="." align="char">0.75</td>
<td char="." align="char">83.3793</td>
<td char="." align="char">
<italic>83.4728</italic>
</td>
<td char="." align="char">83.3068</td>
<td char="." align="char">83.4127</td>
<td char="." align="char">83.4679</td>
<td char="." align="char">83.4297 (0.0431%)</td>
</tr>
<tr>
<td char="." align="char">0.80</td>
<td char="." align="char">94.9861</td>
<td char="." align="char">95.037</td>
<td char="." align="char">94.9453</td>
<td char="." align="char">95.0194</td>
<td char="." align="char">
<italic>95.0386</italic>
</td>
<td char="." align="char">95.0127 (0.0259%)</td>
</tr>
<tr>
<td char="." align="char">0.85</td>
<td char="." align="char">99.2431</td>
<td char="." align="char">99.2478</td>
<td char="." align="char">99.2250</td>
<td char="." align="char">99.2486</td>
<td char="." align="char">
<italic>99.2506</italic>
</td>
<td char="." align="char">99.2452 (0.0054%)</td>
</tr>
<tr>
<td align="left" rowspan="3">12</td>
<td char="." align="char">0.80</td>
<td char="." align="char">90.5750</td>
<td char="." align="char">90.6328</td>
<td char="." align="char">90.4735</td>
<td char="." align="char">90.5820</td>
<td char="." align="char">
<italic>90.6648</italic>
</td>
<td char="." align="char">90.5571 (0.1077%)</td>
</tr>
<tr>
<td char="." align="char">0.85</td>
<td char="." align="char">98.1589</td>
<td char="." align="char">98.1766</td>
<td char="." align="char">98.1199</td>
<td char="." align="char">98.1670</td>
<td char="." align="char">
<italic>98.1824</italic>
</td>
<td char="." align="char">98.1591 (0.0233%)</td>
</tr>
<tr>
<td char="." align="char">0.90</td>
<td char="." align="char">99.8821</td>
<td char="." align="char">99.8853</td>
<td char="." align="char">99.8771</td>
<td char="." align="char">99.8836</td>
<td char="." align="char">
<italic>99.8864</italic>
</td>
<td char="." align="char">99.8840 (0.0024%)</td>
</tr>
<tr>
<td align="left" rowspan="3">16</td>
<td char="." align="char">0.85</td>
<td char="." align="char">84.8212</td>
<td char="." align="char">
<italic>84.9829</italic>
</td>
<td char="." align="char">84.6558</td>
<td char="." align="char">84.8764</td>
<td char="." align="char">84.969</td>
<td char="." align="char">84.9668 (0.0161%)</td>
</tr>
<tr>
<td char="." align="char">0.90</td>
<td char="." align="char">97.4321</td>
<td char="." align="char">97.4712</td>
<td char="." align="char">97.3556</td>
<td char="." align="char">97.4460</td>
<td char="." align="char">
<italic>97.5035</italic>
</td>
<td char="." align="char">97.4730 (0.0305%)</td>
</tr>
<tr>
<td char="." align="char">0.95</td>
<td char="." align="char">99.9388</td>
<td char="." align="char">99.9419</td>
<td char="." align="char">99.9347</td>
<td char="." align="char">99.9424</td>
<td char="." align="char">
<italic>99.9441</italic>
</td>
<td char="." align="char">99.9414 (0.0027%)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Italic values indicate the best sensitivity</p>
<p>The reported sensitivity for
<inline-formula id="IEq334">
<alternatives>
<tex-math id="M511">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=4$$\end{document}</tex-math>
<mml:math id="M512">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq334.gif"></inline-graphic>
</alternatives>
</inline-formula>
seeds of weight
<italic>w</italic>
on alignments of length
<inline-formula id="IEq335">
<alternatives>
<tex-math id="M513">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l=50$$\end{document}</tex-math>
<mml:math id="M514">
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>50</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq335.gif"></inline-graphic>
</alternatives>
</inline-formula>
under a Bernoulli model with a match probability
<italic>p</italic>
. All the reported results are extracted from the Table
<xref rid="Tab1" ref-type="table">1</xref>
of [
<xref ref-type="bibr" rid="CR80">80</xref>
], but the last column that corresponds to our current public seeds, with a
<inline-formula id="IEq336">
<alternatives>
<tex-math id="M515">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\delta$$\end{document}</tex-math>
<mml:math id="M516">
<mml:mi mathvariant="italic">δ</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq336.gif"></inline-graphic>
</alternatives>
</inline-formula>
difference to the optimal seed</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>Note that we did not use any
<italic>Overlap Complexity</italic>
/
<italic>Covariance</italic>
heuristic optimisation here (to stay in a generic framework), and simply apply the very simple hill-climbing algorithm of Iedera. We also mention that our seeds are not definitely the best ones, but since they are published, their sensitivity can be checked using other software, as mandala [
<xref ref-type="bibr" rid="CR63">63</xref>
], SpEED [
<xref ref-type="bibr" rid="CR56">56</xref>
], or rasbhari [
<xref ref-type="bibr" rid="CR80">80</xref>
] ([
<xref ref-type="bibr" rid="CR43">43</xref>
,
<xref ref-type="bibr" rid="CR57">57</xref>
] did the same with the seeds obtained with the SpEED software).</p>
<p>Finally, to show a typical output of this generalized parameter-free approach, optimal single (
<inline-formula id="IEq350">
<alternatives>
<tex-math id="M517">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1$$\end{document}</tex-math>
<mml:math id="M518">
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq350.gif"></inline-graphic>
</alternatives>
</inline-formula>
) seeds of weight
<inline-formula id="IEq351">
<alternatives>
<tex-math id="M519">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w=11$$\end{document}</tex-math>
<mml:math id="M520">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>11</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq351.gif"></inline-graphic>
</alternatives>
</inline-formula>
have been plotted according to the main parameter of each model (horizontal axis) and the length
<inline-formula id="IEq352">
<alternatives>
<tex-math id="M521">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M522">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq352.gif"></inline-graphic>
</alternatives>
</inline-formula>
of the alignment (vertical axis) in Figs.  
<xref rid="Fig5" ref-type="fig">5</xref>
 and 
<xref rid="Fig6" ref-type="fig">6</xref>
. On discrete models, a pink mark represents the lossless border: seeds on the right of this border are by essence
<bold>lossless</bold>
for the set of parameters. On the right margin of the discrete models, we indicate the fraction of the minimum number of matches
<italic>m</italic>
over the alignment length
<inline-formula id="IEq353">
<alternatives>
<tex-math id="M523">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M524">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq353.gif"></inline-graphic>
</alternatives>
</inline-formula>
to be
<italic>lossless</italic>
.</p>
<p>We provide the scripts and the whole set of single and multiple seeds, in
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.cristal.univ-lille.fr/yass/iedera%5fdominance">http://bioinfo.cristal.univ-lille.fr/yass/iedera_dominance</ext-link>
in the hope this will be useful to alignment software and spaced seeds alignment-free metagenomic classifiers.</p>
</sec>
<sec id="Sec9">
<title>Discussion</title>
<p>In this paper, we have presented a generalization of the usage of dominant seeds, first on the Hit integration model with a parameter-free approach, and also on two new discrete models (named Dirac and Heaviside) that are related to lossless seeds. In this parameter-free context, we show that all these models can be computed with help of a method for counting alignments of particular classes, themselves represented by regular languages, and a counting semi-ring to perform an efficient set size computation.</p>
<p>We open the discussion with the complementary asymptotic problem, before going to finite but multivariate model extensions.</p>
<sec id="Sec10">
<title>Complementary asymptotic problem</title>
<p>So far, we only have considered a set of finite alignment lengths
<inline-formula id="IEq354">
<alternatives>
<tex-math id="M525">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M526">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq354.gif"></inline-graphic>
</alternatives>
</inline-formula>
to design seeds.
<italic>But </italic>
limiting the length is far from satisfactory, so the next problem deserves consideration too: the asymptotic hit probability of seeds [
<xref ref-type="bibr" rid="CR63">63</xref>
,
<xref ref-type="bibr" rid="CR89">89</xref>
<xref ref-type="bibr" rid="CR91">91</xref>
].</p>
<p>As an example, if we consider the Bernoulli model where we choose
<italic>p</italic>
in the interval ]0, 1[, and then consider the probability
<inline-formula id="IEq355">
<alternatives>
<tex-math id="M527">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Pr_\pi (p,\ell )$$\end{document}</tex-math>
<mml:math id="M528">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq355.gif"></inline-graphic>
</alternatives>
</inline-formula>
for
<inline-formula id="IEq356">
<alternatives>
<tex-math id="M529">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M530">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq356.gif"></inline-graphic>
</alternatives>
</inline-formula>
to hit an alignment of length
<inline-formula id="IEq357">
<alternatives>
<tex-math id="M531">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M532">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq357.gif"></inline-graphic>
</alternatives>
</inline-formula>
(noted
<inline-formula id="IEq358">
<alternatives>
<tex-math id="M533">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Pr_\pi (\ell )$$\end{document}</tex-math>
<mml:math id="M534">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq358.gif"></inline-graphic>
</alternatives>
</inline-formula>
to simplify), then it can be shown that the complementary probability
<inline-formula id="IEq359">
<alternatives>
<tex-math id="M535">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{Pr_\pi (\ell )}$$\end{document}</tex-math>
<mml:math id="M536">
<mml:mover>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>¯</mml:mo>
</mml:mover>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq359.gif"></inline-graphic>
</alternatives>
</inline-formula>
 [see for example
<xref ref-type="bibr" rid="CR91">91</xref>
, equation (3)] follows
<disp-formula id="Equ5">
<alternatives>
<tex-math id="M537">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \lim _{\ell \rightarrow \infty } \overline{Pr_\pi (\ell )} = \beta _\pi \lambda _\pi ^\ell \big (1+o(1)\big ) \end{aligned}$$\end{document}</tex-math>
<mml:math id="M538" display="block">
<mml:mrow>
<mml:mtable columnspacing="0.5ex">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mrow>
<mml:munder>
<mml:mo movablelimits="true">lim</mml:mo>
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo stretchy="false"></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:munder>
<mml:mover>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>¯</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
<mml:mi></mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">(</mml:mo>
</mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mi>o</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo maxsize="1.2em" minsize="1.2em" stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<graphic xlink:href="13015_2017_92_Article_Equ5.gif" position="anchor"></graphic>
</alternatives>
</disp-formula>
Here
<inline-formula id="IEq360">
<alternatives>
<tex-math id="M539">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _\pi$$\end{document}</tex-math>
<mml:math id="M540">
<mml:msub>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq360.gif"></inline-graphic>
</alternatives>
</inline-formula>
is the largest (positive) eigenvalue of the sub-stochastic matrix of
<inline-formula id="IEq361">
<alternatives>
<tex-math id="M541">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M542">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq361.gif"></inline-graphic>
</alternatives>
</inline-formula>
where final states have been removed, this matrix computing thus the distribution
<inline-formula id="IEq362">
<alternatives>
<tex-math id="M543">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{Pr_\pi (\ell )}$$\end{document}</tex-math>
<mml:math id="M544">
<mml:mover>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>¯</mml:mo>
</mml:mover>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq362.gif"></inline-graphic>
</alternatives>
</inline-formula>
when powered to
<inline-formula id="IEq363">
<alternatives>
<tex-math id="M545">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M546">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq363.gif"></inline-graphic>
</alternatives>
</inline-formula>
 (see section 3.1
<inline-formula id="IEq364">
<alternatives>
<tex-math id="M547">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _\pi$$\end{document}</tex-math>
<mml:math id="M548">
<mml:msub>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq364.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq365">
<alternatives>
<tex-math id="M549">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta _\pi$$\end{document}</tex-math>
<mml:math id="M550">
<mml:msub>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq365.gif"></inline-graphic>
</alternatives>
</inline-formula>
of [
<xref ref-type="bibr" rid="CR63">63</xref>
]).</p>
<p>As an example, for
<inline-formula id="IEq366">
<alternatives>
<tex-math id="M551">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.7$$\end{document}</tex-math>
<mml:math id="M552">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.7</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq366.gif"></inline-graphic>
</alternatives>
</inline-formula>
and for the Patternhunter I spaced seed, we have (with help of a Maple script)
<inline-formula id="IEq367">
<alternatives>
<tex-math id="M553">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\lambda ,\beta \}_\mathtt{111010010100110111} = \{0.98731,0.22667\}$$\end{document}</tex-math>
<mml:math id="M554">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mn mathvariant="monospace">111010010100110111</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mn>0.98731</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>0.22667</mml:mn>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq367.gif"></inline-graphic>
</alternatives>
</inline-formula>
, that can be compared with the contiguous seed of same weight
<inline-formula id="IEq368">
<alternatives>
<tex-math id="M555">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{\lambda ,\beta \}_\mathtt{11111111111} = \{0.99364,0.44784\}$$\end{document}</tex-math>
<mml:math id="M556">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="italic">β</mml:mi>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mn mathvariant="monospace">11111111111</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mn>0.99364</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>0.44784</mml:mn>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq368.gif"></inline-graphic>
</alternatives>
</inline-formula>
. [
<xref ref-type="bibr" rid="CR63">63</xref>
] have proven that, in the class of seeds with the same weight, contiguous seeds have the largest value
<inline-formula id="IEq369">
<alternatives>
<tex-math id="M557">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document}</tex-math>
<mml:math id="M558">
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq369.gif"></inline-graphic>
</alternatives>
</inline-formula>
and thus are the asymptotic worst-case in terms of hit probability, a trait shared with the
<italic>uniformly spaced</italic>
seeds of same weight (e.g. 101010101010101010101 or 1001001001001001001001001001001).</p>
<p>Comparing seeds asymptotically can thus be done easily by comparing their respective
<inline-formula id="IEq370">
<alternatives>
<tex-math id="M559">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document}</tex-math>
<mml:math id="M560">
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq370.gif"></inline-graphic>
</alternatives>
</inline-formula>
eigenvalue, or their
<inline-formula id="IEq371">
<alternatives>
<tex-math id="M561">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta$$\end{document}</tex-math>
<mml:math id="M562">
<mml:mi mathvariant="italic">β</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq371.gif"></inline-graphic>
</alternatives>
</inline-formula>
when
<inline-formula id="IEq372">
<alternatives>
<tex-math id="M563">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document}</tex-math>
<mml:math id="M564">
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq372.gif"></inline-graphic>
</alternatives>
</inline-formula>
equality occurs, but it seems to be
<italic>computationally possible</italic>
<xref ref-type="fn" rid="Fn9">9</xref>
only if
<italic>p</italic>
is set numerically before the analysis.</p>
<p>Moreover
<italic>dominant seeds’</italic>
extracted from this paper on a limited alignment length
<inline-formula id="IEq373">
<alternatives>
<tex-math id="M565">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M566">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq373.gif"></inline-graphic>
</alternatives>
</inline-formula>
(here
<inline-formula id="IEq374">
<alternatives>
<tex-math id="M567">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \le 64$$\end{document}</tex-math>
<mml:math id="M568">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq374.gif"></inline-graphic>
</alternatives>
</inline-formula>
) would not always be optimal for any
<inline-formula id="IEq375">
<alternatives>
<tex-math id="M569">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M570">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq375.gif"></inline-graphic>
</alternatives>
</inline-formula>
: such seeds can, however, be justified as “good” candidates for seeds of restricted span (e.g.
<inline-formula id="IEq376">
<alternatives>
<tex-math id="M571">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s \le 2\times w$$\end{document}</tex-math>
<mml:math id="M572">
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq376.gif"></inline-graphic>
</alternatives>
</inline-formula>
), but definitely not the optimal ones, unless dominance is computed on a wider range of alignment length
<inline-formula id="IEq377">
<alternatives>
<tex-math id="M573">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M574">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq377.gif"></inline-graphic>
</alternatives>
</inline-formula>
values.</p>
<p>For example, the best (smallest)
<inline-formula id="IEq378">
<alternatives>
<tex-math id="M575">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document}</tex-math>
<mml:math id="M576">
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq378.gif"></inline-graphic>
</alternatives>
</inline-formula>
for any
<italic>dominant</italic>
seed of weight
<inline-formula id="IEq379">
<alternatives>
<tex-math id="M577">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w=11$$\end{document}</tex-math>
<mml:math id="M578">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>11</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq379.gif"></inline-graphic>
</alternatives>
</inline-formula>
and span at most
<inline-formula id="IEq380">
<alternatives>
<tex-math id="M579">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2 \times w$$\end{document}</tex-math>
<mml:math id="M580">
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq380.gif"></inline-graphic>
</alternatives>
</inline-formula>
, on alignments of length
<inline-formula id="IEq381">
<alternatives>
<tex-math id="M581">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \le 64$$\end{document}</tex-math>
<mml:math id="M582">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq381.gif"></inline-graphic>
</alternatives>
</inline-formula>
is 0.98714 for the seed 1110010100110010111. Surprisingly, even if this seed reaches the smallest
<inline-formula id="IEq382">
<alternatives>
<tex-math id="M583">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document}</tex-math>
<mml:math id="M584">
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq382.gif"></inline-graphic>
</alternatives>
</inline-formula>
out of its
<italic>dominant</italic>
class, it never occurred in the
<italic>optimal</italic>
seeds, in any of our experiments. Moreover, we have checked that another seed 1110010100100100010111 has an even smaller
<inline-formula id="IEq383">
<alternatives>
<tex-math id="M585">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda = 0.98669$$\end{document}</tex-math>
<mml:math id="M586">
<mml:mrow>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.98669</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq383.gif"></inline-graphic>
</alternatives>
</inline-formula>
: this last seed was not dominant for
<inline-formula id="IEq384">
<alternatives>
<tex-math id="M587">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell \le 64$$\end{document}</tex-math>
<mml:math id="M588">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo></mml:mo>
<mml:mn>64</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq384.gif"></inline-graphic>
</alternatives>
</inline-formula>
, but would be in the class of seeds of span at most
<inline-formula id="IEq385">
<alternatives>
<tex-math id="M589">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2 \times w$$\end{document}</tex-math>
<mml:math id="M590">
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq385.gif"></inline-graphic>
</alternatives>
</inline-formula>
if larger values of
<inline-formula id="IEq386">
<alternatives>
<tex-math id="M591">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M592">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq386.gif"></inline-graphic>
</alternatives>
</inline-formula>
were selected.</p>
<p>Finally, a parameter-free analysis implying both
<italic>p</italic>
and
<inline-formula id="IEq387">
<alternatives>
<tex-math id="M593">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M594">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq387.gif"></inline-graphic>
</alternatives>
</inline-formula>
seems difficult to apply for large seeds. It is interesting to notice that several of our preliminary experiments
<italic>suggest</italic>
that, asymptotically, and only
<xref ref-type="fn" rid="Fn10">10</xref>
for a
<italic>restricted set</italic>
of seeds (e.g. of weight
<inline-formula id="IEq389">
<alternatives>
<tex-math id="M595">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w=11$$\end{document}</tex-math>
<mml:math id="M596">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>11</mml:mn>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq389.gif"></inline-graphic>
</alternatives>
</inline-formula>
and span at most
<inline-formula id="IEq390">
<alternatives>
<tex-math id="M597">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2 \times w$$\end{document}</tex-math>
<mml:math id="M598">
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq390.gif"></inline-graphic>
</alternatives>
</inline-formula>
),
<italic>one seed is optimal whatever the value of p</italic>
. This remains to be confirmed experimentally and theoretically because it might be possible that special cases exist, where at least two (or even more) seeds share the
<italic>p</italic>
partition.</p>
</sec>
<sec id="Sec11">
<title>Models and multivariate analysis</title>
<p>As far as
<italic>i.i.d sequences</italic>
are considered, the full framework of [
<xref ref-type="bibr" rid="CR1">1</xref>
], including the dominant seed selection, can be applied on
<italic>any extended spaced seed model</italic>
(such as transition constrained seeds, vector seeds, indel seeds,...). However, additional free-parameters (such as the transition/transversion rate, the indel/mismatch rate, ...) lead to an increase in the number of alignment classes (for example, alignments of length
<inline-formula id="IEq391">
<alternatives>
<tex-math id="M599">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M600">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq391.gif"></inline-graphic>
</alternatives>
</inline-formula>
, with
<italic>i</italic>
indels,
<italic>v</italic>
transversion errors,
<italic>t</italic>
transitions errors, and remaining
<italic>m</italic>
matches, such that
<inline-formula id="IEq392">
<alternatives>
<tex-math id="M601">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell =i+v+t+m$$\end{document}</tex-math>
<mml:math id="M602">
<mml:mrow>
<mml:mi></mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq392.gif"></inline-graphic>
</alternatives>
</inline-formula>
) that have to be considered by the dominance selection. Moreover, it involves a much more complex multivariate polynomial analysis, if more than one parameter is, at this point, left free.</p>
<p>In a more general way, if
<italic>i.i.d sequences</italic>
are ignored, and dominant seed selection thus abandoned in its original form, one could mix several numerically-fixed models: for example, mixing a given HMM representing coding sequences, with a numerically-fixed Bernoulli model. The idea is here to use a
<italic>free probability parameter</italic>
to create a balance between the two models: either initially before generating the alignment, to choose each of the two models; or along the alignment generation process, to switch between each of the two models. Seeds designed could thus be
<italic>two-handed</italic>
for analyzing both coding and non-coding genomic sequences at the same time, but with an additional control parameter that helps to change the known percentage of such genomic sequences. To compute the sensitivity in this model, a simple idea is to apply a polynomial semi-ring (with at least one parameter-free variable: here the one used to create the balance) on the automaton, and perform, not a numeric, but a symbolic computation.</p>
<p>Finally, as a logical consequence of the two previous remarks, we mention that any HMM with one (or possibly several) free probability parameter(s) could always be analysed with a (multivariate) polynomial semi-ring, increasing thus the scope of the method to applications that depend on Finite State Machines : such parameter-free pre-processing can, at some point, be applied; moreover if several equivalence classes are established in term of probability, it may be possible to use equivalent dominance method to filter out candidates when comparing several elements.</p>
</sec>
</sec>
</body>
<back>
<fn-group>
<fn id="Fn1">
<label>1</label>
<p>We mention an interesting analysis in [
<xref ref-type="bibr" rid="CR92">92</xref>
].</p>
</fn>
<fn id="Fn2">
<label>2</label>
<p>The opposite
<italic>is equivalent</italic>
to say that
<italic>at least one</italic>
string of length
<inline-formula id="IEq129">
<alternatives>
<tex-math id="M603">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M604">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq129.gif"></inline-graphic>
</alternatives>
</inline-formula>
with
<inline-formula id="IEq130">
<alternatives>
<tex-math id="M605">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le k$$\end{document}</tex-math>
<mml:math id="M606">
<mml:mrow>
<mml:mo></mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq130.gif"></inline-graphic>
</alternatives>
</inline-formula>
mismatches is not hit by the seed; in other words, that the seed is not
<inline-formula id="IEq131">
<alternatives>
<tex-math id="M607">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\ell ,k)$$\end{document}</tex-math>
<mml:math id="M608">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq131.gif"></inline-graphic>
</alternatives>
</inline-formula>
-lossless. Note that
<italic>k</italic>
does not need to be initially set: it can be estimated using this requirement, even after the DP run.</p>
</fn>
<fn id="Fn3">
<label>3</label>
<p>Technical details at 
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.cristal.univ-lille.fr/yass/iedera%5fcoverage/index%5fadditional.html">http://bioinfo.cristal.univ-lille.fr/yass/iedera_coverage/index_additional.html</ext-link>
.</p>
</fn>
<fn id="Fn4">
<label>4</label>
<p>This side result is not discussed in [
<xref ref-type="bibr" rid="CR2">2</xref>
], probably because they were more interested by the seed rank and not necessary the “optimal seed”, which they sometime called “dominant”.</p>
</fn>
<fn id="Fn5">
<label>5</label>
<p>As already observed by [
<xref ref-type="bibr" rid="CR63">63</xref>
].</p>
</fn>
<fn id="Fn6">
<label>6</label>
<p>As already mentioned by [
<xref ref-type="bibr" rid="CR1">1</xref>
].</p>
</fn>
<fn id="Fn7">
<label>7</label>
<p>As already mentioned by [
<xref ref-type="bibr" rid="CR2">2</xref>
], but for the non-parametrized
<inline-formula id="IEq275">
<alternatives>
<tex-math id="M609">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^1$$\end{document}</tex-math>
<mml:math id="M610">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>0</mml:mn>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq275.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq276">
<alternatives>
<tex-math id="M611">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _{\frac{1}{2}}^1$$\end{document}</tex-math>
<mml:math id="M612">
<mml:msubsup>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq276.gif"></inline-graphic>
</alternatives>
</inline-formula>
Hit Integration model.</p>
</fn>
<fn id="Fn8">
<label>8</label>
<p>To give a quick and intuitive example, we consider an extreme case : an alignment of fixed length
<inline-formula id="IEq290">
<alternatives>
<tex-math id="M613">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M614">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq290.gif"></inline-graphic>
</alternatives>
</inline-formula>
without any mismatch symbol. Any seed
<inline-formula id="IEq291">
<alternatives>
<tex-math id="M615">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi$$\end{document}</tex-math>
<mml:math id="M616">
<mml:mi mathvariant="italic">π</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq291.gif"></inline-graphic>
</alternatives>
</inline-formula>
of weight
<inline-formula id="IEq292">
<alternatives>
<tex-math id="M617">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$w_\pi \le \ell$$\end{document}</tex-math>
<mml:math id="M618">
<mml:mrow>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq292.gif"></inline-graphic>
</alternatives>
</inline-formula>
and span
<inline-formula id="IEq293">
<alternatives>
<tex-math id="M619">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_\pi \le \ell$$\end{document}</tex-math>
<mml:math id="M620">
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
<mml:mi></mml:mi>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq293.gif"></inline-graphic>
</alternatives>
</inline-formula>
obviously detects this alignment, whatever its shape is, so
<inline-formula id="IEq294">
<alternatives>
<tex-math id="M621">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Dirac_\pi (m=\ell ,\ell )$$\end{document}</tex-math>
<mml:math id="M622">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>a</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq294.gif"></inline-graphic>
</alternatives>
</inline-formula>
and
<inline-formula id="IEq295">
<alternatives>
<tex-math id="M623">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Heaviside_\pi (m_a=\ell ,m_b=\ell ,\ell )$$\end{document}</tex-math>
<mml:math id="M624">
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:msub>
<mml:mi>e</mml:mi>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi></mml:mi>
<mml:mo>,</mml:mo>
<mml:mi></mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq295.gif"></inline-graphic>
</alternatives>
</inline-formula>
reach their maximal sensitivity of 1. For a given weight
<italic>w</italic>
, the restriction of all these seeds to dominant seeds implies that many are lost when dominance selection is applied to keep the best representatives.</p>
</fn>
<fn id="Fn9">
<label>9</label>
<p>At least to the author, but this parametrized problem is intrinsically interesting in itself.</p>
</fn>
<fn id="Fn10">
<label>10</label>
<p>This restricted set of seeds condition is necessary: if removed, best seeds span will increase along
<inline-formula id="IEq388">
<alternatives>
<tex-math id="M625">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell$$\end{document}</tex-math>
<mml:math id="M626">
<mml:mi></mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq388.gif"></inline-graphic>
</alternatives>
</inline-formula>
, see [
<xref ref-type="bibr" rid="CR18">18</xref>
].</p>
</fn>
</fn-group>
<ack>
<title>Acknowledgments and funding</title>
<p> Donald E. K. Martin provided substantive comments on an earlier version of this manuscript. The author would like to thank the second reviewer for his/her thorough review which significantly contributed to improving the quality of the paper. The publication costs were covered by the French Institute for Research in Computer Science and Automation (inria).</p>
<sec id="d29e9805">
<title>Competing interests</title>
<p>The author declare that he has no competing interests.</p>
</sec>
<sec id="d29e9810">
<title>Availability of data and materials</title>
<p>All data and source code are freely available and may be downloaded from:
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.cristal.univ-lille.fr/yass/iedera%5fdominance/">http://bioinfo.cristal.univ-lille.fr/yass/iedera_dominance/</ext-link>
</p>
</sec>
<sec id="d29e9820">
<title>Consent for publication</title>
<p>Not applicable. The manuscript does not contain any data from any individual person.</p>
</sec>
<sec id="d29e9825">
<title>Ethical approval</title>
<p>The manuscript does not report new studies involving any animal or human data or tissue.</p>
</sec>
</ack>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mak</surname>
<given-names>DYF</given-names>
</name>
<name>
<surname>Benson</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>All hits all the time: parameter free calculation of seed sensitivity</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<issue>3</issue>
<fpage>302</fpage>
<lpage>308</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btn643</pub-id>
<pub-id pub-id-type="pmid">19095701</pub-id>
</element-citation>
</ref>
<ref id="CR2">
<label>2.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chung</surname>
<given-names>WH</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>SB</given-names>
</name>
</person-group>
<article-title>Hit integration for identifying optimal spaced seeds</article-title>
<source>BMC Bioinform</source>
<year>2010</year>
<volume>11</volume>
<issue>1</issue>
<fpage>S37</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-11-S1-S37</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Tromp</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>PatternHunter: faster and more sensitive homology search</article-title>
<source>Bioinformatics</source>
<year>2002</year>
<volume>18</volume>
<issue>3</issue>
<fpage>440</fpage>
<lpage>445</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/18.3.440</pub-id>
<pub-id pub-id-type="pmid">11934743</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Burkhardt</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kärkkäinen</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Better filtering with gapped
<inline-formula id="IEq393">
<alternatives>
<tex-math id="M627">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q$$\end{document}</tex-math>
<mml:math id="M628">
<mml:mi>q</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq393.gif"></inline-graphic>
</alternatives>
</inline-formula>
-grams</article-title>
<source>Fund Inform</source>
<year>2002</year>
<volume>56</volume>
<issue>1—-2</issue>
<fpage>51</fpage>
<lpage>70</lpage>
</element-citation>
</ref>
<ref id="CR5">
<label>5.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brejová</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Vinař</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Vector seeds: an extension to spaced seeds</article-title>
<source>J Comput Syst Sci</source>
<year>2005</year>
<volume>70</volume>
<issue>3</issue>
<fpage>364</fpage>
<lpage>380</lpage>
<pub-id pub-id-type="doi">10.1016/j.jcss.2004.12.008</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6.</label>
<mixed-citation publication-type="other">Burkhardt S, Kärkkäinen J. One-gapped
<inline-formula id="IEq395">
<alternatives>
<tex-math id="M629">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q$$\end{document}</tex-math>
<mml:math id="M630">
<mml:mi>q</mml:mi>
</mml:math>
<inline-graphic xlink:href="13015_2017_92_Article_IEq395.gif"></inline-graphic>
</alternatives>
</inline-formula>
-gram filters for Levenshtein distance. Proceedings of the 13th symposium on combinatorial pattern matching (CPM), vol 2373, Lecture Notes in Computer Science Fukuoka (Japan). Berlin: Springer; 2002. p. 225–34.</mixed-citation>
</ref>
<ref id="CR7">
<label>7.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mak</surname>
<given-names>DYF</given-names>
</name>
<name>
<surname>Gelfand</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Benson</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Indel seeds for homology search</article-title>
<source>Bioinformatics</source>
<year>2006</year>
<volume>22</volume>
<issue>14</issue>
<fpage>341</fpage>
<lpage>349</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btl263</pub-id>
<pub-id pub-id-type="pmid">16317072</pub-id>
</element-citation>
</ref>
<ref id="CR8">
<label>8.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>An efficient way of finding good indel seeds for local homology search</article-title>
<source>Chin Sci Bull</source>
<year>2009</year>
<volume>54</volume>
<issue>20</issue>
<fpage>3837</fpage>
<lpage>3842</lpage>
<pub-id pub-id-type="doi">10.1007/s11434-009-0531-6</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Csűrös</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Rapid homology search with neighbor seeds</article-title>
<source>Algorithmica</source>
<year>2007</year>
<volume>48</volume>
<issue>2</issue>
<fpage>187</fpage>
<lpage>202</lpage>
<pub-id pub-id-type="doi">10.1007/s00453-007-0062-y</pub-id>
</element-citation>
</ref>
<ref id="CR10">
<label>10.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ilie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ilie</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Fast computation of neighbor seeds</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<issue>6</issue>
<fpage>822</fpage>
<lpage>823</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp054</pub-id>
<pub-id pub-id-type="pmid">19176560</pub-id>
</element-citation>
</ref>
<ref id="CR11">
<label>11.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Sung</surname>
<given-names>WK</given-names>
</name>
</person-group>
<article-title>On half gapped seed</article-title>
<source>Genome Inform</source>
<year>2003</year>
<volume>14</volume>
<fpage>176</fpage>
<lpage>185</lpage>
<pub-id pub-id-type="pmid">15706532</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Improved hit criteria for DNA local alignment</article-title>
<source>BMC Bioinform</source>
<year>2004</year>
<volume>5</volume>
<fpage>149</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-5-149</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Run probabilities of seed-like patterns and identifying good transition seeds</article-title>
<source>J Comput Biol</source>
<year>2008</year>
<volume>5</volume>
<issue>10</issue>
<fpage>1295</fpage>
<lpage>1313</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2007.0209</pub-id>
</element-citation>
</ref>
<ref id="CR14">
<label>14.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Stanton</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Florea</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Universal seeds for cDNA-to-genome comparison</article-title>
<source>BMC Bioinform</source>
<year>2008</year>
<volume>9</volume>
<fpage>36</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-9-36</pub-id>
</element-citation>
</ref>
<ref id="CR15">
<label>15.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Frith</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Improved search heuristics find 20 000 new alignments between human and mouse genomes</article-title>
<source>Nucleic Acids Res</source>
<year>2014</year>
<volume>42</volume>
<issue>7</issue>
<fpage>59</fpage>
<pub-id pub-id-type="doi">10.1093/nar/gku104</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Kisman</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Tromp</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>PatternHunter II: highly sensitive and fast homology search</article-title>
<source>J Bioinform Comput Biol</source>
<year>2004</year>
<volume>2</volume>
<issue>3</issue>
<fpage>417</fpage>
<lpage>439</lpage>
<pub-id pub-id-type="doi">10.1142/S0219720004000661</pub-id>
<pub-id pub-id-type="pmid">15359419</pub-id>
</element-citation>
</ref>
<ref id="CR17">
<label>17.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Buhler</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Designing multiple simultaneous seeds for DNA similarity search</article-title>
<source>J Comput Biol</source>
<year>2005</year>
<volume>12</volume>
<issue>6</issue>
<fpage>847</fpage>
<lpage>861</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2005.12.847</pub-id>
<pub-id pub-id-type="pmid">16108721</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Roytberg</surname>
<given-names>MA</given-names>
</name>
</person-group>
<article-title>Multiseed lossless filtration</article-title>
<source>IEEE/ACM Trans Comput Biol Bioinform</source>
<year>2005</year>
<volume>2</volume>
<issue>1</issue>
<fpage>51</fpage>
<lpage>61</lpage>
<pub-id pub-id-type="doi">10.1109/TCBB.2005.12</pub-id>
<pub-id pub-id-type="pmid">17044164</pub-id>
</element-citation>
</ref>
<ref id="CR19">
<label>19.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Farach-Colton</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Landau</surname>
<given-names>GM</given-names>
</name>
<name>
<surname>Cenk Sahinalp</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Tsur</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Optimal spaced seeds for faster approximate string matching</article-title>
<source>J Comput Syst Sci</source>
<year>2007</year>
<volume>73</volume>
<issue>7</issue>
<fpage>1035</fpage>
<lpage>1044</lpage>
<pub-id pub-id-type="doi">10.1016/j.jcss.2007.03.007</pub-id>
</element-citation>
</ref>
<ref id="CR20">
<label>20.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kiełbasa</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Horton</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Frith</surname>
<given-names>MC</given-names>
</name>
</person-group>
<article-title>Adaptive seeds tame genomic sequence comparison</article-title>
<source>Genome Res</source>
<year>2011</year>
<volume>21</volume>
<issue>3</issue>
<fpage>487</fpage>
<lpage>493</lpage>
<pub-id pub-id-type="doi">10.1101/gr.113985.110</pub-id>
<pub-id pub-id-type="pmid">21209072</pub-id>
</element-citation>
</ref>
<ref id="CR21">
<label>21.</label>
<mixed-citation publication-type="other">Peterlongo P, Pisanti N, Boyer F, Sagot MF. Lossless filter for finding long multiple approximate repetitions using a new data structure, the bi-factor array. In: Consens M, Navarro G, editor. Proceedings of the 12th international conference, on string processing and information retrieval (SPIRE). Lecture Notes in Computer Science, vol 3772. Buenos Aires; 2005. p. 179–190.</mixed-citation>
</ref>
<ref id="CR22">
<label>22.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Crochemore</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tischler</surname>
<given-names>G</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Chavez</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Lonardi</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>The gapped suffix array: a new index structure for fast approximate matching</article-title>
<source>Proceedings of the 17th international—symposium on string processing and information retrieval (SPIRE)</source>
<year>2010</year>
<publisher-loc>Los Cabos</publisher-loc>
<publisher-name>Springer</publisher-name>
<fpage>359</fpage>
<lpage>364</lpage>
</element-citation>
</ref>
<ref id="CR23">
<label>23.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Onodera</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Shibuya</surname>
<given-names>T</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Asano</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Nakano</surname>
<given-names>S-I</given-names>
</name>
<name>
<surname>Okamoto</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Watanabe</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>An index structure for spaced seed search</article-title>
<source>Proceedings of the 22nd international symposium on algorithms and computation (ISAAC)</source>
<year>2011</year>
<publisher-loc>Yokohama (Japan)</publisher-loc>
<publisher-name>Springer</publisher-name>
<fpage>764</fpage>
<lpage>772</lpage>
</element-citation>
</ref>
<ref id="CR24">
<label>24.</label>
<mixed-citation publication-type="other">Gagie T, Manzini G, Valenzuela D. Compressed spaced suffix arrays. In: Proceedings of the 2nd international conference on algorithms for big data (ICABD). CEUR-WS, vol 1146. Palermo; 2014. p. 37–45.</mixed-citation>
</ref>
<ref id="CR25">
<label>25.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shrestha</surname>
<given-names>AMS</given-names>
</name>
<name>
<surname>Frith</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Horton</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>A bioinformatician’s guide to the forefront of suffix array construction algorithms</article-title>
<source>Brief Bioinform</source>
<year>2014</year>
<volume>15</volume>
<issue>2</issue>
<fpage>138</fpage>
<lpage>154</lpage>
<pub-id pub-id-type="doi">10.1093/bib/bbt081</pub-id>
<pub-id pub-id-type="pmid">24413184</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Birol</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Mohamadi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Jackman</surname>
<given-names>SD</given-names>
</name>
<name>
<surname>Raghavan</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Vandervalk</surname>
<given-names>BP</given-names>
</name>
<name>
<surname>Raymond</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Warren</surname>
<given-names>RL</given-names>
</name>
</person-group>
<article-title>Spaced seed data structures for de novo assembly</article-title>
<source>Int J Genom</source>
<year>2015</year>
<volume>2015</volume>
<fpage>196591</fpage>
</element-citation>
</ref>
<ref id="CR27">
<label>27.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Keich</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Tromp</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>On spaced seeds for similarity search</article-title>
<source>Discret Appl Math</source>
<year>2004</year>
<volume>138</volume>
<issue>3</issue>
<fpage>253</fpage>
<lpage>263</lpage>
<pub-id pub-id-type="doi">10.1016/S0166-218X(03)00382-2</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nicolas</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Rivals</surname>
<given-names>É</given-names>
</name>
</person-group>
<article-title>Hardness of optimal spaced seed design</article-title>
<source>J Comput Syst Sci</source>
<year>2008</year>
<volume>74</volume>
<issue>5</issue>
<fpage>831</fpage>
<lpage>849</lpage>
<pub-id pub-id-type="doi">10.1016/j.jcss.2007.10.001</pub-id>
</element-citation>
</ref>
<ref id="CR29">
<label>29.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Seed optimization for i.i.d. similarities is no easier than optimal Golomb ruler design</article-title>
<source>Inf Process Lett</source>
<year>2009</year>
<volume>109</volume>
<issue>19</issue>
<fpage>1120</fpage>
<lpage>1124</lpage>
<pub-id pub-id-type="doi">10.1016/j.ipl.2009.07.008</pub-id>
</element-citation>
</ref>
<ref id="CR30">
<label>30.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schwartz</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kent</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Smit</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Baertsch</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hardison</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Haussler</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Human-mouse alignments with BLASTZ</article-title>
<source>Genome Res</source>
<year>2003</year>
<volume>13</volume>
<fpage>103</fpage>
<lpage>107</lpage>
<pub-id pub-id-type="doi">10.1101/gr.809403</pub-id>
<pub-id pub-id-type="pmid">12529312</pub-id>
</element-citation>
</ref>
<ref id="CR31">
<label>31.</label>
<mixed-citation publication-type="other">Darling AE, Treangen TJ, Zhang L, Kuiken C, Messeguer X, Perna NT. Procrastination leads to efficient filtration for local multiple alignment. Proceedings of the 6th international workshop on algorithms in bioinformatics (WABI), vol 4175. Lecture notes in bioinformatics. Zürich: Springer; 2006. p. 126–37.</mixed-citation>
</ref>
<ref id="CR32">
<label>32.</label>
<mixed-citation publication-type="other">Harris RS. Improved pairwise alignment of genomic dna. Ph.d. thesis, The Pennsylvania State University; 2007</mixed-citation>
</ref>
<ref id="CR33">
<label>33.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>MQ</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>ZOOM! Zillions Of Oligos Mapped</article-title>
<source>Bioinformatics</source>
<year>2008</year>
<volume>24</volume>
<issue>21</issue>
<fpage>2431</fpage>
<lpage>2437</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btn416</pub-id>
<pub-id pub-id-type="pmid">18684737</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rumble</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Lacroute</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Dalca</surname>
<given-names>AV</given-names>
</name>
<name>
<surname>Fiume</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sidow</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Brudno</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>SHRiMP: accurate mapping of short color-space reads</article-title>
<source>PLoS Comp Biol</source>
<year>2009</year>
<volume>5</volume>
<issue>5</issue>
<fpage>1000386</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1000386</pub-id>
</element-citation>
</ref>
<ref id="CR35">
<label>35.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Souaiaia</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<issue>19</issue>
<fpage>2514</fpage>
<lpage>2521</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp486</pub-id>
<pub-id pub-id-type="pmid">19675096</pub-id>
</element-citation>
</ref>
<ref id="CR36">
<label>36.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Giladi</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Healy</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Myers</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Hart</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kapranov</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lipson</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Roels</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Thayer</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Letovsky</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Error tolerant indexing and alignment of short reads with covering template families</article-title>
<source>J Comput Biol</source>
<year>2010</year>
<volume>17</volume>
<issue>10</issue>
<fpage>1397</fpage>
<lpage>1411</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2010.0005</pub-id>
<pub-id pub-id-type="pmid">20937014</pub-id>
</element-citation>
</ref>
<ref id="CR37">
<label>37.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>David</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Dzamba</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lister</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Ilie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Brudno</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>SHRiMP2: Sensitive yet practical short read mapping</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<issue>7</issue>
<fpage>1011</fpage>
<lpage>1012</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btr046</pub-id>
<pub-id pub-id-type="pmid">21278192</pub-id>
</element-citation>
</ref>
<ref id="CR38">
<label>38.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sović</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Šikić</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wilm</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Fenlon</surname>
<given-names>SN</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Nagarajan</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Fast and sensitive mapping of nanopore sequencing reads with GraphMap</article-title>
<source>Nat Commun</source>
<year>2016</year>
<volume>7</volume>
<fpage>11307</fpage>
<pub-id pub-id-type="doi">10.1038/ncomms11307</pub-id>
<pub-id pub-id-type="pmid">27079541</pub-id>
</element-citation>
</ref>
<ref id="CR39">
<label>39.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Preparata</surname>
<given-names>FP</given-names>
</name>
<name>
<surname>Oliver</surname>
<given-names>JS</given-names>
</name>
</person-group>
<article-title>DNA sequencing by hybridization using semi-degenerate bases</article-title>
<source>J Comput Biol</source>
<year>2005</year>
<volume>11</volume>
<issue>4</issue>
<fpage>753</fpage>
<lpage>765</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2004.11.753</pub-id>
</element-citation>
</ref>
<ref id="CR40">
<label>40.</label>
<mixed-citation publication-type="other">Tsur D. Optimal probing patterns for sequencing by hybridization. Proceedings of the 6th international workshop on algorithms in bioinformatics (WABI), vol 4175. Lecture notes in bioinformatics. Zürich: Springer; 2006. p. 366–75.</mixed-citation>
</ref>
<ref id="CR41">
<label>41.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feng</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Tillier</surname>
<given-names>ERM</given-names>
</name>
</person-group>
<article-title>A fast and flexible approach to oligonucleotide probe design for genomes and gene families</article-title>
<source>Bioinformatics</source>
<year>2007</year>
<volume>23</volume>
<issue>10</issue>
<fpage>1195</fpage>
<lpage>1202</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btm114</pub-id>
<pub-id pub-id-type="pmid">17392329</pub-id>
</element-citation>
</ref>
<ref id="CR42">
<label>42.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chung</surname>
<given-names>W-H</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>S-B</given-names>
</name>
</person-group>
<article-title>An empirical study of choosing efficient discriminative seeds for oligonucleotide design</article-title>
<source>BMC Genom</source>
<year>2009</year>
<volume>10</volume>
<issue>Suppl 3</issue>
<fpage>3</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-10-S3-S3</pub-id>
</element-citation>
</ref>
<ref id="CR43">
<label>43.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ilie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ilie</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Khoshraftar</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Mansouri Bigvand</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Seeds for effective oligonucleotide design</article-title>
<source>BMC Genom</source>
<year>2011</year>
<volume>12</volume>
<fpage>280</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2164-12-280</pub-id>
</element-citation>
</ref>
<ref id="CR44">
<label>44.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ilie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Mohamadi</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Brian Golding</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Smyth</surname>
<given-names>WF</given-names>
</name>
</person-group>
<article-title>BOND: Basic Oligo Nucleotide Design</article-title>
<source>BMC Bioinform</source>
<year>2013</year>
<volume>14</volume>
<fpage>69</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-14-69</pub-id>
</element-citation>
</ref>
<ref id="CR45">
<label>45.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kisman</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>tPatternhunter: gapped, fast and sensitive translated homology search</article-title>
<source>Bioinformatics</source>
<year>2005</year>
<volume>21</volume>
<issue>4</issue>
<fpage>542</fpage>
<lpage>544</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bti035</pub-id>
<pub-id pub-id-type="pmid">15374861</pub-id>
</element-citation>
</ref>
<ref id="CR46">
<label>46.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brown</surname>
<given-names>DG</given-names>
</name>
</person-group>
<article-title>Optimizing multiple seeds for protein homology search</article-title>
<source>IEEE/ACM Trans Comput Biol Bioinform</source>
<year>2005</year>
<volume>2</volume>
<issue>1</issue>
<fpage>23</fpage>
<lpage>38</lpage>
<pub-id pub-id-type="doi">10.1109/TCBB.2005.13</pub-id>
</element-citation>
</ref>
<ref id="CR47">
<label>47.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roytberg</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Gambin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Lasota</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Furletova</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Szczurek</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>On subset seeds for protein alignment</article-title>
<source>IEEE/ACM Trans Comput Biol Bioinform</source>
<year>2009</year>
<volume>6</volume>
<issue>3</issue>
<fpage>483</fpage>
<lpage>494</lpage>
<pub-id pub-id-type="doi">10.1109/TCBB.2009.4</pub-id>
<pub-id pub-id-type="pmid">19644175</pub-id>
</element-citation>
</ref>
<ref id="CR48">
<label>48.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nguyen</surname>
<given-names>V-H</given-names>
</name>
<name>
<surname>Lavenier</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>PLAST: parallel local alignment search tool for database comparison</article-title>
<source>BMC Bioinform</source>
<year>2009</year>
<volume>10</volume>
<fpage>329</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-10-329</pub-id>
</element-citation>
</ref>
<ref id="CR49">
<label>49.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Startek</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lasota</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Sykulski</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bułak</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Gambin</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Efficient alternatives to PSI-BLAST</article-title>
<source>Bull Pol Acad Sci Tech Sci</source>
<year>2012</year>
<volume>60</volume>
<issue>3</issue>
<fpage>495</fpage>
<lpage>505</lpage>
</element-citation>
</ref>
<ref id="CR50">
<label>50.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Optimizing spaced k-mer neighbors for efficient filtration in protein similarity search</article-title>
<source>IEEE/ACM Trans Comput Biol Bioinform</source>
<year>2014</year>
<volume>11</volume>
<issue>2</issue>
<fpage>398</fpage>
<lpage>406</lpage>
<pub-id pub-id-type="doi">10.1109/TCBB.2014.2306831</pub-id>
<pub-id pub-id-type="pmid">26355786</pub-id>
</element-citation>
</ref>
<ref id="CR51">
<label>51.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Buchfink</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Huson</surname>
<given-names>DH</given-names>
</name>
</person-group>
<article-title>Fast and sensitive protein alignment using DIAMOND</article-title>
<source>Nat Methods</source>
<year>2014</year>
<volume>12</volume>
<fpage>59</fpage>
<lpage>60</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.3176</pub-id>
<pub-id pub-id-type="pmid">25402007</pub-id>
</element-citation>
</ref>
<ref id="CR52">
<label>52.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Somervuo</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Holm</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>SANSparallel: interactive homology search against Uniprot</article-title>
<source>Nucleic Acids Res</source>
<year>2015</year>
<volume>43</volume>
<issue>W1</issue>
<fpage>24</fpage>
<lpage>29</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkv317</pub-id>
</element-citation>
</ref>
<ref id="CR53">
<label>53.</label>
<mixed-citation publication-type="other">Petrov I, Brillet S, Drezen E, Quiniou S, Antin L, Durand P, Lavenier D. KLAST: fast and sensitive software to compare large genomic databanks on cloud. In: Proceedings world congress in computer science, computer engineering, and applied computing (WORLDCOMP). Las Vegas; 2015. p. 85–90.</mixed-citation>
</ref>
<ref id="CR54">
<label>54.</label>
<mixed-citation publication-type="other">Yang IH, Wang SH, Chen YH, Huang PH, Ye L, Huang X, Chao KM. Efficient methods for generating optimal single and multiple spaced seeds. In: Proceedings of the IEEE 4th symposium on bioinformatics and bioengineering (BIBE). Taichung: IEEE Computer Society Press; 2004. p. 411–16.</mixed-citation>
</ref>
<ref id="CR55">
<label>55.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ilie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ilie</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Multiple spaced seeds for homology search</article-title>
<source>Bioinformatics</source>
<year>2007</year>
<volume>23</volume>
<issue>22</issue>
<fpage>2969</fpage>
<lpage>2977</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btm422</pub-id>
<pub-id pub-id-type="pmid">17804438</pub-id>
</element-citation>
</ref>
<ref id="CR56">
<label>56.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ilie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ilie</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>SpEED: fast computation of sensitive spaced seeds</article-title>
<source>Bioinformatics</source>
<year>2011</year>
<volume>27</volume>
<issue>17</issue>
<fpage>2433</fpage>
<lpage>2434</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btr368</pub-id>
<pub-id pub-id-type="pmid">21690104</pub-id>
</element-citation>
</ref>
<ref id="CR57">
<label>57.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ilie</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Efficient computation of spaced seeds</article-title>
<source>BMC Res Notes</source>
<year>2012</year>
<volume>5</volume>
<fpage>123</fpage>
<pub-id pub-id-type="doi">10.1186/1756-0500-5-123</pub-id>
<pub-id pub-id-type="pmid">22373455</pub-id>
</element-citation>
</ref>
<ref id="CR58">
<label>58.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Egidi</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Manzini</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Better spaced seeds using quadratic residues</article-title>
<source>J Comput Syst Sci</source>
<year>2013</year>
<volume>79</volume>
<issue>7</issue>
<fpage>1144</fpage>
<lpage>1155</lpage>
<pub-id pub-id-type="doi">10.1016/j.jcss.2013.03.002</pub-id>
</element-citation>
</ref>
<ref id="CR59">
<label>59.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Egidi</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Manzini</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Design and analysis of periodic multiple seeds</article-title>
<source>Theor Comput Sci</source>
<year>2014</year>
<volume>522</volume>
<fpage>62</fpage>
<lpage>76</lpage>
<pub-id pub-id-type="doi">10.1016/j.tcs.2013.12.007</pub-id>
</element-citation>
</ref>
<ref id="CR60">
<label>60.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Egidi</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Manzini</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Spaced seeds design using perfect rulers</article-title>
<source>Fund Inform</source>
<year>2014</year>
<volume>131</volume>
<issue>2</issue>
<fpage>187</fpage>
<lpage>203</lpage>
</element-citation>
</ref>
<ref id="CR61">
<label>61.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Egidi</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Manzini</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Multiple seeds sensitivity using a single seed with threshold</article-title>
<source>J Bioinform Comput Biol</source>
<year>2015</year>
<volume>13</volume>
<issue>4</issue>
<fpage>1550011</fpage>
<pub-id pub-id-type="doi">10.1142/S0219720015500110</pub-id>
<pub-id pub-id-type="pmid">25747382</pub-id>
</element-citation>
</ref>
<ref id="CR62">
<label>62.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brejová</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Vinař</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Optimal spaced seeds for homologous coding regions</article-title>
<source>J Bioinform Comput Biol</source>
<year>2004</year>
<volume>1</volume>
<issue>4</issue>
<fpage>595</fpage>
<lpage>610</lpage>
<pub-id pub-id-type="doi">10.1142/S0219720004000326</pub-id>
<pub-id pub-id-type="pmid">15290755</pub-id>
</element-citation>
</ref>
<ref id="CR63">
<label>63.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Buhler</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Keich</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Designing seeds for similarity search in genomic DNA</article-title>
<source>J Comput Syst Sci</source>
<year>2005</year>
<volume>70</volume>
<issue>3</issue>
<fpage>342</fpage>
<lpage>363</lpage>
<pub-id pub-id-type="doi">10.1016/j.jcss.2004.12.003</pub-id>
</element-citation>
</ref>
<ref id="CR64">
<label>64.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Preparata</surname>
<given-names>FP</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Choi</surname>
<given-names>KP</given-names>
</name>
</person-group>
<article-title>Quick, practical selection of effective seeds for homology search</article-title>
<source>J Comput Biol</source>
<year>2005</year>
<volume>12</volume>
<issue>9</issue>
<fpage>1137</fpage>
<lpage>1152</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2005.12.1137</pub-id>
<pub-id pub-id-type="pmid">16305325</pub-id>
</element-citation>
</ref>
<ref id="CR65">
<label>65.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Roytberg</surname>
<given-names>MA</given-names>
</name>
</person-group>
<article-title>A unifying framework for seed sensitivity and its application to subset seeds</article-title>
<source>J Bioinform Comput Biol</source>
<year>2006</year>
<volume>4</volume>
<issue>2</issue>
<fpage>553</fpage>
<lpage>569</lpage>
<pub-id pub-id-type="doi">10.1142/S0219720006001977</pub-id>
<pub-id pub-id-type="pmid">16819802</pub-id>
</element-citation>
</ref>
<ref id="CR66">
<label>66.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Superiority of spaced seeds for homology search</article-title>
<source>IEEE/ACM Trans Comput Biol Bioinform</source>
<year>2007</year>
<volume>4</volume>
<issue>3</issue>
<fpage>496</fpage>
<lpage>505</lpage>
<pub-id pub-id-type="doi">10.1109/tcbb.2007.1013</pub-id>
<pub-id pub-id-type="pmid">17666769</pub-id>
</element-citation>
</ref>
<ref id="CR67">
<label>67.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kong</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Generalized correlation functions and their applications in selection of optimal multiple spaced seeds for homology search</article-title>
<source>J Comput Biol</source>
<year>2007</year>
<volume>14</volume>
<issue>2</issue>
<fpage>238</fpage>
<lpage>254</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2006.0008</pub-id>
<pub-id pub-id-type="pmid">17456017</pub-id>
</element-citation>
</ref>
<ref id="CR68">
<label>68.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Gîrdea</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Designing efficient spaced seeds for SOLiD read mapping</article-title>
<source>Adv Bioinform</source>
<year>2010</year>
<volume>2010</volume>
<fpage>708501</fpage>
<pub-id pub-id-type="doi">10.1155/2010/708501</pub-id>
</element-citation>
</ref>
<ref id="CR69">
<label>69.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marschall</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Herms</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Kaltenbach</surname>
<given-names>H-M</given-names>
</name>
<name>
<surname>Rahmann</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Probabilistic arithmetic automata and their applications</article-title>
<source>IEEE/ACM Trans Comput Biol Bioinform</source>
<year>2012</year>
<volume>9</volume>
<issue>6</issue>
<fpage>1737</fpage>
<lpage>1750</lpage>
<pub-id pub-id-type="doi">10.1109/TCBB.2012.109</pub-id>
<pub-id pub-id-type="pmid">22868683</pub-id>
</element-citation>
</ref>
<ref id="CR70">
<label>70.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martin</surname>
<given-names>DEK</given-names>
</name>
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Faster exact distributions of pattern statistics through sequential elimination of states</article-title>
<source>Ann Inst Stat Math</source>
<year>2017</year>
<volume>69</volume>
<fpage>1</fpage>
<lpage>18</lpage>
<pub-id pub-id-type="doi">10.1007/s10463-015-0540-y</pub-id>
</element-citation>
</ref>
<ref id="CR71">
<label>71.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Horwege</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lindner</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Boden</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hatje</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kollmar</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Leimeister</surname>
<given-names>C-A</given-names>
</name>
<name>
<surname>Morgenstern</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches</article-title>
<source>Nucleic Acids Res</source>
<year>2014</year>
<volume>42</volume>
<issue>W1</issue>
<fpage>7</fpage>
<lpage>11</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gku398</pub-id>
</element-citation>
</ref>
<ref id="CR72">
<label>72.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leimeister</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Boden</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Horwege</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lindner</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Morgenstern</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Fast alignment-free sequence comparison using spaced-word frequencies</article-title>
<source>Bioinformatics</source>
<year>2014</year>
<volume>30</volume>
<issue>14</issue>
<fpage>1991</fpage>
<lpage>1999</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btu177</pub-id>
<pub-id pub-id-type="pmid">24700317</pub-id>
</element-citation>
</ref>
<ref id="CR73">
<label>73.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ghandi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mohammad-Noori</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Beer</surname>
<given-names>MA</given-names>
</name>
</person-group>
<article-title>Robust k-mer frequency estimation using gapped k-mers</article-title>
<source>J Math Biol</source>
<year>2014</year>
<volume>69</volume>
<issue>2</issue>
<fpage>469</fpage>
<lpage>500</lpage>
<pub-id pub-id-type="doi">10.1007/s00285-013-0705-3</pub-id>
<pub-id pub-id-type="pmid">23861010</pub-id>
</element-citation>
</ref>
<ref id="CR74">
<label>74.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Morgenstern</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Horwege</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Leimeister</surname>
<given-names>CA</given-names>
</name>
</person-group>
<article-title>Estimating evolutionary distances between genomic sequences from spaced-word matches</article-title>
<source>Algorithms Mol Biol</source>
<year>2015</year>
<volume>10</volume>
<fpage>5</fpage>
<pub-id pub-id-type="doi">10.1186/s13015-015-0032-x</pub-id>
<pub-id pub-id-type="pmid">25685176</pub-id>
</element-citation>
</ref>
<ref id="CR75">
<label>75.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Břinda</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Sykulski</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kucherov</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Spaced seeds improve k-mer based metagenomic classification</article-title>
<source>Bioinformatics</source>
<year>2015</year>
<volume>31</volume>
<issue>22</issue>
<fpage>3584</fpage>
<lpage>3592</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btv419</pub-id>
<pub-id pub-id-type="pmid">26209798</pub-id>
</element-citation>
</ref>
<ref id="CR76">
<label>76.</label>
<mixed-citation publication-type="other">Ounit R, Lonardi S. Higher classification sensitivity of short metagenomic reads with CLARK-S. Bioinformatics. 2016.</mixed-citation>
</ref>
<ref id="CR77">
<label>77.</label>
<mixed-citation publication-type="other">Duc DD, Dinh HQ, Dang TH, Laukens K, Hoang XH. AcoSeeD: an ant colony optimization for finding optimal spaced seeds in biological sequence search. Proceedings of the 8th international conference on swarm intelligence (ANTS), vol 7461. Lecture notes in computer science. Brussels: Springer; 2012. p. 204–11.</mixed-citation>
</ref>
<ref id="CR78">
<label>78.</label>
<mixed-citation publication-type="other">Do PT, Tran-Thi CG. An improvement of the overlap complexity in the spaced seed searching problem between genomic DNAs. In: Proceedings of the 2nd National Foundation for Science and Technology Development Conference on Information and Computer Science (NICS). Ho Chi Minh City; 2015. p. 271–76.</mixed-citation>
</ref>
<ref id="CR79">
<label>79.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gheraibia</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Moussaoui</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Djenouri</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Kabir</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>P-Y</given-names>
</name>
<name>
<surname>Mazouzi</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Penguin search optimisation algorithm for finding optimal spaced seeds</article-title>
<source>Int J Softw Sci Comput Intell</source>
<year>2015</year>
<volume>7</volume>
<issue>2</issue>
<fpage>85</fpage>
<lpage>99</lpage>
<pub-id pub-id-type="doi">10.4018/IJSSCI.2015040105</pub-id>
</element-citation>
</ref>
<ref id="CR80">
<label>80.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hahn</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Leimeister</surname>
<given-names>C-A</given-names>
</name>
<name>
<surname>Ounit</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Lonardi</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Morgenstern</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>rasbhari: optimizing spaced seeds for database searching, read mapping and alignment-free sequence comparison</article-title>
<source>PLoS Comput Biol</source>
<year>2016</year>
<volume>12</volume>
<issue>10</issue>
<fpage>1005107</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1005107</pub-id>
</element-citation>
</ref>
<ref id="CR81">
<label>81.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Choi</surname>
<given-names>KP</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Good spaced seeds for homology search</article-title>
<source>Bioinformatics</source>
<year>2004</year>
<volume>20</volume>
<issue>7</issue>
<fpage>1053</fpage>
<lpage>1059</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bth037</pub-id>
<pub-id pub-id-type="pmid">14764573</pub-id>
</element-citation>
</ref>
<ref id="CR82">
<label>82.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Allauzen</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Riley</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Schalkwyk</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Skut</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Mohri</surname>
<given-names>M</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Holub</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zdarek</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>OpenFst: a general and efficient weighted finite-state transducer library</article-title>
<source>Proceedings of the 12th international conference on implementation and application of automata (CIAA)</source>
<year>2007</year>
<publisher-loc>Prague</publisher-loc>
<publisher-name>Springer</publisher-name>
<fpage>11</fpage>
<lpage>23</lpage>
</element-citation>
</ref>
<ref id="CR83">
<label>83.</label>
<mixed-citation publication-type="other">Mohri M. Weighted automata algorithms. In: Handbook of weighted automata. Berlin: Springer; 2009. p. 213–54.</mixed-citation>
</ref>
<ref id="CR84">
<label>84.</label>
<mixed-citation publication-type="other">Huang L. Dynamic programming algorithms in semiring and hypergraph frameworks. Technical report, University of Pennsylvania, Philadelphia, USA; 2006.</mixed-citation>
</ref>
<ref id="CR85">
<label>85.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Hopcroft</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Motwani</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Ullman</surname>
<given-names>JD</given-names>
</name>
</person-group>
<source>Introduction to automata theory languages and computation</source>
<year>2007</year>
<edition>3</edition>
<publisher-loc>New York</publisher-loc>
<publisher-name>Pearson</publisher-name>
</element-citation>
</ref>
<ref id="CR86">
<label>86.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aston</surname>
<given-names>JAD</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>DEK</given-names>
</name>
</person-group>
<article-title>Distributions associated with general runs and patterns in hidden Markov models</article-title>
<source>Ann Appl Stat</source>
<year>2007</year>
<volume>1</volume>
<issue>2</issue>
<fpage>585</fpage>
<lpage>611</lpage>
<pub-id pub-id-type="doi">10.1214/07-AOAS125</pub-id>
</element-citation>
</ref>
<ref id="CR87">
<label>87.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Noé</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>DEK</given-names>
</name>
</person-group>
<article-title>A coverage criterion for spaced seeds and its applications to support vector machine string kernels and k-mer distances</article-title>
<source>J Comput Biol</source>
<year>2014</year>
<volume>21</volume>
<issue>12</issue>
<fpage>947</fpage>
<lpage>963</lpage>
<pub-id pub-id-type="doi">10.1089/cmb.2014.0173</pub-id>
<pub-id pub-id-type="pmid">25393923</pub-id>
</element-citation>
</ref>
<ref id="CR88">
<label>88.</label>
<mixed-citation publication-type="other">Kucherov G, Noé L, Roytberg MA. Iedera subset seed design tool.
<ext-link ext-link-type="uri" xlink:href="http://bioinfo.lifl.fr/yass/iedera.php">http://bioinfo.lifl.fr/yass/iedera.php</ext-link>
; 2016.</mixed-citation>
</ref>
<ref id="CR89">
<label>89.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>On the complexity of spaced seeds</article-title>
<source>J Comput Syst Sci</source>
<year>2007</year>
<volume>73</volume>
<issue>7</issue>
<fpage>1024</fpage>
<lpage>1034</lpage>
<pub-id pub-id-type="doi">10.1016/j.jcss.2007.03.008</pub-id>
</element-citation>
</ref>
<ref id="CR90">
<label>90.</label>
<mixed-citation publication-type="other">Li M, Ma B, Zhang L. Superiority and complexity of the spaced seeds. In: Proceedings of the 17th symposium on discrete algorithms (SODA). Miami: ACM Press; 2006. p. 444–53.</mixed-citation>
</ref>
<ref id="CR91">
<label>91.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nicodème</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Salvy</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Flajolet</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Motif statistics</article-title>
<source>Theor Comput Sci</source>
<year>2002</year>
<volume>287</volume>
<issue>2</issue>
<fpage>593</fpage>
<lpage>617</lpage>
<pub-id pub-id-type="doi">10.1016/S0304-3975(01)00264-X</pub-id>
</element-citation>
</ref>
<ref id="CR92">
<label>92.</label>
<mixed-citation publication-type="other">Myers G. 1. What’s behind blast. Models and algorithms for genome evolution, vol 19. Computational biology. Berlin: Springer; 2013. p. 3–15.</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000246 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000246 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:5310094
   |texte=   Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:28289437" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021