Multi-seed lossless filtration
Identifieur interne :
000650 ( PascalFrancis/Corpus );
précédent :
000649;
suivant :
000651
Multi-seed lossless filtration
Auteurs : Gregory Kucherov ;
Laurent Noe ;
Mikhail RoytbergSource :
-
Lecture notes in computer science [ 0302-9743 ] ; 2004.
RBID : Pascal:04-0412576
Descripteurs français
English descriptors
Abstract
We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.
Notice en format standard (ISO 2709)
Pour connaître la documentation sur le format Inist Standard.
pA |
A01 | 01 | 1 | | @0 0302-9743 |
---|
A05 | | | | @2 3109 |
---|
A08 | 01 | 1 | ENG | @1 Multi-seed lossless filtration |
---|
A09 | 01 | 1 | ENG | @1 CPM 2004 : combinatorial pattern matching : Istanbul, 5-7 July 2004 |
---|
A11 | 01 | 1 | | @1 KUCHEROV (Gregory) |
---|
A11 | 02 | 1 | | @1 NOE (Laurent) |
---|
A11 | 03 | 1 | | @1 ROYTBERG (Mikhail) |
---|
A12 | 01 | 1 | | @1 SAHINALP (Suleyman Cenk) @9 ed. |
---|
A12 | 02 | 1 | | @1 MUTHUKRISHNAN (S.) @9 ed. |
---|
A12 | 03 | 1 | | @1 DOGRUSOZ (Ugur) @9 ed. |
---|
A14 | 01 | | | @1 INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101 @2 54602, Villers-lès-Nancy @3 FRA @Z 1 aut. @Z 2 aut. |
---|
A14 | 02 | | | @1 Institute of Mathematical Problems in Biology @2 Pushchino, Moscow Region, 142290 @3 RUS @Z 3 aut. |
---|
A20 | | | | @1 297-310 |
---|
A21 | | | | @1 2004 |
---|
A23 | 01 | | | @0 ENG |
---|
A26 | 01 | | | @0 3-540-22341-X |
---|
A43 | 01 | | | @1 INIST @2 16343 @5 354000117912830220 |
---|
A44 | | | | @0 0000 @1 © 2004 INIST-CNRS. All rights reserved. |
---|
A45 | | | | @0 21 ref. |
---|
A47 | 01 | 1 | | @0 04-0412576 |
---|
A60 | | | | @1 P @2 C |
---|
A61 | | | | @0 A |
---|
A64 | 01 | 1 | | @0 Lecture notes in computer science |
---|
A66 | 01 | | | @0 DEU |
---|
C01 | 01 | | ENG | @0 We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database. |
---|
C02 | 01 | X | | @0 001D02A05 |
---|
C03 | 01 | X | FRE | @0 Problème combinatoire @5 01 |
---|
C03 | 01 | X | ENG | @0 Combinatorial problem @5 01 |
---|
C03 | 01 | X | SPA | @0 Problema combinatorio @5 01 |
---|
C03 | 02 | X | FRE | @0 Concordance forme @5 06 |
---|
C03 | 02 | X | ENG | @0 Pattern matching @5 06 |
---|
C03 | 03 | X | FRE | @0 Echelle grande @5 07 |
---|
C03 | 03 | X | ENG | @0 Large scale @5 07 |
---|
C03 | 03 | X | SPA | @0 Escala grande @5 07 |
---|
C03 | 04 | X | FRE | @0 Base donnée @5 08 |
---|
C03 | 04 | X | ENG | @0 Database @5 08 |
---|
C03 | 04 | X | SPA | @0 Base dato @5 08 |
---|
C03 | 05 | X | FRE | @0 Semence @5 18 |
---|
C03 | 05 | X | ENG | @0 Seed @5 18 |
---|
C03 | 05 | X | SPA | @0 Semilla @5 18 |
---|
C03 | 06 | X | FRE | @0 Filtration @5 19 |
---|
C03 | 06 | X | ENG | @0 Filtration @5 19 |
---|
C03 | 06 | X | SPA | @0 Filtración @5 19 |
---|
C03 | 07 | 3 | FRE | @0 Appariement chaîne @5 20 |
---|
C03 | 07 | 3 | ENG | @0 String matching @5 20 |
---|
C03 | 08 | X | FRE | @0 Oligonucléotide @5 21 |
---|
C03 | 08 | X | ENG | @0 Oligonucleotide @5 21 |
---|
C03 | 08 | X | SPA | @0 Oligonucleótido @5 21 |
---|
C03 | 09 | X | FRE | @0 Problème sélection @5 23 |
---|
C03 | 09 | X | ENG | @0 Selection problem @5 23 |
---|
C03 | 09 | X | SPA | @0 Problema selección @5 23 |
---|
N21 | | | | @1 236 |
---|
N44 | 01 | | | @1 OTO |
---|
N82 | | | | @1 OTO |
---|
|
pR |
A30 | 01 | 1 | ENG | @1 Combinatorial pattern matching. Annual symposium @2 15 @3 Istanbul TUR @4 2004-07-05 |
---|
|
Format Inist (serveur)
NO : | PASCAL 04-0412576 INIST |
ET : | Multi-seed lossless filtration |
AU : | KUCHEROV (Gregory); NOE (Laurent); ROYTBERG (Mikhail); SAHINALP (Suleyman Cenk); MUTHUKRISHNAN (S.); DOGRUSOZ (Ugur) |
AF : | INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101/54602, Villers-lès-Nancy/France (1 aut., 2 aut.); Institute of Mathematical Problems in Biology/Pushchino, Moscow Region, 142290/Russie (3 aut.) |
DT : | Publication en série; Congrès; Niveau analytique |
SO : | Lecture notes in computer science; ISSN 0302-9743; Allemagne; Da. 2004; Vol. 3109; Pp. 297-310; Bibl. 21 ref. |
LA : | Anglais |
EA : | We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database. |
CC : | 001D02A05 |
FD : | Problème combinatoire; Concordance forme; Echelle grande; Base donnée; Semence; Filtration; Appariement chaîne; Oligonucléotide; Problème sélection |
ED : | Combinatorial problem; Pattern matching; Large scale; Database; Seed; Filtration; String matching; Oligonucleotide; Selection problem |
SD : | Problema combinatorio; Escala grande; Base dato; Semilla; Filtración; Oligonucleótido; Problema selección |
LO : | INIST-16343.354000117912830220 |
ID : | 04-0412576 |
Links to Exploration step
Pascal:04-0412576
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Multi-seed lossless filtration</title>
<author><name sortKey="Kucherov, Gregory" sort="Kucherov, Gregory" uniqKey="Kucherov G" first="Gregory" last="Kucherov">Gregory Kucherov</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noe">Laurent Noe</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Roytberg, Mikhail" sort="Roytberg, Mikhail" uniqKey="Roytberg M" first="Mikhail" last="Roytberg">Mikhail Roytberg</name>
<affiliation><inist:fA14 i1="02"><s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">04-0412576</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 04-0412576 INIST</idno>
<idno type="RBID">Pascal:04-0412576</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000650</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Multi-seed lossless filtration</title>
<author><name sortKey="Kucherov, Gregory" sort="Kucherov, Gregory" uniqKey="Kucherov G" first="Gregory" last="Kucherov">Gregory Kucherov</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noe">Laurent Noe</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Roytberg, Mikhail" sort="Roytberg, Mikhail" uniqKey="Roytberg M" first="Mikhail" last="Roytberg">Mikhail Roytberg</name>
<affiliation><inist:fA14 i1="02"><s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
<imprint><date when="2004">2004</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Combinatorial problem</term>
<term>Database</term>
<term>Filtration</term>
<term>Large scale</term>
<term>Oligonucleotide</term>
<term>Pattern matching</term>
<term>Seed</term>
<term>Selection problem</term>
<term>String matching</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Problème combinatoire</term>
<term>Concordance forme</term>
<term>Echelle grande</term>
<term>Base donnée</term>
<term>Semence</term>
<term>Filtration</term>
<term>Appariement chaîne</term>
<term>Oligonucléotide</term>
<term>Problème sélection</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>0302-9743</s0>
</fA01>
<fA05><s2>3109</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG"><s1>Multi-seed lossless filtration</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>CPM 2004 : combinatorial pattern matching : Istanbul, 5-7 July 2004</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>KUCHEROV (Gregory)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>NOE (Laurent)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>ROYTBERG (Mikhail)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>SAHINALP (Suleyman Cenk)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>MUTHUKRISHNAN (S.)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="03" i2="1"><s1>DOGRUSOZ (Ugur)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA20><s1>297-310</s1>
</fA20>
<fA21><s1>2004</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA26 i1="01"><s0>3-540-22341-X</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>16343</s2>
<s5>354000117912830220</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2004 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>21 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>04-0412576</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA64 i1="01" i2="1"><s0>Lecture notes in computer science</s0>
</fA64>
<fA66 i1="01"><s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02A05</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Problème combinatoire</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Combinatorial problem</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Problema combinatorio</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Concordance forme</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Pattern matching</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Echelle grande</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Large scale</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Escala grande</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Base donnée</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Database</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Base dato</s0>
<s5>08</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Semence</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Seed</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Semilla</s0>
<s5>18</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Filtration</s0>
<s5>19</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Filtration</s0>
<s5>19</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Filtración</s0>
<s5>19</s5>
</fC03>
<fC03 i1="07" i2="3" l="FRE"><s0>Appariement chaîne</s0>
<s5>20</s5>
</fC03>
<fC03 i1="07" i2="3" l="ENG"><s0>String matching</s0>
<s5>20</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Oligonucléotide</s0>
<s5>21</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Oligonucleotide</s0>
<s5>21</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Oligonucleótido</s0>
<s5>21</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE"><s0>Problème sélection</s0>
<s5>23</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG"><s0>Selection problem</s0>
<s5>23</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA"><s0>Problema selección</s0>
<s5>23</s5>
</fC03>
<fN21><s1>236</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Combinatorial pattern matching. Annual symposium</s1>
<s2>15</s2>
<s3>Istanbul TUR</s3>
<s4>2004-07-05</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 04-0412576 INIST</NO>
<ET>Multi-seed lossless filtration</ET>
<AU>KUCHEROV (Gregory); NOE (Laurent); ROYTBERG (Mikhail); SAHINALP (Suleyman Cenk); MUTHUKRISHNAN (S.); DOGRUSOZ (Ugur)</AU>
<AF>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101/54602, Villers-lès-Nancy/France (1 aut., 2 aut.); Institute of Mathematical Problems in Biology/Pushchino, Moscow Region, 142290/Russie (3 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Lecture notes in computer science; ISSN 0302-9743; Allemagne; Da. 2004; Vol. 3109; Pp. 297-310; Bibl. 21 ref.</SO>
<LA>Anglais</LA>
<EA>We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</EA>
<CC>001D02A05</CC>
<FD>Problème combinatoire; Concordance forme; Echelle grande; Base donnée; Semence; Filtration; Appariement chaîne; Oligonucléotide; Problème sélection</FD>
<ED>Combinatorial problem; Pattern matching; Large scale; Database; Seed; Filtration; String matching; Oligonucleotide; Selection problem</ED>
<SD>Problema combinatorio; Escala grande; Base dato; Semilla; Filtración; Oligonucleótido; Problema selección</SD>
<LO>INIST-16343.354000117912830220</LO>
<ID>04-0412576</ID>
</server>
</inist>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000650 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000650 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien
|wiki= Wicri/Lorraine
|area= InforLorV4
|flux= PascalFrancis
|étape= Corpus
|type= RBID
|clé= Pascal:04-0412576
|texte= Multi-seed lossless filtration
}}
| This area was generated with Dilib version V0.6.33. Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022 | ![](Common/icons/LogoDilib.gif) |