Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Multi-seed lossless filtration

Identifieur interne : 000650 ( PascalFrancis/Corpus ); précédent : 000649; suivant : 000651

Multi-seed lossless filtration

Auteurs : Gregory Kucherov ; Laurent Noe ; Mikhail Roytberg

Source :

RBID : Pascal:04-0412576

Descripteurs français

English descriptors

Abstract

We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 0302-9743
A05       @2 3109
A08 01  1  ENG  @1 Multi-seed lossless filtration
A09 01  1  ENG  @1 CPM 2004 : combinatorial pattern matching : Istanbul, 5-7 July 2004
A11 01  1    @1 KUCHEROV (Gregory)
A11 02  1    @1 NOE (Laurent)
A11 03  1    @1 ROYTBERG (Mikhail)
A12 01  1    @1 SAHINALP (Suleyman Cenk) @9 ed.
A12 02  1    @1 MUTHUKRISHNAN (S.) @9 ed.
A12 03  1    @1 DOGRUSOZ (Ugur) @9 ed.
A14 01      @1 INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101 @2 54602, Villers-lès-Nancy @3 FRA @Z 1 aut. @Z 2 aut.
A14 02      @1 Institute of Mathematical Problems in Biology @2 Pushchino, Moscow Region, 142290 @3 RUS @Z 3 aut.
A20       @1 297-310
A21       @1 2004
A23 01      @0 ENG
A26 01      @0 3-540-22341-X
A43 01      @1 INIST @2 16343 @5 354000117912830220
A44       @0 0000 @1 © 2004 INIST-CNRS. All rights reserved.
A45       @0 21 ref.
A47 01  1    @0 04-0412576
A60       @1 P @2 C
A61       @0 A
A64 01  1    @0 Lecture notes in computer science
A66 01      @0 DEU
C01 01    ENG  @0 We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.
C02 01  X    @0 001D02A05
C03 01  X  FRE  @0 Problème combinatoire @5 01
C03 01  X  ENG  @0 Combinatorial problem @5 01
C03 01  X  SPA  @0 Problema combinatorio @5 01
C03 02  X  FRE  @0 Concordance forme @5 06
C03 02  X  ENG  @0 Pattern matching @5 06
C03 03  X  FRE  @0 Echelle grande @5 07
C03 03  X  ENG  @0 Large scale @5 07
C03 03  X  SPA  @0 Escala grande @5 07
C03 04  X  FRE  @0 Base donnée @5 08
C03 04  X  ENG  @0 Database @5 08
C03 04  X  SPA  @0 Base dato @5 08
C03 05  X  FRE  @0 Semence @5 18
C03 05  X  ENG  @0 Seed @5 18
C03 05  X  SPA  @0 Semilla @5 18
C03 06  X  FRE  @0 Filtration @5 19
C03 06  X  ENG  @0 Filtration @5 19
C03 06  X  SPA  @0 Filtración @5 19
C03 07  3  FRE  @0 Appariement chaîne @5 20
C03 07  3  ENG  @0 String matching @5 20
C03 08  X  FRE  @0 Oligonucléotide @5 21
C03 08  X  ENG  @0 Oligonucleotide @5 21
C03 08  X  SPA  @0 Oligonucleótido @5 21
C03 09  X  FRE  @0 Problème sélection @5 23
C03 09  X  ENG  @0 Selection problem @5 23
C03 09  X  SPA  @0 Problema selección @5 23
N21       @1 236
N44 01      @1 OTO
N82       @1 OTO
pR  
A30 01  1  ENG  @1 Combinatorial pattern matching. Annual symposium @2 15 @3 Istanbul TUR @4 2004-07-05

Format Inist (serveur)

NO : PASCAL 04-0412576 INIST
ET : Multi-seed lossless filtration
AU : KUCHEROV (Gregory); NOE (Laurent); ROYTBERG (Mikhail); SAHINALP (Suleyman Cenk); MUTHUKRISHNAN (S.); DOGRUSOZ (Ugur)
AF : INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101/54602, Villers-lès-Nancy/France (1 aut., 2 aut.); Institute of Mathematical Problems in Biology/Pushchino, Moscow Region, 142290/Russie (3 aut.)
DT : Publication en série; Congrès; Niveau analytique
SO : Lecture notes in computer science; ISSN 0302-9743; Allemagne; Da. 2004; Vol. 3109; Pp. 297-310; Bibl. 21 ref.
LA : Anglais
EA : We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.
CC : 001D02A05
FD : Problème combinatoire; Concordance forme; Echelle grande; Base donnée; Semence; Filtration; Appariement chaîne; Oligonucléotide; Problème sélection
ED : Combinatorial problem; Pattern matching; Large scale; Database; Seed; Filtration; String matching; Oligonucleotide; Selection problem
SD : Problema combinatorio; Escala grande; Base dato; Semilla; Filtración; Oligonucleótido; Problema selección
LO : INIST-16343.354000117912830220
ID : 04-0412576

Links to Exploration step

Pascal:04-0412576

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Multi-seed lossless filtration</title>
<author>
<name sortKey="Kucherov, Gregory" sort="Kucherov, Gregory" uniqKey="Kucherov G" first="Gregory" last="Kucherov">Gregory Kucherov</name>
<affiliation>
<inist:fA14 i1="01">
<s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noe">Laurent Noe</name>
<affiliation>
<inist:fA14 i1="01">
<s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Roytberg, Mikhail" sort="Roytberg, Mikhail" uniqKey="Roytberg M" first="Mikhail" last="Roytberg">Mikhail Roytberg</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">04-0412576</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 04-0412576 INIST</idno>
<idno type="RBID">Pascal:04-0412576</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000650</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Multi-seed lossless filtration</title>
<author>
<name sortKey="Kucherov, Gregory" sort="Kucherov, Gregory" uniqKey="Kucherov G" first="Gregory" last="Kucherov">Gregory Kucherov</name>
<affiliation>
<inist:fA14 i1="01">
<s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noe">Laurent Noe</name>
<affiliation>
<inist:fA14 i1="01">
<s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Roytberg, Mikhail" sort="Roytberg, Mikhail" uniqKey="Roytberg M" first="Mikhail" last="Roytberg">Mikhail Roytberg</name>
<affiliation>
<inist:fA14 i1="02">
<s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
<imprint>
<date when="2004">2004</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Combinatorial problem</term>
<term>Database</term>
<term>Filtration</term>
<term>Large scale</term>
<term>Oligonucleotide</term>
<term>Pattern matching</term>
<term>Seed</term>
<term>Selection problem</term>
<term>String matching</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Problème combinatoire</term>
<term>Concordance forme</term>
<term>Echelle grande</term>
<term>Base donnée</term>
<term>Semence</term>
<term>Filtration</term>
<term>Appariement chaîne</term>
<term>Oligonucléotide</term>
<term>Problème sélection</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0302-9743</s0>
</fA01>
<fA05>
<s2>3109</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG">
<s1>Multi-seed lossless filtration</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG">
<s1>CPM 2004 : combinatorial pattern matching : Istanbul, 5-7 July 2004</s1>
</fA09>
<fA11 i1="01" i2="1">
<s1>KUCHEROV (Gregory)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>NOE (Laurent)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>ROYTBERG (Mikhail)</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>SAHINALP (Suleyman Cenk)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>MUTHUKRISHNAN (S.)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="03" i2="1">
<s1>DOGRUSOZ (Ugur)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01">
<s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="02">
<s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA20>
<s1>297-310</s1>
</fA20>
<fA21>
<s1>2004</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA26 i1="01">
<s0>3-540-22341-X</s0>
</fA26>
<fA43 i1="01">
<s1>INIST</s1>
<s2>16343</s2>
<s5>354000117912830220</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2004 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>21 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>04-0412576</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>Lecture notes in computer science</s0>
</fA64>
<fA66 i1="01">
<s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</s0>
</fC01>
<fC02 i1="01" i2="X">
<s0>001D02A05</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE">
<s0>Problème combinatoire</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG">
<s0>Combinatorial problem</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA">
<s0>Problema combinatorio</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE">
<s0>Concordance forme</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG">
<s0>Pattern matching</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Echelle grande</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Large scale</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Escala grande</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Base donnée</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Database</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Base dato</s0>
<s5>08</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Semence</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Seed</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Semilla</s0>
<s5>18</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE">
<s0>Filtration</s0>
<s5>19</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG">
<s0>Filtration</s0>
<s5>19</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA">
<s0>Filtración</s0>
<s5>19</s5>
</fC03>
<fC03 i1="07" i2="3" l="FRE">
<s0>Appariement chaîne</s0>
<s5>20</s5>
</fC03>
<fC03 i1="07" i2="3" l="ENG">
<s0>String matching</s0>
<s5>20</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE">
<s0>Oligonucléotide</s0>
<s5>21</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG">
<s0>Oligonucleotide</s0>
<s5>21</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA">
<s0>Oligonucleótido</s0>
<s5>21</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE">
<s0>Problème sélection</s0>
<s5>23</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG">
<s0>Selection problem</s0>
<s5>23</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA">
<s0>Problema selección</s0>
<s5>23</s5>
</fC03>
<fN21>
<s1>236</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>Combinatorial pattern matching. Annual symposium</s1>
<s2>15</s2>
<s3>Istanbul TUR</s3>
<s4>2004-07-05</s4>
</fA30>
</pR>
</standard>
<server>
<NO>PASCAL 04-0412576 INIST</NO>
<ET>Multi-seed lossless filtration</ET>
<AU>KUCHEROV (Gregory); NOE (Laurent); ROYTBERG (Mikhail); SAHINALP (Suleyman Cenk); MUTHUKRISHNAN (S.); DOGRUSOZ (Ugur)</AU>
<AF>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101/54602, Villers-lès-Nancy/France (1 aut., 2 aut.); Institute of Mathematical Problems in Biology/Pushchino, Moscow Region, 142290/Russie (3 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Lecture notes in computer science; ISSN 0302-9743; Allemagne; Da. 2004; Vol. 3109; Pp. 297-310; Bibl. 21 ref.</SO>
<LA>Anglais</LA>
<EA>We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</EA>
<CC>001D02A05</CC>
<FD>Problème combinatoire; Concordance forme; Echelle grande; Base donnée; Semence; Filtration; Appariement chaîne; Oligonucléotide; Problème sélection</FD>
<ED>Combinatorial problem; Pattern matching; Large scale; Database; Seed; Filtration; String matching; Oligonucleotide; Selection problem</ED>
<SD>Problema combinatorio; Escala grande; Base dato; Semilla; Filtración; Oligonucleótido; Problema selección</SD>
<LO>INIST-16343.354000117912830220</LO>
<ID>04-0412576</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000650 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000650 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:04-0412576
   |texte=   Multi-seed lossless filtration
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022