InforLorV4, PascalFrancis, Corpus, bibRecord, 000650

Multi-seed lossless filtration

Identifieur interne : 000650 ( PascalFrancis/Corpus ); précédent : 000649; suivant : 000651

Multi-seed lossless filtration

Auteurs : Gregory Kucherov ; Laurent Noe ; Mikhail Roytberg

Source :

Lecture notes in computer science [ 0302-9743 ] ; 2004.

RBID : Pascal:04-0412576

Descripteurs français

Pascal (Inist)
- Problème combinatoire, Concordance forme, Echelle grande, Base donnée, Semence, Filtration, Appariement chaîne, Oligonucléotide, Problème sélection.

English descriptors

KwdEn :
- Combinatorial problem, Database, Filtration, Large scale, Oligonucleotide, Pattern matching, Seed, Selection problem, String matching.

Abstract

We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

A01	`01`	`1`		`@0 0302-9743`
A05				`@2 3109`
A08	`01`	`1`	`ENG`	`@1 Multi-seed lossless filtration`
A09	`01`	`1`	`ENG`	`@1 CPM 2004 : combinatorial pattern matching : Istanbul, 5-7 July 2004`
A11	`01`	`1`		`@1 KUCHEROV (Gregory)`
A11	`02`	`1`		`@1 NOE (Laurent)`
A11	`03`	`1`		`@1 ROYTBERG (Mikhail)`
A12	`01`	`1`		`@1 SAHINALP (Suleyman Cenk) @9 ed.`
A12	`02`	`1`		`@1 MUTHUKRISHNAN (S.) @9 ed.`
A12	`03`	`1`		`@1 DOGRUSOZ (Ugur) @9 ed.`
A14	`01`			`@1 INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101 @2 54602, Villers-lès-Nancy @3 FRA @Z 1 aut. @Z 2 aut.`
A14	`02`			`@1 Institute of Mathematical Problems in Biology @2 Pushchino, Moscow Region, 142290 @3 RUS @Z 3 aut.`
A20				`@1 297-310`
A21				`@1 2004`
A23	`01`			`@0 ENG`
A26	`01`			`@0 3-540-22341-X`
A43	`01`			`@1 INIST @2 16343 @5 354000117912830220`
A44				`@0 0000 @1 © 2004 INIST-CNRS. All rights reserved.`
A45				`@0 21 ref.`
A47	`01`	`1`		`@0 04-0412576`
A60				`@1 P @2 C`
A61				`@0 A`
A64	`01`	`1`		`@0 Lecture notes in computer science`
A66	`01`			`@0 DEU`
C01	`01`		`ENG`	@0 We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.
C02	`01`	`X`		`@0 001D02A05`
C03	`01`	`X`	`FRE`	`@0 Problème combinatoire @5 01`
C03	`01`	`X`	`ENG`	`@0 Combinatorial problem @5 01`
C03	`01`	`X`	`SPA`	`@0 Problema combinatorio @5 01`
C03	`02`	`X`	`FRE`	`@0 Concordance forme @5 06`
C03	`02`	`X`	`ENG`	`@0 Pattern matching @5 06`
C03	`03`	`X`	`FRE`	`@0 Echelle grande @5 07`
C03	`03`	`X`	`ENG`	`@0 Large scale @5 07`
C03	`03`	`X`	`SPA`	`@0 Escala grande @5 07`
C03	`04`	`X`	`FRE`	`@0 Base donnée @5 08`
C03	`04`	`X`	`ENG`	`@0 Database @5 08`
C03	`04`	`X`	`SPA`	`@0 Base dato @5 08`
C03	`05`	`X`	`FRE`	`@0 Semence @5 18`
C03	`05`	`X`	`ENG`	`@0 Seed @5 18`
C03	`05`	`X`	`SPA`	`@0 Semilla @5 18`
C03	`06`	`X`	`FRE`	`@0 Filtration @5 19`
C03	`06`	`X`	`ENG`	`@0 Filtration @5 19`
C03	`06`	`X`	`SPA`	`@0 Filtración @5 19`
C03	`07`	`3`	`FRE`	`@0 Appariement chaîne @5 20`
C03	`07`	`3`	`ENG`	`@0 String matching @5 20`
C03	`08`	`X`	`FRE`	`@0 Oligonucléotide @5 21`
C03	`08`	`X`	`ENG`	`@0 Oligonucleotide @5 21`
C03	`08`	`X`	`SPA`	`@0 Oligonucleótido @5 21`
C03	`09`	`X`	`FRE`	`@0 Problème sélection @5 23`
C03	`09`	`X`	`ENG`	`@0 Selection problem @5 23`
C03	`09`	`X`	`SPA`	`@0 Problema selección @5 23`
N21				`@1 236`
N44	`01`			`@1 OTO`
N82				`@1 OTO`

A30	`01`	`1`	`ENG`	`@1 Combinatorial pattern matching. Annual symposium @2 15 @3 Istanbul TUR @4 2004-07-05`

Format Inist (serveur)

NO :	PASCAL 04-0412576 INIST
ET :	Multi-seed lossless filtration
AU :	KUCHEROV (Gregory); NOE (Laurent); ROYTBERG (Mikhail); SAHINALP (Suleyman Cenk); MUTHUKRISHNAN (S.); DOGRUSOZ (Ugur)
AF :	INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101/54602, Villers-lès-Nancy/France (1 aut., 2 aut.); Institute of Mathematical Problems in Biology/Pushchino, Moscow Region, 142290/Russie (3 aut.)
DT :	Publication en série; Congrès; Niveau analytique
SO :	Lecture notes in computer science; ISSN 0302-9743; Allemagne; Da. 2004; Vol. 3109; Pp. 297-310; Bibl. 21 ref.
LA :	Anglais
EA :	We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.
CC :	001D02A05
FD :	Problème combinatoire; Concordance forme; Echelle grande; Base donnée; Semence; Filtration; Appariement chaîne; Oligonucléotide; Problème sélection
ED :	Combinatorial problem; Pattern matching; Large scale; Database; Seed; Filtration; String matching; Oligonucleotide; Selection problem
SD :	Problema combinatorio; Escala grande; Base dato; Semilla; Filtración; Oligonucleótido; Problema selección
LO :	INIST-16343.354000117912830220
ID :	04-0412576

Links to Exploration step

Pascal:04-0412576

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Multi-seed lossless filtration</title>
<author><name sortKey="Kucherov, Gregory" sort="Kucherov, Gregory" uniqKey="Kucherov G" first="Gregory" last="Kucherov">Gregory Kucherov</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noe">Laurent Noe</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Roytberg, Mikhail" sort="Roytberg, Mikhail" uniqKey="Roytberg M" first="Mikhail" last="Roytberg">Mikhail Roytberg</name>
<affiliation><inist:fA14 i1="02"><s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">04-0412576</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 04-0412576 INIST</idno>
<idno type="RBID">Pascal:04-0412576</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000650</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Multi-seed lossless filtration</title>
<author><name sortKey="Kucherov, Gregory" sort="Kucherov, Gregory" uniqKey="Kucherov G" first="Gregory" last="Kucherov">Gregory Kucherov</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Noe, Laurent" sort="Noe, Laurent" uniqKey="Noe L" first="Laurent" last="Noe">Laurent Noe</name>
<affiliation><inist:fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Roytberg, Mikhail" sort="Roytberg, Mikhail" uniqKey="Roytberg M" first="Mikhail" last="Roytberg">Mikhail Roytberg</name>
<affiliation><inist:fA14 i1="02"><s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
<imprint><date when="2004">2004</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Lecture notes in computer science</title>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Combinatorial problem</term>
<term>Database</term>
<term>Filtration</term>
<term>Large scale</term>
<term>Oligonucleotide</term>
<term>Pattern matching</term>
<term>Seed</term>
<term>Selection problem</term>
<term>String matching</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Problème combinatoire</term>
<term>Concordance forme</term>
<term>Echelle grande</term>
<term>Base donnée</term>
<term>Semence</term>
<term>Filtration</term>
<term>Appariement chaîne</term>
<term>Oligonucléotide</term>
<term>Problème sélection</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>0302-9743</s0>
</fA01>
<fA05><s2>3109</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG"><s1>Multi-seed lossless filtration</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>CPM 2004 : combinatorial pattern matching : Istanbul, 5-7 July 2004</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>KUCHEROV (Gregory)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>NOE (Laurent)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>ROYTBERG (Mikhail)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>SAHINALP (Suleyman Cenk)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>MUTHUKRISHNAN (S.)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="03" i2="1"><s1>DOGRUSOZ (Ugur)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101</s1>
<s2>54602, Villers-lès-Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>Institute of Mathematical Problems in Biology</s1>
<s2>Pushchino, Moscow Region, 142290</s2>
<s3>RUS</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA20><s1>297-310</s1>
</fA20>
<fA21><s1>2004</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA26 i1="01"><s0>3-540-22341-X</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>16343</s2>
<s5>354000117912830220</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2004 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>21 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>04-0412576</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>Lecture notes in computer science</s0>
</fA64>
<fA66 i1="01"><s0>DEU</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02A05</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Problème combinatoire</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Combinatorial problem</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Problema combinatorio</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="X" l="FRE"><s0>Concordance forme</s0>
<s5>06</s5>
</fC03>
<fC03 i1="02" i2="X" l="ENG"><s0>Pattern matching</s0>
<s5>06</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Echelle grande</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Large scale</s0>
<s5>07</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Escala grande</s0>
<s5>07</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Base donnée</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Database</s0>
<s5>08</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Base dato</s0>
<s5>08</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Semence</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Seed</s0>
<s5>18</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Semilla</s0>
<s5>18</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Filtration</s0>
<s5>19</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Filtration</s0>
<s5>19</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Filtración</s0>
<s5>19</s5>
</fC03>
<fC03 i1="07" i2="3" l="FRE"><s0>Appariement chaîne</s0>
<s5>20</s5>
</fC03>
<fC03 i1="07" i2="3" l="ENG"><s0>String matching</s0>
<s5>20</s5>
</fC03>
<fC03 i1="08" i2="X" l="FRE"><s0>Oligonucléotide</s0>
<s5>21</s5>
</fC03>
<fC03 i1="08" i2="X" l="ENG"><s0>Oligonucleotide</s0>
<s5>21</s5>
</fC03>
<fC03 i1="08" i2="X" l="SPA"><s0>Oligonucleótido</s0>
<s5>21</s5>
</fC03>
<fC03 i1="09" i2="X" l="FRE"><s0>Problème sélection</s0>
<s5>23</s5>
</fC03>
<fC03 i1="09" i2="X" l="ENG"><s0>Selection problem</s0>
<s5>23</s5>
</fC03>
<fC03 i1="09" i2="X" l="SPA"><s0>Problema selección</s0>
<s5>23</s5>
</fC03>
<fN21><s1>236</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Combinatorial pattern matching. Annual symposium</s1>
<s2>15</s2>
<s3>Istanbul TUR</s3>
<s4>2004-07-05</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 04-0412576 INIST</NO>
<ET>Multi-seed lossless filtration</ET>
<AU>KUCHEROV (Gregory); NOE (Laurent); ROYTBERG (Mikhail); SAHINALP (Suleyman Cenk); MUTHUKRISHNAN (S.); DOGRUSOZ (Ugur)</AU>
<AF>INRIA/LORIA, 615, rue du Jardin Botanique, B.P. 101/54602, Villers-lès-Nancy/France (1 aut., 2 aut.); Institute of Mathematical Problems in Biology/Pushchino, Moscow Region, 142290/Russie (3 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Lecture notes in computer science; ISSN 0302-9743; Allemagne; Da. 2004; Vol. 3109; Pp. 297-310; Bibl. 21 ref.</SO>
<LA>Anglais</LA>
<EA>We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed as studied by Burkhardt and Karkkainen [1]. We present algorithms to compute several important parameters of seed families, study their combinatorial properties, and describe several techniques to construct efficient families. We also report a large-scale application of the proposed technique to the problem of oligonucleotide selection for an EST sequence database.</EA>
<CC>001D02A05</CC>
<FD>Problème combinatoire; Concordance forme; Echelle grande; Base donnée; Semence; Filtration; Appariement chaîne; Oligonucléotide; Problème sélection</FD>
<ED>Combinatorial problem; Pattern matching; Large scale; Database; Seed; Filtration; String matching; Oligonucleotide; Selection problem</ED>
<SD>Problema combinatorio; Escala grande; Base dato; Semilla; Filtración; Oligonucleótido; Problema selección</SD>
<LO>INIST-16343.354000117912830220</LO>
<ID>04-0412576</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/PascalFrancis/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000650 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000650 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:04-0412576
   |texte=   Multi-seed lossless filtration
}}

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022

	Serveur d'exploration sur la recherche en informatique en Lorraine
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la recherche en informatique en Lorraine

Multi-seed lossless filtration

Multi-seed lossless filtration

Source :

Descripteurs français

English descriptors

Abstract

Notice en format standard (ISO 2709)

Format Inist (serveur)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri