Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Procrastination Leads to Efficient Filtration for Local Multiple Alignment

Identifieur interne : 000B96 ( Istex/Corpus ); précédent : 000B95; suivant : 000B97

Procrastination Leads to Efficient Filtration for Local Multiple Alignment

Auteurs : Aaron E. Darling ; Todd J. Treangen ; Louxin Zhang ; Carla Kuiken ; Xavier Messeguer ; Nicole T. Perna

Source :

RBID : ISTEX:542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E

Abstract

Abstract: We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as w characters in any occurrence of the motif. The algorithm consumes $\mathcal{O}(wN)$ memory and $\mathcal{O}(wN \log wN)$ time where N is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called procrastAligner and is freely available from http://gel.ahabs.wisc.edu/procrastination

Url:
DOI: 10.1007/11851561_12

Links to Exploration step

ISTEX:542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
<author>
<name sortKey="Darling, Aaron E" sort="Darling, Aaron E" uniqKey="Darling A" first="Aaron E." last="Darling">Aaron E. Darling</name>
<affiliation>
<mods:affiliation>Department of Computer Science, University of Wisconsin, USA</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: darling@cs.wisc.edu</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Treangen, Todd J" sort="Treangen, Todd J" uniqKey="Treangen T" first="Todd J." last="Treangen">Todd J. Treangen</name>
<affiliation>
<mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: treangen@lsi.upc.edu</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Louxin" sort="Zhang, Louxin" uniqKey="Zhang L" first="Louxin" last="Zhang">Louxin Zhang</name>
<affiliation>
<mods:affiliation>Department of Mathematics, National University of Singapore, Singapore</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Kuiken, Carla" sort="Kuiken, Carla" uniqKey="Kuiken C" first="Carla" last="Kuiken">Carla Kuiken</name>
<affiliation>
<mods:affiliation>T-10 Theoretical Biology Division, Los Alamos National Laboratory, USA</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Messeguer, Xavier" sort="Messeguer, Xavier" uniqKey="Messeguer X" first="Xavier" last="Messeguer">Xavier Messeguer</name>
<affiliation>
<mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Perna, Nicole T" sort="Perna, Nicole T" uniqKey="Perna N" first="Nicole T." last="Perna">Nicole T. Perna</name>
<affiliation>
<mods:affiliation>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin, USA</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11851561_12</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HCB-ZC5KHRTV-K/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B96</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000B96</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
<author>
<name sortKey="Darling, Aaron E" sort="Darling, Aaron E" uniqKey="Darling A" first="Aaron E." last="Darling">Aaron E. Darling</name>
<affiliation>
<mods:affiliation>Department of Computer Science, University of Wisconsin, USA</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: darling@cs.wisc.edu</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Treangen, Todd J" sort="Treangen, Todd J" uniqKey="Treangen T" first="Todd J." last="Treangen">Todd J. Treangen</name>
<affiliation>
<mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: treangen@lsi.upc.edu</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Louxin" sort="Zhang, Louxin" uniqKey="Zhang L" first="Louxin" last="Zhang">Louxin Zhang</name>
<affiliation>
<mods:affiliation>Department of Mathematics, National University of Singapore, Singapore</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Kuiken, Carla" sort="Kuiken, Carla" uniqKey="Kuiken C" first="Carla" last="Kuiken">Carla Kuiken</name>
<affiliation>
<mods:affiliation>T-10 Theoretical Biology Division, Los Alamos National Laboratory, USA</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Messeguer, Xavier" sort="Messeguer, Xavier" uniqKey="Messeguer X" first="Xavier" last="Messeguer">Xavier Messeguer</name>
<affiliation>
<mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Perna, Nicole T" sort="Perna, Nicole T" uniqKey="Perna N" first="Nicole T." last="Perna">Nicole T. Perna</name>
<affiliation>
<mods:affiliation>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin, USA</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s" type="main" xml:lang="en">Lecture Notes in Computer Science</title>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as w characters in any occurrence of the motif. The algorithm consumes $\mathcal{O}(wN)$ memory and $\mathcal{O}(wN \log wN)$ time where N is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called procrastAligner and is freely available from http://gel.ahabs.wisc.edu/procrastination</div>
</front>
</TEI>
<istex>
<corpusName>springer-ebooks</corpusName>
<author>
<json:item>
<name>Aaron E. Darling</name>
<affiliations>
<json:string>Department of Computer Science, University of Wisconsin, USA</json:string>
<json:string>E-mail: darling@cs.wisc.edu</json:string>
</affiliations>
</json:item>
<json:item>
<name>Todd J. Treangen</name>
<affiliations>
<json:string>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</json:string>
<json:string>E-mail: treangen@lsi.upc.edu</json:string>
</affiliations>
</json:item>
<json:item>
<name>Louxin Zhang</name>
<affiliations>
<json:string>Department of Mathematics, National University of Singapore, Singapore</json:string>
</affiliations>
</json:item>
<json:item>
<name>Carla Kuiken</name>
<affiliations>
<json:string>T-10 Theoretical Biology Division, Los Alamos National Laboratory, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Xavier Messeguer</name>
<affiliations>
<json:string>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</json:string>
</affiliations>
</json:item>
<json:item>
<name>Nicole T. Perna</name>
<affiliations>
<json:string>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin, USA</json:string>
</affiliations>
</json:item>
</author>
<arkIstex>ark:/67375/HCB-ZC5KHRTV-K</arkIstex>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>OriginalPaper</json:string>
</originalGenre>
<abstract>Abstract: We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as w characters in any occurrence of the motif. The algorithm consumes $\mathcal{O}(wN)$ memory and $\mathcal{O}(wN \log wN)$ time where N is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called procrastAligner and is freely available from http://gel.ahabs.wisc.edu/procrastination</abstract>
<qualityIndicators>
<score>8.788</score>
<pdfWordCount>5161</pdfWordCount>
<pdfCharCount>27875</pdfCharCount>
<pdfVersion>1.3</pdfVersion>
<pdfPageCount>12</pdfPageCount>
<pdfPageSize>430 x 660 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<abstractWordCount>149</abstractWordCount>
<abstractCharCount>1062</abstractCharCount>
<keywordCount>0</keywordCount>
</qualityIndicators>
<title>Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
<chapterId>
<json:string>12</json:string>
<json:string>Chap12</json:string>
</chapterId>
<genre>
<json:string>conference</json:string>
</genre>
<serie>
<title>Lecture Notes in Computer Science</title>
<language>
<json:string>unknown</json:string>
</language>
<copyrightDate>2006</copyrightDate>
<issn>
<json:string>0302-9743</json:string>
</issn>
<eissn>
<json:string>1611-3349</json:string>
</eissn>
<editor>
<json:item>
<name>David Hutchison</name>
<affiliations>
<json:string>Lancaster University, UK</json:string>
</affiliations>
</json:item>
<json:item>
<name>Takeo Kanade</name>
<affiliations>
<json:string>Carnegie Mellon University, Pittsburgh, PA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Josef Kittler</name>
<affiliations>
<json:string>University of Surrey, Guildford, UK</json:string>
</affiliations>
</json:item>
<json:item>
<name>Jon M. Kleinberg</name>
<affiliations>
<json:string>Cornell University, Ithaca, NY, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Friedemann Mattern</name>
<affiliations>
<json:string>ETH Zurich, Switzerland</json:string>
</affiliations>
</json:item>
<json:item>
<name>John C. Mitchell</name>
<affiliations>
<json:string>Stanford University, CA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Moni Naor</name>
<affiliations>
<json:string>Weizmann Institute of Science, Rehovot, Israel</json:string>
</affiliations>
</json:item>
<json:item>
<name>Oscar Nierstrasz</name>
<affiliations>
<json:string>University of Bern, Switzerland</json:string>
</affiliations>
</json:item>
<json:item>
<name>C. Pandu Rangan</name>
<affiliations>
<json:string>Indian Institute of Technology, Madras, India</json:string>
</affiliations>
</json:item>
<json:item>
<name>Bernhard Steffen</name>
<affiliations>
<json:string>University of Dortmund, Germany</json:string>
</affiliations>
</json:item>
<json:item>
<name>Madhu Sudan</name>
<affiliations>
<json:string>Massachusetts Institute of Technology, MA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Demetri Terzopoulos</name>
<affiliations>
<json:string>University of California, Los Angeles, CA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Dough Tygar</name>
<affiliations>
<json:string>University of California, Berkeley, CA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Moshe Y. Vardi</name>
<affiliations>
<json:string>Rice University, Houston, TX, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Gerhard Weikum</name>
<affiliations>
<json:string>Max-Planck Institute of Computer Science, Saarbruecken, Germany</json:string>
</affiliations>
</json:item>
</editor>
</serie>
<host>
<title>Algorithms in Bioinformatics</title>
<language>
<json:string>unknown</json:string>
</language>
<copyrightDate>2006</copyrightDate>
<doi>
<json:string>10.1007/11851561</json:string>
</doi>
<issn>
<json:string>0302-9743</json:string>
</issn>
<eissn>
<json:string>1611-3349</json:string>
</eissn>
<eisbn>
<json:string>978-3-540-39584-3</json:string>
</eisbn>
<bookId>
<json:string>978-3-540-39584-3</json:string>
</bookId>
<isbn>
<json:string>978-3-540-39583-6</json:string>
</isbn>
<volume>4175</volume>
<pages>
<first>126</first>
<last>137</last>
</pages>
<genre>
<json:string>book-series</json:string>
</genre>
<editor>
<json:item>
<name>Philipp Bücher</name>
<affiliations>
<json:string>Ecole Polytechnique Fédérale de Lausanne, Switzerland</json:string>
<json:string>E-mail: Philipp.Bucher@isrec.unil.ch</json:string>
</affiliations>
</json:item>
<json:item>
<name>Bernard M. E. Moret</name>
<affiliations>
<json:string>Laboratory for Computational Biology and Bioinformatics, EPFL (Ecole Polytechnique Fédérale de Lausanne), Swiss Institute of Bioinformatics, Lausanne, Switzerland</json:string>
<json:string>E-mail: bernard.moret@epfl.ch</json:string>
</affiliations>
</json:item>
</editor>
<subject>
<json:item>
<value>Computer Science</value>
</json:item>
<json:item>
<value>Computer Science</value>
</json:item>
<json:item>
<value>Algorithm Analysis and Problem Complexity</value>
</json:item>
<json:item>
<value>Computation by Abstract Devices</value>
</json:item>
<json:item>
<value>Data Structures</value>
</json:item>
<json:item>
<value>Discrete Mathematics in Computer Science</value>
</json:item>
<json:item>
<value>Probability and Statistics in Computer Science</value>
</json:item>
<json:item>
<value>Bioinformatics</value>
</json:item>
</subject>
</host>
<ark>
<json:string>ark:/67375/HCB-ZC5KHRTV-K</json:string>
</ark>
<publicationDate>2006</publicationDate>
<copyrightDate>2006</copyrightDate>
<doi>
<json:string>10.1007/11851561_12</json:string>
</doi>
<id>542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E</id>
<score>1</score>
<fulltext>
<json:item>
<extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/ark:/67375/HCB-ZC5KHRTV-K/fulltext.pdf</uri>
</json:item>
<json:item>
<extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/ark:/67375/HCB-ZC5KHRTV-K/bundle.zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/ark:/67375/HCB-ZC5KHRTV-K/fulltext.tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<availability>
<licence>Springer-Verlag Berlin Heidelberg</licence>
</availability>
<date when="2006">2006</date>
</publicationStmt>
<notesStmt>
<note type="conference" source="proceedings" scheme="https://content-type.data.istex.fr/ark:/67375/XTP-BFHXPBJJ-3">conference</note>
<note type="publication-type" subtype="book-series" scheme="https://publication-type.data.istex.fr/ark:/67375/JMC-0G6R5W5T-Z">book-series</note>
</notesStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
<author>
<persName>
<forename type="first">Aaron</forename>
<forename type="first">E.</forename>
<surname>Darling</surname>
</persName>
<email>darling@cs.wisc.edu</email>
<affiliation>
<orgName type="department">Department of Computer Science</orgName>
<orgName type="institution">University of Wisconsin</orgName>
<address>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</author>
<author>
<persName>
<forename type="first">Todd</forename>
<forename type="first">J.</forename>
<surname>Treangen</surname>
</persName>
<email>treangen@lsi.upc.edu</email>
<affiliation>
<orgName type="department">Department of Computer Science</orgName>
<orgName type="institution">Technical University of Catalonia</orgName>
<address>
<settlement>Barcelona</settlement>
<country key="ES">SPAIN</country>
</address>
</affiliation>
</author>
<author>
<persName>
<forename type="first">Louxin</forename>
<surname>Zhang</surname>
</persName>
<affiliation>
<orgName type="department">Department of Mathematics</orgName>
<orgName type="institution">National University of Singapore</orgName>
<address>
<country key="SG">SINGAPORE</country>
</address>
</affiliation>
</author>
<author>
<persName>
<forename type="first">Carla</forename>
<surname>Kuiken</surname>
</persName>
<affiliation>
<orgName type="department">T-10 Theoretical Biology Division</orgName>
<orgName type="institution">Los Alamos National Laboratory</orgName>
<address>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</author>
<author>
<persName>
<forename type="first">Xavier</forename>
<surname>Messeguer</surname>
</persName>
<affiliation>
<orgName type="department">Department of Computer Science</orgName>
<orgName type="institution">Technical University of Catalonia</orgName>
<address>
<settlement>Barcelona</settlement>
<country key="ES">SPAIN</country>
</address>
</affiliation>
</author>
<author>
<persName>
<forename type="first">Nicole</forename>
<forename type="first">T.</forename>
<surname>Perna</surname>
</persName>
<affiliation>
<orgName type="department">Department of Animal Health and Biomedical Sciences, Genome Center</orgName>
<orgName type="institution">University of Wisconsin</orgName>
<address>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</author>
<idno type="istex">542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E</idno>
<idno type="ark">ark:/67375/HCB-ZC5KHRTV-K</idno>
<idno type="DOI">10.1007/11851561_12</idno>
</analytic>
<monogr>
<title level="m" type="main">Algorithms in Bioinformatics</title>
<title level="m" type="sub">6th International Workshop, WABI 2006, Zurich, Switzerland, September 11-13, 2006. Proceedings</title>
<idno type="DOI">10.1007/11851561</idno>
<idno type="book-id">978-3-540-39584-3</idno>
<idno type="ISBN">978-3-540-39583-6</idno>
<idno type="eISBN">978-3-540-39584-3</idno>
<idno type="chapter-id">Chap12</idno>
<editor>
<persName>
<forename type="first">Philipp</forename>
<surname>Bücher</surname>
</persName>
<email>Philipp.Bucher@isrec.unil.ch</email>
<affiliation>
<orgName type="institution">Ecole Polytechnique Fédérale de Lausanne</orgName>
<address>
<country key="CH">SWITZERLAND</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Bernard</forename>
<forename type="first">M.</forename>
<forename type="first">E.</forename>
<surname>Moret</surname>
</persName>
<email>bernard.moret@epfl.ch</email>
<affiliation>
<orgName type="department">Laboratory for Computational Biology and Bioinformatics</orgName>
<orgName type="institution">EPFL (Ecole Polytechnique Fédérale de Lausanne), Swiss Institute of Bioinformatics</orgName>
<address>
<settlement>Lausanne</settlement>
<country key="CH">SWITZERLAND</country>
</address>
</affiliation>
</editor>
<imprint>
<biblScope unit="vol">4175</biblScope>
<biblScope unit="page" from="126">126</biblScope>
<biblScope unit="page" to="137">137</biblScope>
<biblScope unit="chapter-count">36</biblScope>
</imprint>
</monogr>
<series>
<title level="s" type="main" xml:lang="en">Lecture Notes in Computer Science</title>
<editor>
<persName>
<forename type="first">David</forename>
<surname>Hutchison</surname>
</persName>
<affiliation>
<orgName type="institution">Lancaster University</orgName>
<address>
<country key="GB">UNITED KINGDOM</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Takeo</forename>
<surname>Kanade</surname>
</persName>
<affiliation>
<orgName type="institution">Carnegie Mellon University</orgName>
<address>
<settlement>Pittsburgh</settlement>
<region>PA</region>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Josef</forename>
<surname>Kittler</surname>
</persName>
<affiliation>
<orgName type="institution">University of Surrey</orgName>
<address>
<settlement>Guildford</settlement>
<country key="GB">UNITED KINGDOM</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Jon</forename>
<forename type="first">M.</forename>
<surname>Kleinberg</surname>
</persName>
<affiliation>
<orgName type="institution">Cornell University</orgName>
<address>
<settlement>Ithaca</settlement>
<region>NY</region>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Friedemann</forename>
<surname>Mattern</surname>
</persName>
<affiliation>
<orgName type="institution">ETH Zurich</orgName>
<address>
<country key="CH">SWITZERLAND</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">John</forename>
<forename type="first">C.</forename>
<surname>Mitchell</surname>
</persName>
<affiliation>
<orgName type="institution">Stanford University</orgName>
<address>
<settlement>CA</settlement>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Moni</forename>
<surname>Naor</surname>
</persName>
<affiliation>
<orgName type="institution">Weizmann Institute of Science</orgName>
<address>
<settlement>Rehovot</settlement>
<country key="IL">ISRAEL</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Oscar</forename>
<surname>Nierstrasz</surname>
</persName>
<affiliation>
<orgName type="institution">University of Bern</orgName>
<address>
<country key="CH">SWITZERLAND</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">C.</forename>
<surname>Pandu Rangan</surname>
</persName>
<affiliation>
<orgName type="institution">Indian Institute of Technology</orgName>
<address>
<settlement>Madras</settlement>
<country key="IN">INDIA</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Bernhard</forename>
<surname>Steffen</surname>
</persName>
<affiliation>
<orgName type="institution">University of Dortmund</orgName>
<address>
<country key="DE">GERMANY</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Madhu</forename>
<surname>Sudan</surname>
</persName>
<affiliation>
<orgName type="institution">Massachusetts Institute of Technology</orgName>
<address>
<settlement>MA</settlement>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Demetri</forename>
<surname>Terzopoulos</surname>
</persName>
<affiliation>
<orgName type="institution">University of California</orgName>
<address>
<settlement>Los Angeles</settlement>
<region>CA</region>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Dough</forename>
<surname>Tygar</surname>
</persName>
<affiliation>
<orgName type="institution">University of California</orgName>
<address>
<settlement>Berkeley</settlement>
<region>CA</region>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Moshe</forename>
<forename type="first">Y.</forename>
<surname>Vardi</surname>
</persName>
<affiliation>
<orgName type="institution">Rice University</orgName>
<address>
<settlement>Houston</settlement>
<region>TX</region>
<country key="US">UNITED STATES</country>
</address>
</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Gerhard</forename>
<surname>Weikum</surname>
</persName>
<affiliation>
<orgName type="institution">Max-Planck Institute of Computer Science</orgName>
<address>
<settlement>Saarbruecken</settlement>
<country key="DE">GERMANY</country>
</address>
</affiliation>
</editor>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="seriesID">558</idno>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<abstract xml:lang="en">
<head>Abstract</head>
<p>We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as
<hi rend="italic">w</hi>
characters in any occurrence of the motif. The algorithm consumes
<formula xml:id="IEq1" notation="TEX">
<media mimeType="image" url=""></media>
$\mathcal{O}(wN)$ </formula>
memory and
<formula xml:id="IEq2" notation="TEX">
<media mimeType="image" url=""></media>
$\mathcal{O}(wN \log wN)$ </formula>
time where
<hi rend="italic">N</hi>
is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called and is freely available from http://gel.ahabs.wisc.edu/procrastination</p>
</abstract>
<textClass ana="subject">
<keywords scheme="book-subject-collection">
<list>
<label>SUCO11645</label>
<item>
<term>Computer Science</term>
</item>
</list>
</keywords>
</textClass>
<textClass ana="subject">
<keywords scheme="book-subject">
<list>
<label>I</label>
<item>
<term type="Primary">Computer Science</term>
</item>
<label>I16021</label>
<item>
<term type="Secondary" subtype="priority-1">Algorithm Analysis and Problem Complexity</term>
</item>
<label>I16013</label>
<item>
<term type="Secondary" subtype="priority-2">Computation by Abstract Devices</term>
</item>
<label>I15017</label>
<item>
<term type="Secondary" subtype="priority-3">Data Structures</term>
</item>
<label>I17028</label>
<item>
<term type="Secondary" subtype="priority-4">Discrete Mathematics in Computer Science</term>
</item>
<label>I17036</label>
<item>
<term type="Secondary" subtype="priority-5">Probability and Statistics in Computer Science</term>
</item>
<label>L15001</label>
<item>
<term type="Secondary" subtype="priority-6">Bioinformatics</term>
</item>
</list>
</keywords>
</textClass>
<langUsage>
<language ident="EN"></language>
</langUsage>
</profileDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/ark:/67375/HCB-ZC5KHRTV-K/fulltext.txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus springer-ebooks not found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document>
<Publisher>
<PublisherInfo>
<PublisherName>Springer Berlin Heidelberg</PublisherName>
<PublisherLocation>Berlin, Heidelberg</PublisherLocation>
</PublisherInfo>
<Series>
<SeriesInfo SeriesType="Series" TocLevels="0">
<SeriesID>558</SeriesID>
<SeriesPrintISSN>0302-9743</SeriesPrintISSN>
<SeriesElectronicISSN>1611-3349</SeriesElectronicISSN>
<SeriesTitle Language="En">Lecture Notes in Computer Science</SeriesTitle>
</SeriesInfo>
<SeriesHeader>
<EditorGroup>
<Editor AffiliationIDS="Aff1">
<EditorName DisplayOrder="Western">
<GivenName>David</GivenName>
<FamilyName>Hutchison</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff2">
<EditorName DisplayOrder="Western">
<GivenName>Takeo</GivenName>
<FamilyName>Kanade</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff3">
<EditorName DisplayOrder="Western">
<GivenName>Josef</GivenName>
<FamilyName>Kittler</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff4">
<EditorName DisplayOrder="Western">
<GivenName>Jon</GivenName>
<GivenName>M.</GivenName>
<FamilyName>Kleinberg</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff5">
<EditorName DisplayOrder="Western">
<GivenName>Friedemann</GivenName>
<FamilyName>Mattern</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff6">
<EditorName DisplayOrder="Western">
<GivenName>John</GivenName>
<GivenName>C.</GivenName>
<FamilyName>Mitchell</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff7">
<EditorName DisplayOrder="Western">
<GivenName>Moni</GivenName>
<FamilyName>Naor</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff8">
<EditorName DisplayOrder="Western">
<GivenName>Oscar</GivenName>
<FamilyName>Nierstrasz</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff9">
<EditorName DisplayOrder="Western">
<GivenName>C.</GivenName>
<FamilyName>Pandu Rangan</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff10">
<EditorName DisplayOrder="Western">
<GivenName>Bernhard</GivenName>
<FamilyName>Steffen</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff11">
<EditorName DisplayOrder="Western">
<GivenName>Madhu</GivenName>
<FamilyName>Sudan</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff12">
<EditorName DisplayOrder="Western">
<GivenName>Demetri</GivenName>
<FamilyName>Terzopoulos</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff13">
<EditorName DisplayOrder="Western">
<GivenName>Dough</GivenName>
<FamilyName>Tygar</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff14">
<EditorName DisplayOrder="Western">
<GivenName>Moshe</GivenName>
<GivenName>Y.</GivenName>
<FamilyName>Vardi</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff15">
<EditorName DisplayOrder="Western">
<GivenName>Gerhard</GivenName>
<FamilyName>Weikum</FamilyName>
</EditorName>
</Editor>
<Affiliation ID="Aff1">
<OrgName>Lancaster University</OrgName>
<OrgAddress>
<Country>UK</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2">
<OrgName>Carnegie Mellon University</OrgName>
<OrgAddress>
<City>Pittsburgh</City>
<State>PA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff3">
<OrgName>University of Surrey</OrgName>
<OrgAddress>
<City>Guildford</City>
<Country>UK</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff4">
<OrgName>Cornell University</OrgName>
<OrgAddress>
<City>Ithaca</City>
<State>NY</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff5">
<OrgName>ETH Zurich</OrgName>
<OrgAddress>
<Country>Switzerland</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff6">
<OrgName>Stanford University</OrgName>
<OrgAddress>
<City>CA</City>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff7">
<OrgName>Weizmann Institute of Science</OrgName>
<OrgAddress>
<City>Rehovot</City>
<Country>Israel</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff8">
<OrgName>University of Bern</OrgName>
<OrgAddress>
<Country>Switzerland</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff9">
<OrgName>Indian Institute of Technology</OrgName>
<OrgAddress>
<City>Madras</City>
<Country>India</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff10">
<OrgName>University of Dortmund</OrgName>
<OrgAddress>
<Country>Germany</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff11">
<OrgName>Massachusetts Institute of Technology</OrgName>
<OrgAddress>
<City>MA</City>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff12">
<OrgName>University of California</OrgName>
<OrgAddress>
<City>Los Angeles</City>
<State>CA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff13">
<OrgName>University of California</OrgName>
<OrgAddress>
<City>Berkeley</City>
<State>CA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff14">
<OrgName>Rice University</OrgName>
<OrgAddress>
<City>Houston</City>
<State>TX</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff15">
<OrgName>Max-Planck Institute of Computer Science</OrgName>
<OrgAddress>
<City>Saarbruecken</City>
<Country>Germany</Country>
</OrgAddress>
</Affiliation>
</EditorGroup>
</SeriesHeader>
<SubSeries>
<SubSeriesInfo>
<SubSeriesID>5381</SubSeriesID>
<SubSeriesPrintISSN>0302-9743</SubSeriesPrintISSN>
<SubSeriesElectronicISSN>1611-3349</SubSeriesElectronicISSN>
<SubSeriesTitle Language="En">Lecture Notes in Bioinformatics</SubSeriesTitle>
</SubSeriesInfo>
<SubSeriesHeader>
<EditorGroup>
<Editor AffiliationIDS="Aff16">
<EditorName DisplayOrder="Western">
<GivenName>Sorin</GivenName>
<FamilyName>Istrail</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff17">
<EditorName DisplayOrder="Western">
<GivenName>Pavel</GivenName>
<FamilyName>Pevzner</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff18">
<EditorName DisplayOrder="Western">
<GivenName>Michael</GivenName>
<FamilyName>Waterman</FamilyName>
</EditorName>
</Editor>
<Affiliation ID="Aff16">
<OrgName>Celera Genomics, Applied Biosystems</OrgName>
<OrgAddress>
<City>Rockville</City>
<State>MD</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff17">
<OrgName>University of California</OrgName>
<OrgAddress>
<City>San Diego</City>
<State>CA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff18">
<OrgName>University of Southern California</OrgName>
<OrgAddress>
<City>Los Angeles</City>
<State>CA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
</EditorGroup>
</SubSeriesHeader>
</SubSeries>
<Book Language="En">
<BookInfo BookProductType="Proceedings" ContainsESM="No" Language="En" MediaType="eBook" NumberingDepth="2" NumberingStyle="ContentOnly" OutputMedium="All" TocLevels="0">
<BookID>978-3-540-39584-3</BookID>
<BookTitle>Algorithms in Bioinformatics</BookTitle>
<BookSubTitle>6th International Workshop, WABI 2006, Zurich, Switzerland, September 11-13, 2006. Proceedings</BookSubTitle>
<BookVolumeNumber>4175</BookVolumeNumber>
<BookSequenceNumber>4175</BookSequenceNumber>
<BookDOI>10.1007/11851561</BookDOI>
<BookTitleID>141336</BookTitleID>
<BookPrintISBN>978-3-540-39583-6</BookPrintISBN>
<BookElectronicISBN>978-3-540-39584-3</BookElectronicISBN>
<BookChapterCount>36</BookChapterCount>
<BookCopyright>
<CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2006</CopyrightYear>
</BookCopyright>
<BookSubjectGroup>
<BookSubject Code="I" Type="Primary">Computer Science</BookSubject>
<BookSubject Code="I16021" Priority="1" Type="Secondary">Algorithm Analysis and Problem Complexity</BookSubject>
<BookSubject Code="I16013" Priority="2" Type="Secondary">Computation by Abstract Devices</BookSubject>
<BookSubject Code="I15017" Priority="3" Type="Secondary">Data Structures</BookSubject>
<BookSubject Code="I17028" Priority="4" Type="Secondary">Discrete Mathematics in Computer Science</BookSubject>
<BookSubject Code="I17036" Priority="5" Type="Secondary">Probability and Statistics in Computer Science</BookSubject>
<BookSubject Code="L15001" Priority="6" Type="Secondary">Bioinformatics</BookSubject>
<SubjectCollection Code="SUCO11645">Computer Science</SubjectCollection>
</BookSubjectGroup>
<BookContext>
<SeriesID>558</SeriesID>
<SubSeriesID>5381</SubSeriesID>
</BookContext>
</BookInfo>
<BookHeader>
<EditorGroup>
<Editor AffiliationIDS="Aff19">
<EditorName DisplayOrder="Western">
<GivenName>Philipp</GivenName>
<FamilyName>Bücher</FamilyName>
</EditorName>
<Contact>
<Email>Philipp.Bucher@isrec.unil.ch</Email>
</Contact>
</Editor>
<Editor AffiliationIDS="Aff20">
<EditorName DisplayOrder="Western">
<GivenName>Bernard</GivenName>
<GivenName>M.</GivenName>
<GivenName>E.</GivenName>
<FamilyName>Moret</FamilyName>
</EditorName>
<Contact>
<Email>bernard.moret@epfl.ch</Email>
</Contact>
</Editor>
<Affiliation ID="Aff19">
<OrgName>Ecole Polytechnique Fédérale de Lausanne</OrgName>
<OrgAddress>
<Country>Switzerland</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff20">
<OrgDivision>Laboratory for Computational Biology and Bioinformatics</OrgDivision>
<OrgName>EPFL (Ecole Polytechnique Fédérale de Lausanne), Swiss Institute of Bioinformatics</OrgName>
<OrgAddress>
<City>Lausanne</City>
<Country>Switzerland</Country>
</OrgAddress>
</Affiliation>
</EditorGroup>
</BookHeader>
<Chapter ID="Chap12" Language="En">
<ChapterInfo ChapterType="OriginalPaper" ContainsESM="No" NumberingDepth="2" NumberingStyle="ContentOnly" TocLevels="0">
<ChapterID>12</ChapterID>
<ChapterDOI>10.1007/11851561_12</ChapterDOI>
<ChapterSequenceNumber>12</ChapterSequenceNumber>
<ChapterTitle Language="En">Procrastination Leads to Efficient Filtration for Local Multiple Alignment</ChapterTitle>
<ChapterFirstPage>126</ChapterFirstPage>
<ChapterLastPage>137</ChapterLastPage>
<ChapterCopyright>
<CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2006</CopyrightYear>
</ChapterCopyright>
<ChapterGrants Type="Regular">
<MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ChapterGrants>
<ChapterContext>
<SeriesID>558</SeriesID>
<BookID>978-3-540-39584-3</BookID>
<BookTitle>Algorithms in Bioinformatics</BookTitle>
</ChapterContext>
</ChapterInfo>
<ChapterHeader>
<AuthorGroup>
<Author AffiliationIDS="Aff21">
<AuthorName DisplayOrder="Western">
<GivenName>Aaron</GivenName>
<GivenName>E.</GivenName>
<FamilyName>Darling</FamilyName>
</AuthorName>
<Contact>
<Email>darling@cs.wisc.edu</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff22">
<AuthorName DisplayOrder="Western">
<GivenName>Todd</GivenName>
<GivenName>J.</GivenName>
<FamilyName>Treangen</FamilyName>
</AuthorName>
<Contact>
<Email>treangen@lsi.upc.edu</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff24">
<AuthorName DisplayOrder="Western">
<GivenName>Louxin</GivenName>
<FamilyName>Zhang</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff25">
<AuthorName DisplayOrder="Western">
<GivenName>Carla</GivenName>
<FamilyName>Kuiken</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff22">
<AuthorName DisplayOrder="Western">
<GivenName>Xavier</GivenName>
<FamilyName>Messeguer</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff23">
<AuthorName DisplayOrder="Western">
<GivenName>Nicole</GivenName>
<GivenName>T.</GivenName>
<FamilyName>Perna</FamilyName>
</AuthorName>
</Author>
<Affiliation ID="Aff21">
<OrgDivision>Department of Computer Science</OrgDivision>
<OrgName>University of Wisconsin</OrgName>
<OrgAddress>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff22">
<OrgDivision>Department of Computer Science</OrgDivision>
<OrgName>Technical University of Catalonia</OrgName>
<OrgAddress>
<City>Barcelona</City>
<Country>Spain</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff23">
<OrgDivision>Department of Animal Health and Biomedical Sciences, Genome Center</OrgDivision>
<OrgName>University of Wisconsin</OrgName>
<OrgAddress>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff24">
<OrgDivision>Department of Mathematics</OrgDivision>
<OrgName>National University of Singapore</OrgName>
<OrgAddress>
<Country>Singapore</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff25">
<OrgDivision>T-10 Theoretical Biology Division</OrgDivision>
<OrgName>Los Alamos National Laboratory</OrgName>
<OrgAddress>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En">
<Heading>Abstract</Heading>
<Para>We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as
<Emphasis Type="Italic">w</Emphasis>
characters in any occurrence of the motif. The algorithm consumes
<InlineEquation ID="IEq1">
<InlineMediaObject>
<ImageObject FileRef="978-3-540-39584-3_12_Chapter_TeX2GIFIEq1.gif" Format="GIF" Color="BlackWhite" Type="Linedraw" Rendition="HTML"></ImageObject>
</InlineMediaObject>
<EquationSource Format="TEX">$\mathcal{O}(wN)$</EquationSource>
</InlineEquation>
memory and
<InlineEquation ID="IEq2">
<InlineMediaObject>
<ImageObject FileRef="978-3-540-39584-3_12_Chapter_TeX2GIFIEq2.gif" Format="GIF" Color="BlackWhite" Type="Linedraw" Rendition="HTML"></ImageObject>
</InlineMediaObject>
<EquationSource Format="TEX">$\mathcal{O}(wN \log wN)$</EquationSource>
</InlineEquation>
time where
<Emphasis Type="Italic">N</Emphasis>
is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called
<Literal>procrastAligner</Literal>
and is freely available from http://gel.ahabs.wisc.edu/procrastination</Para>
</Abstract>
</ChapterHeader>
<NoBody></NoBody>
</Chapter>
</Book>
</Series>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA">
<title>Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
</titleInfo>
<name type="personal">
<namePart type="given">Aaron</namePart>
<namePart type="given">E.</namePart>
<namePart type="family">Darling</namePart>
<affiliation>Department of Computer Science, University of Wisconsin, USA</affiliation>
<affiliation>E-mail: darling@cs.wisc.edu</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Todd</namePart>
<namePart type="given">J.</namePart>
<namePart type="family">Treangen</namePart>
<affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</affiliation>
<affiliation>E-mail: treangen@lsi.upc.edu</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Louxin</namePart>
<namePart type="family">Zhang</namePart>
<affiliation>Department of Mathematics, National University of Singapore, Singapore</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Carla</namePart>
<namePart type="family">Kuiken</namePart>
<affiliation>T-10 Theoretical Biology Division, Los Alamos National Laboratory, USA</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Xavier</namePart>
<namePart type="family">Messeguer</namePart>
<affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Nicole</namePart>
<namePart type="given">T.</namePart>
<namePart type="family">Perna</namePart>
<affiliation>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin, USA</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre displayLabel="OriginalPaper" authority="ISTEX" authorityURI="https://content-type.data.istex.fr" type="conference" valueURI="https://content-type.data.istex.fr/ark:/67375/XTP-BFHXPBJJ-3">conference</genre>
<originInfo>
<publisher>Springer Berlin Heidelberg</publisher>
<place>
<placeTerm type="text">Berlin, Heidelberg</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2006</dateIssued>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<abstract lang="en">Abstract: We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as w characters in any occurrence of the motif. The algorithm consumes $\mathcal{O}(wN)$ memory and $\mathcal{O}(wN \log wN)$ time where N is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called procrastAligner and is freely available from http://gel.ahabs.wisc.edu/procrastination</abstract>
<relatedItem type="host">
<titleInfo>
<title>Algorithms in Bioinformatics</title>
<subTitle>6th International Workshop, WABI 2006, Zurich, Switzerland, September 11-13, 2006. Proceedings</subTitle>
</titleInfo>
<name type="personal">
<namePart type="given">Philipp</namePart>
<namePart type="family">Bücher</namePart>
<affiliation>Ecole Polytechnique Fédérale de Lausanne, Switzerland</affiliation>
<affiliation>E-mail: Philipp.Bucher@isrec.unil.ch</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bernard</namePart>
<namePart type="given">M.</namePart>
<namePart type="given">E.</namePart>
<namePart type="family">Moret</namePart>
<affiliation>Laboratory for Computational Biology and Bioinformatics, EPFL (Ecole Polytechnique Fédérale de Lausanne), Swiss Institute of Bioinformatics, Lausanne, Switzerland</affiliation>
<affiliation>E-mail: bernard.moret@epfl.ch</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="book-series" authority="ISTEX" authorityURI="https://publication-type.data.istex.fr" valueURI="https://publication-type.data.istex.fr/ark:/67375/JMC-0G6R5W5T-Z">book-series</genre>
<originInfo>
<publisher>Springer</publisher>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
<issuance>monographic</issuance>
</originInfo>
<subject>
<genre>Book-Subject-Collection</genre>
<topic authority="SpringerSubjectCodes" authorityURI="SUCO11645">Computer Science</topic>
</subject>
<subject>
<genre>Book-Subject-Group</genre>
<topic authority="SpringerSubjectCodes" authorityURI="I">Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I16021">Algorithm Analysis and Problem Complexity</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I16013">Computation by Abstract Devices</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I15017">Data Structures</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I17028">Discrete Mathematics in Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I17036">Probability and Statistics in Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="L15001">Bioinformatics</topic>
</subject>
<identifier type="DOI">10.1007/11851561</identifier>
<identifier type="ISBN">978-3-540-39583-6</identifier>
<identifier type="eISBN">978-3-540-39584-3</identifier>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="BookTitleID">141336</identifier>
<identifier type="BookID">978-3-540-39584-3</identifier>
<identifier type="BookChapterCount">36</identifier>
<identifier type="BookVolumeNumber">4175</identifier>
<identifier type="BookSequenceNumber">4175</identifier>
<part>
<date>2006</date>
<detail type="volume">
<number>4175</number>
<caption>vol.</caption>
</detail>
<extent unit="pages">
<start>126</start>
<end>137</end>
</extent>
</part>
<recordInfo>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2006</recordOrigin>
</recordInfo>
</relatedItem>
<relatedItem type="series">
<titleInfo>
<title>Lecture Notes in Computer Science</title>
</titleInfo>
<name type="personal">
<namePart type="given">David</namePart>
<namePart type="family">Hutchison</namePart>
<affiliation>Lancaster University, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Takeo</namePart>
<namePart type="family">Kanade</namePart>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Josef</namePart>
<namePart type="family">Kittler</namePart>
<affiliation>University of Surrey, Guildford, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jon</namePart>
<namePart type="given">M.</namePart>
<namePart type="family">Kleinberg</namePart>
<affiliation>Cornell University, Ithaca, NY, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Friedemann</namePart>
<namePart type="family">Mattern</namePart>
<affiliation>ETH Zurich, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">John</namePart>
<namePart type="given">C.</namePart>
<namePart type="family">Mitchell</namePart>
<affiliation>Stanford University, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moni</namePart>
<namePart type="family">Naor</namePart>
<affiliation>Weizmann Institute of Science, Rehovot, Israel</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Oscar</namePart>
<namePart type="family">Nierstrasz</namePart>
<affiliation>University of Bern, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">C.</namePart>
<namePart type="family">Pandu Rangan</namePart>
<affiliation>Indian Institute of Technology, Madras, India</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bernhard</namePart>
<namePart type="family">Steffen</namePart>
<affiliation>University of Dortmund, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Madhu</namePart>
<namePart type="family">Sudan</namePart>
<affiliation>Massachusetts Institute of Technology, MA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Demetri</namePart>
<namePart type="family">Terzopoulos</namePart>
<affiliation>University of California, Los Angeles, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Dough</namePart>
<namePart type="family">Tygar</namePart>
<affiliation>University of California, Berkeley, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moshe</namePart>
<namePart type="given">Y.</namePart>
<namePart type="family">Vardi</namePart>
<affiliation>Rice University, Houston, TX, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Gerhard</namePart>
<namePart type="family">Weikum</namePart>
<affiliation>Max-Planck Institute of Computer Science, Saarbruecken, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<publisher>Springer</publisher>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
<issuance>serial</issuance>
</originInfo>
<relatedItem type="constituent">
<titleInfo>
<title>Lecture Notes in Bioinformatics</title>
</titleInfo>
<name type="personal">
<namePart type="given">David</namePart>
<namePart type="family">Hutchison</namePart>
<affiliation>Lancaster University, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Takeo</namePart>
<namePart type="family">Kanade</namePart>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Josef</namePart>
<namePart type="family">Kittler</namePart>
<affiliation>University of Surrey, Guildford, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jon</namePart>
<namePart type="given">M.</namePart>
<namePart type="family">Kleinberg</namePart>
<affiliation>Cornell University, Ithaca, NY, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Friedemann</namePart>
<namePart type="family">Mattern</namePart>
<affiliation>ETH Zurich, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">John</namePart>
<namePart type="given">C.</namePart>
<namePart type="family">Mitchell</namePart>
<affiliation>Stanford University, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moni</namePart>
<namePart type="family">Naor</namePart>
<affiliation>Weizmann Institute of Science, Rehovot, Israel</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Oscar</namePart>
<namePart type="family">Nierstrasz</namePart>
<affiliation>University of Bern, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">C.</namePart>
<namePart type="family">Pandu Rangan</namePart>
<affiliation>Indian Institute of Technology, Madras, India</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bernhard</namePart>
<namePart type="family">Steffen</namePart>
<affiliation>University of Dortmund, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Madhu</namePart>
<namePart type="family">Sudan</namePart>
<affiliation>Massachusetts Institute of Technology, MA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Demetri</namePart>
<namePart type="family">Terzopoulos</namePart>
<affiliation>University of California, Los Angeles, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Dough</namePart>
<namePart type="family">Tygar</namePart>
<affiliation>University of California, Berkeley, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moshe</namePart>
<namePart type="given">Y.</namePart>
<namePart type="family">Vardi</namePart>
<affiliation>Rice University, Houston, TX, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Gerhard</namePart>
<namePart type="family">Weikum</namePart>
<affiliation>Max-Planck Institute of Computer Science, Saarbruecken, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Sorin</namePart>
<namePart type="family">Istrail</namePart>
<affiliation>Celera Genomics, Applied Biosystems, Rockville, MD, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Pavel</namePart>
<namePart type="family">Pevzner</namePart>
<affiliation>University of California, San Diego, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Michael</namePart>
<namePart type="family">Waterman</namePart>
<affiliation>University of Southern California, Los Angeles, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Philipp</namePart>
<namePart type="family">Bücher</namePart>
<affiliation>Ecole Polytechnique Fédérale de Lausanne, Switzerland</affiliation>
<affiliation>E-mail: Philipp.Bucher@isrec.unil.ch</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bernard</namePart>
<namePart type="given">M.</namePart>
<namePart type="given">E.</namePart>
<namePart type="family">Moret</namePart>
<affiliation>Laboratory for Computational Biology and Bioinformatics, EPFL (Ecole Polytechnique Fédérale de Lausanne), Swiss Institute of Bioinformatics, Lausanne, Switzerland</affiliation>
<affiliation>E-mail: bernard.moret@epfl.ch</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="sub-series"></genre>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SubSeriesID">5381</identifier>
</relatedItem>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SeriesID">558</identifier>
<recordInfo>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2006</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E</identifier>
<identifier type="ark">ark:/67375/HCB-ZC5KHRTV-K</identifier>
<identifier type="DOI">10.1007/11851561_12</identifier>
<identifier type="ChapterID">12</identifier>
<identifier type="ChapterID">Chap12</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag Berlin Heidelberg, 2006</accessCondition>
<recordInfo>
<recordContentSource authority="ISTEX" authorityURI="https://loaded-corpus.data.istex.fr" valueURI="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-RLRX46XW-4">springer</recordContentSource>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2006</recordOrigin>
</recordInfo>
</mods>
<json:item>
<extension>json</extension>
<original>false</original>
<mimetype>application/json</mimetype>
<uri>https://api.istex.fr/ark:/67375/HCB-ZC5KHRTV-K/record.json</uri>
</json:item>
</metadata>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000B96 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000B96 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E
   |texte=   Procrastination Leads to Efficient Filtration for Local Multiple Alignment
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021