Procrastination Leads to Efficient Filtration for Local Multiple Alignment
Identifieur interne : 000B96 ( Istex/Curation ); précédent : 000B95; suivant : 000B97Procrastination Leads to Efficient Filtration for Local Multiple Alignment
Auteurs : Aaron E. Darling [États-Unis] ; Todd J. Treangen [Espagne, États-Unis] ; Louxin Zhang [Singapour] ; Carla Kuiken [États-Unis] ; Xavier Messeguer [Espagne] ; Nicole T. Perna [États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ]
Abstract
Abstract: We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as w characters in any occurrence of the motif. The algorithm consumes $\mathcal{O}(wN)$ memory and $\mathcal{O}(wN \log wN)$ time where N is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called procrastAligner and is freely available from http://gel.ahabs.wisc.edu/procrastination
Url:
DOI: 10.1007/11851561_12
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000B96
Links to Exploration step
ISTEX:542B9BC57D7447DF7EFDFE6A4AB552289FE92D7ELe document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
<author><name sortKey="Darling, Aaron E" sort="Darling, Aaron E" uniqKey="Darling A" first="Aaron E." last="Darling">Aaron E. Darling</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Computer Science, University of Wisconsin, USA</mods:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of Wisconsin</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: darling@cs.wisc.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Treangen, Todd J" sort="Treangen, Todd J" uniqKey="Treangen T" first="Todd J." last="Treangen">Todd J. Treangen</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Department of Computer Science, Technical University of Catalonia, Barcelona</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: treangen@lsi.upc.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Zhang, Louxin" sort="Zhang, Louxin" uniqKey="Zhang L" first="Louxin" last="Zhang">Louxin Zhang</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Mathematics, National University of Singapore, Singapore</mods:affiliation>
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>Department of Mathematics, National University of Singapore</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Kuiken, Carla" sort="Kuiken, Carla" uniqKey="Kuiken C" first="Carla" last="Kuiken">Carla Kuiken</name>
<affiliation wicri:level="1"><mods:affiliation>T-10 Theoretical Biology Division, Los Alamos National Laboratory, USA</mods:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>T-10 Theoretical Biology Division, Los Alamos National Laboratory</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Messeguer, Xavier" sort="Messeguer, Xavier" uniqKey="Messeguer X" first="Xavier" last="Messeguer">Xavier Messeguer</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Department of Computer Science, Technical University of Catalonia, Barcelona</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Perna, Nicole T" sort="Perna, Nicole T" uniqKey="Perna N" first="Nicole T." last="Perna">Nicole T. Perna</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin, USA</mods:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11851561_12</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HCB-ZC5KHRTV-K/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B96</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000B96</idno>
<idno type="wicri:Area/Istex/Curation">000B96</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Procrastination Leads to Efficient Filtration for Local Multiple Alignment</title>
<author><name sortKey="Darling, Aaron E" sort="Darling, Aaron E" uniqKey="Darling A" first="Aaron E." last="Darling">Aaron E. Darling</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Computer Science, University of Wisconsin, USA</mods:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of Wisconsin</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: darling@cs.wisc.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Treangen, Todd J" sort="Treangen, Todd J" uniqKey="Treangen T" first="Todd J." last="Treangen">Todd J. Treangen</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Department of Computer Science, Technical University of Catalonia, Barcelona</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><mods:affiliation>E-mail: treangen@lsi.upc.edu</mods:affiliation>
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Zhang, Louxin" sort="Zhang, Louxin" uniqKey="Zhang L" first="Louxin" last="Zhang">Louxin Zhang</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Mathematics, National University of Singapore, Singapore</mods:affiliation>
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>Department of Mathematics, National University of Singapore</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Kuiken, Carla" sort="Kuiken, Carla" uniqKey="Kuiken C" first="Carla" last="Kuiken">Carla Kuiken</name>
<affiliation wicri:level="1"><mods:affiliation>T-10 Theoretical Biology Division, Los Alamos National Laboratory, USA</mods:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>T-10 Theoretical Biology Division, Los Alamos National Laboratory</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Messeguer, Xavier" sort="Messeguer, Xavier" uniqKey="Messeguer X" first="Xavier" last="Messeguer">Xavier Messeguer</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Computer Science, Technical University of Catalonia, Barcelona, Spain</mods:affiliation>
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Department of Computer Science, Technical University of Catalonia, Barcelona</wicri:regionArea>
</affiliation>
</author>
<author><name sortKey="Perna, Nicole T" sort="Perna, Nicole T" uniqKey="Perna N" first="Nicole T." last="Perna">Nicole T. Perna</name>
<affiliation wicri:level="1"><mods:affiliation>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin, USA</mods:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Animal Health and Biomedical Sciences, Genome Center, University of Wisconsin</wicri:regionArea>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s" type="main" xml:lang="en">Lecture Notes in Computer Science</title>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We describe an efficient local multiple alignment filtration heuristic for identification of conserved regions in one or more DNA sequences. The method incorporates several novel ideas: (1) palindromic spaced seed patterns to match both DNA strands simultaneously, (2) seed extension (chaining) in order of decreasing multiplicity, and (3) procrastination when low multiplicity matches are encountered. The resulting local multiple alignments may have nucleotide substitutions and internal gaps as large as w characters in any occurrence of the motif. The algorithm consumes $\mathcal{O}(wN)$ memory and $\mathcal{O}(wN \log wN)$ time where N is the sequence length. We score the significance of multiple alignments using entropy-based motif scoring methods. We demonstrate the performance of our filtration method on Alu-repeat rich segments of the human genome and a large set of Hepatitis C virus genomes. The GPL implementation of our algorithm in C++ is called procrastAligner and is freely available from http://gel.ahabs.wisc.edu/procrastination</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Istex/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000B96 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Istex/Curation/biblio.hfd -nk 000B96 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Istex |étape= Curation |type= RBID |clé= ISTEX:542B9BC57D7447DF7EFDFE6A4AB552289FE92D7E |texte= Procrastination Leads to Efficient Filtration for Local Multiple Alignment }}
This area was generated with Dilib version V0.6.33. |