Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000502 ( Pmc/Corpus ); précédent : 0005019; suivant : 0005030 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes</title>
<author>
<name sortKey="Niu, Beifang" sort="Niu, Beifang" uniqKey="Niu B" first="Beifang" last="Niu">Beifang Niu</name>
</author>
<author>
<name sortKey="Zhu, Zhengwei" sort="Zhu, Zhengwei" uniqKey="Zhu Z" first="Zhengwei" last="Zhu">Zhengwei Zhu</name>
</author>
<author>
<name sortKey="Fu, Limin" sort="Fu, Limin" uniqKey="Fu L" first="Limin" last="Fu">Limin Fu</name>
</author>
<author>
<name sortKey="Wu, Sitao" sort="Wu, Sitao" uniqKey="Wu S" first="Sitao" last="Wu">Sitao Wu</name>
</author>
<author>
<name sortKey="Li, Weizhong" sort="Li, Weizhong" uniqKey="Li W" first="Weizhong" last="Li">Weizhong Li</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">21505035</idno>
<idno type="pmc">3106194</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3106194</idno>
<idno type="RBID">PMC:3106194</idno>
<idno type="doi">10.1093/bioinformatics/btr252</idno>
<date when="2011">2011</date>
<idno type="wicri:Area/Pmc/Corpus">000502</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes</title>
<author>
<name sortKey="Niu, Beifang" sort="Niu, Beifang" uniqKey="Niu B" first="Beifang" last="Niu">Beifang Niu</name>
</author>
<author>
<name sortKey="Zhu, Zhengwei" sort="Zhu, Zhengwei" uniqKey="Zhu Z" first="Zhengwei" last="Zhu">Zhengwei Zhu</name>
</author>
<author>
<name sortKey="Fu, Limin" sort="Fu, Limin" uniqKey="Fu L" first="Limin" last="Fu">Limin Fu</name>
</author>
<author>
<name sortKey="Wu, Sitao" sort="Wu, Sitao" uniqKey="Wu S" first="Sitao" last="Wu">Sitao Wu</name>
</author>
<author>
<name sortKey="Li, Weizhong" sort="Li, Weizhong" uniqKey="Li W" first="Weizhong" last="Li">Weizhong Li</name>
</author>
</analytic>
<series>
<title level="j">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1367-4811</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>
<bold>Summary:</bold>
Fragment recruitment, a process of aligning sequencing reads to reference genomes, is a crucial step in metagenomic data analysis. The available sequence alignment programs are either slow or insufficient for recruiting metagenomic reads. We implemented an efficient algorithm, FR-HIT, for fragment recruitment. We applied FR-HIT and several other tools including BLASTN, MegaBLAST, BLAT, LAST, SSAHA2, SOAP2, BWA and BWA-SW to recruit four metagenomic datasets from different type of sequencers. On average, FR-HIT and BLASTN recruited significantly more reads than other programs, while FR-HIT is about two orders of magnitude faster than BLASTN. FR-HIT is slower than the fastest SOAP2, BWA and BWA-SW, but it recruited 1–5 times more reads.</p>
<p>
<bold>Availability:</bold>
<ext-link ext-link-type="uri" xlink:href="http://weizhongli-lab.org/frhit">http://weizhongli-lab.org/frhit</ext-link>
.</p>
<p>
<bold>Contact:</bold>
<email>liwz@sdsc.edu</email>
</p>
<p>
<bold>Supplementary information:</bold>
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/cgi/content/full/btr252/DC1">Supplementary data</ext-link>
are available at
<italic>Bioinformatics</italic>
online.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burkhardt, S" uniqKey="Burkhardt S">S Burkhardt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jokinen, P" uniqKey="Jokinen P">P Jokinen</name>
</author>
<author>
<name sortKey="Ukkonen, E" uniqKey="Ukkonen E">E Ukkonen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kent, Wj" uniqKey="Kent W">WJ Kent</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kielbasa, Sm" uniqKey="Kielbasa S">SM Kielbasa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Langmead, B" uniqKey="Langmead B">B Langmead</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ning, Z" uniqKey="Ning Z">Z Ning</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Owolabi, O" uniqKey="Owolabi O">O Owolabi</name>
</author>
<author>
<name sortKey="Mcgregor, Dr" uniqKey="Mcgregor D">DR McGregor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pearson, Wr" uniqKey="Pearson W">WR Pearson</name>
</author>
<author>
<name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, J" uniqKey="Qin J">J Qin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rusch, Db" uniqKey="Rusch D">DB Rusch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Bioinformatics</journal-id>
<journal-id journal-id-type="publisher-id">bioinformatics</journal-id>
<journal-id journal-id-type="hwp">bioinfo</journal-id>
<journal-title-group>
<journal-title>Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="ppub">1367-4803</issn>
<issn pub-type="epub">1367-4811</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">21505035</article-id>
<article-id pub-id-type="pmc">3106194</article-id>
<article-id pub-id-type="doi">10.1093/bioinformatics/btr252</article-id>
<article-id pub-id-type="publisher-id">btr252</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Applications Note</subject>
<subj-group>
<subject>Sequence Analysis</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Niu</surname>
<given-names>Beifang</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhu</surname>
<given-names>Zhengwei</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Fu</surname>
<given-names>Limin</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wu</surname>
<given-names>Sitao</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Weizhong</given-names>
</name>
<xref ref-type="corresp" rid="COR1">*</xref>
</contrib>
</contrib-group>
<aff>Center for Research in Biological Systems, University of California San Diego, La Jolla, CA, USA</aff>
<author-notes>
<corresp id="COR1">*To whom correspondence should be addressed.</corresp>
<fn>
<p>Associate Editor: Martin Bishop</p>
</fn>
</author-notes>
<pub-date pub-type="ppub">
<day>15</day>
<month>6</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="epub">
<day>19</day>
<month>4</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>19</day>
<month>4</month>
<year>2011</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>27</volume>
<issue>12</issue>
<fpage>1704</fpage>
<lpage>1705</lpage>
<history>
<date date-type="received">
<day>19</day>
<month>2</month>
<year>2011</year>
</date>
<date date-type="rev-recd">
<day>8</day>
<month>4</month>
<year>2011</year>
</date>
<date date-type="accepted">
<day>11</day>
<month>4</month>
<year>2011</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2011. Published by Oxford University Press.</copyright-statement>
<copyright-year>2011</copyright-year>
<license license-type="creative-commons" xlink:href="http://creativecommons.org/licenses/by-nc/2.5">
<license-p>
<pmc-comment>CREATIVE COMMONS</pmc-comment>
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/2.5">http://creativecommons.org/licenses/by-nc/2.5</ext-link>
), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>
<bold>Summary:</bold>
Fragment recruitment, a process of aligning sequencing reads to reference genomes, is a crucial step in metagenomic data analysis. The available sequence alignment programs are either slow or insufficient for recruiting metagenomic reads. We implemented an efficient algorithm, FR-HIT, for fragment recruitment. We applied FR-HIT and several other tools including BLASTN, MegaBLAST, BLAT, LAST, SSAHA2, SOAP2, BWA and BWA-SW to recruit four metagenomic datasets from different type of sequencers. On average, FR-HIT and BLASTN recruited significantly more reads than other programs, while FR-HIT is about two orders of magnitude faster than BLASTN. FR-HIT is slower than the fastest SOAP2, BWA and BWA-SW, but it recruited 1–5 times more reads.</p>
<p>
<bold>Availability:</bold>
<ext-link ext-link-type="uri" xlink:href="http://weizhongli-lab.org/frhit">http://weizhongli-lab.org/frhit</ext-link>
.</p>
<p>
<bold>Contact:</bold>
<email>liwz@sdsc.edu</email>
</p>
<p>
<bold>Supplementary information:</bold>
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/cgi/content/full/btr252/DC1">Supplementary data</ext-link>
are available at
<italic>Bioinformatics</italic>
online.</p>
</abstract>
<counts>
<page-count count="2"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="SEC1">
<title>1 INTRODUCTION</title>
<p>Metagenomic data provide a more comprehensive picture for our understanding of the microbial world. An important step of such understanding is to compare the raw sequencing reads against the available microbial genomes to analyze the phylogenetic composition, genes and functions of the samples. Such a procedure, referred to as fragment recruitment, was introduced in the Global Ocean Sampling (GOS) metagenomics study (
<xref ref-type="bibr" rid="B13">Rusch
<italic>et al.</italic>
, 2007</xref>
).</p>
<p>Sequences from metagenomic samples exhibit great differences from the available genomes. Although there are thousands of available complete microbial genomes, they hardly cover the broad and diverse species in many metagenomic samples. A typical metagenomic dataset may have hundreds or thousands of species, and many of them are novel. Therefore, it is critical for fragment recruitment methods to align reads to homologous genomes.</p>
<p>In the GOS study, BLAST (
<xref ref-type="bibr" rid="B1">Altschul
<italic>et al.</italic>
, 1997</xref>
) was used for fragment recruitment. However, it is too slow to handle large datasets. The explosion of next-generation sequencing data stimulated the development of new mapping programs, such as SOAP (
<xref ref-type="bibr" rid="B8">Li
<italic>et al.</italic>
, 2008</xref>
), Bowtie (
<xref ref-type="bibr" rid="B6">Langmead
<italic>et al.</italic>
, 2009</xref>
), BWA (
<xref ref-type="bibr" rid="B7">Li and Durbin, 2009</xref>
) and many others. These programs are several orders of magnitude times faster than BLAST, but they can only identify very stringent similarities that tolerate only a few mismatches and gaps. So these mapping programs are insufficient for fragment recruitment. The slightly slower programs like BLAT (
<xref ref-type="bibr" rid="B4">Kent, 2002</xref>
), SSAHA2 (
<xref ref-type="bibr" rid="B9">Ning
<italic>et al.</italic>
, 2001</xref>
) and LAST (
<xref ref-type="bibr" rid="B5">Kielbasa
<italic>et al.</italic>
, 2011</xref>
) can recruit more reads than the mapping programs, but their fragment recruiting capacities are still limited. In this article, we present a new fragment recruitment method, FR-HIT. Given reference genomes, metagenomic reads and sequence identity and alignment length cutoffs, the goal of FR-HIT is to align the most reads to references with minimal computational time.</p>
</sec>
<sec sec-type="methods" id="SEC2">
<title>2 METHODS AND IMPLEMENTATION</title>
<p>FR-HIT first constructs a
<italic>k</italic>
-mer hash table for the reference genome sequences. Then for each query, it performs seeding, filtering and banded alignment to identify the alignments to reference sequences that meet user-defined cutoffs.</p>
<sec id="SEC2.1">
<title>2.1 Constructing
<italic>k</italic>
-mer hash table</title>
<p>The reference genome sequences are converted into a
<italic>k</italic>
-mer hash table. The default value of
<italic>k</italic>
is 11 and can be adjusted from 8 to 12. We include overlapping
<italic>k</italic>
-mers at an equidistant step from reference sequences. A reference sequence of length
<italic>m</italic>
contains (
<italic>m − k</italic>
)/
<italic>(k − p</italic>
)
<italic>+</italic>
1
<italic>k</italic>
-mers with an overlap of
<italic>p</italic>
bases. Here,
<italic>p</italic>
is also a user-adjustable parameter. The hash table stores the indexes of reference sequences and the offset positions of
<italic>k</italic>
-mers on reference sequences.</p>
</sec>
<sec id="SEC2.2">
<title>2.2 Seeding</title>
<p>Seeding identifies candidate blocks, which are fragments of reference sequences that can be potentially aligned with the query. For each query, we count all its overlapping
<italic>k</italic>
-mers and scan the
<italic>k</italic>
-mer hash table to collect the
<italic>k</italic>
-mers shared by reference sequences.</p>
<p>We identify pieces of reference sequences that the query can be aligned to. These pieces are anchored by the shared
<italic>k</italic>
-mers. For a reference, any cluster of
<italic></italic>
2 pieces within
<italic>b</italic>
bases will derive a candidate block. This block covers all the pieces in that cluster and has extra
<italic>b</italic>
bases at each end. Here,
<italic>b</italic>
is the bandwidth to be introduced in
<xref ref-type="sec" rid="SEC2.4">Section 2.4.</xref>
If two candidate blocks overlap, they are joined together into one candidate block. We repeat this until no overlapping blocks are observed.</p>
</sec>
<sec id="SEC2.3">
<title>2.3 Filtering</title>
<p>Filtering removes the candidate blocks that do not enclose qualified alignments.
<italic>K</italic>
-mer filtering was originally used in QUASAR (
<xref ref-type="bibr" rid="B2">Burkhardt
<italic>et al.</italic>
, 1999</xref>
). Two sequences of length
<italic>n</italic>
with Hamming distance ϵ share at least
<italic>n +</italic>
1
<italic></italic>
(
<italic>ϵ +</italic>
1)
<italic>k</italic>
common
<italic>k</italic>
-mers (
<xref ref-type="bibr" rid="B3">Jokinen and Ukkonen, 1991</xref>
;
<xref ref-type="bibr" rid="B10">Owolabi and Mcgregor, 1988</xref>
). Here, ϵ is the number of mismatches in an alignment. Based on user-defined length and sequence identity cutoffs, we calculate the number of mismatches and reject the candidate blocks that do not have enough common
<italic>k</italic>
-mers. In this step, the length of a
<italic>k</italic>
-mer is 4.</p>
</sec>
<sec id="SEC2.4">
<title>2.4 Banded alignment</title>
<p>FR-HIT performs banded alignments (
<xref ref-type="bibr" rid="B11">Pearson and Lipman, 1988</xref>
) between the query and the candidate blocks that passed the filter. The bandwidth is also a user-defined value. For each candidate block, the band that contains the most shared
<italic>k</italic>
-mers is used. If a reference sequence has multiple candidate blocks, these blocks are sorted by the number of shared
<italic>k</italic>
-mers in decreasing order. Banded alignments are performed in this order, and if
<italic>t</italic>
banded alignments do not recruit this query, no more banded alignment is tried for this reference. Here,
<italic>t</italic>
is a parameter with default value of 10.</p>
</sec>
<sec id="SEC2.5">
<title>2.5 Implementation</title>
<p>FR-HIT is written in C++ and distributed at
<ext-link ext-link-type="uri" xlink:href="http://weizhongli-lab.org/frhit">http://weizhongli-lab.org/frhit</ext-link>
with documentation and testing data. FR-HIT takes reference sequences in FASTA format and queries in FASTA or FASTQ format and produce recruitment results. If a query hits multiple references or multiple locations of a reference, FR-HIT reports all these alignments. Currently, FR-HIT does not support reads in color space.</p>
</sec>
</sec>
<sec sec-type="results" id="SEC3">
<title>3 RESULTS</title>
<p>We applied FR-HIT on four metagenomic datasets and compared it with BLASTN, MegaBLAST, SOAP2, BWA, BWA-SW, SSAHA2, BLAT and LAST. The first dataset has 1 million 75 bp Illumina reads from MetaHIT sample MH0006 (
<xref ref-type="bibr" rid="B12">Qin
<italic>et al.</italic>
, 2010</xref>
). The other three datasets are from 454 GS20, GSFLX and Titanium platforms, with 688 590, 288 735 and 502 399 reads, respectively. Their average lengths are 99, 233 and 345 bp, respectively. The GS20 and GSFLX datasets were downloaded from CAMERA (
<xref ref-type="bibr" rid="B14">Sun
<italic>et al.</italic>
, 2011</xref>
) under IDs SCUMS_SMPL_Arctic and BATS_SMPL_174-2. The Titanium data were from NCBI under accession SRR029691. For the Illumina dataset, we used the 194 human gut genomes from MetaHIT study as reference. For the 454 datasets, we used the 1985 completed bacterial genome sequences downloaded from NCBI in April 2010 as references. The two reference databases are 1.139 and 3.823 GB in size.</p>
<p>A read is considered recruited if it is aligned to a reference with
<italic></italic>
30 bp and
<italic></italic>
80% identity. Such cutoffs represent a basic need for fragment recruitment, to recruit more reads and to prevent obviously spurious hits. More discussions and examples of parameters are available in
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/cgi/content/full/btr252/DC1">Supplementary Material</ext-link>
. Parameters of all the programs are listed in
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/cgi/content/full/btr252/DC1">Supplementary Table S2</ext-link>
. The CPU time and the number of recruited reads are shown in
<xref ref-type="fig" rid="F1">Figure 1</xref>
and
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/cgi/content/full/btr252/DC1">Supplementary Table S3</ext-link>
. FR-HIT's results with different parameters are provided in
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/cgi/content/full/btr252/DC1">Supplementary Table S4</ext-link>
. On average, FR-HIT and BLASTN recruited significantly more reads than other programs, FR-HIT is
<italic>~</italic>
2 orders of magnitude faster than BLASTN. FR-HIT is slower than the fastest mapping programs SOAP2, BWA and BWA-SW, but it recruited 1–5 times more reads. In these tests, FR-HIT shows better recruitment rate and speed than SSAHA2. FR-HIT is slightly slower than MegaBLAST, BLAT and LAST, but it recruited much more reads than them. Using the Illumina data as an example, BLASTN recruited 475 584 reads in 7168 min. SOAP2 used 1.5 min, but only recruited 141 417 reads.
<fig id="F1" position="float">
<label>Fig. 1.</label>
<caption>
<p>Recruitment rate and speed of FR-HIT and other programs. The
<italic>x</italic>
-axis (logarithmic scale) is CPU minute on AMD Opteron 8380 Shanghai 2.5 GHz processors;
<italic>y</italic>
-axis is the number of recruited reads. SOAP2 and BWA, short read mapping tools, were only used in Illumina data.</p>
</caption>
<graphic xlink:href="btr252f1"></graphic>
</fig>
</p>
<p>FR-HIT recruited 523 868 reads in 45 min. Metagenomic data contain many novel species, so 49–64% of reads cannot be recruited by FR-HIT. Due to the use of overlapping
<italic>k</italic>
-mers, FR-HIT needs more memory than other programs. It used
<italic>~</italic>
4 and 8 GB for the two reference databases in these tests.</p>
<p>
<italic>Funding</italic>
: Supported by Awards (
<award-id>R01RR025030</award-id>
and
<award-id>R01HG005978</award-id>
) from
<funding-source>National Center for Research Resources and National Human Genome Research Institute</funding-source>
.</p>
<p>
<italic>Conflict of Interest</italic>
: none declared.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material id="PMC_1" content-type="local-data">
<caption>
<title>Supplementary Data</title>
</caption>
<media mimetype="text" mime-subtype="html" xlink:href="supp_27_12_1704__index.html"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="msword" xlink:href="supp_btr252_fr-hit-v2-supp.doc"></media>
</supplementary-material>
</sec>
</body>
<back>
<ref-list>
<title>REFERENCES</title>
<ref id="B1">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</article-title>
<source>Nucleic Acids Res.</source>
<year>1997</year>
<volume>25</volume>
<fpage>3389</fpage>
<lpage>3402</lpage>
<pub-id pub-id-type="pmid">9254694</pub-id>
</element-citation>
</ref>
<ref id="B2">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Burkhardt</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>q-gram based database searching using a suffix array (QUASAR)</article-title>
<source>RECOMB</source>
<year>1999</year>
<volume>99</volume>
<fpage>77</fpage>
<lpage>83</lpage>
</element-citation>
</ref>
<ref id="B3">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jokinen</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Ukkonen</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>2 Algorithms for approximate string matching in static texts</article-title>
<source>Lect. Notes Computer Sci.</source>
<year>1991</year>
<volume>520</volume>
<fpage>240</fpage>
<lpage>248</lpage>
</element-citation>
</ref>
<ref id="B4">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kent</surname>
<given-names>WJ</given-names>
</name>
</person-group>
<article-title>BLAT–the BLAST-like alignment tool</article-title>
<source>Genome Res.</source>
<year>2002</year>
<volume>12</volume>
<fpage>656</fpage>
<lpage>664</lpage>
<pub-id pub-id-type="pmid">11932250</pub-id>
</element-citation>
</ref>
<ref id="B5">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kielbasa</surname>
<given-names>SM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Adaptive seeds tame genomic sequence comparison</article-title>
<source>Genome Res.</source>
<year>2011</year>
<volume>21</volume>
<fpage>487</fpage>
<lpage>493</lpage>
<pub-id pub-id-type="pmid">21209072</pub-id>
</element-citation>
</ref>
<ref id="B6">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Langmead</surname>
<given-names>B</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Ultrafast and memory-efficient alignment of short DNA sequences to the human genome</article-title>
<source>Genome Biol.</source>
<year>2009</year>
<volume>10</volume>
<fpage>R25</fpage>
<pub-id pub-id-type="pmid">19261174</pub-id>
</element-citation>
</ref>
<ref id="B7">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Fast and accurate short read alignment with Burrows-Wheeler transform</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<fpage>1754</fpage>
<lpage>1760</lpage>
<pub-id pub-id-type="pmid">19451168</pub-id>
</element-citation>
</ref>
<ref id="B8">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>R</given-names>
</name>
<etal></etal>
</person-group>
<article-title>SOAP: short oligonucleotide alignment program</article-title>
<source>Bioinformatics</source>
<year>2008</year>
<volume>24</volume>
<fpage>713</fpage>
<lpage>714</lpage>
<pub-id pub-id-type="pmid">18227114</pub-id>
</element-citation>
</ref>
<ref id="B9">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ning</surname>
<given-names>Z</given-names>
</name>
<etal></etal>
</person-group>
<article-title>SSAHA: a fast search method for large DNA databases</article-title>
<source>Genome Res.</source>
<year>2001</year>
<volume>11</volume>
<fpage>1725</fpage>
<lpage>1729</lpage>
<pub-id pub-id-type="pmid">11591649</pub-id>
</element-citation>
</ref>
<ref id="B10">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Owolabi</surname>
<given-names>O</given-names>
</name>
<name>
<surname>McGregor</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Fast approximate string matching</article-title>
<source>Software Pract. Exper.</source>
<year>1988</year>
<volume>18</volume>
<fpage>387</fpage>
<lpage>393</lpage>
</element-citation>
</ref>
<ref id="B11">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pearson</surname>
<given-names>WR</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Improved tools for biological sequence comparison</article-title>
<source>Proc. Natl Acad. Sci. USA</source>
<year>1988</year>
<volume>85</volume>
<fpage>2444</fpage>
<lpage>2448</lpage>
<pub-id pub-id-type="pmid">3162770</pub-id>
</element-citation>
</ref>
<ref id="B12">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qin</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A human gut microbial gene catalogue established by metagenomic sequencing</article-title>
<source>Nature</source>
<year>2010</year>
<volume>464</volume>
<fpage>59</fpage>
<lpage>65</lpage>
<pub-id pub-id-type="pmid">20203603</pub-id>
</element-citation>
</ref>
<ref id="B13">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rusch</surname>
<given-names>DB</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The Sorcerer II Global Ocean sampling expedition: northwest Atlantic through eastern tropical Pacific</article-title>
<source>PLoS Biol.</source>
<year>2007</year>
<volume>5</volume>
<fpage>e77</fpage>
<pub-id pub-id-type="pmid">17355176</pub-id>
</element-citation>
</ref>
<ref id="B14">
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource</article-title>
<source>Nucleic Acids Res.</source>
<year>2011</year>
<volume>39</volume>
<fpage>D546</fpage>
<lpage>D551</lpage>
<pub-id pub-id-type="pmid">21045053</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000502  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000502  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024