Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 0005470 ( Pmc/Corpus ); précédent : 0005469; suivant : 0005471 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Functional characterization of endogenous siRNA target genes in
<italic>Caenorhabditis elegans</italic>
</title>
<author>
<name sortKey="Asikainen, Suvi" sort="Asikainen, Suvi" uniqKey="Asikainen S" first="Suvi" last="Asikainen">Suvi Asikainen</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heikkinen, Liisa" sort="Heikkinen, Liisa" uniqKey="Heikkinen L" first="Liisa" last="Heikkinen">Liisa Heikkinen</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wong, Garry" sort="Wong, Garry" uniqKey="Wong G" first="Garry" last="Wong">Garry Wong</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Storvik, Markus" sort="Storvik, Markus" uniqKey="Storvik M" first="Markus" last="Storvik">Markus Storvik</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Department of Pharmacology and Toxicology, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">18522735</idno>
<idno type="pmc">2440555</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2440555</idno>
<idno type="RBID">PMC:2440555</idno>
<idno type="doi">10.1186/1471-2164-9-270</idno>
<date when="2008">2008</date>
<idno type="wicri:Area/Pmc/Corpus">000547</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000547</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Functional characterization of endogenous siRNA target genes in
<italic>Caenorhabditis elegans</italic>
</title>
<author>
<name sortKey="Asikainen, Suvi" sort="Asikainen, Suvi" uniqKey="Asikainen S" first="Suvi" last="Asikainen">Suvi Asikainen</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Heikkinen, Liisa" sort="Heikkinen, Liisa" uniqKey="Heikkinen L" first="Liisa" last="Heikkinen">Liisa Heikkinen</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wong, Garry" sort="Wong, Garry" uniqKey="Wong G" first="Garry" last="Wong">Garry Wong</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Storvik, Markus" sort="Storvik, Markus" uniqKey="Storvik M" first="Markus" last="Storvik">Markus Storvik</name>
<affiliation>
<nlm:aff id="I1">Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Department of Pharmacology and Toxicology, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Genomics</title>
<idno type="eISSN">1471-2164</idno>
<imprint>
<date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Small interfering RNA (siRNA) molecules mediate sequence specific silencing in RNA interference (RNAi), a gene regulatory phenomenon observed in almost all organisms. Large scale sequencing of small RNA libraries obtained from
<italic>C. elegans </italic>
has revealed that a broad spectrum of siRNAs is endogenously transcribed from genomic sequences. The biological role and molecular diversity of
<italic>C. elegans </italic>
endogenous siRNA (endo-siRNA) molecules, nonetheless, remain poorly understood. In order to gain insight into their biological function, we annotated two large libraries of endo-siRNA sequences, identified their cognate targets, and performed gene ontology analysis to identify enriched functional categories.</p>
</sec>
<sec>
<title>Results</title>
<p>Systematic trends in categorization of target genes according to the specific length of siRNA sequences were observed: 18- to 22-mer siRNAs were associated with genes required for embryonic development; 23-mers were associated uniquely with post-embryonic development; 24–26-mers were associated with phosphorus metabolism or protein modification. Moreover, we observe that some argonaute related genes associate with siRNAs with multiple reads. Sequence frequency graphs suggest that different lengths of siRNAs share similarities in overall sequence structure: the 5' end begins with G, while the body predominates with U and C.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>These results suggest that the lengths of endogenous siRNA molecules are consequential to their biological functions since the gene ontology categories for their cognate mRNA targets vary depending upon their lengths.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Genomics</journal-id>
<journal-title>BMC Genomics</journal-title>
<issn pub-type="epub">1471-2164</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">18522735</article-id>
<article-id pub-id-type="pmc">2440555</article-id>
<article-id pub-id-type="publisher-id">1471-2164-9-270</article-id>
<article-id pub-id-type="doi">10.1186/1471-2164-9-270</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Functional characterization of endogenous siRNA target genes in
<italic>Caenorhabditis elegans</italic>
</article-title>
</title-group>
<contrib-group>
<contrib id="A1" contrib-type="author">
<name>
<surname>Asikainen</surname>
<given-names>Suvi</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>suvi.asikainen@uku.fi</email>
</contrib>
<contrib id="A2" contrib-type="author">
<name>
<surname>Heikkinen</surname>
<given-names>Liisa</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>liisa.heikkinen@uku.fi</email>
</contrib>
<contrib id="A3" contrib-type="author">
<name>
<surname>Wong</surname>
<given-names>Garry</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<email>garry.wong@uku.fi</email>
</contrib>
<contrib id="A4" corresp="yes" contrib-type="author">
<name>
<surname>Storvik</surname>
<given-names>Markus</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>markus.storvik@uku.fi</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Department of Biosciences, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</aff>
<aff id="I2">
<label>2</label>
Department of Neurobiology, A. I. Virtanen Institute, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</aff>
<aff id="I3">
<label>3</label>
Department of Pharmacology and Toxicology, University of Kuopio, P.O. Box 1627, Kuopio 70211, Finland</aff>
<pub-date pub-type="collection">
<year>2008</year>
</pub-date>
<pub-date pub-type="epub">
<day>3</day>
<month>6</month>
<year>2008</year>
</pub-date>
<volume>9</volume>
<fpage>270</fpage>
<lpage>270</lpage>
<ext-link ext-link-type="uri" xlink:href="http://www.biomedcentral.com/1471-2164/9/270"></ext-link>
<history>
<date date-type="received">
<day>12</day>
<month>2</month>
<year>2008</year>
</date>
<date date-type="accepted">
<day>3</day>
<month>6</month>
<year>2008</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2008 Asikainen et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2008</copyright-year>
<copyright-holder>Asikainen et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0"></ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</p>
<pmc-comment> Asikainen Suvi suvi.asikainen@uku.fi Functional characterization of endogenous siRNA target genes in Caenorhabditis elegans 2008BMC Genomics 9(1): 270-. (2008)1471-2164(2008)9:1<270>urn:ISSN:1471-2164</pmc-comment>
</license>
</permissions>
<abstract>
<sec>
<title>Background</title>
<p>Small interfering RNA (siRNA) molecules mediate sequence specific silencing in RNA interference (RNAi), a gene regulatory phenomenon observed in almost all organisms. Large scale sequencing of small RNA libraries obtained from
<italic>C. elegans </italic>
has revealed that a broad spectrum of siRNAs is endogenously transcribed from genomic sequences. The biological role and molecular diversity of
<italic>C. elegans </italic>
endogenous siRNA (endo-siRNA) molecules, nonetheless, remain poorly understood. In order to gain insight into their biological function, we annotated two large libraries of endo-siRNA sequences, identified their cognate targets, and performed gene ontology analysis to identify enriched functional categories.</p>
</sec>
<sec>
<title>Results</title>
<p>Systematic trends in categorization of target genes according to the specific length of siRNA sequences were observed: 18- to 22-mer siRNAs were associated with genes required for embryonic development; 23-mers were associated uniquely with post-embryonic development; 24–26-mers were associated with phosphorus metabolism or protein modification. Moreover, we observe that some argonaute related genes associate with siRNAs with multiple reads. Sequence frequency graphs suggest that different lengths of siRNAs share similarities in overall sequence structure: the 5' end begins with G, while the body predominates with U and C.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>These results suggest that the lengths of endogenous siRNA molecules are consequential to their biological functions since the gene ontology categories for their cognate mRNA targets vary depending upon their lengths.</p>
</sec>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>The genome of
<italic>C. elegans </italic>
contains two principal groups of small RNA species capable of interfering with gene expression. The first, microRNAs (miRNAs), are a class of relatively well characterized small RNAs of ~22 nucleotides (nt) in length derived from a hairpin precursor of ~65–70 nt, that regulate gene expression patterns during organism development and are found from almost all eukaryotes [
<xref ref-type="bibr" rid="B1">1</xref>
,
<xref ref-type="bibr" rid="B2">2</xref>
]. miRNAs are often derived from their own transcript or from intron sequences of protein coding genes [
<xref ref-type="bibr" rid="B2">2</xref>
,
<xref ref-type="bibr" rid="B3">3</xref>
]. Endogenous small interfering RNAs (endo-siRNAs) are a second class of endogenous regulators of gene expression. They are often derived from exons and match perfectly with mRNA sequence of a target gene [
<xref ref-type="bibr" rid="B3">3</xref>
-
<xref ref-type="bibr" rid="B5">5</xref>
]. In contrast, exogenous small interfering RNAs (exo-siRNAs) can be processed from a double stranded RNA (dsRNA) precursor derived from cellular transfections, microinjections, feeding, or from genetic material of invading viruses [
<xref ref-type="bibr" rid="B7">7</xref>
-
<xref ref-type="bibr" rid="B10">10</xref>
]. The biogenesis of exo-siRNAs appears to differ from endo-siRNAs and thus separate pathways for processing and mediating their silencing actions have been proposed [
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B5">5</xref>
].</p>
<p>According to the current model of siRNA biogenesis, both exogenous and endogenous siRNAs are processed from double-stranded RNA (dsRNA) precursors [
<xref ref-type="bibr" rid="B7">7</xref>
] by Dicer, a ribonuclease III enzyme (RNase III) that cleaves the long double-stranded RNA molecule to yield 21–25 nt siRNAs [
<xref ref-type="bibr" rid="B2">2</xref>
,
<xref ref-type="bibr" rid="B4">4</xref>
]. Subsequent loading of the siRNA into the RNAi silencing complex, followed by action of RNA-dependent RNA polymerase (RdRP) on target mRNA template, yields a population of secondary siRNAs that are able to interfere with gene expression through transcriptional repression, translational block, or mRNA cleavage [
<xref ref-type="bibr" rid="B2">2</xref>
,
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B11">11</xref>
-
<xref ref-type="bibr" rid="B13">13</xref>
]. RRF-3 is the first identified
<italic>C. elegans </italic>
RdRP homolog required for accumulation of at least a portion of endo-siRNAs [
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B5">5</xref>
].
<italic>rrf-3 </italic>
mutants lack many endo-siRNAs and have an enhanced RNAi phenotype presumably due to release of it's associated pathway proteins from endogenous RNAi (endo-RNAi) to the exogenous RNAi pathway (exo-RNAi) [
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B14">14</xref>
]. Microarray expression analysis of
<italic>rrf-3 </italic>
mutants have suggested that endo-siRNAs produced from RRF-3 dependent synthesis regulate a large number of protein coding genes, especially those involved in spermatogenesis and protein phosphorylation [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B15">15</xref>
].</p>
<p>Two studies that performed large scale sequencing of small RNA libraries cloned from mixed stage populations of
<italic>C. elegans </italic>
have provided initial material for grouping and classification of candidate endo-siRNAs [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
]. From a preliminary analysis, the existence of a sub-class of endo-siRNAs, referred to as 21U-RNAs, have been discovered [
<xref ref-type="bibr" rid="B6">6</xref>
]. 21U-RNAs are precisely 21 nt long, begin with uridine 5'-monophosphate and originate from more than 5700, primarily non-coding, genomic loci which are dispersed in two broad regions of
<italic>C. elegans </italic>
chromosome IV. Absence of complementary mRNA matches suggests that this siRNA type does not operate by a "classical" mRNA degradation manner of RNAi but may direct alternative actions such as modifications in chromatin structure [
<xref ref-type="bibr" rid="B6">6</xref>
]. Another less abundant class of endo-siRNA that arises from non-coding genomic regions has also been described. These candidate siRNA sequences are dispersed all along the genome and are referred to as tiny non-coding RNAs (tncRNAs) [
<xref ref-type="bibr" rid="B5">5</xref>
]. The absence of corresponding mRNA sequence suggests that tncRNAs may function in a manner similar to 21U-RNAs. Both studies [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
] observed a number of candidate siRNA sequences perfectly corresponding to one or several mRNA sequences. The mRNA match provides strong evidence that this particular class of endo-siRNA is capable of hybridizing with the gene product and interfering with gene expression by a "classical" mRNA degradation manner.</p>
<p>In order to understand the cellular functions regulated by endogenous siRNAs collected from
<italic>C. elegans</italic>
, we merged the libraries from two sequencing projects [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
] containing all siRNA sequences currently available, resulting in a collection of 7136 candidate endo-siRNA sequences. We characterized their length distributions and the relationship of the siRNA with the function of genes targeted. We observed that different lengths of endo-siRNA molecules are associated with functionally different target genes. Moreover, we observe that some argonaute-related genes associate with siRNAs with multiple reads.</p>
</sec>
<sec>
<title>Results</title>
<sec>
<title>Length distribution of candidate siRNA sequences</title>
<p>The large libraries of candidate endo-siRNA sequences obtained by sequencing efforts of Lee and co-workers from the Ambros laboratory, and Ruby and co-workers from the Bartel laboratory [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
] prompted us to merge the libraries and analyze them by length distribution. From the total of 7136 short RNA sequences, 4024 exhibited antisense complementarity to 2344 known mRNA sequences (Additional files
<xref ref-type="supplementary-material" rid="S1">1</xref>
and
<xref ref-type="supplementary-material" rid="S2">2</xref>
). These putative siRNAs were derived from exon coding areas and corresponding mRNAs were defined as cognate targets.</p>
<p>Length distributions encompassing all siRNA sequences from the two source libraries are shown in Figure
<xref ref-type="fig" rid="F1">1a</xref>
. siRNAs from Lee and co-workers are shown in Figure
<xref ref-type="fig" rid="F1">1b</xref>
and siRNAs from Ruby and co-workers are shown in Figure
<xref ref-type="fig" rid="F1">1c</xref>
. siRNAs with a mRNA match are shaded and siRNAs without a match are shown in lighter color. In both libraries, the short RNAs not matching with mRNA are widely distributed in length while siRNAs with a match exhibit a more centered distribution. Strikingly, siRNAs with no mRNA match are dramatically overrepresented in the library of Lee and co-workers, while in the library of Ruby and co-workers, the majority of siRNAs match with mRNA. In both libraries, the length distribution of siRNAs matching with mRNA shows a normal distribution arising from a length of 18- to 24-mers with the highest peak at 22-mers (Figure
<xref ref-type="fig" rid="F1">1b</xref>
and
<xref ref-type="fig" rid="F1">1c</xref>
). A difference arises from the existence of additional 24- to 26-mers in the library of Ruby and co-workers indicating selectivity in the cloning method for these specific lengths of small RNA molecules. In addition, 21-mers are underrepresented in the library of Lee and co-workers when compared with the amount of 21-mers in the library of Ruby and co-workers.</p>
<fig position="float" id="F1">
<label>Figure 1</label>
<caption>
<p>
<bold>Length distribution of siRNAs</bold>
. The siRNAs were obtained by high-throughput sequencing of mixed stage
<italic>C. elegans </italic>
populations from two library sources (Lee et al., 2006; Ruby et al., 2006). The height of the bars indicates the number of siRNAs for the respective length. The shaded portion of the histograms indicates the number of siRNAs with an mRNA sequence match. A) Length distribution of all 7136 available short RNA sequences from
<italic>C. elegans </italic>
available from the two library sources. The length of sequences ranged from 12- to 37 nucleotides. A total of 4024 siRNA sequences matched with an mRNA sequence from the two library sources. B) Length distribution of siRNAs from the library reported in Lee et al., 2006. C) Length distribution of siRNAs from the library reported in Ruby et al., 2006.</p>
</caption>
<graphic xlink:href="1471-2164-9-270-1"></graphic>
</fig>
</sec>
<sec>
<title>siRNA sequences with multiple reads</title>
<p>The abundance (or counts) of individual reads of siRNA sequences was available only for the data from Ruby and co-workers [
<xref ref-type="bibr" rid="B6">6</xref>
]. A total of 125 siRNA sequences were read three or more times with 102 of these matching with mRNA sequences. A list of candidate target genes with 3 or more identical siRNA reads are shown in Additional file
<xref ref-type="supplementary-material" rid="S3">3</xref>
. Groups of multi-read siRNAs were further observed to exhibit a trend towards overrepresentation of 26-mers (Figure
<xref ref-type="fig" rid="F2">2</xref>
). Upon closer inspection, many of the siRNAs with several reads targeted the same mRNAs. When the functions of these putative targets were identified, the "Argonaute and Dicer protein, PAZ" was observed to be significantly enriched as a nomenclature category. The argonaute-related genes targeted by multi-read siRNAs included C16C10.3 (contains PAZ and PIWI RNA-binding domains), F55C9.3 (PAZ) and F58G1.1 (PAZ/PIWI). F55C9.3 was targeted by 16 siRNAs which fell into two length categories: 19- to 20-mers were represented as a single read, while 25- to 26-mers were represented as one to ten reads. All siRNA sites along the gene F55C9.3 (transcript NM_075351) are presented as an example of the distribution of siRNAs along the sequence of the candidate target mRNA (Figure
<xref ref-type="fig" rid="F3">3</xref>
).</p>
<fig position="float" id="F2">
<label>Figure 2</label>
<caption>
<p>
<bold>Length distribution of the multi-read siRNAs</bold>
. Multi-read siRNAs were obtained from the library in Ruby et al., 2006 as described in the methods section. The lengths were calculated and the number of counts for each nucleotide length was plotted.</p>
</caption>
<graphic xlink:href="1471-2164-9-270-2"></graphic>
</fig>
<fig position="float" id="F3">
<label>Figure 3</label>
<caption>
<p>
<bold>Length distribution and target sites of siRNAs complementary to Argonaute-related mRNA NM_075351 (F55C9.3)</bold>
. The siRNA identifier, number of reads, length, siRNA sequence, and position along the transcript is shown. The short bars indicate the position along the transcript and the numbers indicate the nucleotide position along the sequence. There are one to two nucleotide shorter versions of certain siRNAs that have been read multiple times. The number of reads was not available (n.a.) in some cases because those siRNAs were obtained from Lee et al. (2006).</p>
</caption>
<graphic xlink:href="1471-2164-9-270-3"></graphic>
</fig>
</sec>
<sec>
<title>Gene Ontology terms correspond to specific siRNA lengths</title>
<p>The targets for each length category siRNA were associated with the specific Gene Ontology (GO) terms shown in Table
<xref ref-type="table" rid="T1">1</xref>
and Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
. The 18- to 22-mers were associated with the term embryonic development while 23-mers were uniquely associated with post-embryonic development. Longer siRNAs were linked to other GO terms. The 24- to 26-mer siRNAs were linked to phosphorus metabolism or protein modification.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption>
<p>The enriched GO terms for target genes of endo-siRNAs.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<td align="left" colspan="4">
<bold>All siRNAs with a putative target</bold>
</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left">
<bold>Biological process</bold>
</td>
<td align="center">
<bold>n</bold>
</td>
<td align="center">
<bold>Per cent</bold>
</td>
<td align="center">
<bold>p-value</bold>
</td>
</tr>
<tr>
<td colspan="4">
<hr></hr>
</td>
</tr>
<tr>
<td align="left"> embryonic development</td>
<td align="center">534</td>
<td align="center">23.9</td>
<td align="center">3.6E-44</td>
</tr>
<tr>
<td align="left"> multicellular organismal process</td>
<td align="center">729</td>
<td align="center">32.7</td>
<td align="center">7.3E-42</td>
</tr>
<tr>
<td align="left"> reproduction</td>
<td align="center">397</td>
<td align="center">17.8</td>
<td align="center">2.0E-32</td>
</tr>
<tr>
<td align="left"> larval development</td>
<td align="center">321</td>
<td align="center">14.4</td>
<td align="center">4.3E-24</td>
</tr>
<tr>
<td align="left"> growth</td>
<td align="center">388</td>
<td align="center">17.4</td>
<td align="center">6.1E-24</td>
</tr>
<tr>
<td align="left"> post-embryonic development</td>
<td align="center">339</td>
<td align="center">15.2</td>
<td align="center">7.1E-24</td>
</tr>
<tr>
<td align="left"> cell division</td>
<td align="center">100</td>
<td align="center">4.5</td>
<td align="center">7.9E-23</td>
</tr>
<tr>
<td align="left"> embryonic cleavage</td>
<td align="center">68</td>
<td align="center">3.0</td>
<td align="center">9.4E-16</td>
</tr>
<tr>
<td align="left"> sexual reproduction</td>
<td align="center">170</td>
<td align="center">7.6</td>
<td align="center">2.1E-15</td>
</tr>
<tr>
<td align="left"> reproductive process</td>
<td align="center">185</td>
<td align="center">8.3</td>
<td align="center">1.2E-14</td>
</tr>
<tr>
<td align="left">
<bold>Molecular function</bold>
</td>
<td align="center">
<bold>N</bold>
</td>
<td align="center">
<bold>Per cent</bold>
</td>
<td align="center">
<bold>p-value</bold>
</td>
</tr>
<tr>
<td align="left"> nucleotide binding</td>
<td align="center">291</td>
<td align="center">13.0</td>
<td align="center">1.7E-23</td>
</tr>
<tr>
<td align="left"> purine nucleotide binding</td>
<td align="center">245</td>
<td align="center">11.0</td>
<td align="center">4.5E-16</td>
</tr>
<tr>
<td align="left"> ribonucleotide binding</td>
<td align="center">235</td>
<td align="center">10.5</td>
<td align="center">4.6E-16</td>
</tr>
<tr>
<td align="left"> ATP binding</td>
<td align="center">205</td>
<td align="center">9.2</td>
<td align="center">6.1E-15</td>
</tr>
<tr>
<td align="left"> adenyl ribonucleotide binding</td>
<td align="center">205</td>
<td align="center">9.2</td>
<td align="center">7.6E-15</td>
</tr>
<tr>
<td align="left"> protein binding</td>
<td align="center">398</td>
<td align="center">17.8</td>
<td align="center">9.7E-15</td>
</tr>
<tr>
<td align="left"> RNA binding</td>
<td align="center">83</td>
<td align="center">3.7</td>
<td align="center">8.6E-12</td>
</tr>
<tr>
<td align="left"> nucleic acid binding</td>
<td align="center">295</td>
<td align="center">13.2</td>
<td align="center">6.5E-11</td>
</tr>
<tr>
<td align="left"> helicase activity</td>
<td align="center">28</td>
<td align="center">1.3</td>
<td align="center">1.0E-5</td>
</tr>
<tr>
<td align="left"> protein serine/threonine kinase activity</td>
<td align="center">83</td>
<td align="center">3.7</td>
<td align="center">1.9E-5</td>
</tr>
<tr>
<td align="left">
<bold>22 nucleotides long siRNAs</bold>
</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">
<bold>Biological process</bold>
</td>
<td align="center">
<bold>N</bold>
</td>
<td align="center">
<bold>Per cent</bold>
</td>
<td align="center">
<bold>p-value</bold>
</td>
</tr>
<tr>
<td align="left"> multicellular organismal development</td>
<td align="center">293</td>
<td align="center">33.4</td>
<td align="center">1.1E-21</td>
</tr>
<tr>
<td align="left"> embryonic development ending in birth or egg hatching</td>
<td align="center">221</td>
<td align="center">25.2</td>
<td align="center">2.7E-20</td>
</tr>
<tr>
<td align="left"> reproduction</td>
<td align="center">170</td>
<td align="center">19.4</td>
<td align="center">7.5E-16</td>
</tr>
<tr>
<td align="left"> growth</td>
<td align="center">161</td>
<td align="center">18.4</td>
<td align="center">1.1E-10</td>
</tr>
<tr>
<td align="left"> cell division</td>
<td align="center">44</td>
<td align="center">5.0</td>
<td align="center">1.4E-10</td>
</tr>
<tr>
<td align="left">
<bold>23 nucleotides long siRNAs</bold>
</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">
<bold>Biological process</bold>
</td>
<td align="center">
<bold>N</bold>
</td>
<td align="center">
<bold>Per cent</bold>
</td>
<td align="center">
<bold>p-value</bold>
</td>
</tr>
<tr>
<td align="left"> multicellular organismal development</td>
<td align="center">87</td>
<td align="center">25.4</td>
<td align="center">1.6E-5</td>
</tr>
<tr>
<td align="left"> reproduction</td>
<td align="center">50</td>
<td align="center">14.6</td>
<td align="center">1.9E-4</td>
</tr>
<tr>
<td align="left"> anatomical structure development</td>
<td align="center">34</td>
<td align="center">9.9</td>
<td align="center">4.9E-4</td>
</tr>
<tr>
<td align="left"> cell division</td>
<td align="center">14</td>
<td align="center">4.1</td>
<td align="center">7.5E-4</td>
</tr>
<tr>
<td align="left"> cell cycle process</td>
<td align="center">13</td>
<td align="center">3.8</td>
<td align="center">1.2E-3</td>
</tr>
<tr>
<td align="left">
<bold>26 nucleotides long siRNAs</bold>
</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td align="left">
<bold>Biological process</bold>
</td>
<td align="center">
<bold>N</bold>
</td>
<td align="center">
<bold>Per cent</bold>
</td>
<td align="center">
<bold>p-value</bold>
</td>
</tr>
<tr>
<td align="left"> phosphate metabolic process</td>
<td align="center">26</td>
<td align="center">11.9</td>
<td align="center">4.4E-09</td>
</tr>
<tr>
<td align="left"> biopolymer modification</td>
<td align="center">27</td>
<td align="center">12.3</td>
<td align="center">1.7E-07</td>
</tr>
<tr>
<td align="left"> cellular protein metabolic process</td>
<td align="center">32</td>
<td align="center">14.6</td>
<td align="center">1.1E-04</td>
</tr>
<tr>
<td align="left"> cellular macromolecule metabolic process</td>
<td align="center">32</td>
<td align="center">14.6</td>
<td align="center">2.0E-04</td>
</tr>
<tr>
<td align="left"> biopolymer metabolic process</td>
<td align="center">38</td>
<td align="center">17.4</td>
<td align="center">2.4E-04</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Target gene lists were submitted to DAVID 2.0 as described in methods. The 10 most significant GO categories are shown for all siRNAs with a putative target. The 5 most significant GO categories are presented for the siRNAs according to their length. The lists for siRNAs of all lengths are presented in Additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Nucleotide sequence of siRNA molecules</title>
<p>The starting nucleotide was previously shown to be guanine (G) among 85% of endogenous siRNAs in the library of Ruby and co-workers [
<xref ref-type="bibr" rid="B6">6</xref>
]. Thus, it was of interest to measure the most common starting nucleotide after combining libraries. After plotting the starting nucleotide of siRNA sequences against their length, we observed a similar trend with G as the most abundant first 5' nucleotide (Figure
<xref ref-type="fig" rid="F4">4</xref>
). Shorter siRNAs exhibited uracil (U), cytosine (C) or adenine (A) frequently as the starting nucleotide. Furthermore, we constructed frequency sequence graphs (logos) for 4024 siRNA sequences of 12–29 nt in length with an mRNA match (Figure
<xref ref-type="fig" rid="F5">5</xref>
). All lengths of siRNAs prefer G in their 5' end and C in their 3' end with both A and U largely represented along the body. All siRNAs from the combined library exhibited an average A/U content of 53.1%, while A/U content of all mRNAs was previously reported to be 42.7% [
<xref ref-type="bibr" rid="B16">16</xref>
] and 64.6% in the whole genome [
<xref ref-type="bibr" rid="B17">17</xref>
]. 12- to 16-mers exhibit variable 3' nucleotides (A, G or U) suggesting that short sequences are potential degradation products of longer sequences. To address this question, we aligned siRNA sequences and observed that 5 of 25 sequences (20%) among 12-mers and 3 of 22 sequences (13.6%) among 13-mers could be found from longer siRNA sequences (Additional file
<xref ref-type="supplementary-material" rid="S5">5</xref>
). Frequency sequence graphs were generated also for the combined library of 7136 RNA sequences (12- to 33-mers) including those without mRNA matches (Additional file
<xref ref-type="supplementary-material" rid="S6">6</xref>
). Results were similar between the two sets of frequency sequence graphs.</p>
<fig position="float" id="F4">
<label>Figure 4</label>
<caption>
<p>
<bold>The distribution of 5' nucleotides in siRNAs grouped by the length</bold>
. Each bar represents the percentage of siRNAs for the given length. The shading indicates the 5' starting nucleotide.</p>
</caption>
<graphic xlink:href="1471-2164-9-270-4"></graphic>
</fig>
<fig position="float" id="F5">
<label>Figure 5</label>
<caption>
<p>
<bold>The frequency sequence graphs (logos) of siRNAs which have putative target mRNAs</bold>
. siRNA sequences were collected and divided by size as described in methods. Sequence logos were generated using WebLogo
<ext-link ext-link-type="uri" xlink:href="http://weblogo.berkeley.edu"></ext-link>
. The siRNA length is shown on the left and the number of siRNAs in each length category is shown on the right.</p>
</caption>
<graphic xlink:href="1471-2164-9-270-5"></graphic>
</fig>
</sec>
<sec>
<title>Conservation of candidate mRNA target sequences</title>
<p>The conservation of the putative target genes for the siRNAs was inspected by using pre-calculated BLAST similarities between
<italic>C. elegans </italic>
and
<italic>C. briggsae </italic>
(Wormbase release WS170). 14.4% of genes had perfect matches between the two species, while only 8.8% of the rest of the genome had this score reflecting conservation of siRNA associated genes (data not shown). However, the genes targeted by several different siRNAs were found to be more weakly conserved than the target genes on average (Additional file
<xref ref-type="supplementary-material" rid="S7">7</xref>
).</p>
</sec>
</sec>
<sec>
<title>Discussion</title>
<p>The existence of endogenous siRNAs (endo-siRNAs) encoded by the genome in
<italic>C. elegans </italic>
has been reported by several groups [
<xref ref-type="bibr" rid="B2">2</xref>
,
<xref ref-type="bibr" rid="B4">4</xref>
-
<xref ref-type="bibr" rid="B6">6</xref>
]. The evidence for classification of these sequences as endo-siRNAs has been the observation that RNAi machinery is required for their accumulation. In addition, many RNAi pathway mutants exhibit elevated levels of gene expression indicating the loss of regulatory RNAs [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B15">15</xref>
]. We combined datasets representing candidate endogenous siRNAs collected to date from
<italic>C. elegans </italic>
by high-throughput sequencing efforts [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
]. Two electronic libraries of endo-siRNAs were gathered and annotated.</p>
<p>The length of siRNAs in a combined data set appeared to be about 22 nt on average, which was observed for siRNA sequences with either matching or non-matching mRNAs (Figure
<xref ref-type="fig" rid="F1">1a</xref>
). Examination of the siRNAs as separate groups by the source laboratory appeared to affect their length distributions with 26-mers more highly represented in the library of Ruby and co-workers [
<xref ref-type="bibr" rid="B6">6</xref>
] (Figure
<xref ref-type="fig" rid="F1">1c</xref>
). In both libraries, centered distributions of 18- to 23-mers suggest a uniform class of endogenous siRNAs with ability to hybridize with mRNA sequences (Figures
<xref ref-type="fig" rid="F1">1a, 1b</xref>
and
<xref ref-type="fig" rid="F1">1c</xref>
). siRNAs with no mRNA matches exhibited a wider length distribution. It is possible that siRNAs without mRNA matches represent a less uniform group suggesting alternative modes of synthesis for these sequences. siRNAs in the library of Ruby and co-workers [
<xref ref-type="bibr" rid="B6">6</xref>
] were enriched for siRNAs with matching mRNAs (Figure
<xref ref-type="fig" rid="F1">1c</xref>
) while siRNAs with fewer mRNA matches were more highly represented in the library of Lee and co-workers [
<xref ref-type="bibr" rid="B5">5</xref>
] (Figure
<xref ref-type="fig" rid="F1">1b</xref>
). A possible explanation could arise from the characteristics of the cloning method used by the two laboratories. The siRNA-library by Lee and co-workers was constructed using a 5' monophosphate ligation independent cloning manner, while the construction of the library by Ruby and co-workers utilized the linker sequence with bias to detect 5' monophosphate ends of siRNA molecules. The higher proportion of siRNAs with mRNA matches in the library by Ruby and co-workers could be explained through a case where the majority of these sequences captured with the 5' monophosphate arise from secondary siRNA synthesis on the mRNA template. Secondary siRNAs via the exo-RNAi pathway have been shown to prefer a 5' triphosphate end, while primary siRNAs are generally thought to contain 5' monophosphate in the Dicer dependent synthesis [
<xref ref-type="bibr" rid="B9">9</xref>
,
<xref ref-type="bibr" rid="B12">12</xref>
]. However, the 5' monophosphate could also be obtained by the activity of enzymes with ability to remove 5' γ- and β-phosphates such as
<italic>C. elegans </italic>
PIR-1 [
<xref ref-type="bibr" rid="B4">4</xref>
]. Another possibility to obtain a 5' monophosphate end would be the removal of the entire 5' nucleotide with the triphosphate [
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
,
<xref ref-type="bibr" rid="B12">12</xref>
]. The existence of abundant 25-mers along with even more abundant 26-mers provides support for this hypothesis. The argonaute-related gene F55C9.3 was chosen to model the distribution of siRNA sequences along the mRNA. A total of 16 unique siRNAs with 1–10 reads each, aligned on the F55C9.3 mRNA fall into two interesting length categories. The length difference between 19- to 20-mers and 24- to 26-mers reflects the existence of two individual classes of siRNAs on the mRNA. It is tempting to speculate that the shorter ones with only one read exhibit members of a class of primary siRNAs and the abundant longer ones with up to ten identical reads as secondary siRNAs synthesized by a RdRP. However, biochemical studies are needed to further characterize these classes of siRNAs.</p>
<p>Since the siRNAs from the two sources available so far have over 2200 candidate target genes in total, we categorized these genes by their Gene Ontologies (Table
<xref ref-type="table" rid="T1">1</xref>
, and additional file
<xref ref-type="supplementary-material" rid="S4">4</xref>
). An interesting observation was that the lengths of endogenous siRNAs seem to determine the functional characterization of their putative target genes. The matching mRNAs for 22-mer siRNAs were associated with the GO term embryonic development while candidate targets for 23-mers were uniquely associated with post-embryonic development. It has been shown that some endogenous siRNAs and almost all miRNAs exhibit developmentally regulated expression patterns [
<xref ref-type="bibr" rid="B1">1</xref>
,
<xref ref-type="bibr" rid="B2">2</xref>
]. Interestingly, the 24- to 26-mer siRNAs were associated with phosphorus metabolism or protein modification. We hypothesize that the synthesis of these relatively long 25- to 26-mer endo-siRNA molecules could occur by specific RdRP such as RRF-3 and its associated proteins since a large group of phosphorus metabolism linked genes was observed to be over-expressed in
<italic>rrf</italic>
-3 mutant worm strains [
<xref ref-type="bibr" rid="B15">15</xref>
]. RRF-3 has shown to be required for the biogenesis of several classes of endogenous secondary siRNAs [
<xref ref-type="bibr" rid="B4">4</xref>
,
<xref ref-type="bibr" rid="B5">5</xref>
]. In addition, the
<italic>C. elegans </italic>
argonaute family of RNA binding proteins could exhibit specificity for the length of siRNA and direct silencing of genes associated with specific biological processes such as embryonic development, post-embryonic development, or phosphorus metabolism.</p>
<p>After plotting the starting nucleotide of siRNA sequences against their length, we observed that most siRNAs have G at their 5' end. In addition, sequence frequency graphs showed preference for C at their 3' end. A and U were largely represented along the body of siRNA molecules. A/U frequency might be needed to lower the energy required to remove newly synthesized endo-siRNA molecules and allow the RdRP complex to continue unprimed production of additional secondary siRNA molecules along the template [
<xref ref-type="bibr" rid="B12">12</xref>
,
<xref ref-type="bibr" rid="B18">18</xref>
]. Only short siRNAs of 12 to 16 nt in length exhibited variable 3' nucleotides (A, G or U) suggesting that these are potential degradation products of longer ones.</p>
<p>Target genes for the siRNAs were observed to be more highly conserved between
<italic>C. elegans </italic>
and
<italic>C. briggsae </italic>
than the genes on average. One explanation is that siRNAs target genes with conserved exons. It is also possible that evolutionary conserved target genes have had time to develop siRNA sequences, for example, to regulate their expression during specific developmental stages. When each target gene was classified by the number of associated siRNAs, it was observed that the target genes with more than three siRNAs tended to be poorly conserved between the two nematode species. This might indicate the accelerated production of secondary siRNAs for young species specific genes.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>In order to understand the cellular functions regulated by endogenous siRNAs collected to date from
<italic>C. elegans</italic>
, we merged the libraries from two sequencing projects [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
] containing all publicly available siRNA sequences resulting in a collection of 7136 endo-siRNA sequences. We characterized their length distributions and the relationship of the siRNA length with the function of genes targeted. The endo-siRNA sequences corresponded to functionally different target genes that were dependent upon the siRNA length: 18- to 22-mers match mRNA targets associated with embryonic development, targets of 23-mer siRNAs associate with post-embryonic development and targets corresponding to 24–26-mer siRNAs involve phosphorus metabolism or protein modification. Genes targeted with siRNAs with multiple reads included several poorly characterized argonaute-related genes. We conclude that the function of target genes of endogenous siRNAs appear to vary depending upon their length. These results also provide additional evidence for existence of a number of siRNA biosynthesis mechanisms capable of regulating gene expression associated with specific biological processes.</p>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<sec>
<title>Sequences</title>
<p>The siRNA sequences obtained from mixed stage
<italic>C. elegans </italic>
populations by high-throughput sequencing technologies were obtained from previously published studies [
<xref ref-type="bibr" rid="B5">5</xref>
,
<xref ref-type="bibr" rid="B6">6</xref>
]. Lists of 7136 siRNA sequences in total were annotated by using the NCBI BLAST server with Wormbase release WS170 nematode RefSeq mRNA database, which contains only high-quality sequences for gene transcripts. The results were filtered to contain only plus/minus hits against the known mRNAs representing the antisense siRNA sequences. The results were obtained as RefSeq mRNA identifiers that were unique to known transcripts.</p>
</sec>
<sec>
<title>Length distributions</title>
<p>siRNAs with equal sequence lengths were grouped together and the sequence frequency graphs were generated separately. Nucleotide frequencies of the included siRNAs were created using WebLogo (University of Berkeley, CA, USA). The same process was repeated for the subset of siRNAs that were associated with mRNA sequences.</p>
</sec>
<sec>
<title>Functional categories</title>
<p>The enriched functional categories for the siRNAs with different lengths were obtained using David 2.0 web service [
<xref ref-type="bibr" rid="B19">19</xref>
]. The descriptions for the genes that were targets of 8 or more siRNAs were obtained using Biomart together with WB180 genome. In order to study the conservation of the genes coding the putative target mRNAs, the precalculated BLAST expectation (E) values for the observed homologies between the putative target genes in
<italic>C. elegans</italic>
, and their orthologs in
<italic>C. briggsae</italic>
, were obtained using Wormbase WB180. SPSS version 14 was used in the graphical presentation of the results (SPSS Inc., Chicago, Illinois, USA).</p>
</sec>
</sec>
<sec>
<title>Authors' contributions</title>
<p>SA conceptualized, planned, performed and designed the analysis and wrote the manuscript. LH performed the experiments and participated in writing the manuscript. GW participated in the design of the experiment, management of the project, and writing of the manuscript. MS performed the experiments, wrote the manuscript, and managed the data and submission. All authors read and approved the final manuscript.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="S1">
<caption>
<title>Additional file 1</title>
<p>
<bold>The collection of all 4024 siRNAs with putative mRNA targets</bold>
. The sequences, lengths, and the putative target mRNAs for the siRNAs from two sources (Lee et al., 2006, Ruby et al., 2006) are presented.</p>
</caption>
<media xlink:href="1471-2164-9-270-S1.xls" mimetype="application" mime-subtype="vnd.ms-excel">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S2">
<caption>
<title>Additional file 2</title>
<p>
<bold>The collection of all 7136 short RNAs</bold>
. The sequences, lengths, and the 5'nucleotides of the entire RNA collection from two sources (Lee et al., 2006, Ruby et al., 2006) are presented.</p>
</caption>
<media xlink:href="1471-2164-9-270-S2.xls" mimetype="application" mime-subtype="vnd.ms-excel">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S3">
<caption>
<title>Additional file 3</title>
<p>
<bold>The individual siRNAs with the most reads</bold>
. The data from Ruby et al (2006) contained numbers of individual detections of the sequence and they are presented here.</p>
</caption>
<media xlink:href="1471-2164-9-270-S3.xls" mimetype="application" mime-subtype="vnd.ms-excel">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S4">
<caption>
<title>Additional file 4</title>
<p>
<bold>The full list of the enriched GO terms of siRNAs by length</bold>
. Target gene lists were submitted to DAVID 2.0 as described in methods. The 10 most significant GO categories are shown for all siRNAs with a putative target. The 5 most significant GO categories are presented for the siRNAs according to their length.</p>
</caption>
<media xlink:href="1471-2164-9-270-S4.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S5">
<caption>
<title>Additional file 5</title>
<p>
<bold>Redundancy and fragments of short siRNAs</bold>
. The number of short siRNAs that were aligned with longer siRNAs are presented on the first leaf. On the second leaf, the siRNAs (siRNA1) whose sequence is included in one or more longer siRNA sequences (siRNA2) with a perfect match are presented.</p>
</caption>
<media xlink:href="1471-2164-9-270-S5.xls" mimetype="application" mime-subtype="vnd.ms-excel">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S6">
<caption>
<title>Additional file 6</title>
<p>
<bold>The frequency sequence graphs (logos) of the whole collection of short RNAs</bold>
. The sequences included those without putative targets from Lee et al. (2006) and Ruby et al. (2006). The number of siRNAs in each length category is shown on the right.</p>
</caption>
<media xlink:href="1471-2164-9-270-S6.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S7">
<caption>
<title>Additional file 7</title>
<p>
<bold>Conservation of genes targeted by several siRNAs</bold>
. Some genes (mRNAs) were putative targets for many siRNAs. The genes with the highest number of siRNAs appears to be poorly conserved in recent evolution. The homology between the
<italic>C. elegans </italic>
and
<italic>C. briggsae </italic>
was estimated by precalculated BLAST Expectation (E)-values. The smaller (down to 10^-300) the E-value, the better statistical value for the homology. Certain genes, such as K02E2.6 did not appear to have an ortholog in
<italic>C. briggsae</italic>
.</p>
</caption>
<media xlink:href="1471-2164-9-270-S7.xls" mimetype="application" mime-subtype="vnd.ms-excel">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<sec>
<title>Acknowledgements</title>
<p>We thank Vuokko Aarnio and Merja Lakso with helpful discussions. The Bartel Laboratory and Ambros Laboratory are gratefully acknowledged for making their data available. The Finnish Graduate School of Neurosciences provided funding for SA. The Saastamoinen Foundation provided funding to LH.</p>
</sec>
</ack>
<ref-list>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pasquinelli</surname>
<given-names>AE</given-names>
</name>
<name>
<surname>Ruvkun</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Control of developmental timing by microRNAs and their targets</article-title>
<source>Annu Rev Cell Dev Biol</source>
<year>2002</year>
<volume>18</volume>
<fpage>495</fpage>
<lpage>513</lpage>
<pub-id pub-id-type="pmid">12142272</pub-id>
<pub-id pub-id-type="doi">10.1146/annurev.cellbio.18.012502.105832</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ambros</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Lavanway</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Williams</surname>
<given-names>PT</given-names>
</name>
<name>
<surname>Jewell</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>MicroRNAs and other tiny endogenous RNAs in C. elegans</article-title>
<source>Curr Biol</source>
<year>2003</year>
<volume>13</volume>
<fpage>807</fpage>
<lpage>818</lpage>
<pub-id pub-id-type="pmid">12747828</pub-id>
<pub-id pub-id-type="doi">10.1016/S0960-9822(03)00287-2</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chapman</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Carrington</surname>
<given-names>JC</given-names>
</name>
</person-group>
<article-title>Specialization and evolution of endogenous small RNA pathways</article-title>
<source>Nat Rev Genet</source>
<year>2007</year>
<volume>8</volume>
<fpage>884</fpage>
<lpage>96</lpage>
<pub-id pub-id-type="pmid">17943195</pub-id>
<pub-id pub-id-type="doi">10.1038/nrg2179</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Duchaine</surname>
<given-names>TF</given-names>
</name>
<name>
<surname>Wohlschlegel</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Kennedy</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bei</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Conte</surname>
<given-names>D</given-names>
<suffix>Jr</suffix>
</name>
<name>
<surname>Pang</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Brownell</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Harding</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Mitani</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ruvkun</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Yates</surname>
<given-names>JR</given-names>
<suffix>3rd</suffix>
</name>
<name>
<surname>Mello</surname>
<given-names>CC</given-names>
</name>
</person-group>
<article-title>Functional proteomics reveals the biochemical niche of C. elegans DCR-1 in multiple small-RNA-mediated pathways</article-title>
<source>Cell</source>
<year>2006</year>
<volume>124</volume>
<fpage>343</fpage>
<lpage>54</lpage>
<pub-id pub-id-type="pmid">16439208</pub-id>
<pub-id pub-id-type="doi">10.1016/j.cell.2005.11.036</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Hammell</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Ambros</surname>
<given-names>V</given-names>
</name>
</person-group>
<article-title>Interacting endogenous and exogenous RNAi pathways in Caenorhabditis elegans</article-title>
<source>RNA</source>
<year>2006</year>
<volume>12</volume>
<fpage>589</fpage>
<lpage>597</lpage>
<pub-id pub-id-type="pmid">16489184</pub-id>
<pub-id pub-id-type="doi">10.1261/rna.2231506</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ruby</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Jan</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Player</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Axtell</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Nusbaum</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ge</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Bartel</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Large-Scale Sequencing Reveals 21U-RNAs and Additional MicroRNAs and Endogenous siRNAs in C. elegans</article-title>
<source>Cell</source>
<year>2006</year>
<volume>6</volume>
<fpage>1193</fpage>
<lpage>1207</lpage>
<pub-id pub-id-type="pmid">17174894</pub-id>
<pub-id pub-id-type="doi">10.1016/j.cell.2006.10.040</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fire</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Montgomery</surname>
<given-names>MK</given-names>
</name>
<name>
<surname>Kostas</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Driver</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Mello</surname>
<given-names>CC</given-names>
</name>
</person-group>
<article-title>Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans</article-title>
<source>Nature</source>
<year>1998</year>
<volume>391</volume>
<fpage>806</fpage>
<lpage>811</lpage>
<pub-id pub-id-type="pmid">9486653</pub-id>
<pub-id pub-id-type="doi">10.1038/35888</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Timmons</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Fire</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Specific interference by ingested dsRNA</article-title>
<source>Nature</source>
<year>1998</year>
<volume>29;395</volume>
<fpage>854</fpage>
<pub-id pub-id-type="pmid">9804418</pub-id>
<pub-id pub-id-type="doi">10.1038/27579</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Elbashir</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Harborth</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lendeckel</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Yalcin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Weber</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Tuschl</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells</article-title>
<source>Nature</source>
<year>2001</year>
<volume>411</volume>
<fpage>494</fpage>
<lpage>498</lpage>
<pub-id pub-id-type="pmid">11373684</pub-id>
<pub-id pub-id-type="doi">10.1038/35078107</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilkins</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Dishongh</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Moore</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Whitt</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Chow</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Machaca</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>RNA interference is an antiviral defence mechanism in Caenorhabditis elegans</article-title>
<source>Nature</source>
<year>2005</year>
<volume>436</volume>
<fpage>1044</fpage>
<lpage>1047</lpage>
<pub-id pub-id-type="pmid">16107852</pub-id>
<pub-id pub-id-type="doi">10.1038/nature03957</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sijen</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Fleenor</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Simmer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Thijssen</surname>
<given-names>KL</given-names>
</name>
<name>
<surname>Parrish</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Timmons</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Plasterk</surname>
<given-names>RH</given-names>
</name>
<name>
<surname>Fire</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>On the role of RNA amplification in dsRNA-triggered gene silencing</article-title>
<source>Cell</source>
<year>2001</year>
<volume>107</volume>
<fpage>465</fpage>
<lpage>76</lpage>
<pub-id pub-id-type="pmid">11719187</pub-id>
<pub-id pub-id-type="doi">10.1016/S0092-8674(01)00576-1</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pak</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Fire</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Distinct populations of primary and secondary effectors during RNAi in C. elegans</article-title>
<source>Science</source>
<year>2007</year>
<volume>315</volume>
<fpage>199</fpage>
<lpage>200</lpage>
<pub-id pub-id-type="pmid">17218517</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1132839</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grishok</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sinskey</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Sharp</surname>
<given-names>PA</given-names>
</name>
</person-group>
<article-title>Transcriptional silencing of a transgene by RNAi in the soma of C. elegans</article-title>
<source>Genes and development</source>
<year>2005</year>
<volume>19</volume>
<fpage>683</fpage>
<lpage>696</lpage>
<pub-id pub-id-type="pmid">15741313</pub-id>
<pub-id pub-id-type="doi">10.1101/gad.1247705</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simmer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Tijsterman</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Parrish</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Koushika</surname>
<given-names>SP</given-names>
</name>
<name>
<surname>Nonet</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Fire</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ahringer</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Plasterk</surname>
<given-names>RH</given-names>
</name>
</person-group>
<article-title>Loss of the putative RNA-directed RNA polymerase RRF-3 makes C. elegans hypersensitive to RNAi</article-title>
<source>Curr Biol</source>
<year>2002</year>
<volume>12</volume>
<fpage>1317</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="pmid">12176360</pub-id>
<pub-id pub-id-type="doi">10.1016/S0960-9822(02)01041-2</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Asikainen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Storvik</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lakso</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Whole genome microarray analysis of C. elegans rrf-3 and eri-1 mutants</article-title>
<source>FEBS Letters</source>
<year>2007</year>
<volume>30;581</volume>
<fpage>5050</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="pmid">17919598</pub-id>
<pub-id pub-id-type="doi">10.1016/j.febslet.2007.09.043</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Kasif</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Cantor</surname>
<given-names>CR</given-names>
</name>
<name>
<surname>Broude</surname>
<given-names>NE</given-names>
</name>
</person-group>
<article-title>GC/AT-content spikes as genomic punctuation marks</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2004</year>
<volume>101</volume>
<fpage>16855</fpage>
<lpage>16860</lpage>
<pub-id pub-id-type="pmid">15548610</pub-id>
<pub-id pub-id-type="doi">10.1073/pnas.0407821101</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stein</surname>
<given-names>LD</given-names>
</name>
<name>
<surname>Bao</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Blasiar</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Blumenthal</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Brent</surname>
<given-names>MR</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Chinwalla</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Clarke</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Clee</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Coghlan</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Coulson</surname>
<given-names>A</given-names>
</name>
<name>
<surname>D'Eustachio</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Fitch</surname>
<given-names>DHA</given-names>
</name>
<name>
<surname>Fulton</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Fulton</surname>
<given-names>RE</given-names>
</name>
<name>
<surname>Griffiths-Jones</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Harris</surname>
<given-names>TW</given-names>
</name>
<name>
<surname>Hillier</surname>
<given-names>LW</given-names>
</name>
<name>
<surname>Kamath</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Kuwabara</surname>
<given-names>PE</given-names>
</name>
<name>
<surname>Mardis</surname>
<given-names>ER</given-names>
</name>
<name>
<surname>Marra</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Miner</surname>
<given-names>TL</given-names>
</name>
<name>
<surname>Minx</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Mullikin</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Plumb</surname>
<given-names>RW</given-names>
</name>
<name>
<surname>Rogers</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schein</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Sohrmann</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Spieth</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Stajich</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Willey</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Wilson</surname>
<given-names>RK</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Waterston</surname>
<given-names>RH</given-names>
</name>
</person-group>
<article-title>The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics</article-title>
<source>PLoS Biol</source>
<year>2003</year>
<volume>1</volume>
<fpage>E45</fpage>
<pub-id pub-id-type="pmid">14624247</pub-id>
<pub-id pub-id-type="doi">10.1371/journal.pbio.0000045</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sijen</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Steiner</surname>
<given-names>FA</given-names>
</name>
<name>
<surname>Thijssen</surname>
<given-names>KL</given-names>
</name>
<name>
<surname>Plasterk</surname>
<given-names>RH</given-names>
</name>
</person-group>
<article-title>Secondary siRNAs result from unprimed RNA synthesis and form a distinct class</article-title>
<source>Science</source>
<year>2007</year>
<volume>315</volume>
<fpage>244</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="pmid">17158288</pub-id>
<pub-id pub-id-type="doi">10.1126/science.1136699</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dennis</surname>
<given-names>G</given-names>
<suffix>Jr</suffix>
</name>
<name>
<surname>Sherman</surname>
<given-names>BT</given-names>
</name>
<name>
<surname>Hosack</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Lane</surname>
<given-names>HC</given-names>
</name>
<name>
<surname>Lempicki</surname>
<given-names>RA</given-names>
</name>
</person-group>
<article-title>DAVID: Database for Annotation, Visualization, and Integrated Discovery</article-title>
<source>Genome Biology</source>
<year>2003</year>
<volume>4</volume>
<fpage>P3</fpage>
<pub-id pub-id-type="pmid">12734009</pub-id>
<pub-id pub-id-type="doi">10.1186/gb-2003-4-5-p3</pub-id>
</citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0005470 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0005470 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021