Serveur d'exploration sur l'oranger

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Angiosperms Are Unique among Land Plant Lineages in the Occurrence of Key Genes in the RNA-Directed DNA Methylation (RdDM) Pathway

Identifieur interne : 000102 ( Pmc/Corpus ); précédent : 000101; suivant : 000103

Angiosperms Are Unique among Land Plant Lineages in the Occurrence of Key Genes in the RNA-Directed DNA Methylation (RdDM) Pathway

Auteurs : Lu Ma ; Andrea Hatlen ; Laura J. Kelly ; Hannes Becher ; Wencai Wang ; Ales Kovarik ; Ilia J. Leitch ; Andrew R. Leitch

Source :

RBID : PMC:4607528

Abstract

The RNA-directed DNA methylation (RdDM) pathway can be divided into three phases: 1) small interfering RNA biogenesis, 2) de novo methylation, and 3) chromatin modification. To determine the degree of conservation of this pathway we searched for key genes among land plants. We used OrthoMCL and the OrthoMCL Viridiplantae database to analyze proteomes of species in bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms. We also analyzed small RNA size categories and, in two gymnosperms, cytosine methylation in ribosomal DNA. Six proteins were restricted to angiosperms, these being NRPD4/NRPE4, RDM1, DMS3 (defective in meristem silencing 3), SHH1 (SAWADEE homeodomain homolog 1), KTF1, and SUVR2, although we failed to find the latter three proteins in Fritillaria persica, a species with a giant genome. Small RNAs of 24 nt in length were abundant only in angiosperms. Phylogenetic analyses of Dicer-like (DCL) proteins showed that DCL2 was restricted to seed plants, although it was absent in Gnetum gnemon and Welwitschia mirabilis. The data suggest that phases (1) and (2) of the RdDM pathway, described for model angiosperms, evolved with angiosperms. The absence of some features of RdDM in F. persica may be associated with its large genome. Phase (3) is probably the most conserved part of the pathway across land plants. DCL2, involved in virus defense and interaction with the canonical RdDM pathway to facilitate methylation of CHH, is absent outside seed plants. Its absence in G. gnemon, and W. mirabilis coupled with distinctive patterns of CHH methylation, suggest a secondary loss of DCL2 following the divergence of Gnetales.


Url:
DOI: 10.1093/gbe/evv171
PubMed: 26338185
PubMed Central: 4607528

Links to Exploration step

PMC:4607528

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Angiosperms Are Unique among Land Plant Lineages in the Occurrence of Key Genes in the RNA-Directed DNA Methylation (RdDM) Pathway</title>
<author>
<name sortKey="Ma, Lu" sort="Ma, Lu" uniqKey="Ma L" first="Lu" last="Ma">Lu Ma</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hatlen, Andrea" sort="Hatlen, Andrea" uniqKey="Hatlen A" first="Andrea" last="Hatlen">Andrea Hatlen</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kelly, Laura J" sort="Kelly, Laura J" uniqKey="Kelly L" first="Laura J." last="Kelly">Laura J. Kelly</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Becher, Hannes" sort="Becher, Hannes" uniqKey="Becher H" first="Hannes" last="Becher">Hannes Becher</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Wencai" sort="Wang, Wencai" uniqKey="Wang W" first="Wencai" last="Wang">Wencai Wang</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kovarik, Ales" sort="Kovarik, Ales" uniqKey="Kovarik A" first="Ales" last="Kovarik">Ales Kovarik</name>
<affiliation>
<nlm:aff id="evv171-AFF2">Department of Molecular Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Leitch, Ilia J" sort="Leitch, Ilia J" uniqKey="Leitch I" first="Ilia J." last="Leitch">Ilia J. Leitch</name>
<affiliation>
<nlm:aff id="evv171-AFF3">Department of Comparative Plant and Fungal Biology Royal Botanic Gardens, Kew, Richmond, Surrey, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Leitch, Andrew R" sort="Leitch, Andrew R" uniqKey="Leitch A" first="Andrew R." last="Leitch">Andrew R. Leitch</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26338185</idno>
<idno type="pmc">4607528</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4607528</idno>
<idno type="RBID">PMC:4607528</idno>
<idno type="doi">10.1093/gbe/evv171</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000102</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Angiosperms Are Unique among Land Plant Lineages in the Occurrence of Key Genes in the RNA-Directed DNA Methylation (RdDM) Pathway</title>
<author>
<name sortKey="Ma, Lu" sort="Ma, Lu" uniqKey="Ma L" first="Lu" last="Ma">Lu Ma</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hatlen, Andrea" sort="Hatlen, Andrea" uniqKey="Hatlen A" first="Andrea" last="Hatlen">Andrea Hatlen</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kelly, Laura J" sort="Kelly, Laura J" uniqKey="Kelly L" first="Laura J." last="Kelly">Laura J. Kelly</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Becher, Hannes" sort="Becher, Hannes" uniqKey="Becher H" first="Hannes" last="Becher">Hannes Becher</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Wencai" sort="Wang, Wencai" uniqKey="Wang W" first="Wencai" last="Wang">Wencai Wang</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kovarik, Ales" sort="Kovarik, Ales" uniqKey="Kovarik A" first="Ales" last="Kovarik">Ales Kovarik</name>
<affiliation>
<nlm:aff id="evv171-AFF2">Department of Molecular Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Leitch, Ilia J" sort="Leitch, Ilia J" uniqKey="Leitch I" first="Ilia J." last="Leitch">Ilia J. Leitch</name>
<affiliation>
<nlm:aff id="evv171-AFF3">Department of Comparative Plant and Fungal Biology Royal Botanic Gardens, Kew, Richmond, Surrey, United Kingdom</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Leitch, Andrew R" sort="Leitch, Andrew R" uniqKey="Leitch A" first="Andrew R." last="Leitch">Andrew R. Leitch</name>
<affiliation>
<nlm:aff id="evv171-AFF1">School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Genome Biology and Evolution</title>
<idno type="eISSN">1759-6653</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>The RNA-directed DNA methylation (RdDM) pathway can be divided into three phases: 1) small interfering RNA biogenesis, 2) de novo methylation, and 3) chromatin modification. To determine the degree of conservation of this pathway we searched for key genes among land plants. We used OrthoMCL and the OrthoMCL Viridiplantae database to analyze proteomes of species in bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms. We also analyzed small RNA size categories and, in two gymnosperms, cytosine methylation in ribosomal DNA. Six proteins were restricted to angiosperms, these being NRPD4/NRPE4, RDM1, DMS3 (defective in meristem silencing 3), SHH1 (SAWADEE homeodomain homolog 1), KTF1, and SUVR2, although we failed to find the latter three proteins in
<italic>Fritillaria persica</italic>
, a species with a giant genome. Small RNAs of 24 nt in length were abundant only in angiosperms. Phylogenetic analyses of Dicer-like (DCL) proteins showed that DCL2 was restricted to seed plants, although it was absent in
<italic>Gnetum gnemon</italic>
and
<italic>Welwitschia mirabilis</italic>
. The data suggest that phases (1) and (2) of the RdDM pathway, described for model angiosperms, evolved with angiosperms. The absence of some features of RdDM in
<italic>F. persica</italic>
may be associated with its large genome. Phase (3) is probably the most conserved part of the pathway across land plants. DCL2, involved in virus defense and interaction with the canonical RdDM pathway to facilitate methylation of CHH, is absent outside seed plants. Its absence in
<italic>G. gnemon</italic>
, and
<italic>W. mirabilis</italic>
coupled with distinctive patterns of CHH methylation, suggest a secondary loss of DCL2 following the divergence of Gnetales.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Andika, Ib" uniqKey="Andika I">IB Andika</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Becher, H" uniqKey="Becher H">H Becher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bennetzen, Jl" uniqKey="Bennetzen J">JL Bennetzen</name>
</author>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bernstein, E" uniqKey="Bernstein E">E Bernstein</name>
</author>
<author>
<name sortKey="Caudy, Aa" uniqKey="Caudy A">AA Caudy</name>
</author>
<author>
<name sortKey="Hammond, Sm" uniqKey="Hammond S">SM Hammond</name>
</author>
<author>
<name sortKey="Hannon, Gj" uniqKey="Hannon G">GJ Hannon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bohmdorfer, G" uniqKey="Bohmdorfer G">G Böhmdorfer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Capella Gutierrez, S" uniqKey="Capella Gutierrez S">S Capella-Gutiérrez</name>
</author>
<author>
<name sortKey="Silla Martinez, Jm" uniqKey="Silla Martinez J">JM Silla-Martínez</name>
</author>
<author>
<name sortKey="Gabald N, T" uniqKey="Gabald N T">T Gabaldón</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chang, J M" uniqKey="Chang J">J-M Chang</name>
</author>
<author>
<name sortKey="Di Tommaso, P" uniqKey="Di Tommaso P">P Di Tommaso</name>
</author>
<author>
<name sortKey="Lefort, V" uniqKey="Lefort V">V Lefort</name>
</author>
<author>
<name sortKey="Gascuel, O" uniqKey="Gascuel O">O Gascuel</name>
</author>
<author>
<name sortKey="Notredame, C" uniqKey="Notredame C">C Notredame</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Darriba, D" uniqKey="Darriba D">D Darriba</name>
</author>
<author>
<name sortKey="Taboada, Gl" uniqKey="Taboada G">GL Taboada</name>
</author>
<author>
<name sortKey="Doallo, R" uniqKey="Doallo R">R Doallo</name>
</author>
<author>
<name sortKey="Posada, D" uniqKey="Posada D">D Posada</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dolgosheina, Ev" uniqKey="Dolgosheina E">EV Dolgosheina</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, Rc" uniqKey="Edgar R">RC Edgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fei, Q" uniqKey="Fei Q">Q Fei</name>
</author>
<author>
<name sortKey="Xia, R" uniqKey="Xia R">R Xia</name>
</author>
<author>
<name sortKey="Meyers, Bc" uniqKey="Meyers B">BC Meyers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fischer, S" uniqKey="Fischer S">S Fischer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fuchs, J" uniqKey="Fuchs J">J Fuchs</name>
</author>
<author>
<name sortKey="Jovtchev, G" uniqKey="Jovtchev G">G Jovtchev</name>
</author>
<author>
<name sortKey="Schubert, I" uniqKey="Schubert I">I Schubert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gao, Z" uniqKey="Gao Z">Z Gao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grabherr, Mg" uniqKey="Grabherr M">MG Grabherr</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grover, Ce" uniqKey="Grover C">CE Grover</name>
</author>
<author>
<name sortKey="Wendel, Jf" uniqKey="Wendel J">JF Wendel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="He, Xj" uniqKey="He X">XJ He</name>
</author>
<author>
<name sortKey="Hsu, Yf" uniqKey="Hsu Y">YF Hsu</name>
</author>
<author>
<name sortKey="Pontes, O" uniqKey="Pontes O">O Pontes</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="He, Xj" uniqKey="He X">XJ He</name>
</author>
<author>
<name sortKey="Hsu, Yf" uniqKey="Hsu Y">YF Hsu</name>
</author>
<author>
<name sortKey="Zhu, S" uniqKey="Zhu S">S Zhu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hetzl, J" uniqKey="Hetzl J">J Hetzl</name>
</author>
<author>
<name sortKey="Foerster, Am" uniqKey="Foerster A">AM Foerster</name>
</author>
<author>
<name sortKey="Raidl, G" uniqKey="Raidl G">G Raidl</name>
</author>
<author>
<name sortKey="Mittelsten Scheid, O" uniqKey="Mittelsten Scheid O">O Mittelsten Scheid</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kelly, Lj" uniqKey="Kelly L">LJ Kelly</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kenrick, P" uniqKey="Kenrick P">P Kenrick</name>
</author>
<author>
<name sortKey="Crane, Pr" uniqKey="Crane P">PR Crane</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kovarik, A" uniqKey="Kovarik A">A Kovarik</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kovach, A" uniqKey="Kovach A">A Kovach</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Law, J" uniqKey="Law J">J Law</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Law, J" uniqKey="Law J">J Law</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leitch, Ar" uniqKey="Leitch A">AR Leitch</name>
</author>
<author>
<name sortKey="Leitch, Ij" uniqKey="Leitch I">IJ Leitch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leitch, Ij" uniqKey="Leitch I">IJ Leitch</name>
</author>
<author>
<name sortKey="Leitch, Ar" uniqKey="Leitch A">AR Leitch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, Fw" uniqKey="Li F">FW Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, L" uniqKey="Li L">L Li</name>
</author>
<author>
<name sortKey="Stoeckert, Cj" uniqKey="Stoeckert C">CJ Stoeckert</name>
</author>
<author>
<name sortKey="Roos, Ds" uniqKey="Roos D">DS Roos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lim, Ky" uniqKey="Lim K">KY Lim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lisch, D" uniqKey="Lisch D">D Lisch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, Q" uniqKey="Liu Q">Q Liu</name>
</author>
<author>
<name sortKey="Feng, Y" uniqKey="Feng Y">Y Feng</name>
</author>
<author>
<name sortKey="Zhu, Z" uniqKey="Zhu Z">Z Zhu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Margis, R" uniqKey="Margis R">R Margis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mathews, S" uniqKey="Mathews S">S Mathews</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Matzke, Ma" uniqKey="Matzke M">MA Matzke</name>
</author>
<author>
<name sortKey="Kanno, T" uniqKey="Kanno T">T Kanno</name>
</author>
<author>
<name sortKey="Matzke, Ajm" uniqKey="Matzke A">AJM Matzke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Matzke, Ma" uniqKey="Matzke M">MA Matzke</name>
</author>
<author>
<name sortKey="Mosher, Ra" uniqKey="Mosher R">RA Mosher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Metcalfe, Cj" uniqKey="Metcalfe C">CJ Metcalfe</name>
</author>
<author>
<name sortKey="Casane, D" uniqKey="Casane D">D Casane</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Morse, Am" uniqKey="Morse A">AM Morse</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Notredame, C" uniqKey="Notredame C">C Notredame</name>
</author>
<author>
<name sortKey="Higgins, Dg" uniqKey="Higgins D">DG Higgins</name>
</author>
<author>
<name sortKey="Heringa, J" uniqKey="Heringa J">J Heringa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nuthikattu, S" uniqKey="Nuthikattu S">S Nuthikattu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nystedt, B" uniqKey="Nystedt B">B Nystedt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Parchman, Tl" uniqKey="Parchman T">TL Parchman</name>
</author>
<author>
<name sortKey="Geist, Ks" uniqKey="Geist K">KS Geist</name>
</author>
<author>
<name sortKey="Grahnen, Ja" uniqKey="Grahnen J">JA Grahnen</name>
</author>
<author>
<name sortKey="Benkman, Cw" uniqKey="Benkman C">CW Benkman</name>
</author>
<author>
<name sortKey="Buerkle, Ca" uniqKey="Buerkle C">CA Buerkle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ream, Ts" uniqKey="Ream T">TS Ream</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sasaki, T" uniqKey="Sasaki T">T Sasaki</name>
</author>
<author>
<name sortKey="Lorkovi, Zj" uniqKey="Lorkovi Z">ZJ Lorković</name>
</author>
<author>
<name sortKey="Liang, Sc" uniqKey="Liang S">SC Liang</name>
</author>
<author>
<name sortKey="Matzke, Ajm" uniqKey="Matzke A">AJM Matzke</name>
</author>
<author>
<name sortKey="Matzke, Ma" uniqKey="Matzke M">MA Matzke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schauer, Se" uniqKey="Schauer S">SE Schauer</name>
</author>
<author>
<name sortKey="Jacobsen, Se" uniqKey="Jacobsen S">SE Jacobsen</name>
</author>
<author>
<name sortKey="Meinke, Dw" uniqKey="Meinke D">DW Meinke</name>
</author>
<author>
<name sortKey="Ray, A" uniqKey="Ray A">A Ray</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Song, X" uniqKey="Song X">X Song</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stamatakis, A" uniqKey="Stamatakis A">A Stamatakis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stroud, H" uniqKey="Stroud H">H Stroud</name>
</author>
<author>
<name sortKey="Greenberg, Mvc" uniqKey="Greenberg M">MVC Greenberg</name>
</author>
<author>
<name sortKey="Feng, S" uniqKey="Feng S">S Feng</name>
</author>
<author>
<name sortKey="Bernatavichute, Yv" uniqKey="Bernatavichute Y">YV Bernatavichute</name>
</author>
<author>
<name sortKey="Jacobsen, Se" uniqKey="Jacobsen S">SE Jacobsen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tucker, Sl" uniqKey="Tucker S">SL Tucker</name>
</author>
<author>
<name sortKey="Reece, J" uniqKey="Reece J">J Reece</name>
</author>
<author>
<name sortKey="Ream, Ts" uniqKey="Ream T">TS Ream</name>
</author>
<author>
<name sortKey="Pikaard, Cs" uniqKey="Pikaard C">CS Pikaard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wan, L C" uniqKey="Wan L">L-C Wan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wellman, Ch" uniqKey="Wellman C">CH Wellman</name>
</author>
<author>
<name sortKey="Osterloff, Pl" uniqKey="Osterloff P">PL Osterloff</name>
</author>
<author>
<name sortKey="Mohiuddin, U" uniqKey="Mohiuddin U">U Mohiuddin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Woodhouse, Mr" uniqKey="Woodhouse M">MR Woodhouse</name>
</author>
<author>
<name sortKey="Freeling, M" uniqKey="Freeling M">M Freeling</name>
</author>
<author>
<name sortKey="Lisch, D" uniqKey="Lisch D">D Lisch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, H" uniqKey="Zhang H">H Zhang</name>
</author>
<author>
<name sortKey="Ma, Zy" uniqKey="Ma Z">ZY Ma</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J Zhang</name>
</author>
<author>
<name sortKey="Wu, T" uniqKey="Wu T">T Wu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zimmermann, P" uniqKey="Zimmermann P">P Zimmermann</name>
</author>
<author>
<name sortKey="Hirsch Hoffmann, M" uniqKey="Hirsch Hoffmann M">M Hirsch-Hoffmann</name>
</author>
<author>
<name sortKey="Hennig, L" uniqKey="Hennig L">L Hennig</name>
</author>
<author>
<name sortKey="Gruissem, W" uniqKey="Gruissem W">W Gruissem</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Genome Biol Evol</journal-id>
<journal-id journal-id-type="iso-abbrev">Genome Biol Evol</journal-id>
<journal-id journal-id-type="publisher-id">gbe</journal-id>
<journal-id journal-id-type="hwp">gbe</journal-id>
<journal-title-group>
<journal-title>Genome Biology and Evolution</journal-title>
</journal-title-group>
<issn pub-type="epub">1759-6653</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">26338185</article-id>
<article-id pub-id-type="pmc">4607528</article-id>
<article-id pub-id-type="doi">10.1093/gbe/evv171</article-id>
<article-id pub-id-type="publisher-id">evv171</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Angiosperms Are Unique among Land Plant Lineages in the Occurrence of Key Genes in the RNA-Directed DNA Methylation (RdDM) Pathway</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Ma</surname>
<given-names>Lu</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hatlen</surname>
<given-names>Andrea</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kelly</surname>
<given-names>Laura J.</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Becher</surname>
<given-names>Hannes</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Wencai</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kovarik</surname>
<given-names>Ales</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Leitch</surname>
<given-names>Ilia J.</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Leitch</surname>
<given-names>Andrew R.</given-names>
</name>
<xref ref-type="aff" rid="evv171-AFF1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="evv171-COR1">*</xref>
</contrib>
<aff id="evv171-AFF1">
<sup>1</sup>
School of Biological and Chemical Sciences, Queen Mary University of London, United Kingdom</aff>
<aff id="evv171-AFF2">
<sup>2</sup>
Department of Molecular Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic</aff>
<aff id="evv171-AFF3">
<sup>3</sup>
Department of Comparative Plant and Fungal Biology Royal Botanic Gardens, Kew, Richmond, Surrey, United Kingdom</aff>
</contrib-group>
<author-notes>
<corresp id="evv171-COR1">*Corresponding author: E-mail:
<email>a.r.leitch@qmul.ac.uk</email>
.</corresp>
<fn id="FN100">
<p>
<bold>Associate editor:</bold>
Maria Costantini</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<month>9</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="epub">
<day>02</day>
<month>9</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>02</day>
<month>9</month>
<year>2015</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>7</volume>
<issue>9</issue>
<fpage>2648</fpage>
<lpage>2662</lpage>
<history>
<date date-type="accepted">
<day>26</day>
<month>8</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.</copyright-statement>
<copyright-year>2015</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/" license-type="creative-commons">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>The RNA-directed DNA methylation (RdDM) pathway can be divided into three phases: 1) small interfering RNA biogenesis, 2) de novo methylation, and 3) chromatin modification. To determine the degree of conservation of this pathway we searched for key genes among land plants. We used OrthoMCL and the OrthoMCL Viridiplantae database to analyze proteomes of species in bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms. We also analyzed small RNA size categories and, in two gymnosperms, cytosine methylation in ribosomal DNA. Six proteins were restricted to angiosperms, these being NRPD4/NRPE4, RDM1, DMS3 (defective in meristem silencing 3), SHH1 (SAWADEE homeodomain homolog 1), KTF1, and SUVR2, although we failed to find the latter three proteins in
<italic>Fritillaria persica</italic>
, a species with a giant genome. Small RNAs of 24 nt in length were abundant only in angiosperms. Phylogenetic analyses of Dicer-like (DCL) proteins showed that DCL2 was restricted to seed plants, although it was absent in
<italic>Gnetum gnemon</italic>
and
<italic>Welwitschia mirabilis</italic>
. The data suggest that phases (1) and (2) of the RdDM pathway, described for model angiosperms, evolved with angiosperms. The absence of some features of RdDM in
<italic>F. persica</italic>
may be associated with its large genome. Phase (3) is probably the most conserved part of the pathway across land plants. DCL2, involved in virus defense and interaction with the canonical RdDM pathway to facilitate methylation of CHH, is absent outside seed plants. Its absence in
<italic>G. gnemon</italic>
, and
<italic>W. mirabilis</italic>
coupled with distinctive patterns of CHH methylation, suggest a secondary loss of DCL2 following the divergence of Gnetales.</p>
</abstract>
<kwd-group>
<kwd>chromatin modification</kwd>
<kwd>DNA methylation</kwd>
<kwd>evolution</kwd>
<kwd>RNA-directed DNA methylation</kwd>
<kwd>seed plants</kwd>
</kwd-group>
<counts>
<page-count count="15"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>The first land plants appeared in the fossil record around 470–480 Ma (
<xref rid="evv171-B21" ref-type="bibr">Kenrick and Crane 1997</xref>
;
<xref rid="evv171-B51" ref-type="bibr">Wellman et al. 2003</xref>
) and the species which survive today can be broadly divided into four major groups: 1) the nonvascular plants, which comprise bryophytes (liverworts, mosses, and hornworts); 2) the lycophytes, the earliest diverging extant group of vascular plants; 3) the monilophytes, which include the horsetails (
<italic>Equisetum</italic>
), whisk ferns (e.g.,
<italic>Psilotum</italic>
), ophioglossoid ferns (e.g.,
<italic>Ophioglossum</italic>
), and true ferns; and 4) the seed plants comprising angiosperms (flowering plants) and gymnosperms (naked-seed plants). With the huge increase in genomic data available for species belonging to these different groups, it has become clear that the genome dynamics of each group are distinctive (reviewed in Leitch and Leitch [
<xref rid="evv171-B26" ref-type="bibr">2012</xref>
,
<xref rid="evv171-B27" ref-type="bibr">2013</xref>
]). Here, we explore the composition of the epigenetic machinery in representatives of these major groups and suggest how the differences encountered might have played a role in shaping their genome dynamics. In particular, we compare and contrast the genes involved in controlling the RNA-directed DNA methylation (RdDM) pathway, with a particular emphasis on angiosperms and gymnosperms, but including representatives of the other land plant groups to determine directionality of change in the evolution of this pathway.</p>
<p>Gymnosperms comprise approximately 780 species and are represented by four distinct lineages, the cycads (Cycadales),
<italic>Ginkgo</italic>
(Ginkgoales), Gnetales, and Coniferales (conifers). Our understanding of their genome structure, and the epigenetic processes that regulate their genomes, is largely restricted to Pinaceae (
<xref rid="evv171-B26" ref-type="bibr">Leitch and Leitch 2012</xref>
). Outside this family understanding is minimal, and in most cases missing entirely. Nevertheless, despite this dearth of knowledge, we do know that gymnosperms have reduced frequencies of polyploidy in all but
<italic>Ephedra</italic>
(Gnetales;
<xref rid="evv171-B26" ref-type="bibr">Leitch and Leitch 2012</xref>
) and there is some evidence of alternative mechanisms to regulate the evolution of their genome, for example, different epigenetic marks associated with heterochromatin (
<xref rid="evv171-B13" ref-type="bibr">Fuchs et al. 2008</xref>
), higher levels of transcription of retrotransposons in conifers than angiosperms (
<xref rid="evv171-B38" ref-type="bibr">Morse et al. 2009</xref>
;
<xref rid="evv171-B42" ref-type="bibr">Parchman et al. 2010</xref>
), and lower levels of unequal recombination to remove the long-terminal repeats (LTRs) from LTR retrotransposons (
<xref rid="evv171-B41" ref-type="bibr">Nystedt et al. 2013</xref>
). Such differences have been postulated to have fundamentally shaped patterns of genome evolution in seed plants (
<xref rid="evv171-B26" ref-type="bibr">Leitch and Leitch 2012</xref>
).</p>
<p>It is widely recognized that the diversity of genome sizes in land plants arises from differences in the accumulation of repetitive elements, including tandem and dispersed repeats, as well as the frequency of polyploidy, or whole-genome duplication, in the lineages’ ancestry. This article focuses on the evolution of mechanisms that control the accumulation of repeats and searches for differences in these mechanisms between representative species of the major land plant lineages.</p>
<p>Regulation of repeats in angiosperms broadly falls into two categories: 1) RdDM de novo methylation and 2) maintenance methylation pathways, the latter involving genes which play a role in CG and CHG methylation. This work focuses on the RdDM pathway, leading to the heterochromatinization of repeats in angiosperms, as summarized in
<xref ref-type="fig" rid="evv171-F1">figure 1</xref>
, which outlines the canonical pathway. Briefly, the RdDM pathway can be divided into three phases: 1) RNA polymerase IV (Pol IV)-dependent small interfering RNA (siRNA) biogenesis, 2) RNA polymerase V (Pol V)-mediated de novo DNA methylation, and 3) chromatin alteration or modification (review in
<xref rid="evv171-B36" ref-type="bibr">Matzke and Mosher [2014]</xref>
). In the first of these, Pol IV activity synthesizes RNA transcripts, which are made double stranded by RNA-dependent RNA polymerase 2 (RDR2) and “diced” or cut into 24 nt siRNAs using Dicer-like 3 (DCL3) endonuclease. These siRNAs are then complexed with the argonaute (AGO) protein AGO4 and directed back to the nucleus. Then the siRNAs, through sequence homology, are targeted back to DNA repeats. In phase (2) Pol V is involved in the further transcription of repeats in association with the diced 24 nt siRNA to facilitate RdDM in a little understood process. Finally, in phase (3), genes involved in histone modification and chromatin folding heterochromatize the DNA sequence. This process then “seeds” methylation, which may spread into surrounding genic regions and become extended and supplemented by the activities of the maintenance methylation pathways which typically involve the recognition and full methylation of hemimethylated CG and CHG sites by methyltransferase 1 (MET1) and chromomethylase 3 (CMT3) DNA methyltransferases, respectively, and of CHH by CMT2 (
<xref rid="evv171-B36" ref-type="bibr">Matzke and Mosher 2014</xref>
).
<fig id="evv171-F1" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 1.—</label>
<caption>
<p>The RdDM pathway, taken from
<xref rid="evv171-B36" ref-type="bibr">Matzke and Mosher (2014)</xref>
. The genes involved are shown, and details are given, in the source reference. The pathway is divided into three key phases, 1) Pol IV-dependent siRNA biogenesis, 2) Pol V-mediated de novo DNA methylation, and 3) chromatin alterations. An overview of their activity is given in the introduction and the full names of genes given in
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online.</p>
</caption>
<graphic xlink:href="evv171f1p"></graphic>
</fig>
</p>
<p>The vast majority of research into the genes involved in the RdDM pathway in plants has been conducted in
<italic>Arabidopsis thaliana</italic>
. Thus to search for the occurrence of these genes across the different land plant groups, we retrieved the key genes and their paralogues from
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
(
<xref ref-type="fig" rid="evv171-F2">fig. 2</xref>
) and used these to search for their occurrence in cluster groups of OrthoMCL from publically available proteome sequence databases of bryophytes, lycophytes, monilophytes, representatives from the gymnosperm lineages, the early diverging angiosperm
<italic>Amborella trichopoda</italic>
, and some model angiosperms (e.g.,
<italic>Zea mays</italic>
). In addition, because gymnosperms have significantly larger genomes than most angiosperms, we hypothesized that this may be due to different activities of RdDM. To test that hypothesis we also searched the transcriptome of
<italic>Fritillaria persica</italic>
, an angiosperm with a particularly large genome (1C = 41.21 pg;
<xref rid="evv171-B20" ref-type="bibr">Kelly et al. 2015</xref>
).
<fig id="evv171-F2" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 2.—</label>
<caption>
<p>Key genes of the RdDM pathway taken from
<xref rid="evv171-B36" ref-type="bibr">Matzke and Mosher (2014)</xref>
and
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. (2015)</xref>
that have been analyzed. Genes given in bold were not detected by us outside angiosperms. The genes are grouped into three categories in line with the three phases of chromatin remodeling shown in
<xref ref-type="fig" rid="evv171-F1">figure 1</xref>
.</p>
</caption>
<graphic xlink:href="evv171f2p"></graphic>
</fig>
</p>
</sec>
<sec sec-type="materials|methods">
<title>Materials and Methods</title>
<sec>
<title>Data Used to Search for Orthologues</title>
<p>A flow diagram illustrating our bioinformatic approaches is shown in
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary figure S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online. Twelve proteomes from representative taxa of land plants were selected comprising 1) the angiosperms
<italic>Ar. </italic>
<italic>thaliana</italic>
L. (Heynh.),
<italic>Am. </italic>
<italic>trichopoda</italic>
, Baill.
<italic>F. </italic>
<italic>persica</italic>
L.,
<italic>Z. </italic>
<italic>mays</italic>
L.; 2) the gymnosperms
<italic>Ginkgo biloba</italic>
L.,
<italic>Gnetum gnemon</italic>
L.,
<italic>Picea abies</italic>
(L.) H.Karst.,
<italic>Pinus taeda</italic>
L.,
<italic>Welwitschia mirabilis</italic>
Hook.f.; 3) the monilophyte
<italic>Pteridium aquilinum</italic>
L.Kuhn.; 4) the lycophyte
<italic>Selaginella moellendorffii</italic>
Hieron; and 5) the bryophyte
<italic>Physcomitrella patens</italic>
(Hedw.) Bruch and Schimp.</p>
<p>Proteome data for eight of these species were retrieved from public databases (see
<xref ref-type="table" rid="evv171-T1">table 1</xref>
). For the remaining four species we derived proteome data from transcriptomes. For
<italic>P</italic>
<italic>t</italic>
<italic>. aquilinum</italic>
we used the transcriptome data from
<xref rid="evv171-B28" ref-type="bibr">Li et al. (2014)</xref>
. For
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
, we downloaded Illumina raw reads from NCBI SRA archive (ERR364403) (
<xref ref-type="table" rid="evv171-T1">table 1</xref>
). For
<italic>F. persica</italic>
and
<italic>W. mirabilis</italic>
, we used new transcriptomic data (see below), and reads from the mRNA library of
<italic>F. persica</italic>
,
<italic>G</italic>
<italic>n</italic>
<italic>. </italic>
<italic>g</italic>
<italic>nemon</italic>
, and
<italic>W. mirabilis</italic>
were de novo assembled (
<xref ref-type="table" rid="evv171-T1">table 1</xref>
) using Trinity (version r2013-02-25;
<xref rid="evv171-B15" ref-type="bibr">Grabherr et al. 2011</xref>
) with default settings. TransDecoder was then used to identify the protein-coding regions from the de novo assembled contigs using default settings and keeping sequences longer than 100 amino acids.
<table-wrap id="evv171-T1" orientation="portrait" position="float">
<label>Table 1</label>
<caption>
<p>Transcriptomes, Proteomes, and sRNA Data Used in This Study</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Species</th>
<th rowspan="1" colspan="1">Abbreviation</th>
<th rowspan="1" colspan="1">Source of Proteomes</th>
<th rowspan="1" colspan="1">No. of Proteins</th>
<th rowspan="1" colspan="1">Source of sRNA</th>
<th rowspan="1" colspan="1">Tissue for sRNA</th>
<th rowspan="1" colspan="1">No. of sRNAs (18–26 nt)</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">
<bold>Angiosperms</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Amborella trichopoda</italic>
</td>
<td align="left" rowspan="1" colspan="1">ATRI</td>
<td align="left" rowspan="1" colspan="1">Phytozome v10 (
<ext-link ext-link-type="uri" xlink:href="http://phytozome.jgi.doe.gov/">http://phytozome.jgi.doe.gov/</ext-link>
)</td>
<td align="left" rowspan="1" colspan="1">26,846</td>
<td align="left" rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://smallrna.udel.edu/data.php">http://smallrna.udel.edu/data.php</ext-link>
</td>
<td align="left" rowspan="1" colspan="1">Leaves</td>
<td align="left" rowspan="1" colspan="1">4,003,853</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Arabidopsis thaliana</italic>
</td>
<td align="left" rowspan="1" colspan="1">ATHA</td>
<td align="left" rowspan="1" colspan="1">Phytozome v10 (
<ext-link ext-link-type="uri" xlink:href="http://phytozome.jgi.doe.gov/">http://phytozome.jgi.doe.gov/</ext-link>
)</td>
<td align="left" rowspan="1" colspan="1">35,386</td>
<td align="left" rowspan="1" colspan="1">GEO (GSM154370)</td>
<td align="left" rowspan="1" colspan="1">Leaves</td>
<td align="left" rowspan="1" colspan="1">15,831</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Fritillaria persica</italic>
</td>
<td align="left" rowspan="1" colspan="1">FPER</td>
<td align="left" rowspan="1" colspan="1">Trinity_assembled (see link
<xref ref-type="table-fn" rid="evv171-TF1">
<sup>a</sup>
</xref>
)</td>
<td align="left" rowspan="1" colspan="1">62,452</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Zea mays</italic>
</td>
<td align="left" rowspan="1" colspan="1">ZMAY</td>
<td align="left" rowspan="1" colspan="1">Phytozome v10 (
<ext-link ext-link-type="uri" xlink:href="http://phytozome.jgi.doe.gov/">http://phytozome.jgi.doe.gov/</ext-link>
)</td>
<td align="left" rowspan="1" colspan="1">88,760</td>
<td align="left" rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://smallrna.udel.edu/data.php">http://smallrna.udel.edu/data.php</ext-link>
</td>
<td align="left" rowspan="1" colspan="1">Leaves</td>
<td align="left" rowspan="1" colspan="1">3,662,565</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Gymnosperms</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Ginkgo biloba</italic>
</td>
<td align="left" rowspan="1" colspan="1">GBIL</td>
<td align="left" rowspan="1" colspan="1">
<ext-link ext-link-type="ftp" xlink:href="ftp://ftp.plantbiology.msu.edu/">ftp://ftp.plantbiology.msu.edu/pub/data/MPGR/Ginkgo_biloba/</ext-link>
</td>
<td align="left" rowspan="1" colspan="1">65,468</td>
<td align="left" rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://smallrna">http://smallrna.udel.edu/data.php</ext-link>
</td>
<td align="left" rowspan="1" colspan="1">Leaves</td>
<td align="left" rowspan="1" colspan="1">3,623,537</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Gnetum gnemon</italic>
</td>
<td align="left" rowspan="1" colspan="1">GMON</td>
<td align="left" rowspan="1" colspan="1">Trinity-assembled (see link
<xref ref-type="table-fn" rid="evv171-TF1">
<sup>a</sup>
</xref>
) ERR364403</td>
<td align="left" rowspan="1" colspan="1">26,782</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Welwitschia mirabilis</italic>
</td>
<td align="left" rowspan="1" colspan="1">WMIR</td>
<td align="left" rowspan="1" colspan="1">Trinity-assembled (see link
<xref ref-type="table-fn" rid="evv171-TF1">
<sup>a</sup>
</xref>
)</td>
<td align="left" rowspan="1" colspan="1">18,255</td>
<td align="left" rowspan="1" colspan="1">See link
<xref ref-type="table-fn" rid="evv171-TF1">
<sup>a</sup>
</xref>
</td>
<td align="left" rowspan="1" colspan="1">Leaves</td>
<td align="left" rowspan="1" colspan="1">56,649,017</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Picea abies</italic>
</td>
<td align="left" rowspan="1" colspan="1">PABI</td>
<td align="left" rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://congenie.org">http://congenie.org</ext-link>
</td>
<td align="left" rowspan="1" colspan="1">66,632</td>
<td align="left" rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://smallrna">http://smallrna.udel.edu/data.php</ext-link>
</td>
<td align="left" rowspan="1" colspan="1">Needles</td>
<td align="left" rowspan="1" colspan="1">3,010,087</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Pinus taeda</italic>
</td>
<td align="left" rowspan="1" colspan="1">PTAE</td>
<td align="left" rowspan="1" colspan="1">
<ext-link ext-link-type="uri" xlink:href="http://pinegenome.org/pinerefseq/">http://pinegenome.org/pinerefseq/</ext-link>
(v1.01)</td>
<td align="left" rowspan="1" colspan="1">64,809</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Monilophytes</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Pteridium aquilinum</italic>
</td>
<td align="left" rowspan="1" colspan="1">PAQU</td>
<td align="left" rowspan="1" colspan="1">NCBI Transcriptome Shotgun Assembly (GASP00000000.1)</td>
<td align="left" rowspan="1" colspan="1">23,332</td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Lycophytes</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Selaginella moellendorffii</italic>
</td>
<td align="left" rowspan="1" colspan="1">SMOE</td>
<td align="left" rowspan="1" colspan="1">Phytozome v10 (
<ext-link ext-link-type="uri" xlink:href="http://phytozome.jgi.doe.gov/">http://phytozome.jgi.doe.gov/</ext-link>
)</td>
<td align="left" rowspan="1" colspan="1">22,285</td>
<td align="left" rowspan="1" colspan="1">GEO (GSM176654)</td>
<td align="left" rowspan="1" colspan="1">Above-ground tissues</td>
<td align="left" rowspan="1" colspan="1">1,30,240</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Bryophytes</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Physcomitrella patens</italic>
</td>
<td align="left" rowspan="1" colspan="1">PPAT</td>
<td align="left" rowspan="1" colspan="1">Phytozome v10 (
<ext-link ext-link-type="uri" xlink:href="http://phytozome.jgi.doe.gov/">http://phytozome.jgi.doe.gov/</ext-link>
)</td>
<td align="left" rowspan="1" colspan="1">42,392</td>
<td align="left" rowspan="1" colspan="1">GEO (GSM115095)</td>
<td align="left" rowspan="1" colspan="1">Proto-nemata</td>
<td align="left" rowspan="1" colspan="1">97,999</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="evv171-TF100">
<p>N
<sc>ote</sc>
—All URLs were last accessed on September 10, 2015.</p>
</fn>
<fn id="evv171-TF1">
<p>
<sup>a</sup>
<ext-link ext-link-type="uri" xlink:href="https://goo.gl/PrNKfB">https://goo.gl/PrNKfB</ext-link>
.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec>
<title>RNA Sequencing</title>
<p>For
<italic>F. persica</italic>
we obtained transcriptomic data by extracting mRNA from leaves as in
<xref rid="evv171-B2" ref-type="bibr">Becher et al. (2014)</xref>
. The transcriptome of
<italic>F. persica</italic>
was sequenced by the Centre of Genomic Research at the University of Liverpool, United Kingdom using HiSeq2000 (100 bp paired-end reads). Total RNA of
<italic>W. mirabilis</italic>
from fresh leaf fragments was extracted using a mirVana miRNA isolation kit (Life Technology) following the manufacturer’s instructions. Both transcriptome and small RNA (sRNA) sequencing of
<italic>W. mirabilis</italic>
was conducted by BGI, Shenzhen, China on the HiSeq2000 platform (library fragment size for
<italic>W. mirabilis</italic>
transcriptome sequencing was 270 bp with 91 bp paired-end reads; library fragment size for
<italic>W. mirabilis</italic>
sRNA sequencing was 107 bp with 50 bp single-end reads).</p>
</sec>
<sec>
<title>Finding OrthoMCL Gene Groups of the RdDM Pathway in Land Plants</title>
<p>Proteomes from the 12 representative land plant taxa were filtered using the pipeline OrthoMCL (v2.0.9;
<xref rid="evv171-B29" ref-type="bibr">Li et al. 2003</xref>
;
<xref rid="evv171-B12" ref-type="bibr">Fischer et al. 2011</xref>
) and the number of “good” protein sequences, as defined by OrthoMCL (using default settings), for each species is shown in
<xref ref-type="table" rid="evv171-T1">table 1</xref>
. We searched the proteomes against each other using BLASTp, and then with OrthoMCL. We generated OrthoMCL groups of proteins (clusters) based on similarity, keeping matches with
<italic>E</italic>
values <1e
<sup></sup>
<sup>5</sup>
and ≥50% match along the protein length. The MCL algorithm was used to generate the OrthoMCL clusters of proteins with an inflation value of 1.5. To find orthologues of genes in the RdDM pathway, we first retrieved the protein sequences listed in
<xref ref-type="fig" rid="evv171-F2">figure 2</xref>
(see also
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online, for the full names of each protein) from
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
, and used these to extract orthologous and paralagous proteins from the OrthoMCL clusters of the other 11 species. The protein information retrieved for DMS3 (defective in meristem silencing 3), KTF1, DCL, and RDM1 (RNA-directed DNA methylation 1) is given in
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary tables S2–S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online, respectively, and all protein sequences from each group are given in FASTA format in the
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary data file S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online. Custom Python scripts (available on request) were used to extract the protein groups and corresponding protein sequences for each locus in the RdDM pathway based on the gene names used for
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
(reference proteins,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online).</p>
<p>For six of the RdDM genes, the OrthoMCL groups did not contain sequences from any of the nonangiosperm species analyzed. For these genes, we also searched for orthologues in the OrthoMCL Viridiplantae database (
<ext-link ext-link-type="uri" xlink:href="http://www.orthomcl.org/orthomcl/">http://www.orthomcl.org/orthomcl/</ext-link>
, last accessed September 10, 2015) by BLASTp. The OrthoMCL Viridiplantae database includes data from the following eight plant species—angiosperms:
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
,
<italic>Oryza sativa</italic>
, and
<italic>Ricinus communis</italic>
; bryophytes:
<italic>P</italic>
<italic>h</italic>
<italic>. Patens</italic>
; and green algae:
<italic>Chlamydomonas reinhardtii</italic>
,
<italic>Micromonas</italic>
sp. RCC299,
<italic>Ostreococcus tauri</italic>
, and
<italic>Volvox carteri</italic>
.</p>
</sec>
<sec>
<title>Generation of DCL Protein Trees</title>
<p>Using the approach above, DCL putative orthologues from the 12 representative land plant species were extracted from the OrthoMCL output by searching for OrthoMCL groups containing each of the four
<italic>Arabidopsis</italic>
DCL genes (i.e., DCL1, accession AT1G01040; DCL2, accession AT3G03300; DCL3, accession AT3G43920; DCL4, accession AT5G20320). Protein domains of all sequences were analyzed by scanning predicted protein sequences against the Pfam protein database (
<ext-link ext-link-type="uri" xlink:href="http://pfam.xfam.org/search">http://pfam.xfam.org/search</ext-link>
, last accessed September 10, 2015). When more than one splice variant was present for a gene, only the longest protein sequence was kept for analysis. When more than one incomplete protein from the same species had the same domains, we kept the longest variant. Protein sequences that passed these selection criteria were aligned using MUSCLE with default parameters (version 3.8.31;
<xref rid="evv171-B10" ref-type="bibr">Edgar 2004</xref>
) and trimmed using trimAl (version 1.2rev59;
<xref rid="evv171-B6" ref-type="bibr">Capella-Gutiérrez et al. 2009</xref>
) with the setting “automated1” to remove regions with an excessive amount of missing data or poorly aligned regions. We used ProtTest (version 3.4;
<xref rid="evv171-B8" ref-type="bibr">Darriba et al. 2011</xref>
) to select the best model (LG+I+G) based on Bayesian Information Criterion, and RAxML (version 7.4.2;
<xref rid="evv171-B47" ref-type="bibr">Stamatakis 2006</xref>
) to build the phylogenetic trees, performing 1,000 bootstrap replicates and using the following options: -p 12345, -f a, -c 4, -x 12345.</p>
</sec>
<sec>
<title>Generation of RDM1 Protein Tree</title>
<p>Phylogenetic analysis of protein sequences from the RDM1 locus was performed in the same way as described above for DCL. However, only four RDM1 protein sequences were isolated from the OrthoMCL results, all from angiosperms. Consequently, we also searched for putative homologues by performing BLASTp searches against the NCBI Protein Reference Sequence database (
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/protein">http://www.ncbi.nlm.nih.gov/protein</ext-link>
, last accessed September 10, 2015), retaining all protein matches with an
<italic>E</italic>
value <1e
<sup></sup>
<sup>5</sup>
and ≥50% identity.</p>
</sec>
<sec>
<title>sRNA Analysis</title>
<p>Most of the sRNA data analyzed were downloaded from public databases (
<xref ref-type="table" rid="evv171-T1">table 1</xref>
) and comprised reads that had already been trimmed to remove adapter sequences. For
<italic>W. mirabilis</italic>
, sRNAs were sequenced here (see above). Custom Python scripts were used to obtain the length of each sRNA sequence within the 18–26 nt size range.</p>
</sec>
<sec>
<title>Southern Hybridization</title>
<p>Purified genomic DNAs of
<italic>G</italic>
<italic>i</italic>
<italic>. biloba</italic>
,
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
and, as a control, the angiosperm
<italic>Nicotiana tabacum</italic>
L. (∼2 μg/sample) were digested with the restriction enzymes MspI, HpaII, BstNI or ScrFI and separated by gel electrophoresis on a 0.9% (w/v) agarose gel. The gels were then alkali-blotted onto Hybond-XL membranes (GE Healthcare, Little Chalfont, United Kingdom) and hybridized with a
<sup>32</sup>
P-labeled DNA probe (DekaLabel kit, MBI, Fermentas, Vilnius, Lithuania) for the 18 S ribosomal RNA (rRNA) gene according to protocols described in
<xref rid="evv171-B22" ref-type="bibr">Kovarik et al. (2005)</xref>
. After washing (2 × 5 min with 2 × SSC, 0.1% SDS at room temperature followed by 2 × 15 min with 0.6 × SSC, 0.1% SDS, 65 °C), the hybridization bands were visualized with a PhosphorImager (Typhoon 9410, GE Healthcare, PA) and the data quantified by ImageQuant software (GE Healthcare, PA). The 18S probe was a 300-bp fragment (
<xref ref-type="fig" rid="evv171-F6">fig. 6</xref>
<italic>a</italic>
) obtained by amplification of the 18S rRNA gene of the gymnosperm
<italic>Cycas revoluta</italic>
Thunb. using primers described further below.</p>
</sec>
<sec>
<title>Bisulphite Sequencing</title>
<p>Modification of DNA with bisulphite was carried out with an EpiTect kit (Qiagen, Germany) using 1.3 μg of genomic DNA from leaves. The primers used amplified the coding strand of the 18S rRNA gene subregion shown in
<xref ref-type="fig" rid="evv171-F6">figure 6</xref>
<italic>a</italic>
and did not discriminate between methylated and nonmethylated templates. The primer sequences were as follows: 18SBIS forward: 5′-TATGAGTYTGGTAATTGGAATG-3′; 18SBIS reverse: 5′-TTTAARCACTCTAATTTCTTCAAA-3′. The polymerase chain reaction (total volume 25 μl) used 1.0 μl of bisulphite-converted DNA as the template, 4 nmol of each dNTP, 8 pmol of each primer, and 0.8 U of Kapa
<italic>Taq</italic>
DNA polymerase (Kapabiosystems). Cycling conditions were as follows: initial denaturation (94 °C/3 min); 35 cycles of (94 °C/20 s; 55 °C/20 s; 72 °C/20 s); and a final extension (72 °C/10 min). The resulting c. 300 bp products were separated by gel electrophoresis, purified and cloned into a TA vector (pDrive, Qiagen). In total, 22 and 18 clones were sequenced from
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
and
<italic>G</italic>
<italic>i</italic>
<italic>. biloba</italic>
, respectively. After trimming of primers the 241 bp-long sequences were aligned and statistically evaluated using CyMATE software (
<xref rid="evv171-B19" ref-type="bibr">Hetzl et al. 2007</xref>
).</p>
</sec>
</sec>
<sec sec-type="results">
<title>Results</title>
<sec>
<title>OrthoMCL Clustering</title>
<p>The proteomes of 12 species were compiled to include representative taxa from all four major land plant lineages. Together these 12 taxa generated between 18,255 (
<italic>W. mirabilis</italic>
) and 88,760 (
<italic>Z. mays</italic>
) protein sequences, summing to a total of 543,399 proteins that were clustered into 55,357 OrthoMCL groups (containing both paralogues and orthologues) using OrthoMCL (
<xref ref-type="table" rid="evv171-T1">table 1</xref>
).</p>
<p>We found OrthoMCL groups for all 31 genes/gene families listed in
<xref ref-type="fig" rid="evv171-F2">figure 2</xref>
which represent genes belonging to the three phases of the canonical RdDM pathway, namely: 1) Pol IV-dependent siRNA biogenesis, 2) Pol V-mediated de novo DNA methylation, and 3) chromatin alterations (
<xref ref-type="fig" rid="evv171-F1">fig. 1</xref>
) together with additional factors also involved in cytosine methylation. OrthoMCL groups of nine proteins or families involved in the RdDM pathway, namely the NRPD2/NRPE2, NRPE9B, NRPB1, RDR, DCL, HEN, AGO, HDA, and UBP contained sequences from all 12 of the species analyzed, indicating high levels of conservation for these loci across land plants (highlighted in green in
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). MET1, which codes for Methyltransferase 1, and DDM1 (Decreased DNA methylation 1), which is a chromatin remodeler protein, were also found in all analyzed species.</p>
<p>Putative homologues of DMS3 were found in all plants except
<italic>P</italic>
<italic>in</italic>
<italic>. taeda</italic>
and
<italic>P</italic>
<italic>h</italic>
<italic>. patens</italic>
(
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). However, closer analysis revealed that the protein was either unusually long, indicative of SMC proteins involved in chromatin remodeling (
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
), or so short that it was not possible to distinguish DMS3 from SMC homologues. Consequently, proteins greater than 700 amino acids and less than 150 amino acids were removed. Proteins with a histidine kinase-like ATPase motif, present in the SMC-related protein AtGMI1 (
<xref rid="evv171-B5" ref-type="bibr">Böhmdorfer et al. 2011</xref>
) but not in DMS3, were also removed. This left only putative DMS3 OrthoMCL groups in seed plants. These proteins were aligned using T-Coffee (
<xref rid="evv171-B39" ref-type="bibr">Notredame et al. 2000</xref>
) and the alignment quality was assessed using Transitive Consistence Score (TCS,
<xref rid="evv171-B7" ref-type="bibr">Chang et al. 2015</xref>
). Four proteins had poor alignment (TCS ≤ 16, see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online) and these, together with one isoform of the protein from
<italic>Z</italic>
,
<italic>mays</italic>
, were removed from the analysis, leaving seven sequences, all from seed plants. The SMC-related protein from
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
(GMI1_AT5G24280) was added to the alignment. Phylogenetic analysis of these eight sequences revealed two groups, one containing angiosperms, the other gymnosperms, each being separated by GMI1 from
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
(
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). Thus, sequences from gymnosperms cannot be distinguished from SMC-related proteins, and only in angiosperms can we confidently identify DMS3-like sequences, consistent with
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. (2015)</xref>
.</p>
<p>We searched the data for OrthoMCL groups that were found only in angiosperms and so missing in all other land plant groups and found, in addition to DMS3, a further five proteins in this category: NRPD4/NRPE4, SHH1 (SAWADEE homeodomain homolog 1), RDM1, SUVR2, and KTF1 (all shown in bold in
<xref ref-type="fig" rid="evv171-F2">figure 2</xref>
and highlighted in blue in
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). All six proteins belong to phase (1) and/or (2) of the RdDM pathway. From the 12 proteomes included in the OrthoMCL analysis, only the eudicot
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
yielded sequences for SUVR2 and SHH1 (but see below).</p>
<p>We extended our proteome analysis to include the OrthoMCL Viridiplantae database, which contains data from six additional plant species not analyzed above (
<italic>O. </italic>
<italic>sativa</italic>
,
<italic>R. </italic>
<italic>communis</italic>
,
<italic>Ch. </italic>
<italic>reinhardtii</italic>
,
<italic>Macromonas</italic>
sp.,
<italic>O. </italic>
<italic>tauri</italic>
, and
<italic>V. </italic>
<italic>carteri</italic>
). We focused our search on identifying homologues of the six proteins found only in angiosperms (see above). Using this extended approach SUVR2 and SHH1 were found in all three angiosperm species listed in the OrthoMCL Viridiplantae database, including the monocot
<italic>O. sativa</italic>
, showing that these gene families are not restricted to eudicots. In
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
we identified three putative homologues of the SHH1-family and five of SUVR-family, whereas in both
<italic>O. sativa</italic>
and
<italic>R. communis</italic>
we identified one putative homologue in each.</p>
<p>Beyond the angiosperms, no sequences with homology to NRPD4/NRPE4, SHH1, RDM1, and SUVR2 were found in gymnosperms, monilophytes or lycophytes, but putative KTF1 (KOW domain-containing transcription factor 1; a synonym of SPT5L) homologues were found in the bryophyte
<italic>P</italic>
<italic>h</italic>
<italic>. patens</italic>
, and the green algae
<italic>Micromonas</italic>
sp. RCC299 and
<italic>V. </italic>
<italic>carteri</italic>
in the OrthoMCL Viridiplantae database. Because OrthoMCL relies on low thresholds of BLAST similarity (
<italic>E</italic>
values <1 e
<sup></sup>
<sup>5</sup>
and ≥50% match along the protein length), we further characterized these proteins, by searching for NGN and KOW domains, together with the WG/GW motifs characteristic of KTF1 (
<xref rid="evv171-B17" ref-type="bibr">He, Hsu, Zhu, et al. 2009</xref>
;
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
) (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). We failed to find NGN and KOW domains outside the angiosperms (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). We also noticed that while the putative KTF1 sequences in the angiosperm
<italic>F. perisca</italic>
contained both NGN and KOW domains (
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
), they lacked GW/WG motifs, perhaps because the protein is a partial assembly (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary fig. S3 and</ext-link>
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">table S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online).</p>
<p>In summary, the combined results from our analysis indicated that NRPD4/NRPE4, SHH1, RDM1, KTF1, DMS3, and SUVR2 are restricted to angiosperms.</p>
</sec>
<sec>
<title>Phylogenetic Relationships between Members of the DCL Family Proteins</title>
<p>In
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
it is known that there are four paralogues in the DCL family, these are DCL1 which generates 21 nt microRNAs (miRNAs), DCL2 generating 22 nt siRNAs from viral sequences, DCL3 involved in RdDM and generating 24 nt siRNAs (
<xref ref-type="fig" rid="evv171-F1">fig. 1</xref>
) and DCL4, generating 21 nt siRNAs and trans-acting siRNAs. In
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
, expression levels of each DCL family member are similar and at medium levels in most tissues (
<xref rid="evv171-B55" ref-type="bibr">Zimmermann et al. 2004</xref>
), so we might expect to detect the presence of orthologues in other species, if they are present.</p>
<p>From the 12 species that are the focus of this study, a total of 84 proteins formed a “DCL family group.” They included the four DCL family members in
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
(DCL1–DCL4), which, when complete, should each exceed 1,300 amino acids. Protein domains of all sequences were analyzed by scanning against the Pfam protein database (
<ext-link ext-link-type="uri" xlink:href="http://pfam.xfam.org/search">http://pfam.xfam.org/search</ext-link>
, last accessed September 10, 2015). When more than one splice variant, or size variant was present, only the longest protein sequence was kept for analysis. All sequence variants were kept, leaving 56 protein sequences (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). Phylogenetic relationships between the DCL family members are shown in
<xref ref-type="fig" rid="evv171-F3">figure 3</xref>
. The sequences group into four strongly supported clades, and in each clade there is an
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
DCL member, as expected. This enabled us to label the clades DCL1–DCL4. Recently, it has been noted that there are two distinct clades of DCL3-like sequences in monocots, called DCL3a and DCL3b, the later renamed DCL5 (
<xref rid="evv171-B33" ref-type="bibr">Margis et al. 2006</xref>
;
<xref rid="evv171-B46" ref-type="bibr">Song et al. 2012</xref>
;
<xref rid="evv171-B11" ref-type="bibr">Fei et al. 2013</xref>
), and represented by two DCL3 clades each containing
<italic>Z. mays</italic>
and
<italic>O. sativa</italic>
sequences (see
<xref ref-type="fig" rid="evv171-F3">fig. 3</xref>
).
<xref ref-type="table" rid="evv171-T2">Table 2</xref>
summarizes the number of sequences (paralogues) for each of the four DCL family members across the 12 species analyzed. All species had proteins related to DCL1. Of particular note was DCL2, which was absent outside the seed plants (i.e., angiosperms and gymnosperms) and, perhaps significantly, also absent in the two gymnosperms analyzed belonging to Gnetales (
<italic>W. mirabilis</italic>
and
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
). There were also isolated absences of DCL3 (in
<italic>W. mirabilis</italic>
and
<italic>P</italic>
<italic>t</italic>
<italic>. aquilinum</italic>
) and DCL4 (in
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
and
<italic>S. moellendorffii</italic>
). It may also be significant that we found only two DCL4 domains in
<italic>F. persica</italic>
(Helicase C and Dicer-dimer domains, out of the nine DCL domains considered,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online).
<fig id="evv171-F3" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 3.—</label>
<caption>
<p>Phylogenetic relationships between DCL sequences showing four distinct DCL clades (DCL1-4). The DCL3b (or DCL5, Song et al. 2012;
<xref rid="evv171-B11" ref-type="bibr">Fei et al. 2013</xref>
) clade is labeled with a red asterisk. The (+/−) symbols indicate the land plant group in which each DCL paralogue was found.
<italic>Physcomitrella patens</italic>
(PPAT),
<italic>Selaginella moellendorffii</italic>
(SMOE),
<italic>Pteridium aquilinum</italic>
(PAQU),
<italic>Pinus taeda</italic>
(PTAE),
<italic>Picea abies</italic>
(PABI),
<italic>Welwitschia mirabilis</italic>
(WMIR),
<italic>Gnetum gnemon</italic>
(GMON),
<italic>Ginkgo biloba</italic>
(GBIL),
<italic>Amborella trichopoda</italic>
(ATRI),
<italic>Fritillaria persica</italic>
(FPER),
<italic>Zea mays</italic>
(ZMAY),
<italic>Oryza sativa</italic>
(OSAT), and
<italic>Arabidopsis thaliana</italic>
(ATHA).</p>
</caption>
<graphic xlink:href="evv171f3p"></graphic>
</fig>
<table-wrap id="evv171-T2" orientation="portrait" position="float">
<label>Table 2</label>
<caption>
<p>Numbers of Paralogues in Each of the DCL Family Members</p>
</caption>
<table frame="hsides" rules="groups">
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Species</th>
<th rowspan="1" colspan="1">Total Number</th>
<th rowspan="1" colspan="1">DCL1</th>
<th rowspan="1" colspan="1">DCL2</th>
<th rowspan="1" colspan="1">DCL3</th>
<th rowspan="1" colspan="1">DCL4</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">
<bold>Angiosperms</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Arabidopsis thaliana</italic>
</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Zea mays</italic>
</td>
<td rowspan="1" colspan="1">6</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2 (DCL3/5)</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Fritillaria persica</italic>
</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Amborella trichopoda</italic>
</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Gymnosperms</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Ginkgo biloba</italic>
</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Picea abies</italic>
</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">2
<xref ref-type="table-fn" rid="evv171-TF2">
<sup>a</sup>
</xref>
</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Pinus taeda</italic>
</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2
<xref ref-type="table-fn" rid="evv171-TF2">
<sup>a</sup>
</xref>
</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Gnetum gnemon</italic>
</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">2
<xref ref-type="table-fn" rid="evv171-TF2">
<sup>a</sup>
</xref>
</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">0</td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Welwitschia mirabilis</italic>
</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Monilophytes</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Pteridium aquilinum</italic>
</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">2
<xref ref-type="table-fn" rid="evv171-TF2">
<sup>a</sup>
</xref>
</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Lycophytes</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Selaginella moellendorffii</italic>
</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">0</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<bold>Bryophytes</bold>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">    
<italic>Physcomitrella patens</italic>
</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="evv171-TF2">
<p>
<sup>a</sup>
The assembled proteins are incomplete, and based on their sequences and domains present (see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online) they may represent a single protein.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec>
<title>RDM1 Family</title>
<p>OrthoMCL clustering revealed one RDM1 orthologue each for
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
,
<italic>F. persica</italic>
,
<italic>Z. mays</italic>
, and
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
. To better understand the evolution of RDM1, we BLAST-searched the NCBI Protein Reference Sequence database to look for further sequences with similarity to RDM1 and found 68, all from 35 angiosperm species (comprising one early-diverging, 3 monocot, and 12 eudicot families;
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). These were aligned and used to build a phylogenetic tree of sequence relationships (
<xref ref-type="fig" rid="evv171-F4">fig. 4</xref>
). Within the RDM1 phylogenetic tree, three family specific clades were recovered; one comprising all the sequences from Brassicaceae species, another containing all the sequences from Fabaceae species, and a third made up of sequences from species belong to Solanaceae. These three eudicot clades were very strongly supported (bootstrap support >95%). For five genera from four further eudicot families with two or more sequences (i.e.,
<italic>Citrus</italic>
[Rutaceae],
<italic>Cucumis</italic>
[Cucurbitaceae],
<italic>Theobroma</italic>
[Malvaceae], and
<italic>Fragaria</italic>
and
<italic>Pyrus</italic>
[both Rosaceae]), the sequences clustered by genus with strong support. A further clade was identified which contained all RDM1 sequences from monocot species, but it lacked strong support.
<fig id="evv171-F4" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 4.—</label>
<caption>
<p>Unrooted phylogenetic tree depicting relationships between RDM1-like protein sequences from angiosperms. All protein sequences used to build the tree were extracted from the NCBI Protein Reference Sequence database by BLASTp (see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online, for Genbank accession numbers used).
<italic>Amborella trichopoda</italic>
(
<italic>A. trich</italic>
),
<italic>Arabidopsis lyrata</italic>
subsp
<italic>. lyrata</italic>
(
<italic>A. lyrat</italic>
),
<italic>Arabidopsis thaliana</italic>
(
<italic>A. thali</italic>
),
<italic>Brachypodium distachyon</italic>
(
<italic>B. dista</italic>
),
<italic>Brassica rapa</italic>
(
<italic>B. rapa</italic>
),
<italic>Camelina sativa</italic>
(
<italic>C. sativ</italic>
),
<italic>Capsella rubella</italic>
(
<italic>C. rubel</italic>
),
<italic>Cicer arietinum</italic>
(
<italic>C. ariet</italic>
),
<italic>Citrus clementine</italic>
(
<italic>C. cleme</italic>
),
<italic>Citrus sinensis</italic>
(
<italic>C. sinen</italic>
),
<italic>Cucumis melo</italic>
(
<italic>C. melo</italic>
),
<italic>Cucumis sativus</italic>
(
<italic>C. sativu</italic>
),
<italic>Eucalyptus grandis</italic>
(
<italic>E. grand</italic>
),
<italic>Eutrema salsugineum</italic>
(
<italic>E. salsu</italic>
),
<italic>Fragaria vesca</italic>
subsp.
<italic>vesca</italic>
(
<italic>F. vesca</italic>
),
<italic>Glycine max</italic>
(
<italic>G. max</italic>
),
<italic>Medicago truncatula</italic>
(
<italic>M. trunc</italic>
),
<italic>Morus notabilis</italic>
(
<italic>M. notab</italic>
),
<italic>Musa acuminata</italic>
subsp.
<italic>malaccensis</italic>
(
<italic>M. malac</italic>
),
<italic>Nelumbo nucifera</italic>
(
<italic>N. nucif</italic>
),
<italic>Nicotiana sylvestris</italic>
(
<italic>N. sylve</italic>
),
<italic>Nicotiana tomentosiformis</italic>
(
<italic>N. tomen</italic>
),
<italic>Phaseolus vulgaris</italic>
(
<italic>P. vulga</italic>
),
<italic>Phoenix dactylifera</italic>
(
<italic>P. dacty</italic>
),
<italic>Populus trichocarpa</italic>
(
<italic>P. trich</italic>
),
<italic>Prunus mume</italic>
(
<italic>P. mume</italic>
),
<italic>Prunus persica</italic>
(
<italic>P. persi</italic>
),
<italic>Pyrus</italic>
x
<italic>bretschneideri</italic>
(
<italic>P. brets</italic>
),
<italic>Ricinus communis</italic>
(
<italic>R. commu</italic>
),
<italic>Setaria italica</italic>
(
<italic>S. itali</italic>
),
<italic>Solanum lycopersicum</italic>
(
<italic>S. lycop</italic>
),
<italic>Solanum tuberosum</italic>
(
<italic>S. tuber</italic>
),
<italic>Theobroma cacao</italic>
(
<italic>T. cacao</italic>
),
<italic>Vitis vinifera</italic>
(
<italic>V. vinif</italic>
),
<italic>Zea mays</italic>
(
<italic>Z. mays</italic>
). Where there were multiple sequences from a single species, a number follows the taxon abbreviation. Numbers on branches show bootstrap support values for key nodes discussed in the text; due to reasons of space, the support values for other nodes have been omitted.</p>
</caption>
<graphic xlink:href="evv171f4p"></graphic>
</fig>
</p>
</sec>
<sec>
<title>Length Distribution of sRNA</title>
<p>Because the total number of available sequences differed in the eight sRNA data sets examined (from angiosperms:
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
,
<italic>Z. mays</italic>
,
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
; from gymnosperms:
<italic>G</italic>
<italic>i</italic>
<italic>. biloba</italic>
,
<italic>P</italic>
<italic>ic</italic>
<italic>. abies</italic>
,
<italic>W. mirabilis</italic>
; from lycophytes:
<italic>S. moellendorffii</italic>
; and from bryophytes
<italic>P</italic>
<italic>h</italic>
<italic>. patens</italic>
; see
<xref ref-type="table" rid="evv171-T1">table 1</xref>
), we plotted the percentage of sRNA sequences belonging to each size category (
<xref ref-type="fig" rid="evv171-F5">fig. 5</xref>
). The most abundant category was 24 nt for all angiosperms (
<xref ref-type="fig" rid="evv171-F5">fig. 5</xref>
<italic>a</italic>
). In contrast, for all other land plants analyzed the 21 nt sRNA size category was most abundant (
<xref ref-type="fig" rid="evv171-F5">fig. 5</xref>
<italic>b</italic>
).
<fig id="evv171-F5" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 5.—</label>
<caption>
<p>Length distribution of sRNA sequences from (
<italic>a</italic>
) three angiosperm species and (
<italic>b</italic>
) five other land plant species listed in
<xref ref-type="table" rid="evv171-T1">table 1</xref>
. The percentage of total reads for each size class is plotted.</p>
</caption>
<graphic xlink:href="evv171f5p"></graphic>
</fig>
</p>
</sec>
<sec>
<title>Cytosine Methylation in the Gymnosperms Gi. biloba and Gn. gnemon</title>
<p>Since DCL2 was missing in
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
and
<italic>W. mirabilis</italic>
, and it is thought to interact with RdDM in the noncanonical methylation of cytosine (involving RDR6;
<xref rid="evv171-B40" ref-type="bibr">Nuthikattu et al. 2013</xref>
), we conducted bisulphite sequencing of the 18S rDNA in
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
and
<italic>Gi. </italic>
<italic>biloba</italic>
to compare levels of CHH methylation.
<xref ref-type="fig" rid="evv171-F6">Figure 6</xref>
<italic>b</italic>
shows that in both species CG and CHG methylation levels were high, but CHH methylation was very low in
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
.
<fig id="evv171-F6" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 6.—</label>
<caption>
<p>Bisulphite sequencing of part of the 18S rDNA in
<italic>Ginkgo biloba</italic>
and
<italic>Gnetum gnemon</italic>
was used to determine the level of C methylation (
<italic>a</italic>
) Diagrammatic scheme of the
<italic>Gn. gnemon</italic>
18 S rDNA unit (Genbank accession number U42416.1) showing the loop regions (V1–V7, brown arrows) and the region selected for bisulphite sequencing (blue line). (
<italic>b</italic>
) Results of methylation analysis. Note the relatively low level of non-CG methylation in
<italic>Gn. gnemon</italic>
where only 4/451 CHH sites (0.9%) were methylated.</p>
</caption>
<graphic xlink:href="evv171f6p"></graphic>
</fig>
</p>
<p>To further study methylation patterns, we used Southern hybridization and an 18 S rDNA probe against restricted genomic DNA (using methylation-sensitive and insensitive isoschizomers) from the gymnosperms
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
,
<italic>G</italic>
<italic>i</italic>
<italic>. </italic>
<italic>b</italic>
<italic>iloba</italic>
, and the angiosperm
<italic>Nicotiana tabacum</italic>
(chosen as a control because the methylation status of its rDNA has been extensively studied;
<xref rid="evv171-B30" ref-type="bibr">Lim et al. 2000</xref>
). We revealed more extensive digestion of
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
DNA with MspI (sensitive to CHG methylation) compared with the other species (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary fig. S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online, red circles). This confirmed a relative undermethylation of
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
rDNA. In contrast, the fraction of rDNA resistant to digestion with methylation-sensitive enzymes (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary fig. S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online, red bars) was relatively high in
<italic>G</italic>
<italic>i</italic>
<italic>. biloba</italic>
and
<italic>N. tabacum</italic>
, indicating dense methylation of their units.</p>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<sec>
<title>Differences in RdDM Pathway Genes across Land Plants</title>
<p>Six proteins were identified in our analyses that were absent or missing outside angiosperms (
<xref ref-type="fig" rid="evv171-F7">fig. 7</xref>
), these are NRPD4/NRPE4, SHH1, RDM1, DMS3, KTF1, and SUVR2. All are involved in phases (1) and (2) of the canonical RdDM pathway (
<xref ref-type="fig" rid="evv171-F1">figs. 1</xref>
and
<xref ref-type="fig" rid="evv171-F2">2</xref>
and
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). Collectively, the data suggest that phases (1) and (2) of the RdDM pathway have diverged between the different land plant groups whereas phase (3), which is involved in chromatin remodeling, is the most highly conserved part of the pathway. The other proteins of phases (1) and (2) that we analyzed were found across land plants, perhaps with variant functions outside the seed plants (as in DCL, see below).
<fig id="evv171-F7" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 7.—</label>
<caption>
<p>Synthesis of data showing likely origin of gene families associated with the RdDM pathway in the evolution of land plants. The summary tree topology was based on
<xref rid="evv171-B34" ref-type="bibr">Mathews (2009)</xref>
.</p>
</caption>
<graphic xlink:href="evv171f7p"></graphic>
</fig>
</p>
<p>NRPD4/NRPE4 is known to function as part of the RNA Pol IV and Pol V complexes. It is encoded by the same gene and is distinct from the NRPB4 subunit of RNA polymerase II (Pol II) in
<italic>Ar. </italic>
<italic>thaliana</italic>
(
<xref rid="evv171-B17" ref-type="bibr">He, Hsu, Pontes, et al. 2009</xref>
;
<xref rid="evv171-B43" ref-type="bibr">Ream et al. 2009</xref>
). NRPD4/NRPE4 forms subcomplexes with NRPD7 and NRPE7 in Pol IV and Pol V, respectively (
<xref rid="evv171-B43" ref-type="bibr">Ream et al. 2009</xref>
). Pol IV and Pol V are central to the RdDM pathway and probably to its evolution (
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
). Previously, it was suggested that NRPD4 evolved after the divergence of
<italic>P</italic>
<italic>h</italic>
<italic>. patens</italic>
and before angiosperms (
<xref rid="evv171-B49" ref-type="bibr">Tucker et al. 2010</xref>
). Our data extend these findings by showing that NRPD4/E4 diverged with the angiosperms.</p>
<p>Pol IV is thought to be recruited to a subset of target loci for siRNA production by the protein SHH1 which recognizes and binds to H3 histones when they are unmethylated at lysine 4 (=H3K4) and methylated at lysine 9 (=H3K9), that is, markers of heterochromatin production (
<xref rid="evv171-B25" ref-type="bibr">Law et al. 2013</xref>
;
<xref rid="evv171-B53" ref-type="bibr">Zhang, Ma, et al. 2013</xref>
;
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
). Our failure to detect SHH1 outside angiosperms is consistent with the lack of the NRPD4/NRPE4 subunits of Pol IV.</p>
<p>RDM1 is reported to be needed for Pol V function (
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
) and is currently understood to interact with the Pol V pathway in phase (2) of the RdDM pathway in two ways: (a) acting as a homodimer protein bridging between AGO4 and DRM2 in the de novo methylation step (
<xref rid="evv171-B14" ref-type="bibr">Gao et al. 2010</xref>
;
<xref rid="evv171-B44" ref-type="bibr">Sasaki et al. 2014</xref>
), and (b) acting as a monomer in the DDR complex (together with DRD1 and DMS3) that facilitates Pol V transcription (
<xref rid="evv171-B24" ref-type="bibr">Law et al. 2010</xref>
). Certainly, Arabidopsis
<italic>rdm1</italic>
mutants show a nearly complete loss of DNA methylation via the RdDM pathway (
<xref rid="evv171-B14" ref-type="bibr">Gao et al. 2010</xref>
;
<xref rid="evv171-B48" ref-type="bibr">Stroud et al. 2013</xref>
;
<xref rid="evv171-B44" ref-type="bibr">Sasaki et al. 2014</xref>
). Previously
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. (2015)</xref>
noted that RDM1 was restricted to angiosperms, and we confirm this in our taxonomically more diverse survey, which includes representatives from all major land plant groups. The phylogenetic tree inferred from RDM1 sequences from 35 angiosperm species illustrates that sequences from three eudicot families cluster into discrete, highly supported, clades (
<xref ref-type="fig" rid="evv171-F4">fig. 4</xref>
).</p>
<p>Overall, it appears that the specialized components of both Pol IV and Pol V pathways may only be present in angiosperms (
<xref ref-type="fig" rid="evv171-F7">fig. 7</xref>
).</p>
<p>The protein SUVR2 was also shown to be restricted to angiosperms (
<xref ref-type="fig" rid="evv171-F2">fig. 2</xref>
and
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). It is a putative histone methyltransferase that is not directly required for the generation of siRNAs by the RdDM pathyway, but was recently shown to be required for DRM2 establishment and for maintaining methylation downstream of siRNA biogenesis (
<xref rid="evv171-B48" ref-type="bibr">Stroud et al. 2013</xref>
).</p>
<p>The final angiosperm specific protein is KTF1, a transcription factor that plays a role in phase (2) of the RdDM pathway by coordinating transcriptional elongation with chromatin modifications and pre-mRNA processing via interactions with AGO4 (
<xref rid="evv171-B17" ref-type="bibr">He, Hsu, Zhu, et al. 2009</xref>
). It was previously reported to be restricted to angiosperms (
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
), consistent with findings from the more extensive survey here.</p>
</sec>
<sec>
<title>DCL Proteins</title>
<p>DCL proteins are multidomain endoribonucleases, which “dice” or cut prematured long double stranded RNAs into sRNAs (
<xref rid="evv171-B4" ref-type="bibr">Bernstein et al. 2001</xref>
;
<xref rid="evv171-B32" ref-type="bibr">Liu et al. 2009</xref>
;
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
). The number of DCL family members varies among different organisms and patterns of evolution across eukaryotes, including an alga and three angiosperms, have been discussed previously (
<xref rid="evv171-B33" ref-type="bibr">Margis et al. 2006</xref>
). In
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
there are four DCL gene paralogues (DCL1–DCL4) (
<xref rid="evv171-B45" ref-type="bibr">Schauer et al. 2002</xref>
), but in other eukaryotic groups the numbers can vary from one to more than four types (
<xref rid="evv171-B4" ref-type="bibr">Bernstein et al
<italic>.</italic>
2001</xref>
;
<xref rid="evv171-B32" ref-type="bibr">Liu et al
<italic>.</italic>
2009</xref>
). Our analysis showed that only DCL1 was found in all the land plant lineages examined, which suggests it is the most highly conserved. In
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana</italic>
, this protein has a role in generating 21 nt miRNAs involved in posttranscriptional regulation of their target genes.</p>
<p>The isolated absences of other DCL family members in our analysis (e.g., DCL3 in
<italic>W. mirabilis</italic>
and
<italic>Pt. aquilinum</italic>
and DCL4 in
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
and
<italic>S. moellendorffii</italic>
) may have arisen because the gene transcripts were not sequenced or detected by us. We have therefore put more weight on our findings where there is strong phylogenetic signal in the patterns of gene losses and gains (
<xref ref-type="fig" rid="evv171-F3">fig. 3</xref>
).</p>
<p>In our analysis, DCL2 was only detected in species belonging to the seed plants (
<xref ref-type="fig" rid="evv171-F3">fig. 3</xref>
), although it was not found in the two species of the gymnosperm order Gnetales examined (i.e.,
<italic>W. mirabilis</italic>
and
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
;
<xref ref-type="fig" rid="evv171-F3">fig. 3</xref>
). It is therefore possible that DCL2 sequences have been secondarily lost with the divergence of these species in Gnetales (
<xref ref-type="fig" rid="evv171-F7">fig. 7</xref>
). DCL2 is thought to be involved in RNA-mediated virus resistance and is associated with the production of 22 nt sRNAs. There may also be interactions between the posttranscriptional gene silencing pathway that targets RNA polymerase II-transcribed genes, including newly transposed retroelements, and the noncanonical methylation of cytosines in the RdDM pathway (
<xref rid="evv171-B40" ref-type="bibr">Nuthikattu et al. 2013</xref>
). The latter involves the activities of DCL2 and DCL4 to generate 21 and 22 nt sRNAs. Bisulphite sequencing
<italic>of G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
revealed unusually low levels of CHH methylation in 18S rDNA sequences compared with
<italic>G</italic>
<italic>i</italic>
<italic>. biloba</italic>
, which does have DCL2 (
<xref ref-type="fig" rid="evv171-F6">fig. 6</xref>
<italic>b</italic>
and
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary fig. S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). Such a result is consistent with an absence of interaction of DCL2 with RdDM in
<italic>G</italic>
<italic>n</italic>
<italic>. gnemon</italic>
. If so, the absence of DCL2 outside the seed plants could have similar consequences on the degree of methylation at noncanonical cytosines.</p>
<p>DCL3, which generates 24 nt sRNAs and is directly involved in the canonical RdDM pathway (
<xref ref-type="fig" rid="evv171-F1">fig. 1</xref>
), was found in all plant groups except in the monilophyte studied (
<xref ref-type="fig" rid="evv171-F3">fig. 3</xref>
). Although previous studies failed to detect 24 nt sRNAs in conifers (
<xref rid="evv171-B9" ref-type="bibr">Dolgosheina et al. 2008</xref>
), recently they were reported to be present in some tissues of
<italic>Cunninghamia lanceolata</italic>
(
<xref rid="evv171-B50" ref-type="bibr">Wan et al. 2012</xref>
),
<italic>P</italic>
<italic>ic</italic>
<italic>. abies</italic>
(
<xref rid="evv171-B41" ref-type="bibr">Nystedt et al. 2013</xref>
), and
<italic>Larix leptolepis</italic>
(
<xref rid="evv171-B54" ref-type="bibr">Zhang, Wu, et al. 2013</xref>
), consistent with the results presented here. Indeed our survey of the sRNAs generated across land plants shows that all species have a fraction of sRNAs that are 24 nt long although it is only in the angiosperms that these comprise the major fraction of sRNAs (
<xref ref-type="fig" rid="evv171-F5">fig. 5</xref>
). The observation that 24 nt sRNAs were present in the monilophyte
<italic>Pt. aquilinum</italic>
(
<xref ref-type="fig" rid="evv171-F5">fig. 5</xref>
) may indicate that we have simply failed to find DCL3 in the transcriptome data currently available, rather than the gene being absent from their genomes.</p>
<p>DCL4 is thought to be involved in trans-acting RNA metabolism and post-transcriptional gene regulation, generating 21 nt sRNAs. We found DCL4 in all land plant lineages except the lycophyte studied (
<xref ref-type="table" rid="evv171-T2">table 2</xref>
and
<xref ref-type="fig" rid="evv171-F3">fig. 3</xref>
).</p>
<p>In consideration of missing genes in the pathway it must be noted that there is redundancy in function between these DCL families, which results in limited phenotypes in knock-out experiments (
<xref rid="evv171-B1" ref-type="bibr">Andika et al. 2015</xref>
). This means that the losses of particular DCL families may be functionally compensated for by the activity of another DCL family member.</p>
</sec>
<sec>
<title>Influence of Different Epigenetic Machinery on Genome Structures</title>
<p>The primary role of RdDM is considered to be the epigenetic silencing of repeats, predominantly retroelements across the genome. This silencing process leads to chromatin remodeling or heterochromatinization, which typically renders the repeats transcriptionally silent (
<xref rid="evv171-B36" ref-type="bibr">Matzke and Mosher 2014</xref>
;
<xref rid="evv171-B35" ref-type="bibr">Matzke et al. 2015</xref>
). For example, among angiosperms it is known that modifications to, or breakdown of, the RdDM pathway can lead to repeat amplification, as shown, for example, by the inactivity of an orthologue of RDR2 in
<italic>Z. mays</italic>
resulting in enhanced transposon activity (
<xref rid="evv171-B52" ref-type="bibr">Woodhouse et al. 2006</xref>
).</p>
<p>The differences in the epigenetic machinery among representatives of the major land plant groups we show here might potentially influence the evolutionary dynamics of their genomes. Angiosperms are thought to have dynamic genome structures compared with gymnosperms, with a higher level of turnover of retroelements (
<xref rid="evv171-B26" ref-type="bibr">Leitch and Leitch 2012</xref>
), at least in those species with a small genome (cf.
<xref rid="evv171-B20" ref-type="bibr">Kelly et al. 2015</xref>
). Angiosperms are also remarkable among comparably sized eukaryotic groups in terms of their genome size diversity. Not only do they have the largest range for any comparable group—varying approximately 2,400-fold (1C = 0.063–152.23 pg), but the distribution of genome sizes is skewed towards small genomes, with the modal and median values being just 1C = 0.6 pg and 2.5 pg, respectively (
<xref rid="evv171-B27" ref-type="bibr">Leitch and Leitch 2013</xref>
).</p>
<p>To determine if angiosperms with large genomes have anything unusual in their RdDM pathway we analyzed
<italic>F. persica</italic>
, which has an extraordinary large genome size for any eukaryote (1C = 41.21 pg,
<xref rid="evv171-B20" ref-type="bibr">Kelly et al. 2015</xref>
), nearly 300 times that of
<italic>A</italic>
<italic>r</italic>
<italic>. thaliana.</italic>
Previously, in a study of a related species (
<italic>F. imperialis</italic>
; 1C = 43 pg), we identified a pararetrovirus-like repeat sequence (FriEPRV) which was estimated to be present in approximately 21,000 copies, accounting for 0.4% of its genome (
<xref rid="evv171-B2" ref-type="bibr">Becher et al. 2014</xref>
). We showed high levels of cytosine methylation and an abundance of 24 nt sRNA reads that mapped exclusively to the repeat, a result which did not suggest anything unusual in the RdDM pathway. Nevertheless, here we failed to detect NRPD1, SUVR2, and SHH1 (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online) in
<italic>F. persica</italic>
. Potentially, KTF1 is also missing since the OrthoMCL group protein isolated lacks GW/WG motifs, which function to interact with AGO4 and siRNAs (
<xref rid="evv171-B17" ref-type="bibr">He, Hsu, Zhu, et al. 2009</xref>
). However, for this protein we cannot rule out incomplete assembly (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online). Similarly, we only found two domains for DCL4 (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">supplementary table S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary Material</ext-link>
online), although this too might point towards an incomplete assembly. Nevertheless, collectively, it remains possible that there is divergence in particular components of the RdDM pathway in
<italic>Fritillaria</italic>
, which perhaps impacts on the amplification and elimination of different types of repeat in the genome. If so, this may contribute to the observation that
<italic>Fritillaria</italic>
genomes comprise a high diversity of highly heterogeneous repeats, each representing a rather small proportion of the genome (
<xref rid="evv171-B20" ref-type="bibr">Kelly et al. 2015</xref>
). Such a pattern of repeats may also be present in other species with large genomes (
<xref rid="evv171-B37" ref-type="bibr">Metcalfe and Casane 2013</xref>
). This pattern differs from that generally found in species with small genomes, where amplification of one or a few repeat families can result in the contrasting genome sizes observed (
<xref rid="evv171-B16" ref-type="bibr">Grover and Wendel 2010</xref>
;
<xref rid="evv171-B3" ref-type="bibr">Bennetzen and Wang 2014</xref>
).</p>
<p>In contrast to angiosperms, gymnosperms have relatively limited genome size variation (just 16-fold overall, 2.25–36.00 pg) despite having the highest proportion of species with recorded DNA C-values (∼25% of species;
<xref rid="evv171-B26" ref-type="bibr">Leitch and Leitch 2012</xref>
,
<xref rid="evv171-B27" ref-type="bibr">2013</xref>
). In addition, the mode and median genome size values are significantly higher compared with angiosperms (gymnosperm mode 1C = 10.0 pg, median 1C = 7.9 pg and mean 1C = 18.6 pg). Such differences, coupled with the heterogeneous repeat profiles of the Coniferales species examined (
<xref rid="evv171-B23" ref-type="bibr">Kovach et al. 2010</xref>
;
<xref rid="evv171-B41" ref-type="bibr">Nystedt et al. 2013</xref>
), could also be related to differences observed in the epigenetic machineries. Potentially in angiosperms RdDM pathways evolved as another, or alternative, layer of transposon proliferation control not found in other land plant groups. In angiosperms, it is thought that activated transposons (transcribing RNA) are resilenced through RdDM. However, we are unaware of evidence for an active transposon in gymnosperms, despite their large genomes, whereas there are many examples in angiosperms (
<xref rid="evv171-B31" ref-type="bibr">Lisch 2013</xref>
). Possibly gymnosperms and other land plants have other/alternative mechanisms to silence transposons, such as an elevated frequency of C to T mutation of noncoding, highly methylated repeats.</p>
<p>Available cytological data in monilophytes, lycophytes, and bryophytes point to further differences with seed plants in patterns of genome organization (
<xref rid="evv171-B27" ref-type="bibr">Leitch and Leitch 2013</xref>
). Sadly, however, the lack of extensive genomic data for these land plant groups precludes generalizations about their genome dynamics and the role that epigenetics may play. It is clear that more molecular studies are needed to probe the role of RdDM in contributing to the contrasting genomic profiles observed across land plants.</p>
</sec>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">Supplementary data file S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">figures S1–S4</ext-link>
, and
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evv171/-/DC1">tables S1–S5</ext-link>
are available at
<italic>Genome Biology and Evolution</italic>
online (
<ext-link ext-link-type="uri" xlink:href="http://www.gbe.oxfordjournals.org/">http://www.gbe.oxfordjournals.org/</ext-link>
).</p>
<supplementary-material id="PMC_1" content-type="local-data">
<caption>
<title>Supplementary Data</title>
</caption>
<media mimetype="text" mime-subtype="html" xlink:href="supp_7_9_2648__index.html"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_evv171_Ma_et_al_Revised_suppl_data_21_Aug_2015.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="msword" xlink:href="supp_evv171_New_Microsoft_Office_Word_Document.docx"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>The authors are grateful for funding from FP7 Marie Curie IEF (Lu Ma), the FP7 Marie Curie ITN INTERCROSSING (Andrea Hatlen), the Czech Science Foundation (501/12/G090, Ales Kovarik), China Scholarship Council (Wencai Wang), and NERC (NE/ G01724/1, Laura Kelly, Ilia Leitch, Andrew Leitch). The Illumina sequencing of
<italic>Fritillaria</italic>
was funded by NERC (NE/G01724/1) and generated by the Centre of Genomic Research in the University of Liverpool, United Kingdom. This research utilized Queen Mary's MidPlus computational facilities, supported by QMUL Research-IT and funded by
<funding-source>EPSRC</funding-source>
grant
<award-id>EP/K000128/1</award-id>
. The authors thank an anonymous referee for a rigorous, insightful, and helpful review.</p>
</ack>
<ref-list>
<title>Literature Cited</title>
<ref id="evv171-B1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Andika</surname>
<given-names>IB</given-names>
</name>
<etal></etal>
</person-group>
<year>2015</year>
<article-title>Differential contributions of plant Dicer-like proteins to antiviral defences against potato virus X in leaves and roots</article-title>
.
<source>Plant J.</source>
<volume>81</volume>
:
<fpage>781</fpage>
<lpage>793</lpage>
.
<pub-id pub-id-type="pmid">25619543</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Becher</surname>
<given-names>H</given-names>
</name>
<etal></etal>
</person-group>
<year>2014</year>
<article-title>Endogenous pararetrovirus sequences associated with 24 nt small RNAs at the centromeres of
<italic>Fritillaria imperialis</italic>
L. (Liliaceae), a species with a giant genome</article-title>
.
<source>Plant J.</source>
<volume>80</volume>
:
<fpage>823</fpage>
<lpage>833</lpage>
.
<pub-id pub-id-type="pmid">25230921</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B3">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bennetzen</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>The contributions of transposable elements to the structure, function, and evolution of plant genomes</article-title>
.
<source>Annu Rev Plant Biol.</source>
<volume>65</volume>
:
<fpage>505</fpage>
<lpage>530</lpage>
.
<pub-id pub-id-type="pmid">24579996</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B4">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bernstein</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Caudy</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Hammond</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Hannon</surname>
<given-names>GJ</given-names>
</name>
</person-group>
<year>2001</year>
<article-title>Role for a bidentate ribonuclease in the initiation step of RNA interference</article-title>
.
<source>Nature</source>
<volume>409</volume>
:
<fpage>363</fpage>
<lpage>366</lpage>
.
<pub-id pub-id-type="pmid">11201747</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B5">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Böhmdorfer</surname>
<given-names>G</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>GMI1, a structural-maintenance-of-chromosomes-hinge domain-containing protein, is involved in somatic homologous recombination in
<italic>Arabidopsis</italic>
</article-title>
.
<source>Plant J.</source>
<volume>67</volume>
:
<fpage>420</fpage>
-
<lpage>433</lpage>
.
<pub-id pub-id-type="pmid">21481027</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B6">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Capella-Gutiérrez</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Silla-Martínez</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Gabaldón</surname>
<given-names>T</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses</article-title>
.
<source>Bioinformatics</source>
<volume>25</volume>
:
<fpage>1972</fpage>
<lpage>1973</lpage>
.
<pub-id pub-id-type="pmid">19505945</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B7">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname>
<given-names>J-M</given-names>
</name>
<name>
<surname>Di Tommaso</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lefort</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Gascuel</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Notredame</surname>
<given-names>C</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>TCS: a web server for multiple sequence alignment evaluation and phylogenetic reconstruction</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>43</volume>
:
<fpage>W3</fpage>
<lpage>W6</lpage>
.
<pub-id pub-id-type="pmid">25855806</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B8">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Darriba</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Taboada</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Doallo</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Posada</surname>
<given-names>D</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>ProtTest 3: fast selection of best-fit models of protein evolution</article-title>
.
<source>Bioinformatics</source>
<volume>27</volume>
:
<fpage>1164</fpage>
<lpage>1165</lpage>
.
<pub-id pub-id-type="pmid">21335321</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B9">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dolgosheina</surname>
<given-names>EV</given-names>
</name>
<etal></etal>
</person-group>
<year>2008</year>
<article-title>Conifers have a unique small RNA silencing signature</article-title>
.
<source>RNA</source>
<volume>14</volume>
:
<fpage>1508</fpage>
<lpage>1515</lpage>
.
<pub-id pub-id-type="pmid">18566193</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>32</volume>
:
<fpage>1792</fpage>
<lpage>1797</lpage>
.
<pub-id pub-id-type="pmid">15034147</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fei</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Meyers</surname>
<given-names>BC</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Phased, secondary, small interfering RNAs in posttranscriptional regulatory networks</article-title>
.
<source>Plant Cell</source>
<volume>25</volume>
:
<fpage>2400</fpage>
<lpage>2415</lpage>
.
<pub-id pub-id-type="pmid">23881411</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B12">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fischer</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups</article-title>
.
<source>Curr Protoc Bioinformatics</source>
.
<comment>Chapter 6:Unit 6.12.1–6.12.19</comment>
.</mixed-citation>
</ref>
<ref id="evv171-B13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fuchs</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Jovtchev</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Schubert</surname>
<given-names>I</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>The chromosomal distribution of histone methylation marks in gymnosperms differs from that of angiosperms</article-title>
.
<source>Chromosom Res.</source>
<volume>16</volume>
:
<fpage>891</fpage>
<lpage>898</lpage>
.</mixed-citation>
</ref>
<ref id="evv171-B14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gao</surname>
<given-names>Z</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>An RNA polymerase II- and AGO4-associated protein acts in RNA-directed DNA methylation</article-title>
.
<source>Nature</source>
<volume>465</volume>
:
<fpage>106</fpage>
<lpage>109</lpage>
.
<pub-id pub-id-type="pmid">20410883</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grabherr</surname>
<given-names>MG</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>Full-length transcriptome assembly from RNA-Seq data without a reference genome</article-title>
.
<source>Nat Biotechnol.</source>
<volume>29</volume>
:
<fpage>644</fpage>
<lpage>652</lpage>
.
<pub-id pub-id-type="pmid">21572440</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B16">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Grover</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Wendel</surname>
<given-names>JF</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Recent insights into mechanisms of genome size change in plants</article-title>
.
<source>J Bot.</source>
<volume>2010</volume>
:
<fpage>1</fpage>
<lpage>8</lpage>
.</mixed-citation>
</ref>
<ref id="evv171-B17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>XJ</given-names>
</name>
<name>
<surname>Hsu</surname>
<given-names>YF</given-names>
</name>
<name>
<surname>Pontes</surname>
<given-names>O</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>NRPD4, a protein related to the RPB4 subunit of RNA polymerase II, is a component of RNA polymerases IV and V and is required for RNA-directed DNA methylation</article-title>
.
<source>Genes Dev.</source>
<volume>23</volume>
:
<fpage>318</fpage>
<lpage>330</lpage>
.
<pub-id pub-id-type="pmid">19204117</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>XJ</given-names>
</name>
<name>
<surname>Hsu</surname>
<given-names>YF</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>An effector of RNA-directed DNA methylation in
<italic>Arabidopsis</italic>
is an ARGONAUTE 4- and RNA-binding protein</article-title>
.
<source>Cell</source>
<volume>137</volume>
:
<fpage>498</fpage>
<lpage>508</lpage>
.
<pub-id pub-id-type="pmid">19410546</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hetzl</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Foerster</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Raidl</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Mittelsten Scheid</surname>
<given-names>O</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>CyMATE: a new tool for methylation analysis of plant genomic DNA after bisulphite sequencing</article-title>
.
<source>Plant J.</source>
<volume>51</volume>
:
<fpage>526</fpage>
<lpage>536</lpage>
.
<pub-id pub-id-type="pmid">17559516</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kelly</surname>
<given-names>LJ</given-names>
</name>
<etal></etal>
</person-group>
<year>2015</year>
<article-title>Analysis of the giant genomes of
<italic>Fritillaria</italic>
(Liliaceae) indicates that a lack of DNA removal characterizes extreme expansions in genome size</article-title>
.
<source>New Phytol</source>
:
<comment>doi: 10.1111/nph.13471</comment>
.</mixed-citation>
</ref>
<ref id="evv171-B21">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kenrick</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Crane</surname>
<given-names>PR</given-names>
</name>
</person-group>
<year>1997</year>
<article-title>The origin and early evolution of plants on land</article-title>
.
<source>Nature</source>
<volume>389</volume>
:
<fpage>33</fpage>
<lpage>39</lpage>
.</mixed-citation>
</ref>
<ref id="evv171-B22">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kovarik</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<year>2005</year>
<article-title>Rapid concerted evolution of nuclear ribosomal DNA in two
<italic>Tragopogon</italic>
allopolyploids of recent and recurrent origin</article-title>
.
<source>Genetics</source>
<volume>169</volume>
:
<fpage>931</fpage>
<lpage>944</lpage>
.
<pub-id pub-id-type="pmid">15654116</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B23">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kovach</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>The
<italic>Pinus taeda</italic>
genome is characterized by diverse and highly diverged repetitive sequences</article-title>
.
<source>BMC Genomics</source>
<volume>11</volume>
:
<fpage>420</fpage>
.
<pub-id pub-id-type="pmid">20609256</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B24">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Law</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>A protein complex required for polymerase V transcripts and RNA- directed DNA methylation in
<italic>Arabidopsis</italic>
</article-title>
.
<source>Curr Biol.</source>
<volume>20</volume>
:
<fpage>951</fpage>
<lpage>956</lpage>
.
<pub-id pub-id-type="pmid">20409711</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B25">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Law</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<year>2013</year>
<article-title>Polymerase IV occupancy at RNA-directed DNA methylation sites requires SHH1</article-title>
.
<source>Nature</source>
<volume>498</volume>
:
<fpage>385</fpage>
<lpage>389</lpage>
.
<pub-id pub-id-type="pmid">23636332</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B26">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leitch</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Leitch</surname>
<given-names>IJ</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Ecological and genetic factors linked to contrasting genome dynamics in seed plants</article-title>
.
<source>New Phytol.</source>
<volume>194</volume>
:
<fpage>629</fpage>
<lpage>646</lpage>
.
<pub-id pub-id-type="pmid">22432525</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B27">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Leitch</surname>
<given-names>IJ</given-names>
</name>
<name>
<surname>Leitch</surname>
<given-names>AR</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Genome size diversity and evolution in land plants</article-title>
. In:
<person-group person-group-type="editor">
<name>
<surname>Leitch</surname>
<given-names>IJ</given-names>
</name>
<name>
<surname>Greilhuber</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Dolezel</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wendel</surname>
<given-names>JF</given-names>
</name>
</person-group>
, editors.
<source>Plant genome diversity</source>
.
<volume>Vol. 2</volume>
<publisher-loc>Vienna</publisher-loc>
:
<publisher-name>Springer</publisher-name>
p.
<fpage>307</fpage>
<lpage>322</lpage>
.</mixed-citation>
</ref>
<ref id="evv171-B28">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>FW</given-names>
</name>
<etal></etal>
</person-group>
<year>2014</year>
<article-title>Horizontal transfer of an adaptive chimeric photoreceptor from bryophytes to ferns</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>111</volume>
:
<fpage>6672</fpage>
<lpage>6677</lpage>
.
<pub-id pub-id-type="pmid">24733898</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B29">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Stoeckert</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Roos</surname>
<given-names>DS</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>OrthoMCL: identification of ortholog groups for eukaryotic genomes</article-title>
.
<source>Genome Res.</source>
<volume>13</volume>
:
<fpage>2178</fpage>
<lpage>2189</lpage>
.
<pub-id pub-id-type="pmid">12952885</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B30">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lim</surname>
<given-names>KY</given-names>
</name>
<etal></etal>
</person-group>
<year>2000</year>
<article-title>Gene conversion of ribosomal DNA in
<italic>Nicotiana tabacum</italic>
is associated with undermethylated, decondensed and probably active gene units</article-title>
.
<source>Chromosoma</source>
<volume>109</volume>
:
<fpage>161</fpage>
<lpage>172</lpage>
.
<pub-id pub-id-type="pmid">10929194</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B31">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lisch</surname>
<given-names>D</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>How important are transposons in plant evolution? Nat Rev Genet</article-title>
.
<source>14</source>
:
<fpage>49</fpage>
-
<lpage>61</lpage>
.</mixed-citation>
</ref>
<ref id="evv171-B32">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Z</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Dicer-like (DCL) proteins in plants</article-title>
.
<source>Funct Integr Genomics.</source>
<volume>9</volume>
:
<fpage>277</fpage>
<lpage>286</lpage>
.
<pub-id pub-id-type="pmid">19221817</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B33">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Margis</surname>
<given-names>R</given-names>
</name>
<etal></etal>
</person-group>
<year>2006</year>
<article-title>The evolution and diversification of Dicers in plants</article-title>
.
<source>FEBS Lett.</source>
<volume>580</volume>
:
<fpage>2442</fpage>
<lpage>2450</lpage>
.
<pub-id pub-id-type="pmid">16638569</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B34">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mathews</surname>
<given-names>S</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Phylogenetic relationships among seed plants: persistent questions and the limits of molecular data</article-title>
.
<source>Am. J. Bot.</source>
<volume>96</volume>
:
<fpage>228</fpage>
<lpage>236</lpage>
.
<pub-id pub-id-type="pmid">21628186</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B35">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Matzke</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Kanno</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Matzke</surname>
<given-names>AJM</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>RNA-directed DNA methylation: the evolution of a complex epigenetic pathway in flowering plants</article-title>
.
<source>Annu Rev Plant Biol.</source>
<volume>66</volume>
:
<fpage>243</fpage>
<lpage>267</lpage>
.
<pub-id pub-id-type="pmid">25494460</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B36">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Matzke</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Mosher</surname>
<given-names>RA</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>RNA-directed DNA methylation: an epigenetic pathway of increasing complexity</article-title>
.
<source>Nat Rev Genet.</source>
<volume>15</volume>
:
<fpage>394</fpage>
<lpage>408</lpage>
.
<pub-id pub-id-type="pmid">24805120</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B37">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Metcalfe</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Casane</surname>
<given-names>D</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Accommodating the load: the transposable element content of very large genomes</article-title>
.
<source>Mob Genet Elements.</source>
<volume>3</volume>
:
<fpage>e24775</fpage>
.
<pub-id pub-id-type="pmid">24616835</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B38">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Morse</surname>
<given-names>AM</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>Evolution of genome size and complexity in
<italic>Pinus</italic>
</article-title>
.
<source>PLoS One</source>
<volume>4</volume>
:
<fpage>e4332</fpage>
.
<pub-id pub-id-type="pmid">19194510</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B39">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Notredame</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Higgins</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Heringa</surname>
<given-names>J</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>T-coffee: a novel method for fast and accurate multiple sequence alignment</article-title>
.
<source>J Mol Biol.</source>
<volume>302</volume>
:
<fpage>205</fpage>
<lpage>217</lpage>
.
<pub-id pub-id-type="pmid">10964570</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B40">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nuthikattu</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<year>2013</year>
<article-title>The initiation of epigenetic silencing of active transposable elements is triggered by RDR6 and 21-22 nucleotide small interfering RNAs</article-title>
.
<source>Plant Physiol.</source>
<volume>162</volume>
:
<fpage>116</fpage>
<lpage>131</lpage>
.
<pub-id pub-id-type="pmid">23542151</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B41">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nystedt</surname>
<given-names>B</given-names>
</name>
<etal></etal>
</person-group>
<year>2013</year>
<article-title>The Norway spruce genome sequence and conifer genome evolution</article-title>
.
<source>Nature</source>
<volume>497</volume>
:
<fpage>579</fpage>
<lpage>584</lpage>
.
<pub-id pub-id-type="pmid">23698360</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B42">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Parchman</surname>
<given-names>TL</given-names>
</name>
<name>
<surname>Geist</surname>
<given-names>KS</given-names>
</name>
<name>
<surname>Grahnen</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Benkman</surname>
<given-names>CW</given-names>
</name>
<name>
<surname>Buerkle</surname>
<given-names>CA</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery</article-title>
.
<source>BMC Genomics</source>
<volume>11</volume>
:
<fpage>180</fpage>
.
<pub-id pub-id-type="pmid">20233449</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B43">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ream</surname>
<given-names>TS</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>Subunit compositions of the RNA-silencing enzymes Pol IV and Pol V reveal their origins as specialized forms of RNA Polymerase II</article-title>
.
<source>Mol Cell.</source>
<volume>33</volume>
:
<fpage>192</fpage>
<lpage>203</lpage>
.
<pub-id pub-id-type="pmid">19110459</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B44">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sasaki</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Lorković</surname>
<given-names>ZJ</given-names>
</name>
<name>
<surname>Liang</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Matzke</surname>
<given-names>AJM</given-names>
</name>
<name>
<surname>Matzke</surname>
<given-names>MA</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>The ability to form homodimers is essential for RDM1 to function in RNA-directed DNA methylation</article-title>
.
<source>PLoS One</source>
<volume>9</volume>
:
<fpage>e88190</fpage>
.
<pub-id pub-id-type="pmid">24498436</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B45">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schauer</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Jacobsen</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Meinke</surname>
<given-names>DW</given-names>
</name>
<name>
<surname>Ray</surname>
<given-names>A</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>DICER-LIKE1: blind men and elephants in
<italic>Arabidopsis</italic>
development</article-title>
.
<source>Trends Plant Sci.</source>
<volume>7</volume>
:
<fpage>487</fpage>
<lpage>491</lpage>
.
<pub-id pub-id-type="pmid">12417148</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B46">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname>
<given-names>X</given-names>
</name>
<etal></etal>
</person-group>
<year>2012</year>
<article-title>Roles of DCL4 and DCL3b in rice phased small RNA biogenesis</article-title>
.
<source>Plant J.</source>
<volume>69</volume>
:
<fpage>462</fpage>
<lpage>474</lpage>
.
<pub-id pub-id-type="pmid">21973320</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B47">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stamatakis</surname>
<given-names>A</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models</article-title>
.
<source>Bioinformatics</source>
<volume>22</volume>
:
<fpage>2688</fpage>
<lpage>2690</lpage>
.
<pub-id pub-id-type="pmid">16928733</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B48">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stroud</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Greenberg</surname>
<given-names>MVC</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bernatavichute</surname>
<given-names>YV</given-names>
</name>
<name>
<surname>Jacobsen</surname>
<given-names>SE</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Comprehensive analysis of silencing mutants reveals complex regulation of the
<italic>Arabidopsis</italic>
methylome</article-title>
.
<source>Cell</source>
<volume>152</volume>
:
<fpage>352</fpage>
<lpage>364</lpage>
.
<pub-id pub-id-type="pmid">23313553</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B49">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tucker</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Reece</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ream</surname>
<given-names>TS</given-names>
</name>
<name>
<surname>Pikaard</surname>
<given-names>CS</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Evolutionary history of plant multisubunit RNA polymerases IV and V: subunit origins via genome-wide and segmental gene duplications, retrotransposition, and lineage-specific subfunctionalization</article-title>
.
<source>Cold Spring Harb Symp Quant Biol.</source>
<volume>75</volume>
:
<fpage>285</fpage>
<lpage>297</lpage>
.
<pub-id pub-id-type="pmid">21447813</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B50">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wan</surname>
<given-names>L-C</given-names>
</name>
<etal></etal>
</person-group>
<year>2012</year>
<article-title>Identification and characterization of small non-coding RNAs from Chinese fir by high throughput sequencing</article-title>
.
<source>BMC Plant Biol.</source>
<volume>12</volume>
:
<fpage>146</fpage>
.
<pub-id pub-id-type="pmid">22894611</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B51">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wellman</surname>
<given-names>CH</given-names>
</name>
<name>
<surname>Osterloff</surname>
<given-names>PL</given-names>
</name>
<name>
<surname>Mohiuddin</surname>
<given-names>U</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Fragments of the earliest land plants</article-title>
.
<source>Nature</source>
<volume>425</volume>
:
<fpage>282</fpage>
<lpage>285</lpage>
.
<pub-id pub-id-type="pmid">13679913</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B52">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Woodhouse</surname>
<given-names>MR</given-names>
</name>
<name>
<surname>Freeling</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lisch</surname>
<given-names>D</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>Initiation, establishment, and maintenance of heritable MuDR transposon silencing in maize are mediated by distinct factors</article-title>
.
<source>PLoS Biol.</source>
<volume>4</volume>
:
<fpage>e339</fpage>
.
<pub-id pub-id-type="pmid">16968137</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B53">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>ZY</given-names>
</name>
<etal></etal>
</person-group>
<year>2013</year>
<article-title>DTF1 is a core component of RNA-directed DNA methylation and may assist in the recruitment of Pol IV</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>110</volume>
:
<fpage>8290</fpage>
<lpage>8295</lpage>
.
<pub-id pub-id-type="pmid">23637343</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B54">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<year>2013</year>
<article-title>Dynamic expression of small RNA populations in larch (
<italic>Larix leptolepis</italic>
)</article-title>
.
<source>Planta</source>
<volume>237</volume>
:
<fpage>89</fpage>
<lpage>101</lpage>
.
<pub-id pub-id-type="pmid">22983700</pub-id>
</mixed-citation>
</ref>
<ref id="evv171-B55">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zimmermann</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Hirsch-Hoffmann</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hennig</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Gruissem</surname>
<given-names>W</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>GENEVESTIGATOR.
<italic>Arabidopsis</italic>
microarray database and analysis toolbox</article-title>
.
<source>Plant Physiol.</source>
<volume>136</volume>
:
<fpage>2621</fpage>
<lpage>2632</lpage>
<pub-id pub-id-type="pmid">15375207</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Bois/explor/OrangerV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000102 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000102 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Bois
   |area=    OrangerV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4607528
   |texte=   Angiosperms Are Unique among Land Plant Lineages in the Occurrence of Key Genes in the RNA-Directed DNA Methylation (RdDM) Pathway
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:26338185" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a OrangerV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Sat Dec 3 17:11:04 2016. Site generation: Wed Mar 6 18:18:32 2024