Serveur d'exploration sur l'oranger

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Phylogenetic and Genomic Analyses Resolve the Origin of Important Plant Genes Derived from Transposable Elements

Identifieur interne : 001067 ( Pmc/Corpus ); précédent : 001066; suivant : 001068

Phylogenetic and Genomic Analyses Resolve the Origin of Important Plant Genes Derived from Transposable Elements

Auteurs : Zoé Joly-Lopez ; Douglas R. Hoen ; Mathieu Blanchette ; Thomas E. Bureau

Source :

RBID : PMC:4948706

Abstract

Once perceived as merely selfish, transposable elements (TEs) are now recognized as potent agents of adaptation. One way TEs contribute to evolution is through TE exaptation, a process whereby TEs, which persist by replicating in the genome, transform into novel host genes, which persist by conferring phenotypic benefits. Known exapted TEs (ETEs) contribute diverse and vital functions, and may facilitate punctuated equilibrium, yet little is known about this process. To better understand TE exaptation, we designed an approach to resolve the phylogenetic context and timing of exaptation events and subsequent patterns of ETE diversification. Starting with known ETEs, we search in diverse genomes for basal ETEs and closely related TEs, carefully curate the numerous candidate sequences, and infer detailed phylogenies. To distinguish TEs from ETEs, we also weigh several key genomic characteristics including repetitiveness, terminal repeats, pseudogenic features, and conserved domains. Applying this approach to the well-characterized plant ETEs MUG and FHY3, we show that each group is paraphyletic and we argue that this pattern demonstrates that each originated in not one but multiple exaptation events. These exaptations and subsequent ETE diversification occurred throughout angiosperm evolution including the crown group expansion, the angiosperm radiation, and the primitive evolution of angiosperms. In addition, we detect evidence of several putative novel ETE families. Our findings support the hypothesis that TE exaptation generates novel genes more frequently than is currently thought, often coinciding with key periods of evolution.


Url:
DOI: 10.1093/molbev/msw067
PubMed: 27189548
PubMed Central: 4948706

Links to Exploration step

PMC:4948706

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Phylogenetic and Genomic Analyses Resolve the Origin of Important Plant Genes Derived from Transposable Elements</title>
<author>
<name sortKey="Joly Lopez, Zoe" sort="Joly Lopez, Zoe" uniqKey="Joly Lopez Z" first="Zoé" last="Joly-Lopez">Zoé Joly-Lopez</name>
<affiliation>
<nlm:aff id="msw067-aff1">Department of Biology, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hoen, Douglas R" sort="Hoen, Douglas R" uniqKey="Hoen D" first="Douglas R." last="Hoen">Douglas R. Hoen</name>
<affiliation>
<nlm:aff id="msw067-aff1">Department of Biology, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Blanchette, Mathieu" sort="Blanchette, Mathieu" uniqKey="Blanchette M" first="Mathieu" last="Blanchette">Mathieu Blanchette</name>
<affiliation>
<nlm:aff id="msw067-aff2">School of Computer Science, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bureau, Thomas E" sort="Bureau, Thomas E" uniqKey="Bureau T" first="Thomas E." last="Bureau">Thomas E. Bureau</name>
<affiliation>
<nlm:aff id="msw067-aff1">Department of Biology, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27189548</idno>
<idno type="pmc">4948706</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4948706</idno>
<idno type="RBID">PMC:4948706</idno>
<idno type="doi">10.1093/molbev/msw067</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">001067</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Phylogenetic and Genomic Analyses Resolve the Origin of Important Plant Genes Derived from Transposable Elements</title>
<author>
<name sortKey="Joly Lopez, Zoe" sort="Joly Lopez, Zoe" uniqKey="Joly Lopez Z" first="Zoé" last="Joly-Lopez">Zoé Joly-Lopez</name>
<affiliation>
<nlm:aff id="msw067-aff1">Department of Biology, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hoen, Douglas R" sort="Hoen, Douglas R" uniqKey="Hoen D" first="Douglas R." last="Hoen">Douglas R. Hoen</name>
<affiliation>
<nlm:aff id="msw067-aff1">Department of Biology, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Blanchette, Mathieu" sort="Blanchette, Mathieu" uniqKey="Blanchette M" first="Mathieu" last="Blanchette">Mathieu Blanchette</name>
<affiliation>
<nlm:aff id="msw067-aff2">School of Computer Science, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bureau, Thomas E" sort="Bureau, Thomas E" uniqKey="Bureau T" first="Thomas E." last="Bureau">Thomas E. Bureau</name>
<affiliation>
<nlm:aff id="msw067-aff1">Department of Biology, McGill University, Montréal, QC, Canada</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Molecular Biology and Evolution</title>
<idno type="ISSN">0737-4038</idno>
<idno type="eISSN">1537-1719</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Once perceived as merely selfish, transposable elements (TEs) are now recognized as potent agents of adaptation. One way TEs contribute to evolution is through TE exaptation, a process whereby TEs, which persist by replicating in the genome, transform into novel host genes, which persist by conferring phenotypic benefits. Known exapted TEs (ETEs) contribute diverse and vital functions, and may facilitate punctuated equilibrium, yet little is known about this process. To better understand TE exaptation, we designed an approach to resolve the phylogenetic context and timing of exaptation events and subsequent patterns of ETE diversification. Starting with known ETEs, we search in diverse genomes for basal ETEs and closely related TEs, carefully curate the numerous candidate sequences, and infer detailed phylogenies. To distinguish TEs from ETEs, we also weigh several key genomic characteristics including repetitiveness, terminal repeats, pseudogenic features, and conserved domains. Applying this approach to the well-characterized plant ETEs
<italic>MUG</italic>
and
<italic>FHY3</italic>
, we show that each group is paraphyletic and we argue that this pattern demonstrates that each originated in not one but multiple exaptation events. These exaptations and subsequent ETE diversification occurred throughout angiosperm evolution including the crown group expansion, the angiosperm radiation, and the primitive evolution of angiosperms. In addition, we detect evidence of several putative novel ETE families. Our findings support the hypothesis that TE exaptation generates novel genes more frequently than is currently thought, often coinciding with key periods of evolution.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Agrawal, A" uniqKey="Agrawal A">A Agrawal</name>
</author>
<author>
<name sortKey="Eastman, Qm" uniqKey="Eastman Q">QM Eastman</name>
</author>
<author>
<name sortKey="Schatz, Dg" uniqKey="Schatz D">DG. Schatz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altekar, G" uniqKey="Altekar G">G Altekar</name>
</author>
<author>
<name sortKey="Dwarkadas, S" uniqKey="Dwarkadas S">S Dwarkadas</name>
</author>
<author>
<name sortKey="Huelsenbeck, Jp" uniqKey="Huelsenbeck J">JP Huelsenbeck</name>
</author>
<author>
<name sortKey="Ronquist, F" uniqKey="Ronquist F">F. Ronquist</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Babu, Mm" uniqKey="Babu M">MM Babu</name>
</author>
<author>
<name sortKey="Iyer, Lm" uniqKey="Iyer L">LM Iyer</name>
</author>
<author>
<name sortKey="Balaji, S" uniqKey="Balaji S">S Balaji</name>
</author>
<author>
<name sortKey="Aravind, L" uniqKey="Aravind L">L. Aravind</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benjak, A" uniqKey="Benjak A">A Benjak</name>
</author>
<author>
<name sortKey="Forneck, A" uniqKey="Forneck A">A Forneck</name>
</author>
<author>
<name sortKey="Casacuberta, Jm" uniqKey="Casacuberta J">JM. Casacuberta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Birol, I" uniqKey="Birol I">I Birol</name>
</author>
<author>
<name sortKey="Raymond, A" uniqKey="Raymond A">A Raymond</name>
</author>
<author>
<name sortKey="Jackman, Sd" uniqKey="Jackman S">SD Jackman</name>
</author>
<author>
<name sortKey="Pleasance, S" uniqKey="Pleasance S">S Pleasance</name>
</author>
<author>
<name sortKey="Coope, R" uniqKey="Coope R">R Coope</name>
</author>
<author>
<name sortKey="Taylor, Ga" uniqKey="Taylor G">GA Taylor</name>
</author>
<author>
<name sortKey="Yuen, Mm" uniqKey="Yuen M">MM Yuen</name>
</author>
<author>
<name sortKey="Keeling, Ci" uniqKey="Keeling C">CI Keeling</name>
</author>
<author>
<name sortKey="Brand, D" uniqKey="Brand D">D Brand</name>
</author>
<author>
<name sortKey="Vandervalk, Bp" uniqKey="Vandervalk B">BP Vandervalk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boc, A" uniqKey="Boc A">A Boc</name>
</author>
<author>
<name sortKey="Diallo, Ab" uniqKey="Diallo A">AB Diallo</name>
</author>
<author>
<name sortKey="Makarenkov, V" uniqKey="Makarenkov V">V. Makarenkov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Borisov, Ay" uniqKey="Borisov A">AY Borisov</name>
</author>
<author>
<name sortKey="Madsen, Lh" uniqKey="Madsen L">LH Madsen</name>
</author>
<author>
<name sortKey="Tsyganov, Ve" uniqKey="Tsyganov V">VE Tsyganov</name>
</author>
<author>
<name sortKey="Umehara, Y" uniqKey="Umehara Y">Y Umehara</name>
</author>
<author>
<name sortKey="Voroshilova, Va" uniqKey="Voroshilova V">VA Voroshilova</name>
</author>
<author>
<name sortKey="Batagov, Ao" uniqKey="Batagov A">AO Batagov</name>
</author>
<author>
<name sortKey="Sandal, N" uniqKey="Sandal N">N Sandal</name>
</author>
<author>
<name sortKey="Mortensen, A" uniqKey="Mortensen A">A Mortensen</name>
</author>
<author>
<name sortKey="Schauser, L" uniqKey="Schauser L">L Schauser</name>
</author>
<author>
<name sortKey="Ellis, N" uniqKey="Ellis N">N Ellis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bundock, P" uniqKey="Bundock P">P Bundock</name>
</author>
<author>
<name sortKey="Hooykaas, P" uniqKey="Hooykaas P">P. Hooykaas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Camacho, C" uniqKey="Camacho C">C Camacho</name>
</author>
<author>
<name sortKey="Coulouris, G" uniqKey="Coulouris G">G Coulouris</name>
</author>
<author>
<name sortKey="Avagyan, V" uniqKey="Avagyan V">V Avagyan</name>
</author>
<author>
<name sortKey="Ma, N" uniqKey="Ma N">N Ma</name>
</author>
<author>
<name sortKey="Papadopoulos, J" uniqKey="Papadopoulos J">J Papadopoulos</name>
</author>
<author>
<name sortKey="Bealer, K" uniqKey="Bealer K">K Bealer</name>
</author>
<author>
<name sortKey="Madden, Tl" uniqKey="Madden T">TL. Madden</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Castresana, J" uniqKey="Castresana J">J. Castresana</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chardin, C" uniqKey="Chardin C">C Chardin</name>
</author>
<author>
<name sortKey="Girin, T" uniqKey="Girin T">T Girin</name>
</author>
<author>
<name sortKey="Roudier, F" uniqKey="Roudier F">F Roudier</name>
</author>
<author>
<name sortKey="Meyer, C" uniqKey="Meyer C">C Meyer</name>
</author>
<author>
<name sortKey="Krapp, A" uniqKey="Krapp A">A. Krapp</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cowan, R" uniqKey="Cowan R">R Cowan</name>
</author>
<author>
<name sortKey="Hoen, D" uniqKey="Hoen D">D Hoen</name>
</author>
<author>
<name sortKey="Schoen, D" uniqKey="Schoen D">D Schoen</name>
</author>
<author>
<name sortKey="Bureau, T" uniqKey="Bureau T">T. Bureau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dereeper, A" uniqKey="Dereeper A">A Dereeper</name>
</author>
<author>
<name sortKey="Audic, S" uniqKey="Audic S">S Audic</name>
</author>
<author>
<name sortKey="Claverie, Jm" uniqKey="Claverie J">JM Claverie</name>
</author>
<author>
<name sortKey="Blanc, G" uniqKey="Blanc G">G. Blanc</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dereeper, A" uniqKey="Dereeper A">A Dereeper</name>
</author>
<author>
<name sortKey="Guignon, V" uniqKey="Guignon V">V Guignon</name>
</author>
<author>
<name sortKey="Blanc, G" uniqKey="Blanc G">G Blanc</name>
</author>
<author>
<name sortKey="Audic, S" uniqKey="Audic S">S Audic</name>
</author>
<author>
<name sortKey="Buffet, S" uniqKey="Buffet S">S Buffet</name>
</author>
<author>
<name sortKey="Chevenet, F" uniqKey="Chevenet F">F Chevenet</name>
</author>
<author>
<name sortKey="Dufayard, Jf" uniqKey="Dufayard J">JF Dufayard</name>
</author>
<author>
<name sortKey="Guindon, S" uniqKey="Guindon S">S Guindon</name>
</author>
<author>
<name sortKey="Lefort, V" uniqKey="Lefort V">V Lefort</name>
</author>
<author>
<name sortKey="Lescot, M" uniqKey="Lescot M">M Lescot</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Donoghue, Mt" uniqKey="Donoghue M">MT Donoghue</name>
</author>
<author>
<name sortKey="Keshavaiah, C" uniqKey="Keshavaiah C">C Keshavaiah</name>
</author>
<author>
<name sortKey="Swamidatta, Sh" uniqKey="Swamidatta S">SH Swamidatta</name>
</author>
<author>
<name sortKey="Spillane, C" uniqKey="Spillane C">C. Spillane</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Doolittle, Wf" uniqKey="Doolittle W">WF Doolittle</name>
</author>
<author>
<name sortKey="Sapienza, C" uniqKey="Sapienza C">C. Sapienza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feschotte, C" uniqKey="Feschotte C">C Feschotte</name>
</author>
<author>
<name sortKey="Pritham, Ej" uniqKey="Pritham E">EJ. Pritham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Flagel, Le" uniqKey="Flagel L">LE Flagel</name>
</author>
<author>
<name sortKey="Wendel, Jf" uniqKey="Wendel J">JF. Wendel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gao, Y" uniqKey="Gao Y">Y Gao</name>
</author>
<author>
<name sortKey="Liu, H" uniqKey="Liu H">H Liu</name>
</author>
<author>
<name sortKey="An, C" uniqKey="An C">C An</name>
</author>
<author>
<name sortKey="Shi, Y" uniqKey="Shi Y">Y Shi</name>
</author>
<author>
<name sortKey="Liu, X" uniqKey="Liu X">X Liu</name>
</author>
<author>
<name sortKey="Yuan, W" uniqKey="Yuan W">W Yuan</name>
</author>
<author>
<name sortKey="Zhang, B" uniqKey="Zhang B">B Zhang</name>
</author>
<author>
<name sortKey="Yang, J" uniqKey="Yang J">J Yang</name>
</author>
<author>
<name sortKey="Yu, C" uniqKey="Yu C">C Yu</name>
</author>
<author>
<name sortKey="Gao, H" uniqKey="Gao H">H. Gao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gascuel, O" uniqKey="Gascuel O">O. Gascuel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gould, Sj" uniqKey="Gould S">SJ Gould</name>
</author>
<author>
<name sortKey="Lloyd, Ea" uniqKey="Lloyd E">EA. Lloyd</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gould, Sj" uniqKey="Gould S">SJ Gould</name>
</author>
<author>
<name sortKey="Vrba, Es" uniqKey="Vrba E">ES. Vrba</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guilfoyle, Tj" uniqKey="Guilfoyle T">TJ Guilfoyle</name>
</author>
<author>
<name sortKey="Hagen, G" uniqKey="Hagen G">G. Hagen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Han, Y" uniqKey="Han Y">Y Han</name>
</author>
<author>
<name sortKey="Burnette, Jm" uniqKey="Burnette J">JM Burnette</name>
</author>
<author>
<name sortKey="Wessler, Sr" uniqKey="Wessler S">SR. Wessler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoen, Dr" uniqKey="Hoen D">DR Hoen</name>
</author>
<author>
<name sortKey="Bureau, Te" uniqKey="Bureau T">TE. Bureau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoen, Dr" uniqKey="Hoen D">DR Hoen</name>
</author>
<author>
<name sortKey="Bureau, Te" uniqKey="Bureau T">TE. Bureau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoen, Dr" uniqKey="Hoen D">DR Hoen</name>
</author>
<author>
<name sortKey="Park, Kc" uniqKey="Park K">KC Park</name>
</author>
<author>
<name sortKey="Elrouby, N" uniqKey="Elrouby N">N Elrouby</name>
</author>
<author>
<name sortKey="Yu, Z" uniqKey="Yu Z">Z Yu</name>
</author>
<author>
<name sortKey="Mohabir, N" uniqKey="Mohabir N">N Mohabir</name>
</author>
<author>
<name sortKey="Cowan, Rk" uniqKey="Cowan R">RK Cowan</name>
</author>
<author>
<name sortKey="Bureau, Te" uniqKey="Bureau T">TE. Bureau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huang, X" uniqKey="Huang X">X Huang</name>
</author>
<author>
<name sortKey="Ouyang, X" uniqKey="Ouyang X">X Ouyang</name>
</author>
<author>
<name sortKey="Yang, P" uniqKey="Yang P">P Yang</name>
</author>
<author>
<name sortKey="Lau, Os" uniqKey="Lau O">OS Lau</name>
</author>
<author>
<name sortKey="Li, G" uniqKey="Li G">G Li</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Chen, H" uniqKey="Chen H">H Chen</name>
</author>
<author>
<name sortKey="Deng, Xw" uniqKey="Deng X">XW. Deng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hudson, M" uniqKey="Hudson M">M Hudson</name>
</author>
<author>
<name sortKey="Ringli, C" uniqKey="Ringli C">C Ringli</name>
</author>
<author>
<name sortKey="Boylan, Mt" uniqKey="Boylan M">MT Boylan</name>
</author>
<author>
<name sortKey="Quail, Ph" uniqKey="Quail P">PH. Quail</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hudson, Me" uniqKey="Hudson M">ME Hudson</name>
</author>
<author>
<name sortKey="Lisch, Dr" uniqKey="Lisch D">DR Lisch</name>
</author>
<author>
<name sortKey="Quail, Ph" uniqKey="Quail P">PH. Quail</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huelsenbeck, Jp" uniqKey="Huelsenbeck J">JP Huelsenbeck</name>
</author>
<author>
<name sortKey="Ronquist, F" uniqKey="Ronquist F">F Ronquist</name>
</author>
<author>
<name sortKey="Nielsen, R" uniqKey="Nielsen R">R Nielsen</name>
</author>
<author>
<name sortKey="Bollback, Jp" uniqKey="Bollback J">JP. Bollback</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Joly Lopez, Z" uniqKey="Joly Lopez Z">Z Joly-Lopez</name>
</author>
<author>
<name sortKey="Forczek, E" uniqKey="Forczek E">E Forczek</name>
</author>
<author>
<name sortKey="Hoen, Dr" uniqKey="Hoen D">DR Hoen</name>
</author>
<author>
<name sortKey="Juretic, N" uniqKey="Juretic N">N Juretic</name>
</author>
<author>
<name sortKey="Bureau, Te" uniqKey="Bureau T">TE. Bureau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Juretic, N" uniqKey="Juretic N">N Juretic</name>
</author>
<author>
<name sortKey="Hoen, D" uniqKey="Hoen D">D Hoen</name>
</author>
<author>
<name sortKey="Huynh, M" uniqKey="Huynh M">M Huynh</name>
</author>
<author>
<name sortKey="Harrison, P" uniqKey="Harrison P">P Harrison</name>
</author>
<author>
<name sortKey="Bureau, T" uniqKey="Bureau T">T. Bureau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kapitonov, V" uniqKey="Kapitonov V">V Kapitonov</name>
</author>
<author>
<name sortKey="Jurka, J" uniqKey="Jurka J">J. Jurka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kapitonov, Vv" uniqKey="Kapitonov V">VV Kapitonov</name>
</author>
<author>
<name sortKey="Jurka, J" uniqKey="Jurka J">J. Jurka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Katoh, K" uniqKey="Katoh K">K Katoh</name>
</author>
<author>
<name sortKey="Standley, Dm" uniqKey="Standley D">DM. Standley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kawashima, T" uniqKey="Kawashima T">T Kawashima</name>
</author>
<author>
<name sortKey="Berger, F" uniqKey="Berger F">F. Berger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Korasick, Da" uniqKey="Korasick D">DA Korasick</name>
</author>
<author>
<name sortKey="Westfall, Cs" uniqKey="Westfall C">CS Westfall</name>
</author>
<author>
<name sortKey="Lee, Sg" uniqKey="Lee S">SG Lee</name>
</author>
<author>
<name sortKey="Nanao, Mh" uniqKey="Nanao M">MH Nanao</name>
</author>
<author>
<name sortKey="Dumas, R" uniqKey="Dumas R">R Dumas</name>
</author>
<author>
<name sortKey="Hagen, G" uniqKey="Hagen G">G Hagen</name>
</author>
<author>
<name sortKey="Guilfoyle, Tj" uniqKey="Guilfoyle T">TJ Guilfoyle</name>
</author>
<author>
<name sortKey="Jez, Jm" uniqKey="Jez J">JM Jez</name>
</author>
<author>
<name sortKey="Strader, Lc" uniqKey="Strader L">LC. Strader</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Larsson, A" uniqKey="Larsson A">A. Larsson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Le, Qh" uniqKey="Le Q">QH Le</name>
</author>
<author>
<name sortKey="Wright, S" uniqKey="Wright S">S Wright</name>
</author>
<author>
<name sortKey="Yu, Z" uniqKey="Yu Z">Z Yu</name>
</author>
<author>
<name sortKey="Bureau, T" uniqKey="Bureau T">T. Bureau</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Levin, Hl" uniqKey="Levin H">HL Levin</name>
</author>
<author>
<name sortKey="Moran, Jv" uniqKey="Moran J">JV. Moran</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, G" uniqKey="Li G">G Li</name>
</author>
<author>
<name sortKey="Siddiqui, H" uniqKey="Siddiqui H">H Siddiqui</name>
</author>
<author>
<name sortKey="Teng, Y" uniqKey="Teng Y">Y Teng</name>
</author>
<author>
<name sortKey="Lin, R" uniqKey="Lin R">R Lin</name>
</author>
<author>
<name sortKey="Wan, Xy" uniqKey="Wan X">XY Wan</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Lau, Os" uniqKey="Lau O">OS Lau</name>
</author>
<author>
<name sortKey="Ouyang, X" uniqKey="Ouyang X">X Ouyang</name>
</author>
<author>
<name sortKey="Dai, M" uniqKey="Dai M">M Dai</name>
</author>
<author>
<name sortKey="Wan, J" uniqKey="Wan J">J Wan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lin, R" uniqKey="Lin R">R Lin</name>
</author>
<author>
<name sortKey="Ding, L" uniqKey="Ding L">L Ding</name>
</author>
<author>
<name sortKey="Casola, C" uniqKey="Casola C">C Casola</name>
</author>
<author>
<name sortKey="Ripoll, Dr" uniqKey="Ripoll D">DR Ripoll</name>
</author>
<author>
<name sortKey="Feschotte, C" uniqKey="Feschotte C">C Feschotte</name>
</author>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lin, R" uniqKey="Lin R">R Lin</name>
</author>
<author>
<name sortKey="Teng, Y" uniqKey="Teng Y">Y Teng</name>
</author>
<author>
<name sortKey="Park, Hj" uniqKey="Park H">HJ Park</name>
</author>
<author>
<name sortKey="Ding, L" uniqKey="Ding L">L Ding</name>
</author>
<author>
<name sortKey="Black, C" uniqKey="Black C">C Black</name>
</author>
<author>
<name sortKey="Fang, P" uniqKey="Fang P">P Fang</name>
</author>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lin, R" uniqKey="Lin R">R Lin</name>
</author>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lisch, D" uniqKey="Lisch D">D. Lisch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Louis, A" uniqKey="Louis A">A Louis</name>
</author>
<author>
<name sortKey="Muffato, M" uniqKey="Muffato M">M Muffato</name>
</author>
<author>
<name sortKey="Roest Crollius, H" uniqKey="Roest Crollius H">H. Roest Crollius</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marchler Bauer, A" uniqKey="Marchler Bauer A">A Marchler-Bauer</name>
</author>
<author>
<name sortKey="Lu, S" uniqKey="Lu S">S Lu</name>
</author>
<author>
<name sortKey="Anderson, Jb" uniqKey="Anderson J">JB Anderson</name>
</author>
<author>
<name sortKey="Chitsaz, F" uniqKey="Chitsaz F">F Chitsaz</name>
</author>
<author>
<name sortKey="Derbyshire, Mk" uniqKey="Derbyshire M">MK Derbyshire</name>
</author>
<author>
<name sortKey="Deweese Scott, C" uniqKey="Deweese Scott C">C DeWeese-Scott</name>
</author>
<author>
<name sortKey="Fong, Jh" uniqKey="Fong J">JH Fong</name>
</author>
<author>
<name sortKey="Geer, Ly" uniqKey="Geer L">LY Geer</name>
</author>
<author>
<name sortKey="Geer, Rc" uniqKey="Geer R">RC Geer</name>
</author>
<author>
<name sortKey="Gonzales, Nr" uniqKey="Gonzales N">NR Gonzales</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Miller, Wj" uniqKey="Miller W">WJ Miller</name>
</author>
<author>
<name sortKey="Hagemann, S" uniqKey="Hagemann S">S Hagemann</name>
</author>
<author>
<name sortKey="Reiter, E" uniqKey="Reiter E">E Reiter</name>
</author>
<author>
<name sortKey="Pinsker, W" uniqKey="Pinsker W">W. Pinsker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Oliver, Kr" uniqKey="Oliver K">KR Oliver</name>
</author>
<author>
<name sortKey="Mccomb, Ja" uniqKey="Mccomb J">JA McComb</name>
</author>
<author>
<name sortKey="Greene, Wk" uniqKey="Greene W">WK. Greene</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Orgel, Le" uniqKey="Orgel L">LE Orgel</name>
</author>
<author>
<name sortKey="Crick, Fh" uniqKey="Crick F">FH. Crick</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ouyang, X" uniqKey="Ouyang X">X Ouyang</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Li, G" uniqKey="Li G">G Li</name>
</author>
<author>
<name sortKey="Li, B" uniqKey="Li B">B Li</name>
</author>
<author>
<name sortKey="Chen, B" uniqKey="Chen B">B Chen</name>
</author>
<author>
<name sortKey="Shen, H" uniqKey="Shen H">H Shen</name>
</author>
<author>
<name sortKey="Huang, X" uniqKey="Huang X">X Huang</name>
</author>
<author>
<name sortKey="Mo, X" uniqKey="Mo X">X Mo</name>
</author>
<author>
<name sortKey="Wan, X" uniqKey="Wan X">X Wan</name>
</author>
<author>
<name sortKey="Lin, R" uniqKey="Lin R">R Lin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pardue, Ml" uniqKey="Pardue M">ML Pardue</name>
</author>
<author>
<name sortKey="Debaryshe, Pg" uniqKey="Debaryshe P">PG. DeBaryshe</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Parisod, C" uniqKey="Parisod C">C Parisod</name>
</author>
<author>
<name sortKey="Alix, K" uniqKey="Alix K">K Alix</name>
</author>
<author>
<name sortKey="Just, J" uniqKey="Just J">J Just</name>
</author>
<author>
<name sortKey="Petit, M" uniqKey="Petit M">M Petit</name>
</author>
<author>
<name sortKey="Sarilar, V" uniqKey="Sarilar V">V Sarilar</name>
</author>
<author>
<name sortKey="Mhiri, C" uniqKey="Mhiri C">C Mhiri</name>
</author>
<author>
<name sortKey="Ainouche, M" uniqKey="Ainouche M">M Ainouche</name>
</author>
<author>
<name sortKey="Chalhoub, B" uniqKey="Chalhoub B">B Chalhoub</name>
</author>
<author>
<name sortKey="Grandbastien, Ma" uniqKey="Grandbastien M">MA. Grandbastien</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Prasad, Bd" uniqKey="Prasad B">BD Prasad</name>
</author>
<author>
<name sortKey="Goel, S" uniqKey="Goel S">S Goel</name>
</author>
<author>
<name sortKey="Krishna, P" uniqKey="Krishna P">P. Krishna</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quesneville, H" uniqKey="Quesneville H">H Quesneville</name>
</author>
<author>
<name sortKey="Nouaud, D" uniqKey="Nouaud D">D Nouaud</name>
</author>
<author>
<name sortKey="Anxolabehere, D" uniqKey="Anxolabehere D">D. Anxolabehere</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rawn, Sm" uniqKey="Rawn S">SM Rawn</name>
</author>
<author>
<name sortKey="Cross, Jc" uniqKey="Cross J">JC. Cross</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rebollo, R" uniqKey="Rebollo R">R Rebollo</name>
</author>
<author>
<name sortKey="Horard, B" uniqKey="Horard B">B Horard</name>
</author>
<author>
<name sortKey="Hubert, B" uniqKey="Hubert B">B Hubert</name>
</author>
<author>
<name sortKey="Vieira, C" uniqKey="Vieira C">C. Vieira</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ronquist, F" uniqKey="Ronquist F">F Ronquist</name>
</author>
<author>
<name sortKey="Huelsenbeck, Jp" uniqKey="Huelsenbeck J">JP. Huelsenbeck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosso, Mg" uniqKey="Rosso M">MG Rosso</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Strizhov, N" uniqKey="Strizhov N">N Strizhov</name>
</author>
<author>
<name sortKey="Reiss, B" uniqKey="Reiss B">B Reiss</name>
</author>
<author>
<name sortKey="Dekker, K" uniqKey="Dekker K">K Dekker</name>
</author>
<author>
<name sortKey="Weisshaar, B" uniqKey="Weisshaar B">B. Weisshaar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roussigne, M" uniqKey="Roussigne M">M Roussigne</name>
</author>
<author>
<name sortKey="Kossida, S" uniqKey="Kossida S">S Kossida</name>
</author>
<author>
<name sortKey="Lavigne, Ac" uniqKey="Lavigne A">AC Lavigne</name>
</author>
<author>
<name sortKey="Clouaire, T" uniqKey="Clouaire T">T Clouaire</name>
</author>
<author>
<name sortKey="Ecochard, V" uniqKey="Ecochard V">V Ecochard</name>
</author>
<author>
<name sortKey="Glories, A" uniqKey="Glories A">A Glories</name>
</author>
<author>
<name sortKey="Amalric, F" uniqKey="Amalric F">F Amalric</name>
</author>
<author>
<name sortKey="Girard, Jp" uniqKey="Girard J">JP. Girard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Saccaro, Nl" uniqKey="Saccaro N">NL Saccaro</name>
</author>
<author>
<name sortKey="Van Sluys, M A" uniqKey="Van Sluys M">M-A Van Sluys</name>
</author>
<author>
<name sortKey="De Mello Varani, A" uniqKey="De Mello Varani A">A de Mello Varani</name>
</author>
<author>
<name sortKey="Rossi, M" uniqKey="Rossi M">M. Rossi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sinzelle, L" uniqKey="Sinzelle L">L Sinzelle</name>
</author>
<author>
<name sortKey="Izsvak, Z" uniqKey="Izsvak Z">Z Izsvak</name>
</author>
<author>
<name sortKey="Ivics, Z" uniqKey="Ivics Z">Z. Ivics</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stirnberg, P" uniqKey="Stirnberg P">P Stirnberg</name>
</author>
<author>
<name sortKey="Zhao, S" uniqKey="Zhao S">S Zhao</name>
</author>
<author>
<name sortKey="Williamson, L" uniqKey="Williamson L">L Williamson</name>
</author>
<author>
<name sortKey="Ward, S" uniqKey="Ward S">S Ward</name>
</author>
<author>
<name sortKey="Leyser, O" uniqKey="Leyser O">O. Leyser</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sumimoto, H" uniqKey="Sumimoto H">H Sumimoto</name>
</author>
<author>
<name sortKey="Kamakura, S" uniqKey="Kamakura S">S Kamakura</name>
</author>
<author>
<name sortKey="Ito, T" uniqKey="Ito T">T. Ito</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tang, W" uniqKey="Tang W">W Tang</name>
</author>
<author>
<name sortKey="Ji, Q" uniqKey="Ji Q">Q Ji</name>
</author>
<author>
<name sortKey="Huang, Y" uniqKey="Huang Y">Y Huang</name>
</author>
<author>
<name sortKey="Jiang, Z" uniqKey="Jiang Z">Z Jiang</name>
</author>
<author>
<name sortKey="Bao, M" uniqKey="Bao M">M Bao</name>
</author>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H Wang</name>
</author>
<author>
<name sortKey="Lin, R" uniqKey="Lin R">R. Lin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Trehin, C" uniqKey="Trehin C">C Trehin</name>
</author>
<author>
<name sortKey="Schrempp, S" uniqKey="Schrempp S">S Schrempp</name>
</author>
<author>
<name sortKey="Chauvet, A" uniqKey="Chauvet A">A Chauvet</name>
</author>
<author>
<name sortKey="Berne Dedieu, A" uniqKey="Berne Dedieu A">A Berne-Dedieu</name>
</author>
<author>
<name sortKey="Thierry, Am" uniqKey="Thierry A">AM Thierry</name>
</author>
<author>
<name sortKey="Faure, Je" uniqKey="Faure J">JE Faure</name>
</author>
<author>
<name sortKey="Negrutiu, I" uniqKey="Negrutiu I">I Negrutiu</name>
</author>
<author>
<name sortKey="Morel, P" uniqKey="Morel P">P. Morel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Leeuwen, H" uniqKey="Van Leeuwen H">H van Leeuwen</name>
</author>
<author>
<name sortKey="Monfort, A" uniqKey="Monfort A">A Monfort</name>
</author>
<author>
<name sortKey="Puigdomenech, P" uniqKey="Puigdomenech P">P. Puigdomenech</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H Wang</name>
</author>
<author>
<name sortKey="Deng, Xw" uniqKey="Deng X">XW. Deng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H Wang</name>
</author>
<author>
<name sortKey="Wang, H" uniqKey="Wang H">H. Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Whitelam, Gc" uniqKey="Whitelam G">GC Whitelam</name>
</author>
<author>
<name sortKey="Johnson, E" uniqKey="Johnson E">E Johnson</name>
</author>
<author>
<name sortKey="Peng, J" uniqKey="Peng J">J Peng</name>
</author>
<author>
<name sortKey="Carol, P" uniqKey="Carol P">P Carol</name>
</author>
<author>
<name sortKey="Anderson, Ml" uniqKey="Anderson M">ML Anderson</name>
</author>
<author>
<name sortKey="Cowl, Js" uniqKey="Cowl J">JS Cowl</name>
</author>
<author>
<name sortKey="Harberd, Np" uniqKey="Harberd N">NP. Harberd</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, Z" uniqKey="Yang Z">Z Yang</name>
</author>
<author>
<name sortKey="Wong, Ws" uniqKey="Wong W">WS Wong</name>
</author>
<author>
<name sortKey="Nielsen, R" uniqKey="Nielsen R">R. Nielsen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zeh, Dw" uniqKey="Zeh D">DW Zeh</name>
</author>
<author>
<name sortKey="Zeh, Ja" uniqKey="Zeh J">JA Zeh</name>
</author>
<author>
<name sortKey="Ishida, Y" uniqKey="Ishida Y">Y. Ishida</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zeng, L" uniqKey="Zeng L">L Zeng</name>
</author>
<author>
<name sortKey="Zhang, Q" uniqKey="Zhang Q">Q Zhang</name>
</author>
<author>
<name sortKey="Sun, R" uniqKey="Sun R">R Sun</name>
</author>
<author>
<name sortKey="Kong, H" uniqKey="Kong H">H Kong</name>
</author>
<author>
<name sortKey="Zhang, N" uniqKey="Zhang N">N Zhang</name>
</author>
<author>
<name sortKey="Ma, H" uniqKey="Ma H">H. Ma</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, W" uniqKey="Zhang W">W Zhang</name>
</author>
<author>
<name sortKey="Wu, J" uniqKey="Wu J">J Wu</name>
</author>
<author>
<name sortKey="Ward, Md" uniqKey="Ward M">MD Ward</name>
</author>
<author>
<name sortKey="Yang, S" uniqKey="Yang S">S Yang</name>
</author>
<author>
<name sortKey="Chuang, Ya" uniqKey="Chuang Y">YA Chuang</name>
</author>
<author>
<name sortKey="Xiao, M" uniqKey="Xiao M">M Xiao</name>
</author>
<author>
<name sortKey="Li, R" uniqKey="Li R">R Li</name>
</author>
<author>
<name sortKey="Leahy, Dj" uniqKey="Leahy D">DJ Leahy</name>
</author>
<author>
<name sortKey="Worley, Pf" uniqKey="Worley P">PF. Worley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zientara Rytter, K" uniqKey="Zientara Rytter K">K Zientara-Rytter</name>
</author>
<author>
<name sortKey="Sirko, A" uniqKey="Sirko A">A. Sirko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zimin, A" uniqKey="Zimin A">A Zimin</name>
</author>
<author>
<name sortKey="Stevens, Ka" uniqKey="Stevens K">KA Stevens</name>
</author>
<author>
<name sortKey="Crepeau, Mw" uniqKey="Crepeau M">MW Crepeau</name>
</author>
<author>
<name sortKey="Holtz Morris, A" uniqKey="Holtz Morris A">A Holtz-Morris</name>
</author>
<author>
<name sortKey="Koriabine, M" uniqKey="Koriabine M">M Koriabine</name>
</author>
<author>
<name sortKey="Marcais, G" uniqKey="Marcais G">G Marcais</name>
</author>
<author>
<name sortKey="Puiu, D" uniqKey="Puiu D">D Puiu</name>
</author>
<author>
<name sortKey="Roberts, M" uniqKey="Roberts M">M Roberts</name>
</author>
<author>
<name sortKey="Wegrzyn, Jl" uniqKey="Wegrzyn J">JL Wegrzyn</name>
</author>
<author>
<name sortKey="De Jong, Pj" uniqKey="De Jong P">PJ de Jong</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Mol Biol Evol</journal-id>
<journal-id journal-id-type="iso-abbrev">Mol. Biol. Evol</journal-id>
<journal-id journal-id-type="publisher-id">molbev</journal-id>
<journal-id journal-id-type="hwp">molbiolevol</journal-id>
<journal-title-group>
<journal-title>Molecular Biology and Evolution</journal-title>
</journal-title-group>
<issn pub-type="ppub">0737-4038</issn>
<issn pub-type="epub">1537-1719</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27189548</article-id>
<article-id pub-id-type="pmc">4948706</article-id>
<article-id pub-id-type="doi">10.1093/molbev/msw067</article-id>
<article-id pub-id-type="publisher-id">msw067</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Discoveries</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Phylogenetic and Genomic Analyses Resolve the Origin of Important Plant Genes Derived from Transposable Elements</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Joly-Lopez</surname>
<given-names>Zoé</given-names>
</name>
<xref ref-type="author-notes" rid="msw067-FM1">
<sup></sup>
</xref>
<xref ref-type="author-notes" rid="msw067-FM2">
<sup></sup>
</xref>
<xref ref-type="aff" rid="msw067-aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Hoen</surname>
<given-names>Douglas R.</given-names>
</name>
<xref ref-type="author-notes" rid="msw067-FM1">
<sup></sup>
</xref>
<xref ref-type="aff" rid="msw067-aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Blanchette</surname>
<given-names>Mathieu</given-names>
</name>
<xref ref-type="aff" rid="msw067-aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bureau</surname>
<given-names>Thomas E.</given-names>
</name>
<xref ref-type="corresp" rid="msw067-cor1">*</xref>
<xref ref-type="aff" rid="msw067-aff1">
<sup>1</sup>
</xref>
</contrib>
<aff id="msw067-aff1">
<sup>1</sup>
Department of Biology, McGill University, Montréal, QC, Canada</aff>
<aff id="msw067-aff2">
<sup>2</sup>
School of Computer Science, McGill University, Montréal, QC, Canada</aff>
</contrib-group>
<author-notes>
<fn id="msw067-FM1">
<p>
<sup></sup>
These authors contributed equally to this work.</p>
</fn>
<fn id="msw067-FM2">
<p>
<sup></sup>
Present address: Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA.</p>
</fn>
<corresp id="msw067-cor1">*
<bold>Corresponding author:</bold>
E-mail:
<email>thomas.bureau@mcgill.ca</email>
.</corresp>
<fn id="msw067-FM3">
<p>
<bold>Associate editor:</bold>
Brandon Gaut</p>
</fn>
</author-notes>
<pub-date pub-type="ppub">
<month>8</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="epub">
<day>28</day>
<month>4</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>28</day>
<month>4</month>
<year>2016</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>33</volume>
<issue>8</issue>
<fpage>1937</fpage>
<lpage>1956</lpage>
<permissions>
<copyright-statement>© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.</copyright-statement>
<copyright-year>2016</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by-nc/4.0/" license-type="creative-commons">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</ext-link>
), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com</license-p>
</license>
</permissions>
<abstract>
<p>Once perceived as merely selfish, transposable elements (TEs) are now recognized as potent agents of adaptation. One way TEs contribute to evolution is through TE exaptation, a process whereby TEs, which persist by replicating in the genome, transform into novel host genes, which persist by conferring phenotypic benefits. Known exapted TEs (ETEs) contribute diverse and vital functions, and may facilitate punctuated equilibrium, yet little is known about this process. To better understand TE exaptation, we designed an approach to resolve the phylogenetic context and timing of exaptation events and subsequent patterns of ETE diversification. Starting with known ETEs, we search in diverse genomes for basal ETEs and closely related TEs, carefully curate the numerous candidate sequences, and infer detailed phylogenies. To distinguish TEs from ETEs, we also weigh several key genomic characteristics including repetitiveness, terminal repeats, pseudogenic features, and conserved domains. Applying this approach to the well-characterized plant ETEs
<italic>MUG</italic>
and
<italic>FHY3</italic>
, we show that each group is paraphyletic and we argue that this pattern demonstrates that each originated in not one but multiple exaptation events. These exaptations and subsequent ETE diversification occurred throughout angiosperm evolution including the crown group expansion, the angiosperm radiation, and the primitive evolution of angiosperms. In addition, we detect evidence of several putative novel ETE families. Our findings support the hypothesis that TE exaptation generates novel genes more frequently than is currently thought, often coinciding with key periods of evolution.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>
<italic>MUSTANG</italic>
</kwd>
<kwd>
<italic>FAR1</italic>
</kwd>
<kwd>
<italic>FHY3</italic>
</kwd>
<kwd>
<italic>FRS</italic>
</kwd>
<kwd>transposable elements</kwd>
<kwd>exaptation</kwd>
<kwd>molecular domestication</kwd>
<kwd>phylogeny</kwd>
<kwd>evolution</kwd>
<kwd>mutator</kwd>
<kwd>MULE</kwd>
<kwd>Phox/Bem1p</kwd>
<kwd>PB1</kwd>
<kwd>Peptidase C48</kwd>
<kwd>angiosperm radiation</kwd>
<kwd>adaptation</kwd>
<kwd>transposon</kwd>
<kwd>co-option</kwd>
<kwd>selective constraint.</kwd>
</kwd-group>
<counts>
<page-count count="20"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>Transposable elements (TEs) are DNA segments that mediate their own duplication and thereby accumulate to high abundance in most eukaryotic genomes. Because of this, TEs were once perceived as selfish (
<xref rid="msw067-B18" ref-type="bibr">Doolittle and Sapienza 1980</xref>
;
<xref rid="msw067-B53" ref-type="bibr">Orgel and Crick 1980</xref>
); however, it is now understood that they confer many benefits, such as maintaining genome structure, generating variation, and producing evolutionary innovation (
<xref rid="msw067-B20" ref-type="bibr">Flagel and Wendel 2009</xref>
;
<xref rid="msw067-B48" ref-type="bibr">Lisch 2009</xref>
;
<xref rid="msw067-B56" ref-type="bibr">Parisod et al. 2010</xref>
;
<xref rid="msw067-B60" ref-type="bibr">Rebollo et al. 2010</xref>
;
<xref rid="msw067-B43" ref-type="bibr">Levin and Moran 2011</xref>
;
<xref rid="msw067-B55" ref-type="bibr">Pardue and DeBaryshe 2011</xref>
). In plants, TEs are important contributors to evolution and diversity; for example, in more than 60 reported instances, TEs have modified existing genes or given rise to novel phenotypic genes (
<xref rid="msw067-B52" ref-type="bibr">Oliver et al. 2013</xref>
).</p>
<p>One mechanism through which TEs contribute to evolution is exaptation. Although the familiar term “adaptation” refers to biological features selected to increase the benefit of existing roles, the term “exaptation” instead refers to features co-opted to perform entirely new roles (
<xref rid="msw067-B24" ref-type="bibr">Gould and Vrba 1982</xref>
;
<xref rid="msw067-B23" ref-type="bibr">Gould and Lloyd 1999</xref>
). More specifically, TE exaptation (also referred to as co-option or molecular domestication) (
<xref rid="msw067-B51" ref-type="bibr">Miller et al. 1992</xref>
) is a process by which a TE, originally conserved through “self-replicative selection,” transitions to increase the fitness of the organism and becomes conserved through “phenotypic selection” (
<xref rid="msw067-B18" ref-type="bibr">Doolittle and Sapienza 1980</xref>
;
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
). In eukaryotes, TE exaptation has made possible major evolutionary innovations, including the vertebrate adaptive immune system (
<xref rid="msw067-B1" ref-type="bibr">Agrawal et al. 1998</xref>
;
<xref rid="msw067-B36" ref-type="bibr">Kapitonov and Jurka 2004</xref>
,
<xref rid="msw067-B37" ref-type="bibr">2005</xref>
), the mammalian placenta (
<xref rid="msw067-B59" ref-type="bibr">Rawn and Cross 2008</xref>
), and human cognition (
<xref rid="msw067-B77" ref-type="bibr">Zhang et al. 2015</xref>
). Until recently, most exapted TE genes (ETEs) were discovered fortuitously in forward genetic screens (e.g.,
<xref rid="msw067-B10" ref-type="bibr">Bundock and Hooykaas 2005</xref>
;
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
); however, advances in sequencing technology have enabled us to directly identify putative ETEs using computational analysis of genomic data (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
), then validate them by reverse genetic characterization of phenotypes (
<xref rid="msw067-B14" ref-type="bibr">Cowan et al. 2005</xref>
;
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). For instance, in a large-scale search for ETEs in one family of plants (the Brassicaceae), we recently identified 67 ETEs, more than half of them novel, suggesting that ETEs may be far more abundant than previously thought (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
).</p>
<p>Discoveries such as this magnify the importance of better understanding the TE exaptation process. However, like many evolutionary mechanisms involving TEs, investigating exaptation can be difficult. One particularly challenging aspect is reconstructing ETE evolutionary histories in order to determine the number and timing of exaptation events, and the identity of the ancestral genomes in which they arose. Such analyses require identifying both ETEs and closely related TEs. But because TE families frequently go extinct within genomes (
<xref rid="msw067-B17" ref-type="bibr">Donoghue et al. 2011</xref>
), TEs directly descended from the progenitors of an ETE family may no longer exist, or if they do exist may be present in only a small fraction of genomes. Furthermore, even if TEs closely related to an ETE family do exist in a sequenced genome, identifying and analyzing them may be difficult given the large number of sequenced genomes and vast number of TEs that each genome may contain. Nevertheless, investigating the evolutionary history of ETEs is worthwhile and increasingly feasible as the number of sequenced genomes continues to grow. A better understanding of the origins of ETEs would also help to guide experimental studies, since ETEs that originated in different exaptation events ought also to have different functions.</p>
<p>We therefore undertook to demonstrate the feasibility and value of such investigations by thoroughly characterizing the evolutionary history of the two largest and best-characterized families of ETEs in plants:
<italic>MUSTANG</italic>
(
<italic>MUG</italic>
) and
<italic>FAR1-RELATED SEQUENCES</italic>
(
<italic>FRS</italic>
).
<italic>MUG</italic>
, which descended from TEs of the
<italic>Mutator</italic>
-like element (MULE) superfamily (
<xref rid="msw067-B14" ref-type="bibr">Cowan et al. 2005</xref>
;
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
), consists in
<italic>Arabidopsis</italic>
<italic>thaliana</italic>
of eight genes equally divided between two subfamilies,
<italic>MUGA</italic>
and
<italic>MUGB</italic>
(
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). Double mutants within each subfamily (
<italic>mug1 mug2</italic>
in
<italic>MUGA</italic>
and
<italic>mug7 mug8</italic>
in
<italic>MUGB</italic>
) produce stronger phenotypes than the single mutants and, despite similar broad phenotypes (e.g., delayed flowering and reduced yield), each subfamily exhibits different specific phenotypes such as reduced chlorophyll production in
<italic>mug1 mug2</italic>
(
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). In addition to the conserved domains typically associated with MULE transposases,
<italic>MUGB</italic>
but not
<italic>MUGA</italic>
genes contain a Phox and Bem1p (PB1) domain, which adopts a ubiquitin-like β-grasp fold structure (
<xref rid="msw067-B67" ref-type="bibr">Sumimoto et al. 2007</xref>
). Plant genomes have greater numbers of PB1-containing genes than other eukaryotes, and although a few have been characterized and found to be involved in a wide range of biological processes, little is known about the source or biological function of most plant PB1 domains (
<xref rid="msw067-B9" ref-type="bibr">Borisov et al. 2003</xref>
;
<xref rid="msw067-B57" ref-type="bibr">Prasad et al. 2010</xref>
;
<xref rid="msw067-B25" ref-type="bibr">Guilfoyle and Hagen 2012</xref>
;
<xref rid="msw067-B69" ref-type="bibr">Trehin et al. 2013</xref>
;
<xref rid="msw067-B13" ref-type="bibr">Chardin et al. 2014</xref>
;
<xref rid="msw067-B40" ref-type="bibr">Korasick, et al. 2014</xref>
;
<xref rid="msw067-B78" ref-type="bibr">Zientara-Rytter and Sirko 2014</xref>
).
<italic>MUG</italic>
genes are present in both monocots and eudicots, and likely in basal angiosperms, but have not been identified in other plants, suggesting that exaptation might have occurred during early angiosperm evolution (
<xref rid="msw067-B14" ref-type="bibr">Cowan et al. 2005</xref>
;
<xref rid="msw067-B64" ref-type="bibr">Saccaro et al. 2007</xref>
;
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). Interestingly, at the time of the monocot-eudicot split,
<italic>MUGA</italic>
had already undergone at least two conserved duplications whereas
<italic>MUGB</italic>
had undergone none (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). Overall, such differences in phenotype, gene structure, and phylogeny show that, despite their close similarity in sequence,
<italic>MUGA</italic>
and
<italic>MUGB</italic>
followed different evolutionary trajectories; however, it is not yet known whether these differences occurred prior or subsequent to exaptation.</p>
<p>The second well-characterized group of plant ETEs,
<italic>FRS</italic>
, consist in
<italic>A. thaliana</italic>
of the genes
<italic>FAR-RED IMPAIRED RESPONSE 1</italic>
(
<italic>FAR1</italic>
) and
<italic>FAR-RED ELONGATED HYPOCOTYLS 3</italic>
(
<italic>FHY3</italic>
) (
<xref rid="msw067-B73" ref-type="bibr">Whitelam et al. 1993</xref>
;
<xref rid="msw067-B31" ref-type="bibr">Hudson et al. 1999</xref>
), as well as 12 additional
<italic>FAR1-RELATED SEQUENCE</italic>
(
<italic>FRS</italic>
) (
<xref rid="msw067-B4" ref-type="bibr">
<italic>Arabidopsis</italic>
Genome 2000</xref>
;
<xref rid="msw067-B32" ref-type="bibr">Hudson et al. 2003</xref>
;
<xref rid="msw067-B47" ref-type="bibr">Lin and Wang 2004</xref>
).
<italic>FAR1</italic>
and
<italic>FHY3</italic>
encode transcription factors essential for far-red light responses controlled by phytochrome A (
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
). They also play roles in diverse developmental and physiological processes (
<xref rid="msw067-B47" ref-type="bibr">Lin and Wang 2004</xref>
;
<xref rid="msw067-B44" ref-type="bibr">Li et al. 2011</xref>
;
<xref rid="msw067-B54" ref-type="bibr">Ouyang et al. 2011</xref>
;
<xref rid="msw067-B30" ref-type="bibr">Huang et al. 2012</xref>
;
<xref rid="msw067-B66" ref-type="bibr">Stirnberg et al. 2012</xref>
;
<xref rid="msw067-B21" ref-type="bibr">Gao et al. 2013</xref>
;
<xref rid="msw067-B68" ref-type="bibr">Tang et al. 2013</xref>
). The remaining
<italic>FRS</italic>
genes are thus far not well characterized, but have been suggested to play distinct roles in light-controlled development in
<italic>Arabidopsis</italic>
(
<xref rid="msw067-B47" ref-type="bibr">Lin and Wang 2004</xref>
). Like
<italic>MUG</italic>
, the
<italic>FRS</italic>
family was derived from MULEs and is found in monocots as well as eudicots, so are thought to have originated in one or more exaptation events during early angiosperm evolution (
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
).</p>
<p>To better understand TE exaptation and the evolutionary histories of
<italic>MUG</italic>
and
<italic>FRS</italic>
, we explore the following questions: How many TE exaptation events led to the formation of these two groups of ETEs? When did these exaptations occur and did they happen in close succession or at widely separated times? Did they involve similar or divergent MULE families, and might any characteristics of the progenitor TE families shed light on the structural and functional characteristics of the ETEs? After they had been established, how often and when did each ETE family diversify? Finally, what does the timing of exaptation and diversification events suggest about their potential role in evolution? To answer these questions, we take the following approach: 1) identify sequences closely related to
<italic>MUG</italic>
or
<italic>FRS</italic>
in diverse genomes, 2) generate curated, reliable phylogenies, and 3) measure genomic attributes that differentiate ETEs from TEs. We then analyze the results to determine how many exaptation events occurred and their timing. Our findings reveal that TE exaptation has contributed more than previously understood to angiosperm evolution, and likely provide many functions of potential practical importance that have yet to be discovered.</p>
</sec>
<sec>
<title>Results and Discussion</title>
<sec>
<title>Identifying Genomes of Interest</title>
<p>To maximize our chances of finding extant TEs closely related to
<italic>MUG</italic>
and to sample diverse ETE clades, we searched a large number (62) of genomes including representatives from all major angiosperm lineages (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). We expected to find thousands of sequences (mainly TEs) but did not want to preclude any TEs or ETEs of potential interest, even in such distant genomes as
<italic>Picea abies</italic>
. We thus devised a strategy to screen for genomes of potential interest (
<xref ref-type="fig" rid="msw067-F1">fig. 1</xref>
). First, we determined which genomes contained any sequences of interest by searching each genome individually. Then we selected the genomes (22) that appeared, according to their individual phylogenetic trees, to contain TEs descended from the last common ancestor of all
<italic>MUG</italic>
query sequences (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). We also retained genomes with apparent TEs that contained a PB1 domain (see below). For additional analyses, we added three genomes with key positions in the species phylogeny but which were expressed sequence tag (EST) assemblies rather than fully sequenced: the basal eudicot
<italic>Nelumbo</italic>
<italic>nucifera</italic>
, the magnoliid
<italic>Persea</italic>
<italic>americana</italic>
, and the basal angiosperm
<italic>Nuphar</italic>
<italic>advena</italic>
. In total, this amounted to 25 genomes for subsequent study (
<xref ref-type="table" rid="msw067-T1">table 1</xref>
).
<fig id="msw067-F1" orientation="portrait" position="float">
<label>Fig. 1</label>
<caption>
<p>Analysis flowchart. Numbered arrows indicate two separate paths. Arrows with no numbers were performed in both iterations. Arrows numbered (1) indicate the first iteration in which 62 genomes were individually searched to identify genomes of interest. Arrows numbered (2) indicate the second iteration, conducted on all selected genomes with additional curation steps. In bold, programs used. In italics, additional descriptions. Parallelogram, input or output; rectangle, process; diamond, action, which includes manual intervention. In the first iteration, TBLASTN, PHI, MAFFT, and FastTreeMP were performed using TARGeT.</p>
</caption>
<graphic xlink:href="msw067f1p"></graphic>
</fig>
<table-wrap id="msw067-T1" orientation="portrait" position="float">
<label>Table 1</label>
<caption>
<p>Summary of Selected 25 Genomes Included in the
<italic>MUG</italic>
Tree.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1"></th>
<th align="center" colspan="2" rowspan="1">Genome Selection Criteria
<hr></hr>
</th>
</tr>
<tr>
<th rowspan="1" colspan="1">Species Name</th>
<th rowspan="1" colspan="1">Division, Order</th>
<th rowspan="1" colspan="1">
<italic>AtMUGA</italic>
Homologs
<xref ref-type="table-fn" rid="msw067-TF2">
<sup>b</sup>
</xref>
</th>
<th rowspan="1" colspan="1">
<italic>AtMUGB</italic>
Homologs
<xref ref-type="table-fn" rid="msw067-TF2">
<sup>b</sup>
</xref>
</th>
<th rowspan="1" colspan="1">TEs Derived from
<italic>MUG</italic>
LCA
<xref ref-type="table-fn" rid="msw067-TF3">
<sup>c</sup>
</xref>
</th>
<th rowspan="1" colspan="1">TEs with PB1</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="1" colspan="1">
<italic>Aquilegia coerulea</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Ranunculales</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Amborella trichopoda</italic>
</td>
<td rowspan="1" colspan="1">Basal angiosperm</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Citrus clementina</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Sapindales</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">No</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Citrus sinensis</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Sapindales</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Elaeis oleifera</italic>
</td>
<td rowspan="1" colspan="1">Monocot, Arecales</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Eutrema salsugineum</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Brassicales</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Fragaria vesca</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Rosales</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Glycine max</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Fabales</td>
<td rowspan="1" colspan="1">6</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Gossypium raimondii</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Malvales</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">6</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Malus domestica</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Rosales</td>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">8</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Manihot esculenta</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Malpighiales</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Mimulus guttatus</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Lamiales</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Nelumbo nucifera</italic>
<xref ref-type="table-fn" rid="msw067-TF1">
<sup>a</sup>
</xref>
</td>
<td rowspan="1" colspan="1">Eudicot, Proteales</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">No</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Nuphar advena</italic>
<xref ref-type="table-fn" rid="msw067-TF1">
<sup>a</sup>
</xref>
</td>
<td rowspan="1" colspan="1">Basal angiosperm</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">No</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Panicum virgatum</italic>
</td>
<td rowspan="1" colspan="1">Monocot, Poales</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">No</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Persea americana</italic>
<xref ref-type="table-fn" rid="msw067-TF1">
<sup>a</sup>
</xref>
</td>
<td rowspan="1" colspan="1">Magnoliids, Laurales</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">No</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Phoenix dactylifera</italic>
</td>
<td rowspan="1" colspan="1">Monocot, Arecales</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Physcomitrella patens</italic>
</td>
<td rowspan="1" colspan="1">Moss, Funariales</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Prunus persica</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Rosales</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Setaria italica</italic>
</td>
<td rowspan="1" colspan="1">Monocot, Poales</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Solanum lycopersicum</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Solanales</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Solanum tuberosum</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Solanales</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Theobroma cacao</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Malvales</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">No</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Vitis vinifera</italic>
</td>
<td rowspan="1" colspan="1">Eudicot, Vitales</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">Yes</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>Zea mays</italic>
</td>
<td rowspan="1" colspan="1">Monocot, Poales</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">No</td>
<td rowspan="1" colspan="1">Yes</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="msw067-TF1">
<p>
<sup>a</sup>
Genomes included in the final list although they did not fulfill the genome selection criteria. They were included because of their strategic position in the tree. These genomes are not shown in the main tree figures but present in the
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary tree figures</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online.</p>
</fn>
<fn id="msw067-TF2">
<p>
<sup>b</sup>
Number of homologous sequences for each given species.</p>
</fn>
<fn id="msw067-TF3">
<p>
<sup>c</sup>
LCA, Last common ancestor.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec>
<title>Generating and Curating the Phylogenies</title>
<p>We next constructed a phylogenetic tree that included all ETEs and TEs of interest (
<xref ref-type="fig" rid="msw067-F1">fig. 1</xref>
; iteration 2). Alignment curation is critical to the construction of high quality phylogenies, especially for TE genes because they often contain frameshifts, truncations, deletions, and insertions. If not removed, highly degenerate sequences with long gaps or poor alignment within otherwise well-conserved blocks may reduce the accuracy of phylogenetic inference (
<xref rid="msw067-B12" ref-type="bibr">Castresana 2000</xref>
). Conversely, we also did not want to remove all pseudogenes because that would include the majority of TE-derived sequences (see below). We thus devised an approach that combined multiple methods to remove extraneous sequences.</p>
<p>To exclude sequences related to
<italic>MUG</italic>
only distantly, we increased the search stringency (see Materials and Methods), resulting in 2,077 sequences. We built a preliminary alignment (MAFFT) and used it to remove problematic sequences. We generated a final alignment (MAFFT), curated it further (GBlocks), and inferred a “full”
<italic>MUG</italic>
approximately maximum-likelihood phylogenetic tree (FastTreeMP) (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Lastly, for clarity of presentation, we also generated a “simplified”
<italic>MUG</italic>
tree containing only the sequences most closely related to
<italic>MUG</italic>
by pruning branches more distant than the last common ancestor of all known
<italic>MUG</italic>
genes (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). For
<italic>FRS</italic>
, we used a similar approach, except the final pruning step was not performed because of the large evolutionary distance between certain
<italic>FRS</italic>
subtrees (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S4 and S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). To evaluate the robustness of the resulting
<italic>MUG</italic>
and
<italic>FRS</italic>
trees, we selected a subset of sequences (131) from each tree, representing all major clades, and performed analyses using two additional phylogenetic methods: a Bayesian (MrBayes) (
<xref rid="msw067-B61" ref-type="bibr">Ronquist and Huelsenbeck 2003</xref>
) and a Neighbor joining method (BioNJ) (
<xref rid="msw067-B22" ref-type="bibr">Gascuel 1997</xref>
) (see Materials and Methods). The resulting trees agreed with the topologies of the original approximately maximum-likelihood trees, including strong support for all key nodes (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S6–S9</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online).
<fig id="msw067-F2" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 2</label>
<caption>
<p>Simplified phylogenetic tree of
<italic>MUG</italic>
genes and TEs. Curated phylogenetic tree showing sequences descended from the last common ancestor of
<italic>MUGA</italic>
and
<italic>MUGB</italic>
, rooted at fungal
<italic>hop</italic>
. Terminal triangles represent clades, with circumferential width proportional to number of genes (see key). In clades with genes from only one or a few species the species are labeled, otherwise clades are labeled according to taxon. Clades and branches from
<italic>Amborella trichopoda</italic>
are colored red. Major clades of
<italic>MUGA</italic>
and
<italic>MUGB</italic>
are labeled according to
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. (2012)</xref>
and the positions of
<italic>Arabidopsis</italic>
<italic>thaliana AtMUG1-8</italic>
genes are indicated. For simplicity, TE features are categorized by the number of TE characteristics (out of 3) associated with each clade to emphasize differences between clades of known ETEs and clades of putative TEs (see key). For the same tree including detailed TE characteristics, see
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figure S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online. Radial branch lengths are proportional to the inferred number of substitutions per site (circumferential branch length is arbitrary). Circles at internal nodes have color and size corresponding to “local support values” (Shimodaira–Hasegawa test [
<xref rid="msw067-B75" ref-type="bibr">Zeh et al. 2009</xref>
;
<xref rid="msw067-B52" ref-type="bibr">Oliver et al. 2013</xref>
]). Empty red diamonds indicate known exaptation events; red asterisks indicate putative novel exaptation events. Greek letters indicate branches referred to in the main text. Dashed lines indicate clades, dotted lines are species labels. Pink branches are used to highlight
<italic>Am. trichopoda</italic>
clades or individual sequences. See
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figure S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online, for a fully expanded phylogenetic tree.</p>
</caption>
<graphic xlink:href="msw067f2p"></graphic>
</fig>
</p>
</sec>
<sec>
<title>ETEs versus TEs: Two Distinctive Types of Clade</title>
<p>We now had phylogenies that included only the sequences most closely related to
<italic>MUG</italic>
or
<italic>FRS</italic>
. But how could we determine which clades are ETEs and which are TEs? Certain attributes of individual sequences (and families) can be used to differentiate between TEs and ETEs (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
). Three such attributes are intrinsic products of phylogenetic analysis (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
): 1) TE genes often have high copy-number (
<xref rid="msw067-B19" ref-type="bibr">Feschotte and Pritham 2007</xref>
), which is reflected in the phylogenetic tree by clades with high numbers of paralogs (and low numbers of orthologs). 2) Recently, active TE families may include genes with highly similar sequences, reflected by short terminal branches. 3) Many TEs are lineage-specific, reflected by incongruities between the topologies of the clades and those of the species phylogeny; in other words, whereas phylogenetic sister sequences of ETEs (and regular genes) are usually orthologs from sister species, sister clades of TEs are often from species that are not closely related.</p>
<p>To better distinguish TEs from ETEs, we also evaluated additional sequence characteristics (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S3</ext-link>
and
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online) (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
): 1) presence of frameshifts and in-frame stop codons; 2) repetitiveness of flanking DNA sequences; 3) presence of potential terminal inverted repeats (TIRs); and 4) presence of a peptidase C48 conserved domain, which is useful in identifying clades of sequences that lack TIRs but nevertheless are TEs (see Materials and Methods). We also searched for the PB1 domain, not because it helps in distinguishing TEs from ETEs, but because of its as-yet unexplained presence in
<italic>MUGB</italic>
. 5) In addition, we examine microsynteny, which has previously been used to identify putative ETEs and described for
<italic>MUG</italic>
and
<italic>FRS</italic>
genes across Brassicaceae genomes (
<xref rid="msw067-B14" ref-type="bibr">Cowan et al. 2005</xref>
;
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
). We extend this work by examining microsynteny in the diverse monocot and eudicots genomes examined in this study (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S10</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online).</p>
<p>Combining these phylogenetic and sequence characteristics, two distinctive types of subtree or clade become apparent. The first type is putative ETEs, including all known
<italic>MUG</italic>
or
<italic>FRS</italic>
genes. Focusing first on
<italic>MUG</italic>
(
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
), consistent with previous results (
<xref rid="msw067-B14" ref-type="bibr">Cowan et al. 2005</xref>
;
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
) and our single-species phylogenies (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), known
<italic>MUG</italic>
genes form two subtrees:
<italic>MUGA</italic>
and
<italic>MUGB</italic>
. Each subtree has several major clades consisting of orthologs from diverse species (
<italic>MUGA</italic>
, 74 sequences;
<italic>MUGB</italic>
, 78). Indeed, each subtree includes at least one sequence from every examined angiosperm (except
<italic>Amborella</italic>
<italic>trichopoda</italic>
; see below), but does not include any sequence from a nonangiosperm species. The topologies of each subtree and major clade are broadly congruent with the known species phylogeny (
<xref rid="msw067-B3" ref-type="bibr">Amborella Genome Project 2013</xref>
); that is, most branches (monocots vs. dicots, Arecales vs. Poales, asterids vs. rosids, Fabidae vs. Malvidae, etc.) agree with the species topology. Finally, in these putative ETE subtrees few sequences have TE characteristics (those that do are presumably false positives).</p>
<p>The second type of subtree or clade is putative TEs. This includes most of the remaining sequences (1,672 of 1,824 sequences in the
<italic>MUG</italic>
phylogeny). In contrast to the ETE subtrees, these clades are lineage-specific (they are from single or closely related species) and have sister clades from distantly rather than closely related species. They also often have short terminal branches. Although some small clades are ambiguous, many clades have multiple strong TE characteristics. For example, clade α (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
) has 14 sequences: all are from
<italic>Elaeis oleifera</italic>
, all have pseudogenic features (e.g., stop codons: min 2, mean 8), 43% have potential TIRs, and the clade has high DNA repetitiveness (median 51 copies;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Furthermore, while
<italic>E. oleifera</italic>
is a monocot, its nearest sister clade (β) is distant (0.86 subs/site), specific to the eudicot
<italic>Mimulus</italic>
<italic>guttatus</italic>
, and itself consists of 29 sequences with strong TE characteristics including several very short terminal branches (e.g., nine are shorter than 0.01 subs/site).</p>
</sec>
<sec>
<title>
<italic>MUGA</italic>
and
<italic>MUGB</italic>
Are Paraphyletic with Respect to TEs</title>
<p>Using our phylogeny labeled with genomic attributes that distinguish ETEs from TEs (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), we can now begin to address interesting evolutionary questions. First, did
<italic>MUGA</italic>
and
<italic>MUGB</italic>
originate together in a single exaptation event, or separately in two (or more) events? As expected,
<italic>MUGA</italic>
and
<italic>MUGB</italic>
are more closely related to one another (i.e., have a shorter genetic distance between the roots of their respective subtrees) than to the vast majority of apparent TE clades (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). However, a few TE clades are more closely related to either
<italic>MUGA</italic>
or
<italic>MUGB</italic>
; in other words,
<italic>MUGA</italic>
and
<italic>MUGB</italic>
are paraphyletic with respect to several TE clades. Indeed, in
<xref ref-type="fig" rid="msw067-F2">figure 2</xref>
, all sequences are descended from the last common ancestor of
<italic>MUGA</italic>
and
<italic>MUGB</italic>
, so this is true of all the apparent TEs included in this figure.</p>
<p>For example, perhaps the most interesting clade of apparent TEs is the sister clade to
<italic>MUGB</italic>
(γ), a large family of TEs (53 sequences) in the basal angiosperm
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
. Most originated in rapid expansions and all now are apparent pseudogenes (stop codons: min 3, mean 9.5) (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Consistent with their state of degeneration, only 11 are associated with high-identity TIRs and the median DNA copy-number is 22. Interestingly, many (13) of the sequences are associated with PB1 domains, suggesting that the
<italic>MUGB</italic>
PB1 domain originated from its last common ancestor with these TEs (see below). Note that although this phylogeny also suggests that
<italic>MUGA</italic>
has a small
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
sister clade (two sequences) (δ), the placement of this particular clade is uncertain because these were among the highly degenerate sequences not removed in the final phylogeny, and furthermore the placement of this clade was unstable between multiple independent builds of the full phylogeny (not shown).</p>
<p>In addition to clade γ, there are two additional branches of putative TEs. One is again from
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
(ε), but it has low local branch support (16%) so may not truly be a distinct branch. The other includes sequences from eight additional species (ζ), and although not all phylogenetic relationships between these putative subclades are well resolved, there is strong support that this branch as a whole is ancestral to
<italic>MUGB</italic>
(100% local branch support), and conversely that the
<italic>MUGA</italic>
branch is ancestral to it (95% local branch support). Note that one of the eight species in branch ζ is
<italic>Vitis</italic>
<italic>vinifera</italic>
, even though a previous analysis failed to identify any
<italic>V. vinifera</italic>
MULEs paraphyletic to
<italic>MUGA</italic>
and
<italic>MUGB</italic>
(
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
), possibly because that analysis included only draft
<italic>V. vinifera</italic>
sequences (no other genomes) and a short
<italic>mudrA</italic>
subsequence (
<xref rid="msw067-B6" ref-type="bibr">Benjak et al. 2008</xref>
). Our topology is also supported by all respective single-genome phylogenies (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), by multiple independent builds of the full phylogeny using a range of curation settings (not shown), and by analyses using multiple phylogenetic methods (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S6 and S8</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online).</p>
</sec>
<sec>
<title>Paraphyly Implies Separate Exaptation Events Because ETE Reversion Is Unlikely</title>
<p>Our results thus provide strong evidence that
<italic>MUGA</italic>
and
<italic>MUGB</italic>
are paraphyletic with respect to TEs. How might such a topology have arisen? The simplest explanation is straightforward:
<italic>MUGA</italic>
and
<italic>MUGB</italic>
originated in separate exaptation events (
<xref ref-type="fig" rid="msw067-F3">fig. 3
<italic>b</italic>
and
<italic>c</italic>
</xref>
) and the TE branches simply reflect the evolutionary history of the progenitor TE families.
<fig id="msw067-F3" orientation="portrait" position="float">
<label>Fig. 3</label>
<caption>
<p>One exaptation versus two. Differences in phylogenetic relationships that would result if two ETEs originate in a single exaptation versus separate exaptations. T*, common ancestral TE; T, T0, and T1, extant TEs; A and B, ETEs; empty diamonds, exaptation events; filled diamond, hypothetical ETE reversion event. (
<italic>a</italic>
) If A and B originate in a single exaptation, then A and B should not be paraphyletic—descended from a common evolutionary ancestor, but not including all the descendant groups—to any TE family (but see case
<italic>d</italic>
). (
<italic>b</italic>
) If A and B originate in separate exaptations of a single TE family (T), then A and B are paraphyletic with respect to T.(
<italic>c</italic>
) If A and B originate in separate exaptations of different (but related) TEs families (T0 and T1), then A and B are paraphyletic with respect to both T0 and T1. (
<italic>d</italic>
) Hypothetically, if an ETE were able to revert to being a TE (T) after the ETE family had already differentiated into at least two branches (A and B), it could also result in paraphyly with respect to the TE even if the ETE family had originated in a single exaptation event.</p>
</caption>
<graphic xlink:href="msw067f3p"></graphic>
</fig>
</p>
<p>But could there be another explanation? Theoretically, such paraphyly might also arise from a reversal of the TE exaptation process; that is, by ETEs transforming into TEs (
<xref ref-type="fig" rid="msw067-F3">fig. 3
<italic>d</italic>
</xref>
). Is ETE reversion plausible? Consider the functional changes and underlying mutations required for TE exaptation versus those that would be required for ETE reversion. Fundamentally, the process of TE exaptation involves a transition from persisting by self-replicative selection to persisting by phenotypic selection. That is, TEs persist by replicating within a genome to escape disabling mutations, whereas ETEs are conserved by conferring phenotypic benefits to the organism. To achieve this transition, TE exaptation entails various functional changes. The most basic of these follows directly from this fundamental change in selection regime: whereas for TEs transposition is essential for the sequence to persist, for ETEs it is not only nonessential, but harmful since transposition may disrupt ETE expression or cause other deleterious mutations. As a consequence, upon exaptation we expect ETE genes to become immobilized and their mobility-related flanking DNA sequences such as TIRs to become degraded or deleted. Indeed, all known (well-supported) ETE genes are immobilized. In addition, the phenotypic functions of ETE-encoded proteins rarely involve mobility-related molecular activities such as DNA cleavage or integration, which again are deleterious to the genome. Thus, with the rare exception of ETEs that have retained such activities in tightly controlled contexts, such as V(D)J recombination (
<xref rid="msw067-B37" ref-type="bibr">Kapitonov and Jurka 2005</xref>
), ETEs lose their ability to perform various molecular activities required for transposition. Yet another change is that whereas most TEs are usually silenced, ETEs genes must be expressed at relatively high levels in order to confer phenotypic benefits.</p>
<p>Although TE exaptation involves several functional changes, as evidenced by known ETEs such as
<italic>MUG</italic>
and
<italic>FRS</italic>
, it does occur at some frequency. We propose two reasons for this. First, each underlying mutation has high probability. For example, any one of many possible point mutations to a transposase could decrease or nullify its ability to catalyze transposition. In addition, TEs frequently sustain deletions, and any sufficiently long deletion in a TIR may lead to immobilization (
<xref rid="msw067-B19" ref-type="bibr">Feschotte and Pritham 2007</xref>
;
<xref rid="msw067-B65" ref-type="bibr">Sinzelle et al. 2009</xref>
). Furthermore, some mutations can have dual consequences. For example, transcriptional silencing in plants of DNA transposons is largely mediated by RNA-directed DNA methylation focused at the TIRs (
<xref rid="msw067-B39" ref-type="bibr">Kawashima and Berger 2014</xref>
); thus, partial deletion of a TIR could result in both immobilization and desilencing. The second reason we propose that TE exaptation can occur despite requiring multiple mutations is that none of these mutations are required for phenotypic selection pressure to begin; thus, they could occur independently and in any order. Fundamentally, this is because the molecular activities that permit an ETE to produce beneficial phenotypes are inherent to the TEs themselves. For example, DNA transposases such as
<italic>mudrA</italic>
are often exapted to become transcription factors, utilizing the TEs molecular functions of specific DNA binding and protein-protein interaction. Thus, a transposase could produce a beneficial phenotype and become a nascent ETE even before being immobilized, allowing phenotypic selection to drive the TE exaptation process (
<xref rid="msw067-B27" ref-type="bibr">Hoen and Bureau 2012</xref>
).</p>
<p>In contrast, consider the functional changes and mutations that would be required for an ETE to revert to being a TE. Note that for a reversion to be preserved in the phylogenetic record, the ETE family would first need to diversify into at least two well-separated branches; otherwise it would simply appear to be a regular TE family. First, the ETE would need to reacquire mobility-related flanking DNA structures such as TIRs. Unlike deleting them, reacquiring TIRs would seem exceedingly difficult. How might it occur? The following series of four mutations (not necessarily in this order) might for example permit TIR reacquisition: 1) a TE (with TIRs) inserts close to one side of the ETE; 2) a second TE of the same family inserts close to the other side; 3) the interior TIR is deleted from one TE; and 4) the interior TIR is deleted from the second TE. Another possible mechanism of TIR acquisition might be transduplication, which is the direct capture of genomic sequences by certain types of TEs; however, transduplication rarely if ever results in the duplication of entire genes (
<xref rid="msw067-B35" ref-type="bibr">Juretic et al. 2005</xref>
). Regardless of how it might occur, TIR acquisition could theoretically allow the ETE to become mobilized
<italic>in trans</italic>
by a transposase encoded by the TE family that donated the TIRs. But even this would not be sufficient to reestablish self-mobility and thus selfish selection, because the encoded protein would also need to reacquire any required molecular functions it had lost, such as DNA cleavage and integration, and doing so might require multiple, specific amino acid substitutions. Finally, the revertant transposase would need to be able to specifically bind particular target sequences in the reacquired TIRs, which would require additional specific amino acid substitutions to its DNA-binding domain (except in the seemingly unlikely event that the TIRs were reacquired from same TE family from which the ETE descended, and binding-site specificity had been retained during the intervening evolutionary period). Furthermore, and most crucially, these mutations would all need to occur before selfish selection could even begin to act; that is, they would need to be simultaneous.</p>
<p>So, whereas TE exaptation could occur with relatively few, independent loss-of-function mutations and be driven by phenotypic selection, ETE reversion would require a larger set of gain-of-function mutations that must occur simultaneously. Therefore, while theoretically possible, reversion of a well-established ETE seems extremely improbable. Indeed, while a large and growing number of ETEs have been reported in the literature (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
), there are no reports of revertant ETEs, nor do we find evidence of it in either the
<italic>MUGA</italic>
or
<italic>MUGB</italic>
subtrees, nor in any of the
<italic>FRS</italic>
subtrees (see below). Finally, suppose that ETE reversion did very occasionally occur. We would still have difficulty explaining the paraphyly of
<italic>MUGA</italic>
and
<italic>MUGB</italic>
because the tree topology would require not just one, but at least two ETE reversions: one for each of the two (well-supported) TE branches (γ and ζ). Thus, the simplest explanation by far of the observed phylogeny is that
<italic>MUGA</italic>
and
<italic>MUGB</italic>
formed in separate exaptation events. Indeed, as shown below, differences between the two subtrees suggest that they also originated far apart in time, and thus ought to be considered as separate families, a conclusion further supported by differences in their gene structures (e.g., PB1) and experimental evidence (see below).</p>
</sec>
<sec>
<title>
<italic>MUGB</italic>
Originated in the Angiosperm Crown Group and Diversified in Monocots and Eudicots</title>
<p>Our approach enabled us to identify TEs closely related to
<italic>MUGA</italic>
and
<italic>MUGB</italic>
and show that they originated in separate exaptation events. By examining the detailed phylogenies of each ETE subtree, we can also resolve the timing of the original exaptation events, as well as the pattern and timing of diversification within each family subsequent to exaptation.</p>
<p>The
<italic>MUGB</italic>
subtree includes clades of single or low copy-number homologs in all examined crown monocots and eudicots (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Consistent with previous results (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
), monocot and eudicot homologs form two monophyletic subtrees. This shows that the progenitor
<italic>MUGB</italic>
gene did not undergo duplication prior to the monocot-eudicot split, suggesting it may have originated not long before the split. The basal branches of the
<italic>MUGB</italic>
tree confirm this. In the basal-most extant angiosperm genome,
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
, even though we detected both a large number of MULEs (see above) and several
<italic>MUGA</italic>
homologs (see below), we found no
<italic>MUGB</italic>
homolog. We did however find putative
<italic>MUGB</italic>
homologs in EST assemblies of the second most basal lineage,
<italic>N</italic>
<italic>u</italic>
<italic>. advena</italic>
, as well as the magnoliids
<italic>P</italic>
<italic>e</italic>
<italic>. americana</italic>
and
<italic>Liriodendron tulipifera</italic>
, suggesting that
<italic>MUGB</italic>
exaptation likely occurred between the divergence of the Amborellales and the Nymphaeales (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S1 and S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). This places the origin of
<italic>MUGB</italic>
near the beginning of the angiosperm radiation (∼145 Ma), a period which produced all five major angiosperm lineages including the magnoliids, the monocots, and the eudicots (
<xref rid="msw067-B3" ref-type="bibr">
<italic>Amborella</italic>
Genome 2013</xref>
;
<xref rid="msw067-B76" ref-type="bibr">Zeng et al. 2014</xref>
).</p>
<p>Not only does the topology near the root of the entire
<italic>MUGB</italic>
subtree enable us to resolve the timing of the origin of
<italic>MUGB</italic>
, but similarly the topology of its internal clades enables us to resolve the timing of subsequent
<italic>MUGB</italic>
duplication events.
<italic>MUGB</italic>
has two main monocot-specific clades with long root branches: Bm1 (0.18 subs/site) and Bm2 (0.13 subs/site), suggesting that the single progenitor
<italic>MUGB</italic>
gene duplicated once in early monocot evolution. Each of these two clades is composed of diverse species including representatives of both examined monocot orders (Arecales and Poales). In addition to this early duplication, Bm1 and Bm2 each subsequently underwent duplications prior to Poales diversification, as well as further duplications in certain lineages (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Together with losses in certain lineages, these duplications resulted in
<italic>MUGB</italic>
having between two and five (median 3) paralogs per monocot genome.</p>
<p>In eudicots, the pattern of
<italic>MUGB</italic>
diversification is somewhat different. There are homologs in the basal eudicot
<italic>Aquilegia</italic>
<italic>coerulea</italic>
(Ranunculales) but not in
<italic>N. nucifera</italic>
(Proteales), and
<italic>MUGB</italic>
is divided into three major eudicot clades (Be1, Be2, and Be3) (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). Clade Be3, which diverged first and has a particularly long root branch (0.27 subs/site), includes homologs in all examined crown eudicot species except
<italic>Glycine max</italic>
. The Be3 Brassicales subclade (which includes
<italic>AtMUG8</italic>
) has a particularly long root branch (0.42 subs/site), as do the other
<italic>MUGB</italic>
Brassicales subclades. The two remaining major eudicot clades, Be1 and Be2, resulted from a far more recent duplication (root branch lengths of 0.08 and 0.07 subs/site, respectively), yet each also includes all examined crown eudicot species, except that Be2 contains no homolog from
<italic>Eutrema</italic>
<italic>salsugineum</italic>
and Be1 contains none from
<italic>Manihot esculenta.</italic>
These and additional duplications and losses have resulted in
<italic>MUGB</italic>
having three to eight (median 4) paralogs per eudicot genome, typically one per subclade. Note that higher copy-numbers in certain genomes resulted from recent duplications and many are pseudogenes; for example, seven of eight
<italic>Malus</italic>
<italic>domestica MUGB</italic>
genes have premature stop codons (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). In addition to the three previously assigned clades, this phylogeny suggests that one of the subclades of Be2 (θ) might better be considered as a fourth major eudicot clade.</p>
</sec>
<sec>
<title>
<italic>MUGA</italic>
Originated and Diversified in the Angiosperm Stem Group</title>
<p>Although
<italic>MUGB</italic>
originated during the angiosperm radiation,
<italic>MUGA</italic>
originated much earlier. The first clue of this early origin is the absence of any well-supported TE clades closely related to
<italic>MUGA</italic>
. Conclusive evidence is provided by the topology and basal branches of the
<italic>MUGA</italic>
subtree.
<italic>MUGA</italic>
consists of three major clades (A1, A2, and A3) that, unlike the
<italic>MUGB</italic>
clades, each include orthologs from all major angiosperm lineages examined, including monocots and eudicots (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
). Thus,
<italic>MUGA</italic>
underwent two duplications prior to the divergence of monocots and eudicots. In addition,
<italic>MUGA</italic>
includes homologs not only in the magnoliids
<italic>P</italic>
<italic>e</italic>
<italic>. americana</italic>
and
<italic>L. tulipifera</italic>
, but importantly also in the basal angiosperm
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
. Furthermore, these basal branches do not stem from the base of the
<italic>MUGA</italic>
tree, but instead there are
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
and other basal angiosperm branches specific to each of the three major
<italic>MUGA</italic>
clades. Therefore,
<italic>MUGA</italic>
must have originated and diversified at least as early as the angiosperm stem group, prior to the radiation of all extant angiosperms. Indeed, the root branches of the two best-supported major clades (A1 and A2) are long (0.2 subs/site), suggesting that the exaptation and initial diversification of
<italic>MUGA</italic>
likely occurred long before this angiosperm radiation.</p>
<p>Subsequent to its initial diversification,
<italic>MUGA</italic>
did not undergo any further duplications, except for clade A2 in early core eudicots and certain species-specific duplications. Interestingly, the earliest-diverging clade (A3) is also the least conserved between taxa. For example, although it is present in
<italic>Carica</italic>
<italic>papaya,</italic>
a basal Brassicales of the same order as
<italic>Arabidopsis</italic>
, it does not include homologs from the Brassicaceae (e.g.,
<italic>A. thaliana</italic>
) (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). Also, while it does include monocots of the order Arecales (
<italic>E. oleifera</italic>
and
<italic>P. dactylifera</italic>
), no homologs were found in the examined Poales (
<italic>Zea</italic>
<italic>mays,</italic>
<italic>Oryza</italic>
<italic>sativa, Panicum virgatum,</italic>
and
<italic>Setaria italica</italic>
). Conversely, clades A1 and A2 each include homologs from both of these monocot orders. As a consequence of these diversification events,
<italic>MUGA</italic>
copy-number ranges from one (
<italic>Solanum</italic>
<italic>lycopersicum</italic>
,
<italic>S. tuberosum</italic>
, and
<italic>M. esculenta</italic>
; all in clade A2) to seven (
<italic>M</italic>
<italic>a</italic>
<italic>. domestica</italic>
), with most angiosperms having three or four
<italic>MUGA</italic>
paralogs (median 3).</p>
<p>Could
<italic>MUGA</italic>
have originated even earlier, prior to the divergence of angiosperms and gymnosperms? We searched the genomes of nonangiosperm species, including the assembled genome of the gymnosperm
<italic>P. abies</italic>
and EST assemblies for
<italic>Pinus sylvestris, Abies sibirica, Juniperus communis,</italic>
and
<italic>Gnetum gnemon</italic>
, but found no potential
<italic>MUG</italic>
homologs. We also found no homologs using supplementary TBLASTN searches (query
<italic>AtMUG1</italic>
; default E-value, 1e-3) of the genome assemblies (
<ext-link ext-link-type="uri" xlink:href="http://congenie.org">http://congenie.org</ext-link>
, last accessed October 23, 2015) of
<italic>Picea glauca</italic>
(white spruce; PG29-v4.0) (
<xref rid="msw067-B7" ref-type="bibr">Birol et al. 2013</xref>
) and
<italic>Pinus taeda</italic>
(loblolly pine; v1.0) (
<xref rid="msw067-B79" ref-type="bibr">Zimin et al. 2014</xref>
). Although negative search results such as these cannot definitively rule out the presence of
<italic>MUGA</italic>
, it is revealing that members of both
<italic>MUG</italic>
families have been found in virtually every angiosperm that has been searched, including full genome assemblies, EST assemblies, and even in most sufficiently large EST databases (both in this study and in
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. [2012</xref>
]), yet has not been found in any nonangiosperm. Thus,
<italic>MUGA</italic>
likely originated early in angiosperm evolution, subsequent to divergence from gymnosperms (estimated at 290–310 Ma), but well before the angiosperm radiation (
<xref rid="msw067-B76" ref-type="bibr">Zeng et al. 2014</xref>
). Little is known about this lineage of preangiosperm species (the angiosperm stem group), which recent evidence suggests may have originated around 225–250 Ma in the Late-to-Middle Triassic (
<xref rid="msw067-B76" ref-type="bibr">Zeng et al. 2014</xref>
). We are just beginning to address whether
<italic>MUGA</italic>
may have played a role in the many crucial adaptations that occurred in the angiosperm stem group.</p>
</sec>
<sec>
<title>Experimental Results and
<italic>d</italic>
N/
<italic>d</italic>
S Analyses Suggest Functional Overlap within Families</title>
<p>As we show above, characterizing the phylogenetic patterns of ETE families and their cognate TEs is useful from an evolutionary standpoint because it elucidates when and how often TE exaptation and ETE diversification occurs. It is also interesting from a practical standpoint, since the evolutionary histories of ETEs may reflect their potential phenotypic functions, molecular interactions, and genetic redundancies.</p>
<p>To illustrate, consider the four
<italic>MUGA</italic>
paralogs in
<italic>A. thaliana</italic>
, which are of particular interest because we previously characterized some of them phenotypically (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). While single knockout mutants of
<italic>MUGA</italic>
genes show only subtle phenotypes under controlled laboratory conditions compared with wild-type Col-0 (
<xref ref-type="fig" rid="msw067-F4">fig. 4
<italic>A</italic>
</xref>
),
<italic>mug1 mug2</italic>
double mutants exhibit strong phenotypes for traits usually associated with plant fitness. Similarly
<italic>, MUGB</italic>
single mutants do not show strong phenotypes whereas certain double mutants, such as
<italic>mug7 mug8</italic>
, do have serious defects (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez, et al. 2012</xref>
). Here, we show that the other
<italic>MUGA</italic>
double mutant combinations, although they do exhibit phenotypes such as delayed flowering time and reduced rosette diameter and inflorescence height, these phenotypes appear under standard laboratory conditions to be weaker than for
<italic>mug1 mug2</italic>
(
<xref ref-type="fig" rid="msw067-F4">fig. 4
<italic>B</italic>
</xref>
) (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
).
<fig id="msw067-F4" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 4</label>
<caption>
<p>Phenotypes for
<italic>MUG</italic>
mutants, including the triple mutant
<italic>mug1 mug2 mug3</italic>
in
<italic>Arabidopsis thaliana</italic>
. (
<italic>A</italic>
<bold>)</bold>
Phenotypes of wild-type (Col-0), and
<italic>mug1</italic>
to
<italic>mug4</italic>
single mutants, based on traits that have been associated to fitness and cover the lifespan of the plant life cycle. The phenotypic assays for
<italic>mug1</italic>
and
<italic>mug2</italic>
were performed independently in two different growth chambers from
<italic>mug3</italic>
and
<italic>mug4</italic>
; hence the two results for Col-0. (
<italic>B</italic>
) Results of the phenotypic analysis for the five previously uncharacterized double mutant combinations.
<italic>n</italic>
<bold></bold>
= 60 plants. Images of 2
<bold>-</bold>
week-old seedlings grown on one-half MS media and representing double mutant combinations of  
<italic>MUGA.</italic>
(
<italic>C</italic>
) The table shows the results of the segregation ratio for 120 F2 plants following genotyping by PCR. On the bottom, image captures of 3
<bold>-</bold>
week-old
<italic>mug1 mug2</italic>
and
<italic>mug1 mug2 mug3</italic>
mutant seedlings heterozygous and homozygous for
<italic>mug3</italic>
, all grown on the same one-half MS media supplemented with 2% sucrose. Scale bar
<bold></bold>
= 1.3 cm. On the right, difference in size of
<italic>mug1 mug2</italic>
, and
<italic>mug1 mug2 mug3</italic>
mutant plants at 50 days after sterilization. Scale bar
<bold></bold>
= 4 cm. (
<italic>D</italic>
) Results of the phenotypic analysis for the other triple mutant combinations.
<italic>n</italic>
<bold></bold>
= 30 plants. On the bottom, images of 2
<bold>-</bold>
week-old seedlings grown on one-half MS media. On the right, image of 40-day
<bold>-</bold>
old mature plants for the triple mutants compared with Col-0. Scale bar
<bold></bold>
= 4 cm. For the phenotypic analyses, statistical significance is based on a two-sample student
<italic>t</italic>
-test ∝= 0.05; *
<italic>P</italic>
 < 0.05, **
<italic>P</italic>
 < 0.01, ***
<italic>P</italic>
 < 0.001.</p>
</caption>
<graphic xlink:href="msw067f4p"></graphic>
</fig>
</p>
<p>The evolutionary history of
<italic>MUGA</italic>
is a starting point to explain these differences.
<italic>MUGA</italic>
has two major clades that are conserved among all angiosperms: A1 and A2 (A3 is not present in
<italic>A. thaliana</italic>
—see above;
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S2 and S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Clade A1 has two subclades that diverged early in eudicot evolution, one of which includes
<italic>AtMUG1</italic>
, the other
<italic>AtMUG2</italic>
. Although further experiments are warranted to confirm these results, this phylogeny suggests that, whereas
<italic>AtMUG1</italic>
and
<italic>AtMUG2</italic>
may have subfunctionalized to perform eudicot-specific functions that are difficult to detect in single mutants under our growth conditions, they may also have redundancies for more deeply conserved functions that are revealed by the double mutants. Indeed, the topology is similar to that of
<italic>AtFAR1</italic>
and
<italic>AtFHY3</italic>
, which are known to have partially redundant functions and direct molecular interactions.</p>
<p>In addition to
<italic>AtMUG1</italic>
and
<italic>AtMUG2</italic>
, clade A1 includes a third
<italic>A. thaliana</italic>
paralog,
<italic>AtMUG3</italic>
. The above phylogeny-based reasoning suggests that mutating all three A1 paralogs should produce even stronger defects. To test this hypothesis, we generated a
<italic>mug1 mug2 mug3</italic>
triple mutant. The progeny of an F2 plant homozygous for
<italic>mug1</italic>
and
<italic>mug2</italic>
and heterozygous for
<italic>mug3</italic>
were screened for triple mutants (
<italic>n</italic>
 = 120) and homozygotes seedlings were successfully genotyped only when seeds were grown on media supplemented with addition of carbohydrates (2% sucrose vs. 1%). Segregation ratios suggest that
<italic>mug3</italic>
is recessive and segregating independently (
<xref ref-type="fig" rid="msw067-F4">fig. 4
<italic>C</italic>
</xref>
). The
<italic>mug1 mug2 mug3</italic>
triple mutant showed an additive phenotype that was more severe than the double mutant: increased pale yellow-green coloration, longer delays in flowering, and smaller overall size (
<xref ref-type="fig" rid="msw067-F4">fig. 4
<italic>C</italic>
</xref>
). In addition, whereas wild-type plants produce thousands of seeds, triple mutants yielded two orders of magnitude fewer seeds (average 30), and some plants yielded no seed at all (data not shown). These results support the hypothesis that clade A1 genes together perform critical functions, at least in
<italic>A. thaliana</italic>
.</p>
<p>Finally, there is a fourth
<italic>A. thaliana</italic>
paralog (
<italic>AtMUG4</italic>
), which belongs to a second major clade, A2, that diverged from clade A1 during angiosperm stem group evolution and is itself conserved among all angiosperms. Such a long period of divergence suggests that
<italic>AtMUG1</italic>
,
<italic>AtMUG2</italic>
, and
<italic>AtMUG3</italic>
may have greater functional overlap or genetic redundancy with one another than with
<italic>AtMUG4</italic>
. This phylogeny-based reasoning is supported by our experimental results, which show that double and triple mutant combinations involving
<italic>AtMUG4</italic>
display less severe defects than
<italic>mug1 mug2</italic>
and
<italic>mug1 mug2 mug3</italic>
(
<xref ref-type="fig" rid="msw067-F4">fig. 4
<italic>B</italic>
and
<italic>D</italic>
</xref>
). Finally, we have not been able to generate quadruple mutant of all
<italic>A. thaliana MUGA</italic>
genes, suggesting that these long-diverged lineages may still maintain redundancy for some deeply conserved angiosperm function and that the absence of the
<italic>MUGA</italic>
family may be lethal in
<italic>A. thaliana</italic>
.</p>
<p>In general, while ETEs from the same family might share functional redundancies or similarities, ETEs derived from separate exaptation events likely do not. This is because, fundamentally, TE exaptation is the acquisition of a novel phenotypic function by a sequence with no prior phenotypic function; thus, the novel functions acquired in different exaptations ought to be independent of one another. Nevertheless, we might expect that certain functional similarities between different ETE families could result from common attributes between the progenitor TEs, such as their molecular activities or expression patterns. Functional similarities might also arise from similar phenotypic selective pressures at their times of exaptation. Applying similar phylogenetic analysis and reasoning to
<italic>FRS</italic>
might also aid our understanding of experimental results for that group of ETEs (
<xref rid="msw067-B47" ref-type="bibr">Lin and Wang 2004</xref>
).</p>
<p>In addition to experimental analysis, we had previously examined selective pressures in
<italic>MUGA</italic>
and
<italic>MUGB</italic>
coding regions using
<italic>d</italic>
N/
<italic>d</italic>
S analysis and found evidence of purifying selection for the entire coding region encompassing three conserved domains found in progenitor TEs (
<xref rid="msw067-B14" ref-type="bibr">Cowan et al. 2005</xref>
;
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
). To explore this further, we selected 121 representative
<italic>MUG</italic>
sequences (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), generated a phylogenetic tree (BioNJ;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S11
<italic>A</italic>
</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), and estimated
<italic>d</italic>
N/
<italic>d</italic>
S ratios using CODEML. Overall,
<italic>d</italic>
N/
<italic>d</italic>
S for the whole tree suggests that 73% and 26% of sites, respectively, are under negative and positive selection. Results were similar using a branch-site model (test 2): 80% and 20%, respectively (“root branch 1” in
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S11
<italic>B</italic>
</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online).</p>
<p>To better understand selection on the
<italic>MUGA</italic>
subclades, we selected three additional branches labeled “2,” “3,” and “4” to represent clades A3, A2, and A1, respectively (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S11
<italic>A</italic>
</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). The
<italic>d</italic>
N/
<italic>d</italic>
S ratio for branch 2 (Clade A1), which encompasses
<italic>MUG1</italic>
-
<italic>MUG3</italic>
did not show significant positive selection and the branch appears to be mostly fixed under negative selection. In contrast, we detected strong positive selection on branch 3 (
<italic>P</italic>
 = 0.0057) (Clade A2), which encompasses
<italic>MUG4</italic>
and homologs, where the ω-value for a subset of sites is well above 1, suggesting that certain sites show stronger positive selection than overall identified by a Bayes Empirical Bayes (BEB) analysis for positive sites (“root branch 3” in
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S11
<italic>B</italic>
</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Four positively selected sites were detected in branch 3 (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S11
<italic>C</italic>
</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Interestingly, these amino acids lie on the border of a conserved domain (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S11
<italic>D</italic>
</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Although confirmation is needed, the observation that the
<italic>MUG1-MUG3</italic>
subtree branch is more “fixed” than
<italic>MUG4</italic>
suggests that the
<italic>MUG1-MUG3</italic>
ancestral sequence may have acquired a phenotypic function earlier than the
<italic>MUG4</italic>
ancestral sequence, possibly due to being domesticated separately from
<italic>MUG4</italic>
, which would be consistent with the tree topology (above). Alternately, it may suggest that the function of
<italic>MUG1-MUG3</italic>
became fixed early while
<italic>MUG4</italic>
continued to undergo subfunctionalization.</p>
</sec>
<sec>
<title>
<italic>FRS</italic>
Consists of Five Families that Originated and Diversified at Different Times</title>
<p>In addition to
<italic>MUG</italic>
, one other group of plant ETEs has been well-characterized:
<italic>FRS</italic>
(
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
). Although both
<italic>MUG</italic>
and
<italic>FRS</italic>
are derived from TEs of the MULE superfamily, their respective TE lineages (mudrA and
<italic>FAR1</italic>
) are highly diverged (
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
). Similar to
<italic>MUG</italic>
, a previously published phylogeny includes among the descendants of the last common
<italic>FRS</italic>
ancestor two branches of TEs (
<italic>LOM-1</italic>
in
<italic>O. sativa</italic>
and
<italic>M. truncatula</italic>
; and
<italic>Jittery</italic>
in
<italic>Z. mays</italic>
), suggesting that
<italic>FRS</italic>
may have originated in more than one exaptation event (
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
).</p>
<p>To test whether
<italic>FRS</italic>
originated in one or in multiple exaptation events, and to resolve the timing of exaptation and subsequent
<italic>FRS</italic>
diversification, we followed a similar approach as we did with
<italic>MUG</italic>
. We selected a representative query from each of five previously identified
<italic>FRS</italic>
lineages (
<italic>FRS10</italic>
,
<italic>FRS6</italic>
,
<italic>FHY3</italic>
,
<italic>FRS3</italic>
, and
<italic>FRS7</italic>
;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S12</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online) (
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
). We used TARGeT to search for homologs among the 25 final genomes, to which we added
<italic>O. sativa</italic>
and
<italic>M. truncatula</italic>
in order to include
<italic>LOM-1</italic>
. This resulted in 1,117 sequences, to which we added all 14 known
<italic>A. thaliana FRS</italic>
sequences, fungal
<italic>hop</italic>
as outgroup, and maize
<italic>Jittery</italic>
. We generated a multiple sequence alignment (MAFFT), curated it by removing columns with at least 50% gaps and using Gblocks to remove poorly conserved blocks (69% of 602 positions retained), and finally inferred a phylogenetic tree (FastTreeMP). Lastly, we used identical methods as for
<italic>MUG</italic>
to identify sequence attributes characteristic of TEs: premature stop codons, frameshifts, DNA repetitiveness, and potential TIRs, as well as the conserved domains PB1 and C48.</p>
<p>The results, while broadly consistent with the phylogeny of
<xref rid="msw067-B45" ref-type="bibr">Lin et al. (2007)</xref>
, were nonetheless surprising (
<xref ref-type="fig" rid="msw067-F5">fig. 5</xref>
tree;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S4 and S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Not only is
<italic>FRS</italic>
paraphyletic with respect to the two TE branches reported by
<xref rid="msw067-B45" ref-type="bibr">Lin et al., 2007</xref>
but moreover all five
<italic>FRS</italic>
subtrees are paraphyletic with respect to various apparent TE clades. Thus, each of these five subtrees likely arose in a separate exaptation event, making them separate ETE families (see above). The most obvious case is the
<italic>FRS10</italic>
subtree (18 sequences), which is ancestral to all other
<italic>FRS</italic>
subtrees (89% local support), as well as to a large subtree that includes diverse clades of apparent TEs (e.g.,
<italic>α</italic>
) (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). The four remaining
<italic>FRS</italic>
families can be analyzed as two pairs of nearest neighbors (
<italic>FHY3</italic>
and
<italic>FRS6</italic>
;
<italic>FRS7</italic>
and
<italic>FRS3</italic>
), both of which are also separated by apparent TEs. The
<italic>FHY3</italic>
(98 sequences) and
<italic>FRS6</italic>
(75 sequences) subtrees are paraphyletic with respect to diverse apparent TE clades (e.g., β: 100% local support; 12 sequences; all but one have pseudogenic features; DNA repetitiveness, 56). Similarly, the
<italic>FRS7</italic>
(29 sequences) and
<italic>FRS3</italic>
(123 sequences) subtrees are paraphyletic with respect to apparent TE clades in various eudicots and
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
(e.g., γ: 89% local support; five sequences; all five have pseudogenic features; potential TIRs, 60%; DNA repetitiveness, 13).
<fig id="msw067-F5" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 5</label>
<caption>
<p>Phylogenetic tree of
<italic>FRS</italic>
genes and TEs. Curated phylogenetic tree of all identified
<italic>FRS</italic>
sequences, rooted in fungal
<italic>hop</italic>
. The five
<italic>FRS</italic>
clades are labeled following
<xref rid="msw067-B45" ref-type="bibr">Lin et al. (2007)</xref>
. Putative TE clades that include
<italic>Jittery</italic>
and
<italic>LOM-1</italic>
are indicated. Attributes are labeled only if present in more than one sequence per clade and for clarity only selected putative TE clades are labeled. Terminal triangles represent clades, with circumferential width proportional to number of genes (see key). In clades with genes from only one or a few species the species are labeled, otherwise clades are labeled according to taxon. Clades and branches from
<italic>Amborella trichopoda</italic>
are colored red. For simplicity, TE features are categorized by the number of TE characteristics (out of 2) associated with each clade to emphasize differences between clades of known ETEs and clades of putative TEs (see key). For the same tree including detailed TE characteristics, see
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figure S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online. Radial branch lengths are proportional to the inferred number of substitutions per site (circumferential branch length is arbitrary). Circles at internal nodes have color and size corresponding to their “local support values” (Shimodaira–Hasegawa test [
<xref rid="msw067-B75" ref-type="bibr">Zeh et al. 2009</xref>
;
<xref rid="msw067-B52" ref-type="bibr">Oliver et al. 2013</xref>
]). Empty red diamonds indicate known exaptation events; red asterisks indicate putative novel exaptation events. Greek letters indicate branches referred to in the main text. Dashed lines indicate clades, dotted lines are species labels. Pink branches are used to highlight
<italic>Am. trichopoda</italic>
clades or individual sequences. See
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figure S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online, for a fully expanded phylogenetic tree.</p>
</caption>
<graphic xlink:href="msw067f5p"></graphic>
</fig>
</p>
<p>Furthermore, as for
<italic>MUGA</italic>
versus
<italic>MUGB</italic>
, multiple exaptation events is also supported by the internal topologies of the five
<italic>FRS</italic>
subtrees, which show that the five families originated and diversified at different times (
<xref ref-type="table" rid="msw067-T2">table 2</xref>
). Each of the
<italic>FRS</italic>
subtrees is broadly congruent with the species topology, but each has a different apparent last common ancestor (
<xref ref-type="fig" rid="msw067-F5">fig. 5</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online): two appear to have originated in early angiosperms (
<italic>FRS3</italic>
and
<italic>FRS10</italic>
), two in early eudicots (
<italic>FRS6</italic>
and
<italic>FRS7</italic>
), and one in early core eudicots (
<italic>FHY3</italic>
).
<table-wrap id="msw067-T2" orientation="portrait" position="float">
<label>Table 2</label>
<caption>
<p>Periods of Origin and Diversification of Known
<italic>MUG</italic>
and
<italic>FRS</italic>
Families
<xref ref-type="table-fn" rid="msw067-TF4">
<sup>a</sup>
</xref>
.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1">Stem Group Angiosperms
<xref ref-type="table-fn" rid="msw067-TF5">
<sup>b</sup>
</xref>
</th>
<th rowspan="1" colspan="1">Basal Angiosperms
<xref ref-type="table-fn" rid="msw067-TF6">
<sup>c</sup>
</xref>
</th>
<th rowspan="1" colspan="1">Monocots
<xref ref-type="table-fn" rid="msw067-TF7">
<sup>d</sup>
</xref>
</th>
<th rowspan="1" colspan="1">Eudicots
<xref ref-type="table-fn" rid="msw067-TF8">
<sup>e</sup>
</xref>
</th>
<th rowspan="1" colspan="1">Core Eudicots
<xref ref-type="table-fn" rid="msw067-TF9">
<sup>f</sup>
</xref>
</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="1" colspan="1">
<italic>MUGA</italic>
</td>
<td rowspan="1" colspan="1">Origin +2</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>MUGB</italic>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Origin</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>FRS3</italic>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Origin
<xref ref-type="table-fn" rid="msw067-TF10">
<sup>g</sup>
</xref>
</td>
<td rowspan="1" colspan="1">4</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>FRS10</italic>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Origin</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>FRS6</italic>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Origin</td>
<td rowspan="1" colspan="1">3</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>FRS7</italic>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Origin</td>
<td rowspan="1" colspan="1"></td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>FHY3</italic>
</td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1"></td>
<td rowspan="1" colspan="1">Origin +4</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Total</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">3</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">5</td>
<td rowspan="1" colspan="1">12</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="msw067-TF4">
<p>
<sup>a</sup>
Numerals are the number of postexaptation duplications occurring in a given period (interior nodes).</p>
</fn>
<fn id="msw067-TF5">
<p>
<sup>b</sup>
Includes monocots, dicots, and
<italic>Am. trichopoda</italic>
.</p>
</fn>
<fn id="msw067-TF6">
<p>
<sup>c</sup>
Includes monocots and eudicots but does not include
<italic>Am. trichopoda</italic>
.</p>
</fn>
<fn id="msw067-TF7">
<p>
<sup>d</sup>
Includes only monocots.</p>
</fn>
<fn id="msw067-TF8">
<p>
<sup>e</sup>
Includes only eudicots and includes
<italic>A. coerulea</italic>
.</p>
</fn>
<fn id="msw067-TF9">
<p>
<sup>f</sup>
Includes only eudicots and does not include
<italic>A. coerulea</italic>
.</p>
</fn>
<fn id="msw067-TF10">
<p>
<sup>g</sup>
May have originated in stem group (see text).</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
<p>Specifically, the
<italic>FRS3</italic>
family contains multiple monocot clades that include both Arecales and Poales, and similarly contains multiple clades that include diverse eudicots including the basal eudicot
<italic>A</italic>
<italic>q</italic>
<italic>. coerulea</italic>
. (Note that one monocot clade is not monophyletic with the other monocot clades, but has low local branch support (46%) and thus is likely mislocated.) Interestingly, the sister clade to this well-supported part of the
<italic>FRS3</italic>
subtree is a small
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
clade (δ) with no TE characteristics (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online) but only low local support (32%), and might also be part of the
<italic>FRS3</italic>
family. This suggests that
<italic>FRS3</italic>
was likely exapted in the basal angiosperms, or perhaps earlier in the angiosperm stem group, and subsequently diversified into various descendant lineages.</p>
<p>The
<italic>FRS10</italic>
family also includes monocot homologs, and although it also has an
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
sister clade, both
<italic>A</italic>
<italic>m</italic>
<italic>. trichopoda</italic>
sequences contain multiple stop codons and frameshifts. Furthermore, the clade has an ancestral eudicot branch, in violation of the species phylogeny (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), which includes a clade of six highly similar sequences in
<italic>G. max</italic>
that contain stop codons and frameshifts. These results suggest that
<italic>AtFRS10</italic>
and
<italic>AtFRS11</italic>
might have separate origins, but additional analysis would be required to confirm this. It seems more likely that they do form a single family, which originated in early angiosperms and diversified only once, in early core eudicots.</p>
<p>The remaining three
<italic>FRS</italic>
families appear to be eudicot-specific. In the case of the
<italic>FRS7</italic>
family,
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
reported monocot homologs; however, we found none, even though we included in our search both monocot genomes searched by
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
(
<italic>Z. mays</italic>
and
<italic>O. sativa</italic>
). To confirm, we used TARGeT to individually search all 11 monocots in our initial 62 genomes, yet still found no
<italic>FRS7</italic>
homologs (not shown). Instead, the most basal
<italic>FRS7</italic>
homologs we found were in
<italic>A</italic>
<italic>q</italic>
<italic>. coerulea</italic>
, suggesting that
<italic>FRS7</italic>
originated in early eudicots. Interestingly, unlike other
<italic>FRS</italic>
and
<italic>MUG</italic>
families,
<italic>FRS7</italic>
is single-copy in most genomes, with only one widely conserved duplication, which occurred in the early Brassicaceae.</p>
<p>Lastly, the
<italic>FHY3</italic>
family is of particular interest because it includes
<italic>AtFHY3</italic>
and
<italic>AtFAR1</italic>
, currently the best characterized plant ETEs (
<xref rid="msw067-B72" ref-type="bibr">Wang and Wang 2015</xref>
). The
<italic>FHY3</italic>
family includes homologs in diverse core eudicots including the asterids (e.g.,
<italic>M. guttatus</italic>
), but not in
<italic>A</italic>
<italic>q</italic>
<italic>. coerulea</italic>
, suggesting it likely originated in early core eudicots. Thus, interestingly, among the seven
<italic>MUG</italic>
and
<italic>FRS</italic>
families, the best-characterized family also happens to be the youngest. Furthermore, it has the distinction of being present in more core eudicot-specific clades than any other clade: five of them, each with a single paralogs in most core eudicots, including both rosids and asterids. Consistent with previous results (
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
),
<italic>AtFHY3</italic>
and
<italic>AtFAR1</italic>
are the sole
<italic>A. thaliana</italic>
paralogs in neighboring clades.</p>
</sec>
<sec>
<title>Potential Novel ETEs</title>
<p>Our results also suggest that the two
<italic>MUG</italic>
and five
<italic>FRS</italic>
subtrees may not be the only ETEs in these phylogenies. A few additional clades (at least nine clades containing 126 sequences;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online) have attributes suggesting that they may also be ETEs rather than TEs, attributes such as low copy-number, high proportions of paralogs to orthologs, and topologies congruent with the known species phylogeny.</p>
<p>In the simplified
<italic>MUG</italic>
tree (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), there is one such clade (λ): three sequences that are single-copy in
<italic>V. vinifera</italic>
,
<italic>S. lycopersicum</italic>
, and
<italic>S. tuberosum</italic>
, none of which have premature stop codons or frameshifts, TIRs, or repetitive flanking DNA. However, this clade is missing sequences in several sister species, so unlike the
<italic>MUG</italic>
and
<italic>FRS</italic>
families if this clade is an ETE family it is only weakly conserved.</p>
<p>More convincing cases are found in the
<italic>FRS</italic>
tree (
<xref ref-type="fig" rid="msw067-F5">fig. 5</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). For example, clade ε is closely related to
<italic>FRS6</italic>
but separated by a large family of TEs in citrus (β) and other taxa. The clade includes orthologs from all examined monocots in several species-congruent clades, a topology similar to the monocot clades of the
<italic>FRS3</italic>
family. Only six of the 47 sequences have pseudogenic characteristics (five from
<italic>E. oleifera</italic>
alone), and the clade has no other TE characteristics. To further investigate this clade, we examined the six
<italic>O. sativa</italic>
paralogs using Genomicus (
<xref rid="msw067-B49" ref-type="bibr">Louis et al. 2013</xref>
) to determine whether they have maintained microsynteny with at least two adjacent genes, a characteristic common to most ETEs but not functional TE genes (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
). Unlike a negative control of 25 known TEs that had no microsynteny beyond Oryza, and similar to a positive control of
<italic>FRS3</italic>
homologs, in clade ϵ five of six
<italic>O. sativa</italic>
sequences do have conserved microsynteny among the Poaceae (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). These results suggest that clade ϵ is a family of bona fide monocot-specific ETEs.</p>
<p>An alternative explanation for this topology is that, rather than novel ETEs, clade ε might be part of the
<italic>FRS6</italic>
family, even though the intervening apparent TE clades have high local branch support (100%). However, this is not the only such subtree in the
<italic>FRS</italic>
phylogeny. Clade ζ, which is not closely related to any known
<italic>FRS</italic>
ETEs, has similar characteristics. Finally, perhaps the strongest example is clade θ, which consists of 25 species-congruent sequences that are conserved in diverse core eudicots and, except a single pseudogene, have no TE characteristics. A caveat to this interpretation is that putative TE sequences in the
<italic>FRS</italic>
phylogeny generally have fewer pseudogenic and other TE characteristics than sequences in the
<italic>MUG</italic>
phylogeny, making the distinction between ETEs and TEs less apparent (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Thus, although these clades have intriguing characteristics, further analysis is needed to determine whether they are indeed novel ETEs, unusual TE families, or artifacts.</p>
<p>If these clades do represent novel ETE families, it would be consistent with our recent finding that ETEs may be far more abundant than is currently understood (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
). Note that none of the novel ETEs we reported in that study are present in these phylogenies because none are related closely enough to
<italic>MUG</italic>
or
<italic>FRS</italic>
. Conversely, none of the potential novel ETEs reported here could have been found in that previous study because none include an
<italic>A. thaliana</italic>
ortholog.</p>
<p>If some of these subtrees do represent novel ETEs, it increases even further the contribution of TE exaptation to angiosperm evolution, especially in monocots. Indeed, this would not be surprising, given that three of five known
<italic>FRS</italic>
families are eudicot-specific while none are monocot specific. This is likely due to a selection bias: the initial search for
<italic>FRS</italic>
genes was restricted to the
<italic>A. thaliana</italic>
genome (
<xref rid="msw067-B47" ref-type="bibr">Lin and Wang 2004</xref>
) and subsequent phenotypic characterization was apparently limited to close orthologs of the initial twelve
<italic>FRS</italic>
genes found in
<italic>A. thaliana</italic>
(
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
).</p>
</sec>
<sec>
<title>C48-MULEs Form Widely Diverged Clades, Some with TIRs</title>
<p>In addition to characterizing ETEs, our results also uncovered TE clades with noteworthy characteristics. Along with three conserved domains normally present in the
<italic>mudrA</italic>
transposase, certain MULE families include a second gene,
<italic>Kaonashi</italic>
(
<italic>KI</italic>
), that contains a peptidase C48 domain normally found in
<italic>ubiquitin-like protein-specific proteases</italic>
(
<italic>ULPs</italic>
) (
<xref rid="msw067-B29" ref-type="bibr">Hoen et al. 2006</xref>
;
<xref rid="msw067-B70" ref-type="bibr">van Leeuwen et al. 2007</xref>
;
<xref rid="msw067-B6" ref-type="bibr">Benjak et al. 2008</xref>
). Although the function of
<italic>KI</italic>
is unknown, at least some KI-MULEs do not have easily identifiable TIRs, yet remain capable of transposition (
<xref rid="msw067-B29" ref-type="bibr">Hoen et al. 2006</xref>
).</p>
<p>In the full
<italic>MUG</italic>
tree (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), although three C48-MULE clades are closely related to
<italic>MUG</italic>
(in
<italic>V. vinifera</italic>
,
<italic>M. guttatus</italic>
, and
<italic>E. oleifera</italic>
), most are concentrated in the branches furthest from
<italic>MUG</italic>
. Consistent with previous results (
<xref rid="msw067-B42" ref-type="bibr">Le et al. 2000</xref>
;
<xref rid="msw067-B29" ref-type="bibr">Hoen et al. 2006</xref>
;
<xref rid="msw067-B6" ref-type="bibr">Benjak et al. 2008</xref>
), most C48-MULE clades have high proportions of associated sequences with potential TIRs. However, some do not; for example, a large clade that appears to have been recently active in the basal eudicot
<italic>Aq.</italic>
<italic>coerulea</italic>
. Furthermore, the branches of the tree containing most C48-MULE clades also includes clades that lack C48—some associated with TIRs, some not—a sporadic phylogenetic distribution similar to that of PB1 (see below), suggesting that C48 was lost from various lineages. Interestingly, several MULE clades are associated with both C48 and PB1 domains. Finally, in addition to those in the
<italic>MUG</italic>
tree, we found C48-MULEs in the
<italic>FRS</italic>
tree, in the genomes of
<italic>V. vinifera</italic>
,
<italic>Citrus clementina</italic>
,
<italic>Citrus sinensis</italic>
,
<italic>M. truncatula, N. nucifera, Theobroma cacao</italic>
, and
<italic>M. guttatus</italic>
. Consistent with previous results in melon (
<xref rid="msw067-B70" ref-type="bibr">van Leeuwen et al. 2007</xref>
), we also identified C48-MULEs in
<italic>FRS</italic>
tree (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), even though it is widely diverged from
<italic>MUG</italic>
.</p>
</sec>
<sec>
<title>The PB1 Conserved Domain Is Present in Diverse MULEs</title>
<p>Finally, in addition to peptidase C48, we surprisingly found certain TEs that contain another unusual domain. As discussed above,
<italic>MUGA</italic>
and
<italic>MUGB</italic>
have a key difference in their gene structures: every known
<italic>MUGB</italic>
gene but no
<italic>MUGA</italic>
gene contains a PB1 domain (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
). The origin of this domain in
<italic>MUGB</italic>
has been a mystery because, unlike other domains present in
<italic>MUG</italic>
, no MULE or indeed any TE has been reported to contain PB1.</p>
<p>Surprisingly, here we detected PB1 domains associated with apparent TEs in nine genomes in the
<italic>MUG</italic>
tree (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online), including several potentially active TE clades such as the sister clade of
<italic>MUGB</italic>
. PB1-MULEs might have previously remained undetected for several reasons. First, in the literature we could find no previous report of a specific search for the PB1 domain in TEs. Second, PB1-MULEs are present in only a small fraction of genomes (9 of 62 examined). Third, none of the genomes containing PB1-MULEs happen to be model genomes (e.g.,
<italic>A. thaliana</italic>
and
<italic>O. sativa</italic>
contain none). Finally, even among these nine genomes, although PB1 is abundant in MULEs that are closely related to
<italic>MUG</italic>
(found in 178 of 397 non-
<italic>MUG</italic>
sequences that are paraphyletic to
<italic>MUGA</italic>
and
<italic>MUGB</italic>
;
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
), it is rare among other MULEs (found in only 71 of 658 remaining sequences in the full
<italic>MUG</italic>
tree;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S2 and S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online) and in none of the sequences of the
<italic>FRS</italic>
tree. Discovery of these PB1-MULEs solves the mystery of the origin of the
<italic>MUGB</italic>
PB1 domain. Furthermore, these high copy-number PB1-MULE families may explain previous observations that PB1 domains are far more abundant in plants than in other kingdoms.</p>
<p>The detailed phylogenetic pattern of PB1-MULEs is lineage-specific and sporadic. Clades associated with PB1 are tightly interspersed with clades not associated with it, and even clades associated with PB1 have highly variable proportions of PB1-MULEs (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
). This sporadic distribution pattern may have arisen either because PB1 has been acquired multiple times in separate TE branches, or because it has been repeatedly lost. To determine which of these is more likely, we aligned the PB1 amino acid subsequence, including both PB1-MULEs and
<italic>MUGB</italic>
members, and inferred a separate phylogenetic tree. Except for minor differences that can be explained by the low information content of this domain, which is short (84 aa in
<italic>AtMUG7</italic>
) and has many variable positions (
<xref rid="msw067-B67" ref-type="bibr">Sumimoto et al. 2007</xref>
), the topology of the PB1 tree is broadly congruent with the
<italic>MUG</italic>
tree (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S13</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Although far from conclusive, these results are consistent with a single origin of PB1 in a common MULE ancestor followed by lineage-specific losses.</p>
<p>The origin of PB1 in MULEs is unknown, but given that PB1 has not been reported in other TEs, it may have been acquired through transduplication (
<xref rid="msw067-B35" ref-type="bibr">Juretic et al. 2005</xref>
), similar to how peptidase C48 domains may have been acquired (
<xref rid="msw067-B29" ref-type="bibr">Hoen et al. 2006</xref>
). This possibility is supported by the
<italic>MUGB</italic>
gene structure: compared with the other MULE transposase genes, PB1 occurs in an additional short 5′-exon, consistent with the general pattern of MULE transduplication. Transduplication is also supported by the recurrent deletion of PB1 from various MULE clades, suggesting that although it may somehow improve the transpositional success of PB1-MULE families, it is not essential. Interestingly, transduplication—the co-option of genes by TEs for a “selfish” function—is in a sense the evolutionary inverse of TE exaptation—the co-option of selfish TE genes for a phenotypic function. Thus if a MULE ancestor of
<italic>MUGB</italic>
did originally acquire PB1 by transduplication, there is an interesting corollary: the
<italic>MUGB</italic>
PB1 domains have likely undergone a complete co-evolutionary cycle, from phenotypic function to selfish function and back again.</p>
<p>How frequently are non-TE conserved domains transduplicated by TEs, then exapted from the TEs, and eventually have all evidence of their origin erased by extinction of the TE family? We can recognize
<italic>MUGB</italic>
as an ETE family because of its TE-specific MULE domain; however, many TEs have no TE-specific domain (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
), or may have lost any TE-specific domains during or following exaptation. Thus perhaps cycles of exaptation have enabled the amplification and diversification of not just PB1, but other sequences as well. If such cycles took place during primordial evolution, few traces of the origins of these sequences would remain in extant genomes (
<xref rid="msw067-B63" ref-type="bibr">Roussigne et al. 2003</xref>
;
<xref rid="msw067-B58" ref-type="bibr">Quesneville et al. 2005</xref>
;
<xref rid="msw067-B5" ref-type="bibr">Babu et al. 2006</xref>
).</p>
</sec>
</sec>
<sec>
<title>Summary and Conclusions</title>
<p>We have shown that through careful phylogenetic analysis of ETE families, we may obtain a better understanding of the evolutionary role of TE exaptation. By analyzing ETEs in angiosperms previously thought to constitute only two families,
<italic>MUG</italic>
and
<italic>FRS</italic>
, we have shown that they instead likely originated in a total of at least seven separate exaptation events, triple the number of TE exaptation events previously understood for these ETEs, for a total among the 22 final genomes of 281 ETEs out of 2,934 sequences. Furthermore, we report preliminary evidence suggesting that additional ETE families have yet to be characterized. These results confirm and expand upon another recent study in which we showed that the number of ETEs in
<italic>A. thaliana</italic>
is more than double that previously reported (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
).</p>
<p>In addition to improving our theoretical understanding of how TEs have contributed to genome evolution, there is another motivation to better resolving the phylogenetic history of ETEs. As we have shown for
<italic>MUGA</italic>
and
<italic>MUGB</italic>
, ETEs of different families may have similar broad phenotypes, such as delays in development or decreases in plant size. This may be explained by the fact that many different families often act in concert to generate complex traits. However, it would be surprising if common functions were shared by ETEs from different families that originated in separate exaptation events, especially from widely diverged TE families (e.g.,
<italic>FRS10</italic>
vs. the four other
<italic>FRS</italic>
families) or greatly separated in time (e.g.,
<italic>MUG1</italic>
and
<italic>MUG7</italic>
). For instance, while AtFHY3 and AtFAR1 have not only been shown to rescue each other but to heterodimerize (
<xref rid="msw067-B46" ref-type="bibr">Lin et al. 2008</xref>
), they have not been shown to complement any FRS protein outside the
<italic>FRS</italic>
family. Furthermore,
<italic>AtFHY3</italic>
and
<italic>AtFAR1</italic>
have been well characterized and shown to act as transcription factors, to bind to thousands of sites in the genome, to differentially regulate hundreds of genes under light or dark conditions, and to regulate far-red induced hypocotyl de-etiolation (
<xref rid="msw067-B71" ref-type="bibr">Wang and Deng 2002</xref>
). Thus, it is important to emphasize that each of the four additional
<italic>FRS</italic>
families, which are thus far largely uncharacterized, each have as much potential as
<italic>AtFHY3</italic>
and
<italic>AtFAR1</italic>
to impact plant function. For example, we have recently shown that ETEs are often involved in abiotic stress responses, including genes in both the
<italic>MUGA</italic>
and
<italic>MUGB</italic>
families, in at least four
<italic>FRS</italic>
families, and in a large set of novel ETEs (unpublished results;
<xref rid="msw067-B47" ref-type="bibr">Lin and Wang 2004</xref>
;
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
;
<xref rid="msw067-B54" ref-type="bibr">Ouyang et al. 2011</xref>
;
<xref rid="msw067-B21" ref-type="bibr">Gao et al. 2013</xref>
).</p>
<p>In conclusion, it has not gone unnoticed that the self-perpetuating nature of TEs, sometimes denigrated as selfish, endows in them the capacity to act as agents of periodic rapid evolution (
<xref rid="msw067-B28" ref-type="bibr">Hoen and Bureau 2015</xref>
). This study uses phylogenetic analysis to investigate TE exaptation and highlights the importance of resolving the origin and evolution of ETE families. Such analyses can contribute greatly to our understanding of the potential functions and interactions of ETEs. We have shown that both the
<italic>MUG</italic>
and
<italic>FRS</italic>
groups of ETEs are not single families, but instead are derived from multiple exaptation events. These TE exaptations and subsequent ETE diversification contributed to all key stages of angiosperm evolution, from the early stem group, to the angiosperm radiation, to recent crown group radiations. In the future, such evolutionary histories will help improve the design and interpretation of experimental studies of ETEs and, we hope, will encourage additional investigations of as-yet uncharacterized ETE families.</p>
</sec>
<sec sec-type="materials|methods">
<title>Materials and Methods</title>
<p>To maximize our ability to find TEs closely related to
<italic>MUG</italic>
or
<italic>FRS</italic>
, as well as to identify basal and diverse ETEs in these groups, we searched a large number of genomes (62 species), including representatives from all major angiosperm lineages (49 species: 3 basals, 2 magnoliids, 11 monocots, 33 eudicots), six gymnosperms, five algae, one vascular plant, and one moss species (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary table S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). As queries, to maximize search sensitivity we selected seven amino acid sequences representing diverse
<italic>MUG</italic>
sequences, including monocots and eudicots from each previously identified
<italic>MUGA</italic>
and
<italic>MUGB</italic>
clade (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
), or
<italic>FRS</italic>
clade (
<xref rid="msw067-B45" ref-type="bibr">Lin et al. 2007</xref>
;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online).</p>
<sec>
<title>Genomes Selection</title>
<p>Because we expected to find thousands of sequences not closely related to
<italic>MUG</italic>
or
<italic>FRS</italic>
and therefore not of interest, to reduce the size of final analysis we devised a strategy to identify only the genomes of potential interest (
<xref ref-type="fig" rid="msw067-F1">fig. 1</xref>
). First, we determined which genomes contained any sequence of interest by searching each genome individually. We used a customized command-line version of the TARGeT pipeline (Tree Analysis of Related Genes and Transposons) (v2.00; [
<xref rid="msw067-B26" ref-type="bibr">Han et al. 2009</xref>
]; Cavinder B, personal communication) along with TBLASTN (v2.2.26) to align queries and genomes (see below), PHI (v2.4; [
<xref rid="msw067-B26" ref-type="bibr">Han et al. 2009</xref>
];
<ext-link ext-link-type="uri" xlink:href="http://target.iplantcollaborative.org/">http://target.iplantcollaborative.org/</ext-link>
, last accessed February 3, 2016) to join local alignments and count stop codons and frameshifts, MAFFT (v7.158b; option –max_iterate 100; [
<xref rid="msw067-B26" ref-type="bibr">Han et al. 2009</xref>
]) to generate multiple alignments, and FastTreeMP (v2.1.7 SSE3 OpenMP; option –gamma; [
<xref rid="msw067-B38" ref-type="bibr">Katoh and Standley 2013</xref>
]) to infer phylogenetic trees. For the initial search, we selected a permissive similarity threshold (TBLASTN E-value, 1e-30) in order to identify even distantly related putative
<italic>MUG</italic>
homologs.</p>
<p>We selected for further analyses the genomes that fulfilled one or both of the following criteria: they contained apparent TEs that were descended from the last common ancestor of all
<italic>MUG</italic>
query sequences, or they had homologs associated with PB1 domains. In addition, we included for further analyses three genomes with key positions in the species phylogeny: the basal eudicot
<italic>N. nucifera</italic>
, which has the unique biological feature of being an aquatic herbaceous species; the magnoliid
<italic>P</italic>
<italic>e</italic>
<italic>. americana</italic>
, which is basal to monocots and eudicots; and the basal angiosperm
<italic>N</italic>
<italic>u</italic>
<italic>. advena</italic>
.</p>
</sec>
<sec>
<title>Alignment, Curation, and Tree Building</title>
<p>We then searched the genomes of interest, again using a command-line version of TARGeT that we customized to search multiple genomes, using a similarity threshold selected to maximize stringency while still retaining all sequences of interest (TBLASTN E-value, 1e–55). We included the following sequences for the phylogenetic analysis: fungal
<italic>hop</italic>
, maize
<italic>mudrA</italic>
, maize
<italic>Jittery</italic>
, and all previously identified
<italic>MUG</italic>
genes (
<xref rid="msw067-B41" ref-type="bibr">Larsson 2014</xref>
), which were from the following six genomes:
<italic>C.</italic>
<italic>papaya</italic>
,
<italic>Sorghum bicolor</italic>
,
<italic>Oryza sativa</italic>
,
<italic>Brachypodium distachyon</italic>
,
<italic>Medicago truncatula</italic>
, and
<italic>A.</italic>
<italic>thaliana</italic>
(
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
;
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary figs. S2 and S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). We built a preliminary alignment (MAFFT), then removed 188 problematic sequences that contained long truncations, insertions, deletions, or frameshifts, resulting in poor alignment within highly conserved blocks. We then generated a final multiple alignment (MAFFT), which we curated by first removing columns with gaps in 50% of sequences or more, then using Gblocks (
<xref rid="msw067-B12" ref-type="bibr">Castresana 2000</xref>
) to retain only highly conserved alignment blocks (63% of 555 columns). We inferred a final
<italic>MUG</italic>
phylogenetic tree (FastTreeMP) (
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary Material</ext-link>
online). Lastly, for clarity of presentation, we made a simplified
<italic>MUG</italic>
tree containing only the sequences most closely related to
<italic>MUG</italic>
by pruning branches more diverged than the last common ancestor of all known
<italic>MUG</italic>
genes (
<xref ref-type="fig" rid="msw067-F2">fig. 2</xref>
). FigTree (v1.4.2;
<ext-link ext-link-type="uri" xlink:href="http://tree.bio.ed.ac.uk/software/figtree/">http://tree.bio.ed.ac.uk/software/figtree/</ext-link>
, last accessed February 3, 2016) was used to visualize the phylogenetic trees.</p>
<p>To validate the phylogenetic analysis, we used two additional methods: 1) neighbor joining using BioNJ/Neighbor (PHYLYP; v3.66; default parameters; 300 bootstraps;
<ext-link ext-link-type="uri" xlink:href="http://www.Phylogeny.fr">http://www.Phylogeny.fr</ext-link>
, last accessed February 3, 2016 [
<xref rid="msw067-B16" ref-type="bibr">Dereeper et al. 2008</xref>
,
<xref rid="msw067-B15" ref-type="bibr">2010</xref>
]); 2) Bayesian MCMC using MrBayes v3.2.6 (default parameters, except “heating temperature” 0.01 for
<italic>FRS</italic>
[
<xref rid="msw067-B33" ref-type="bibr">Huelsenbeck et al. 2001</xref>
;
<xref rid="msw067-B61" ref-type="bibr">Ronquist and Huelsenbeck 2003</xref>
;
<xref rid="msw067-B2" ref-type="bibr">Altekar et al. 2004</xref>
]), obtaining standard deviation of split frequencies of 0.008 after 2,000,000 generations for
<italic>MUG</italic>
and 0.052 after 2,000,000 generations for
<italic>FRS</italic>
.</p>
</sec>
<sec>
<title>Discriminate TEs from ETEs</title>
<p>To discriminate TEs from ETEs, we evaluated four sequence characteristics. 1) To evaluate pseudogenic features, we counted the number of stop codons and frameshifts as identified by PHI (
<xref rid="msw067-B12" ref-type="bibr">Castresana 2000</xref>
). 2) To identify flanking repetitive sequences, we aligned the DNA sequence flanking either side (3 kb) of each putative homolog to its respective genome (NCBI BLASTN v.2.2.29+; E-value, 1e-100; [
<xref rid="msw067-B26" ref-type="bibr">Han et al. 2009</xref>
]). To reduce artifacts caused by the presence of any repetitive sequences unrelated to the putative homologs (e.g., insertions of other TEs), we calculated for each putative homolog the minimum number of nonself-hits flanking either side, thus eliminating cases where a TE had inserted on only one side of the homolog. We then used the median of these repetitiveness values per clade, so that even if unrelated TEs had inserted on both sides of some elements, the repetitiveness measure would not be unduly affected, especially for large clades. 3) To identify potential TIRs, we again using the DNA sequences flanking each putative homolog, but this time aligned the two sides together (BLASTN; E-value, 0.01; reverse strand). Because the lengths of MULEs vary widely, we analyzed a large range of flanking lengths (1–30 kb), then chose a biologically reasonable representative length (10 kb) that had low (presumably false) positives among known
<italic>MUG</italic>
s and low (presumably false) negatives among other sequences. Finally, to detect Peptidase C48 domains we used NCBI RPS-TBLASTN (E
<italic>-</italic>
value, 0.01; v.2.2.29+; [
<xref rid="msw067-B11" ref-type="bibr">Camacho et al. 2009</xref>
]) and the NCBI Conserved Domain Database (
<xref rid="msw067-B11" ref-type="bibr">Camacho et al. 2009</xref>
) to search the genomic DNA sequence corresponding to each putative homolog plus 5 kb flanking each side. The same method was also used to identify PB1 domains.</p>
</sec>
<sec>
<title>Plant Material</title>
<p>To characterize the
<italic>MUG</italic>
mutant phenotypes, we used the approach and methods as described in
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. (2012</xref>
) (
<xref rid="msw067-B50" ref-type="bibr">Marchler-Bauer et al. 2011</xref>
). The mutants
<italic>mug1-1</italic>
(GK_514B01),
<italic>mug2-3</italic>
(SALK_090878),
<italic>mug3-1</italic>
(SALK_053113), and
<italic>mug4-2</italic>
(SALK_036408) were obtained from GABI-Kat (
<ext-link ext-link-type="uri" xlink:href="http://www.gabi-kat.de">http://www.gabi-kat.de</ext-link>
, last accessed February 3, 2016) (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
) and SALK (
<ext-link ext-link-type="uri" xlink:href="http://www.arabidopsis.org/abrc">http://www.arabidopsis.org/abrc</ext-link>
, last accessed February 3, 2016) (
<xref rid="msw067-B62" ref-type="bibr">Rosso et al. 2003</xref>
) T-DNA insertion populations. Positions of insertion sites in double mutants used in phenotypic analyses were confirmed by sequencing the allele-specific PCR products. Wild-type ecotype Col-0 seeds were originally obtained from Lehle Seeds (
<ext-link ext-link-type="uri" xlink:href="http://www.arabidopsis.com">www.arabidopsis.com</ext-link>
). For the triple mutant genotyping and phenotypic analyses, seeds were plated on one-half MS media supplemented with 2% sucrose instead of 1% as described previously (
<xref rid="msw067-B34" ref-type="bibr">Joly-Lopez et al. 2012</xref>
).</p>
</sec>
<sec>
<title>
<italic>d</italic>
N/
<italic>d</italic>
S Analysis</title>
<p>The selective pressure for the
<italic>MUGA</italic>
family within the
<italic>MUG</italic>
tree was examined using
<italic>d</italic>
N/
<italic>d</italic>
S analysis. The same amino acid MAFFT (121 sequences) was used as in the
<italic>MUG</italic>
BioNJ/Neighbor analysis. Amino acids were replaced with corresponding genomic DNA sequences. Tree adjustments and branch calling were made using the Tree viewer T-Rex (
<xref rid="msw067-B8" ref-type="bibr">Boc et al. 2012</xref>
).
<italic>d</italic>
N/
<italic>d</italic>
S was estimated using CODEML (Phylogenetic Analysis by Maximum Likelihood package (PAML); version 4.8a release August 2014; default parameters except clean data = 0, with fix_omega = 1 for null and fix_omega = 0 for the alternative [model 2]). Sites under positive selection were analyzed using BEB (
<xref rid="msw067-B74" ref-type="bibr">Yang et al. 2005</xref>
) and the position of the residues visualized using the alignment viewer Aliview (
<xref rid="msw067-B41" ref-type="bibr">Larsson 2014</xref>
). The aligned amino acid sequence of
<italic>MUG4</italic>
was used as query to search the NCBI Conserved Domain database to detect the position of the conserved domains (
<xref rid="msw067-B50" ref-type="bibr">Marchler-Bauer et al. 2011</xref>
).</p>
</sec>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>
<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msw067/-/DC1">Supplementary materials</ext-link>
are available at
<italic>Molecular Biology and Evolution</italic>
online (
<ext-link ext-link-type="uri" xlink:href="http://www.mbe.oxfordjournals.org">http://www.mbe.oxfordjournals.org</ext-link>
).</p>
<supplementary-material id="PMC_1" content-type="local-data">
<caption>
<title>Supplementary Data</title>
</caption>
<media mimetype="text" mime-subtype="html" xlink:href="supp_33_8_1937__index.html"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="x-zip-compressed" xlink:href="supp_msw067_suppl_data.zip"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>The authors thank Adrian E. Platts for his help with the
<italic>d</italic>
N/
<italic>d</italic>
S analyses using CODEML and for his very useful advice. The authors also thank Brad Cavinder and James Burnette for supplying a command-line version of the TARGeT pipeline. This research was funded by grants from Genome Québec and Genome Canada to T.E.B. and M.B., and the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery) to T.E.B. The authors declare that they have no conflicts of interest. D.R.H. and Z.J.L. contributed equally. They conceived and designed the study, collected and analyzed the data, and drafted the manuscript. T.E.B. and M.B. participated in conceiving the study and revising the manuscript. All authors read and approved the final manuscript.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="msw067-B1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Agrawal</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Eastman</surname>
<given-names>QM</given-names>
</name>
<name>
<surname>Schatz</surname>
<given-names>DG.</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system</article-title>
.
<source>Nature</source>
<volume>394</volume>
:
<fpage>744</fpage>
<lpage>751</lpage>
.
<pub-id pub-id-type="pmid">9723614</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altekar</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Dwarkadas</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Huelsenbeck</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Ronquist</surname>
<given-names>F.</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Parallel metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference</article-title>
.
<source>Bioinformatics</source>
<volume>20</volume>
:
<fpage>407</fpage>
<lpage>415</lpage>
.
<pub-id pub-id-type="pmid">14960467</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B3">
<mixed-citation publication-type="journal">
<collab>
<italic>Amborella</italic>
Genome Project</collab>
.
<year>2013</year>
<article-title>The
<italic>Amborella</italic>
genome and the evolution of flowering plants</article-title>
.
<source>Science</source>
<volume>342</volume>
:
<fpage>1241089</fpage>
.
<pub-id pub-id-type="pmid">24357323</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B4">
<mixed-citation publication-type="journal">
<collab>
<italic>Arabidopsis</italic>
Genome Initiative</collab>
.
<year>2000</year>
<article-title>Analysis of the genome sequence of the flowering plant
<italic>Arabidopsis thaliana</italic>
</article-title>
.
<source>Nature</source>
<volume>408</volume>
:
<fpage>796</fpage>
<lpage>815</lpage>
.
<pub-id pub-id-type="pmid">11130711</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B5">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Babu</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Iyer</surname>
<given-names>LM</given-names>
</name>
<name>
<surname>Balaji</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Aravind</surname>
<given-names>L.</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>The natural history of the WRKY-GCM1 zinc fingers and the relationship between transcription factors and transposons</article-title>
.
<source>Nucleic Acids Res</source>
.
<volume>34</volume>
:
<fpage>6505</fpage>
<lpage>6520</lpage>
.
<pub-id pub-id-type="pmid">17130173</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B6">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Benjak</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Forneck</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Casacuberta</surname>
<given-names>JM.</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Genome-wide analysis of the “cut-and-paste” transposons of grapevine</article-title>
.
<source>PLoS One</source>
<volume>3</volume>
:
<fpage>e3107.</fpage>
<pub-id pub-id-type="pmid">18769592</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B7">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Birol</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Raymond</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Jackman</surname>
<given-names>SD</given-names>
</name>
<name>
<surname>Pleasance</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Coope</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>GA</given-names>
</name>
<name>
<surname>Yuen</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Keeling</surname>
<given-names>CI</given-names>
</name>
<name>
<surname>Brand</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Vandervalk</surname>
<given-names>BP</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2013</year>
<article-title>Assembling the 20 Gb white spruce (
<italic>Picea glauca</italic>
) genome from whole-genome shotgun sequencing data</article-title>
.
<source>Bioinformatics</source>
<volume>29</volume>
:
<fpage>1492</fpage>
<lpage>1497</lpage>
.
<pub-id pub-id-type="pmid">23698863</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B8">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boc</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Diallo</surname>
<given-names>AB</given-names>
</name>
<name>
<surname>Makarenkov</surname>
<given-names>V.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks</article-title>
.
<source>Nucleic Acids Res</source>
.
<volume>40</volume>
:
<fpage>W573</fpage>
<lpage>W579</lpage>
.
<pub-id pub-id-type="pmid">22675075</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B9">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Borisov</surname>
<given-names>AY</given-names>
</name>
<name>
<surname>Madsen</surname>
<given-names>LH</given-names>
</name>
<name>
<surname>Tsyganov</surname>
<given-names>VE</given-names>
</name>
<name>
<surname>Umehara</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Voroshilova</surname>
<given-names>VA</given-names>
</name>
<name>
<surname>Batagov</surname>
<given-names>AO</given-names>
</name>
<name>
<surname>Sandal</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Mortensen</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Schauser</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ellis</surname>
<given-names>N</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2003</year>
<article-title>The Sym35 gene required for root nodule development in pea is an ortholog of Nin from
<italic>Lotus japonicus</italic>
</article-title>
.
<source>Plant Physiol</source>
.
<volume>131</volume>
:
<fpage>1009</fpage>
<lpage>1017</lpage>
.
<pub-id pub-id-type="pmid">12644653</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bundock</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Hooykaas</surname>
<given-names>P.</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>An
<italic>Arabidopsis</italic>
hAT-like transposase is essential for plant development</article-title>
.
<source>Nature</source>
<volume>436</volume>
:
<fpage>282</fpage>
<lpage>284</lpage>
.
<pub-id pub-id-type="pmid">16015335</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Camacho</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Coulouris</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Avagyan</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Papadopoulos</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bealer</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Madden</surname>
<given-names>TL.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>BLAST+: architecture and applications</article-title>
.
<source>BMC Bioinformatics</source>
<volume>10</volume>
:
<fpage>421.</fpage>
<pub-id pub-id-type="pmid">20003500</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B12">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Castresana</surname>
<given-names>J.</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>17</volume>
:
<fpage>540</fpage>
<lpage>552</lpage>
.
<pub-id pub-id-type="pmid">10742046</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chardin</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Girin</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Roudier</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Meyer</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Krapp</surname>
<given-names>A.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>The plant RWP-RK transcription factors: key regulators of nitrogen responses and of gametophyte development</article-title>
.
<source>J Exp Bot</source>
.
<volume>65</volume>
:
<fpage>5577</fpage>
<lpage>5587</lpage>
.
<pub-id pub-id-type="pmid">24987011</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cowan</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hoen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Schoen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Bureau</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>MUSTANG is a novel family of domesticated transposase genes found in diverse angiosperms</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>22</volume>
:
<fpage>2084</fpage>
<lpage>2089</lpage>
.
<pub-id pub-id-type="pmid">15987878</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dereeper</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Audic</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Claverie</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Blanc</surname>
<given-names>G.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>BLAST-EXPLORER helps you building datasets for phylogenetic analysis</article-title>
.
<source>BMC Evol Biol</source>
.
<volume>10</volume>
:
<fpage>8.</fpage>
<pub-id pub-id-type="pmid">20067610</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B16">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dereeper</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Guignon</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Blanc</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Audic</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Buffet</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chevenet</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Dufayard</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Guindon</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lefort</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Lescot</surname>
<given-names>M</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2008</year>
<article-title>Phylogeny.fr: robust phylogenetic analysis for the non-specialist</article-title>
.
<source>Nucleic Acids Res</source>
.
<volume>36</volume>
:
<fpage>W465</fpage>
<lpage>W469</lpage>
.
<pub-id pub-id-type="pmid">18424797</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Donoghue</surname>
<given-names>MT</given-names>
</name>
<name>
<surname>Keshavaiah</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Swamidatta</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Spillane</surname>
<given-names>C.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Evolutionary origins of Brassicaceae specific genes in
<italic>Arabidopsis thaliana</italic>
</article-title>
.
<source>BMC Evol Biol</source>
.
<volume>11</volume>
:
<fpage>47.</fpage>
<pub-id pub-id-type="pmid">21332978</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Doolittle</surname>
<given-names>WF</given-names>
</name>
<name>
<surname>Sapienza</surname>
<given-names>C.</given-names>
</name>
</person-group>
<year>1980</year>
<article-title>Selfish genes, the phnotype paradigm and genome evolution</article-title>
.
<source>Nature</source>
<volume>284</volume>
:
<fpage>601</fpage>
<lpage>603</lpage>
.
<pub-id pub-id-type="pmid">6245369</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feschotte</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Pritham</surname>
<given-names>EJ.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>DNA Transposons and the evolution of eukaryotic genomes</article-title>
.
<source>Annu Rev Genet</source>
.
<volume>41</volume>
:
<fpage>331</fpage>
<lpage>368</lpage>
.
<pub-id pub-id-type="pmid">18076328</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Flagel</surname>
<given-names>LE</given-names>
</name>
<name>
<surname>Wendel</surname>
<given-names>JF.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Gene duplication and evolutionary novelty in plants</article-title>
.
<source>New Phytol</source>
.
<volume>183</volume>
:
<fpage>557</fpage>
<lpage>564</lpage>
.
<pub-id pub-id-type="pmid">19555435</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B21">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>H</given-names>
</name>
<name>
<surname>An</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>H.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>
<italic>Arabidopsis</italic>
FRS4/CPD25 and FHY3/CPD45 work cooperatively to promote the expression of the chloroplast division gene ARC5 and chloroplast division</article-title>
.
<source>Plant J</source>
.
<volume>75</volume>
:
<fpage>795</fpage>
<lpage>807</lpage>
.
<pub-id pub-id-type="pmid">23662592</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B22">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gascuel</surname>
<given-names>O.</given-names>
</name>
</person-group>
<year>1997</year>
<article-title>BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>14</volume>
:
<fpage>685</fpage>
<lpage>695</lpage>
.
<pub-id pub-id-type="pmid">9254330</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B23">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gould</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Lloyd</surname>
<given-names>EA.</given-names>
</name>
</person-group>
<year>1999</year>
<article-title>Individuality and adaptation across levels of selection: how shall we name and generalize the unit of Darwinism?</article-title>
<source>Proc Natl Acad Sci U S A</source>
.
<volume>96</volume>
:
<fpage>11904</fpage>
<lpage>11909</lpage>
.
<pub-id pub-id-type="pmid">10518549</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B24">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gould</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Vrba</surname>
<given-names>ES.</given-names>
</name>
</person-group>
<year>1982</year>
<article-title>Exaptation-a missing term in the science of form</article-title>
.
<source>Paleobiology</source>
<volume>8</volume>
:
<fpage>4</fpage>
<lpage>15</lpage>
.</mixed-citation>
</ref>
<ref id="msw067-B25">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guilfoyle</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Hagen</surname>
<given-names>G.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Getting a grasp on domain III/IV responsible for auxin response factor-IAA protein interactions</article-title>
.
<source>Plant Sci</source>
.
<volume>190</volume>
:
<fpage>82</fpage>
<lpage>88</lpage>
.
<pub-id pub-id-type="pmid">22608522</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B26">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Han</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Burnette</surname>
<given-names>JM</given-names>
<suffix>3rd</suffix>
</name>
<name>
<surname>Wessler</surname>
<given-names>SR.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>TARGeT: a web-based pipeline for retrieving and characterizing gene and transposable element families from genomic sequences</article-title>
.
<source>Nucleic Acids Res</source>
.
<volume>37</volume>
:
<fpage>e78.</fpage>
<pub-id pub-id-type="pmid">19429695</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B27">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Hoen</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Bureau</surname>
<given-names>TE.</given-names>
</name>
</person-group>
<year>2012</year>
<chapter-title>Transposable element exaptation in plants</chapter-title>
In:
<person-group person-group-type="editor">
<name>
<surname>Grandbastien</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Casacuberta</surname>
<given-names>JM</given-names>
</name>
</person-group>
, editors.
<source>Plant transposable elements</source>
.
<publisher-loc>Heidelberg</publisher-loc>
:
<publisher-name>Springer Berlin</publisher-name>
p.
<fpage>219</fpage>
<lpage>251</lpage>
.</mixed-citation>
</ref>
<ref id="msw067-B28">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hoen</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Bureau</surname>
<given-names>TE.</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>Discovery of novel genes derived from transposable elements using integrative genomic analysis</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>32</volume>
:
<fpage>1487</fpage>
<lpage>1506</lpage>
.
<pub-id pub-id-type="pmid">25713212</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B29">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hoen</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>KC</given-names>
</name>
<name>
<surname>Elrouby</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Mohabir</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Cowan</surname>
<given-names>RK</given-names>
</name>
<name>
<surname>Bureau</surname>
<given-names>TE.</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>Transposon-mediated expansion and diversification of a family of ULP-like genes</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>23</volume>
:
<fpage>1254</fpage>
<lpage>1268</lpage>
.
<pub-id pub-id-type="pmid">16581939</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B30">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Ouyang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lau</surname>
<given-names>OS</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>XW.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>
<italic>Arabidopsis</italic>
FHY3 and HY5 positively mediate induction of cop1 transcription in response to photomorphogenic UV-B light</article-title>
.
<source>The Plant Cell</source>
<volume>24</volume>
:
<fpage>4590</fpage>
<lpage>4606</lpage>
.
<pub-id pub-id-type="pmid">23150635</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B31">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hudson</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ringli</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Boylan</surname>
<given-names>MT</given-names>
</name>
<name>
<surname>Quail</surname>
<given-names>PH.</given-names>
</name>
</person-group>
<year>1999</year>
<article-title>The FAR1 locus encodes a novel nuclear protein specific to phytochrome A signaling</article-title>
.
<source>Genes Dev</source>
.
<volume>13</volume>
:
<fpage>2017</fpage>
<lpage>2027</lpage>
.
<pub-id pub-id-type="pmid">10444599</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B32">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hudson</surname>
<given-names>ME</given-names>
</name>
<name>
<surname>Lisch</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Quail</surname>
<given-names>PH.</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>The FHY3 and FAR1 genes encode transposase-related proteins involved in regulation of gene expression by the phytochrome A-signaling pathway</article-title>
.
<source>Plant J</source>
.
<volume>34</volume>
:
<fpage>453</fpage>
<lpage>471</lpage>
.
<pub-id pub-id-type="pmid">12753585</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B33">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huelsenbeck</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Ronquist</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Bollback</surname>
<given-names>JP.</given-names>
</name>
</person-group>
<year>2001</year>
<article-title>Bayesian inference of phylogeny and its impact on evolutionary biology</article-title>
.
<source>Science</source>
<volume>294</volume>
:
<fpage>2310</fpage>
<lpage>2314</lpage>
.
<pub-id pub-id-type="pmid">11743192</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B34">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Joly-Lopez</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Forczek</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Hoen</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Juretic</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Bureau</surname>
<given-names>TE.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>A Gene family derived from transposable elements during early angiosperm evolution has reproductive fitness benefits in
<italic>Arabidopsis thaliana</italic>
</article-title>
.
<source>PLoS Genet</source>
.
<volume>8</volume>
:
<fpage>e1002931.</fpage>
<pub-id pub-id-type="pmid">22969437</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B35">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Juretic</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Hoen</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Huynh</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Harrison</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bureau</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>The evolutionary fate of MULE-mediated duplications of host gene fragments in rice</article-title>
.
<source>Genome Res</source>
.
<volume>15</volume>
:
<fpage>1292</fpage>
<lpage>1297</lpage>
.
<pub-id pub-id-type="pmid">16140995</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B36">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kapitonov</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Jurka</surname>
<given-names>J.</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Harbinger transposons and an ancient HARBI1 gene derived from a transposase</article-title>
.
<source>DNA Cell Biol</source>
.
<volume>23</volume>
:
<fpage>311</fpage>
<lpage>324</lpage>
.
<pub-id pub-id-type="pmid">15169610</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B37">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kapitonov</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Jurka</surname>
<given-names>J.</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>RAG1 Core and V(D)J recombination signal sequences were derived from transib transposons</article-title>
.
<source>PLoS Biol</source>
.
<volume>3</volume>
:
<fpage>e181.</fpage>
<pub-id pub-id-type="pmid">15898832</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B38">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katoh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Standley</surname>
<given-names>DM.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>MAFFT multiple sequence alignment software version 7: improvements in performance and usability</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>30</volume>
:
<fpage>772</fpage>
<lpage>780</lpage>
.
<pub-id pub-id-type="pmid">23329690</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B39">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kawashima</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Berger</surname>
<given-names>F.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Epigenetic reprogramming in plant sexual reproduction</article-title>
.
<source>Nat Rev Genet</source>
.
<volume>15</volume>
:
<fpage>613</fpage>
<lpage>624</lpage>
.
<pub-id pub-id-type="pmid">25048170</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B40">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Korasick</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Westfall</surname>
<given-names>CS</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Nanao</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>Dumas</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hagen</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Guilfoyle</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Jez</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Strader</surname>
<given-names>LC.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Molecular basis for auxin response factor protein interaction and the control of auxin response repression</article-title>
.
<source>Proc Natl Acad Sci U S A</source>
.
<volume>111</volume>
:
<fpage>5427</fpage>
<lpage>5432</lpage>
.
<pub-id pub-id-type="pmid">24706860</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B41">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Larsson</surname>
<given-names>A.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>AliView: a fast and lightweight alignment viewer and editor for large datasets</article-title>
.
<source>Bioinformatics</source>
<volume>30</volume>
:
<fpage>3276</fpage>
<lpage>3278</lpage>
.
<pub-id pub-id-type="pmid">25095880</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B42">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Le</surname>
<given-names>QH</given-names>
</name>
<name>
<surname>Wright</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Bureau</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>Transposon diversity in
<italic>Arabidopsis thaliana</italic>
</article-title>
.
<source>Proc Natl Acad Sci U S A</source>
.
<volume>97</volume>
:
<fpage>7376</fpage>
<lpage>7381</lpage>
.
<pub-id pub-id-type="pmid">10861007</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B43">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Levin</surname>
<given-names>HL</given-names>
</name>
<name>
<surname>Moran</surname>
<given-names>JV.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Dynamic interactions between transposable elements and their hosts</article-title>
.
<source>Nat Rev Genet</source>
.
<volume>12</volume>
:
<fpage>615</fpage>
<lpage>627</lpage>
.
<pub-id pub-id-type="pmid">21850042</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B44">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Siddiqui</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Teng</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>XY</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lau</surname>
<given-names>OS</given-names>
</name>
<name>
<surname>Ouyang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>J</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Coordinated transcriptional regulation underlying the circadian clock in
<italic>Arabidopsis.</italic>
</article-title>
<source>Nat Cell Biol</source>
.
<volume>13</volume>
:
<fpage>616</fpage>
<lpage>622</lpage>
.
<pub-id pub-id-type="pmid">21499259</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B45">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Casola</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ripoll</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Feschotte</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Transposase-derived transcription factors regulate light signaling in arabidopsis</article-title>
.
<source>Science</source>
<volume>318</volume>
:
<fpage>1302</fpage>
<lpage>1305</lpage>
.
<pub-id pub-id-type="pmid">18033885</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B46">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Teng</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>HJ</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Black</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Fang</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Discrete and essential roles of the multiple domains of
<italic>Arabidopsis</italic>
FHY3 in mediating phytochrome A signal transduction</article-title>
.
<source>Plant Physiol</source>
.
<volume>148</volume>
:
<fpage>981</fpage>
<lpage>992</lpage>
.
<pub-id pub-id-type="pmid">18715961</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B47">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Arabidopsis FHY3/FAR1 gene family and distinct roles of its members in light control of
<italic>Arabidopsis</italic>
development</article-title>
.
<source>Plant Physiol</source>
.
<volume>136</volume>
:
<fpage>4010</fpage>
<lpage>4022</lpage>
.
<pub-id pub-id-type="pmid">15591448</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B48">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lisch</surname>
<given-names>D.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Epigenetic regulation of transposable elements in plants</article-title>
.
<source>Annu Rev Plant Biol</source>
.
<volume>60</volume>
:
<fpage>43</fpage>
<lpage>66</lpage>
.
<pub-id pub-id-type="pmid">19007329</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B49">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Louis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Muffato</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Roest Crollius</surname>
<given-names>H.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Genomicus: five genome browsers for comparative genomics in eukaryota</article-title>
.
<source>Nucleic Acids Res</source>
.
<volume>41</volume>
:
<fpage>D700</fpage>
<lpage>D705</lpage>
.
<pub-id pub-id-type="pmid">23193262</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B50">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marchler-Bauer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Chitsaz</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Derbyshire</surname>
<given-names>MK</given-names>
</name>
<name>
<surname>DeWeese-Scott</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Fong</surname>
<given-names>JH</given-names>
</name>
<name>
<surname>Geer</surname>
<given-names>LY</given-names>
</name>
<name>
<surname>Geer</surname>
<given-names>RC</given-names>
</name>
<name>
<surname>Gonzales</surname>
<given-names>NR</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>CDD: a conserved domain database for the functional annotation of proteins</article-title>
.
<source>Nucleic Acids Res</source>
.
<volume>39</volume>
:
<fpage>D225</fpage>
<lpage>D229</lpage>
.
<pub-id pub-id-type="pmid">21109532</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B51">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Miller</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Hagemann</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Reiter</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Pinsker</surname>
<given-names>W.</given-names>
</name>
</person-group>
<year>1992</year>
<article-title>P-element homologous sequences are tandemly repeated in the genome of Drosophila guanche</article-title>
.
<source>Proc Natl Acad Sci U S A</source>
.
<volume>89</volume>
:
<fpage>4018</fpage>
<lpage>4022</lpage>
.
<pub-id pub-id-type="pmid">1315047</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B52">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oliver</surname>
<given-names>KR</given-names>
</name>
<name>
<surname>McComb</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Greene</surname>
<given-names>WK.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Transposable elements: powerful contributors to angiosperm evolution and diversity</article-title>
.
<source>Genome Biol Evol</source>
.
<volume>5</volume>
:
<fpage>1886</fpage>
<lpage>1901</lpage>
.
<pub-id pub-id-type="pmid">24065734</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B53">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Orgel</surname>
<given-names>LE</given-names>
</name>
<name>
<surname>Crick</surname>
<given-names>FH.</given-names>
</name>
</person-group>
<year>1980</year>
<article-title>Selfish DNA: the ultimate parasite</article-title>
.
<source>Nature</source>
<volume>284</volume>
:
<fpage>604</fpage>
<lpage>607</lpage>
.
<pub-id pub-id-type="pmid">7366731</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B54">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ouyang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Mo</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>R</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Genome-Wide binding site analysis of far-red elongated hypocotyl3 reveals its novel function in
<italic>Arabidopsis</italic>
development</article-title>
.
<source>Plant Cell</source>
<volume>23</volume>
:
<fpage>2514</fpage>
<lpage>2535</lpage>
.
<pub-id pub-id-type="pmid">21803941</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B55">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pardue</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>DeBaryshe</surname>
<given-names>PG.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Retrotransposons that maintain chromosome ends</article-title>
.
<source>Proc Natl Acad Sci U S A</source>
.
<volume>108</volume>
:
<fpage>20317</fpage>
<lpage>20324</lpage>
.
<pub-id pub-id-type="pmid">21821789</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B56">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Parisod</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Alix</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Just</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Petit</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sarilar</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Mhiri</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ainouche</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Chalhoub</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Grandbastien</surname>
<given-names>MA.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Impact of transposable elements on the organization and function of allopolyploid genomes</article-title>
.
<source>New Phytol</source>
.
<volume>186</volume>
:
<fpage>37</fpage>
<lpage>45</lpage>
.
<pub-id pub-id-type="pmid">20002321</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B57">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Prasad</surname>
<given-names>BD</given-names>
</name>
<name>
<surname>Goel</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Krishna</surname>
<given-names>P.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>In silico identification of carboxylate clamp type tetratricopeptide repeat proteins in Arabidopsis and rice as putative co-chaperones of Hsp90/Hsp70</article-title>
.
<source>PLoS One</source>
<volume>5</volume>
:
<fpage>e12761.</fpage>
<pub-id pub-id-type="pmid">20856808</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B58">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Quesneville</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Nouaud</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Anxolabehere</surname>
<given-names>D.</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>Recurrent recruitment of the THAP DNA-binding domain and molecular domestication of the P-transposable element</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>22</volume>
:
<fpage>741</fpage>
<lpage>746</lpage>
.
<pub-id pub-id-type="pmid">15574804</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B59">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rawn</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Cross</surname>
<given-names>JC.</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>The evolution, regulation, and function of placenta-specific genes</article-title>
.
<source>Annu Rev Cell Dev Biol</source>
.
<volume>24</volume>
:
<fpage>159</fpage>
<lpage>181</lpage>
.
<pub-id pub-id-type="pmid">18616428</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B60">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rebollo</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Horard</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Hubert</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Vieira</surname>
<given-names>C.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Jumping genes and epigenetics: towards new species</article-title>
.
<source>Gene</source>
<volume>454</volume>
:
<fpage>1</fpage>
<lpage>7</lpage>
.
<pub-id pub-id-type="pmid">20102733</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B61">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ronquist</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Huelsenbeck</surname>
<given-names>JP.</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>MrBayes 3: Bayesian phylogenetic inference under mixed models</article-title>
.
<source>Bioinformatics</source>
<volume>19</volume>
:
<fpage>1572</fpage>
<lpage>1574</lpage>
.
<pub-id pub-id-type="pmid">12912839</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B62">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rosso</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Strizhov</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Reiss</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Dekker</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Weisshaar</surname>
<given-names>B.</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>An
<italic>Arabidopsis thaliana</italic>
T-DNA mutagenized population (GABI-Kat) for flanking sequence tag-based reverse genetics</article-title>
.
<source>Plant Mol Biol</source>
.
<volume>53</volume>
:
<fpage>247</fpage>
<lpage>259</lpage>
.
<pub-id pub-id-type="pmid">14756321</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B63">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roussigne</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kossida</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lavigne</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>Clouaire</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ecochard</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Glories</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Amalric</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Girard</surname>
<given-names>JP.</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>The THAP domain: a novel protein motif with similarity to the DNA-binding domain of P element transposase</article-title>
.
<source>Trends Biochem Sci</source>
.
<volume>28</volume>
:
<fpage>66</fpage>
<lpage>69</lpage>
.
<pub-id pub-id-type="pmid">12575992</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B64">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saccaro</surname>
<given-names>NL</given-names>
</name>
<name>
<surname>Van Sluys</surname>
<given-names>M-A</given-names>
</name>
<name>
<surname>de Mello Varani</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rossi</surname>
<given-names>M.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>MudrA-like sequences from rice and sugarcane cluster as two bona fide transposon clades and two domesticated transposases</article-title>
.
<source>Gene</source>
<volume>392</volume>
:
<fpage>117</fpage>
<lpage>125</lpage>
.
<pub-id pub-id-type="pmid">17289300</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B65">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sinzelle</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Izsvak</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Ivics</surname>
<given-names>Z.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Molecular domestication of transposable elements: from detrimental parasites to useful host genes</article-title>
.
<source>Cell Mol Life Sci</source>
.
<volume>66</volume>
:
<fpage>1073</fpage>
<lpage>1093</lpage>
.
<pub-id pub-id-type="pmid">19132291</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B66">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stirnberg</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Williamson</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Leyser</surname>
<given-names>O.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>FHY3 promotes shoot branching and stress tolerance in
<italic>Arabidopsis</italic>
in an AXR1-dependent manner</article-title>
.
<source>Plant J</source>
.
<volume>71</volume>
:
<fpage>907</fpage>
<lpage>920</lpage>
.
<pub-id pub-id-type="pmid">22540368</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B67">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sumimoto</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Kamakura</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ito</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Structure and function of the PB1 domain, a protein interaction module conserved in animals, fungi, amoebas, and plants</article-title>
.
<source>Sci STKE</source>
.
<volume>2007</volume>
:
<fpage>re6.</fpage>
<pub-id pub-id-type="pmid">17726178</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B68">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Ji</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Bao</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>FAR-RED ELONGATED HYPOCOTYL3 and FAR-RED IMPAIRED RESPONSE1 transcription factors integrate light and abscisic acid signaling in Arabidopsis</article-title>
.
<source>Plant Physiol</source>
.
<volume>163</volume>
:
<fpage>857</fpage>
<lpage>866</lpage>
.
<pub-id pub-id-type="pmid">23946351</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B69">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Trehin</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Schrempp</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chauvet</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Berne-Dedieu</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Thierry</surname>
<given-names>AM</given-names>
</name>
<name>
<surname>Faure</surname>
<given-names>JE</given-names>
</name>
<name>
<surname>Negrutiu</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Morel</surname>
<given-names>P.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>QUIRKY interacts with STRUBBELIG and PAL OF QUIRKY to regulate cell growth anisotropy during
<italic>Arabidopsis</italic>
gynoecium development</article-title>
.
<source>Development</source>
<volume>140</volume>
:
<fpage>4807</fpage>
<lpage>4817</lpage>
.
<pub-id pub-id-type="pmid">24173806</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B70">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Leeuwen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Monfort</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Puigdomenech</surname>
<given-names>P.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Mutator-like elements identified in melon, Arabidopsis and rice contain ULP1 protease domains</article-title>
.
<source>Mol Genet Genomics</source>
.
<volume>277</volume>
:
<fpage>357</fpage>
<lpage>364</lpage>
.
<pub-id pub-id-type="pmid">17136348</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B71">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>XW.</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>
<italic>Arabidopsis</italic>
FHY3 defines a key phytochrome A signaling component directly interacting with its homologous partner FAR1</article-title>
.
<source>EMBO J</source>
.
<volume>21</volume>
:
<fpage>1339</fpage>
<lpage>1349</lpage>
.
<pub-id pub-id-type="pmid">11889039</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B72">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>Multifaceted roles of FHY3 and FAR1 in light signaling and beyond</article-title>
.
<source>Trends Plant Sci</source>
.
<volume>20</volume>
:
<fpage>453</fpage>
<lpage>461</lpage>
.
<pub-id pub-id-type="pmid">25956482</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B73">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Whitelam</surname>
<given-names>GC</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Carol</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Cowl</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Harberd</surname>
<given-names>NP.</given-names>
</name>
</person-group>
<year>1993</year>
<article-title>Phytochrome A null mutants of Arabidopsis display a wild-type phenotype in white light</article-title>
.
<source>Plant Cell</source>
<volume>5</volume>
:
<fpage>757</fpage>
<lpage>768</lpage>
.
<pub-id pub-id-type="pmid">8364355</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B74">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>WS</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>Bayes empirical Bayes inference of amino acid sites under positive selection</article-title>
.
<source>Mol Biol Evol</source>
.
<volume>22</volume>
:
<fpage>1107</fpage>
<lpage>1118</lpage>
.
<pub-id pub-id-type="pmid">15689528</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B75">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zeh</surname>
<given-names>DW</given-names>
</name>
<name>
<surname>Zeh</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Ishida</surname>
<given-names>Y.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Transposable elements and an epigenetic basis for punctuated equilibria</article-title>
.
<source>BioEssays</source>
<volume>31</volume>
:
<fpage>715</fpage>
<lpage>726</lpage>
.
<pub-id pub-id-type="pmid">19472370</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B76">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zeng</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Kong</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>H.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Resolution of deep angiosperm phylogeny using conserved nuclear genes and estimates of early divergence times</article-title>
.
<source>Nat Commun</source>
.
<volume>5</volume>
:
<fpage>4956.</fpage>
<pub-id pub-id-type="pmid">25249442</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B77">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>MD</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chuang</surname>
<given-names>YA</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Leahy</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Worley</surname>
<given-names>PF.</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>Structural basis of arc binding to synaptic proteins: implications for cognitive disease</article-title>
.
<source>Neuron</source>
<volume>86</volume>
:
<fpage>490</fpage>
<lpage>500</lpage>
.
<pub-id pub-id-type="pmid">25864631</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B78">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zientara-Rytter</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Sirko</surname>
<given-names>A.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Significant role of PB1 and UBA domains in multimerization of Joka2, a selective autophagy cargo receptor from tobacco</article-title>
.
<source>Front Plant Sci</source>
.
<volume>5</volume>
:
<fpage>13.</fpage>
<pub-id pub-id-type="pmid">24550923</pub-id>
</mixed-citation>
</ref>
<ref id="msw067-B79">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zimin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Crepeau</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Holtz-Morris</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Koriabine</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Marcais</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Puiu</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Roberts</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wegrzyn</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>de Jong</surname>
<given-names>PJ</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2014</year>
<article-title>Sequencing and assembly of the 22-gb loblolly pine genome</article-title>
.
<source>Genetics</source>
<volume>196</volume>
:
<fpage>875</fpage>
<lpage>890</lpage>
.
<pub-id pub-id-type="pmid">24653210</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Bois/explor/OrangerV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001067 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 001067 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Bois
   |area=    OrangerV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4948706
   |texte=   Phylogenetic and Genomic Analyses Resolve the Origin of Important Plant Genes Derived from Transposable Elements
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:27189548" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a OrangerV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Sat Dec 3 17:11:04 2016. Site generation: Wed Mar 6 18:18:32 2024