CyberinfraV1, Pmc, Corpus, bibRecord, 000566

Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic Micromonas Populations

Identifieur interne : 000566 ( Pmc/Corpus ); précédent : 000565; suivant : 000567

Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic Micromonas Populations

Auteurs : Melinda P. Simmons ; Charles Bachy ; Sebastian Sudek ; Marijke J. Van Baren ; Lisa Sudek ; Manuel Ares ; Alexandra Z. Worden

Source :

Molecular Biology and Evolution [ 0737-4038 ] ; 2015.

RBID : PMC:4540971

Abstract

Spliceosomal introns are a hallmark of eukaryotic genes that are hypothesized to play important roles in genome evolution but have poorly understood origins. Although most introns lack sequence homology to each other, new families of spliceosomal introns that are repeated hundreds of times in individual genomes have recently been discovered in a few organisms. The prevalence and conservation of these introner elements (IEs) or introner-like elements in other taxa, as well as their evolutionary relationships to regular spliceosomal introns, are still unknown. Here, we systematically investigate introns in the widespread marine green alga Micromonas and report new families of IEs, numerous intron presence–absence polymorphisms, and potential intron insertion hot-spots. The new families enabled identification of conserved IE secondary structure features and establishment of a novel general model for repetitive intron proliferation across genomes. Despite shared secondary structure, the IE families from each Micromonas lineage bear no obvious sequence similarity to those in the other lineages, suggesting that their appearance is intimately linked with the process of speciation. Two of the new IE families come from an Arctic culture (Micromonas Clade E2) isolated from a polar region where abundance of this alga is increasing due to climate induced changes. The same two families were detected in metagenomic data from Antarctica—a system where Micromonas has never before been reported. Strikingly high identity between the Arctic isolate and Antarctic coding sequences that flank the IEs suggests connectivity between populations in the two polar systems that we postulate occurs through deep-sea currents. Recovery of Clade E2 sequences in North Atlantic Deep Waters beneath the Gulf Stream supports this hypothesis. Our research illuminates the dynamic relationships between an unusual class of repetitive introns, genome evolution, speciation, and global distribution of this sentinel marine alga.

Url:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4540971

DOI: 10.1093/molbev/msv122
PubMed: 25998521
PubMed Central: 4540971

Links to Exploration step

PMC:4540971

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic <italic>Micromonas</italic>
 Populations</title>
<author><name sortKey="Simmons, Melinda P" sort="Simmons, Melinda P" uniqKey="Simmons M" first="Melinda P." last="Simmons">Melinda P. Simmons</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="msv122-AFF2">Department of Ocean Sciences, University of California Santa Cruz</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Bachy, Charles" sort="Bachy, Charles" uniqKey="Bachy C" first="Charles" last="Bachy">Charles Bachy</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Sudek, Sebastian" sort="Sudek, Sebastian" uniqKey="Sudek S" first="Sebastian" last="Sudek">Sebastian Sudek</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Van Baren, Marijke J" sort="Van Baren, Marijke J" uniqKey="Van Baren M" first="Marijke J." last="Van Baren">Marijke J. Van Baren</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Sudek, Lisa" sort="Sudek, Lisa" uniqKey="Sudek L" first="Lisa" last="Sudek">Lisa Sudek</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Ares, Manuel" sort="Ares, Manuel" uniqKey="Ares M" first="Manuel" last="Ares">Manuel Ares</name>
<affiliation><nlm:aff id="msv122-AFF3">Department of Molecular, Cell & Developmental Biology, University of California Santa Cruz</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Worden, Alexandra Z" sort="Worden, Alexandra Z" uniqKey="Worden A" first="Alexandra Z." last="Worden">Alexandra Z. Worden</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="msv122-AFF2">Department of Ocean Sciences, University of California Santa Cruz</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="msv122-AFF4">Integrated Microbial Biodiversity Program, Canadian Institute for Advanced Research, Toronto, ON, Canada</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">25998521</idno>
<idno type="pmc">4540971</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4540971</idno>
<idno type="RBID">PMC:4540971</idno>
<idno type="doi">10.1093/molbev/msv122</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000566</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic <italic>Micromonas</italic>
 Populations</title>
<author><name sortKey="Simmons, Melinda P" sort="Simmons, Melinda P" uniqKey="Simmons M" first="Melinda P." last="Simmons">Melinda P. Simmons</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="msv122-AFF2">Department of Ocean Sciences, University of California Santa Cruz</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Bachy, Charles" sort="Bachy, Charles" uniqKey="Bachy C" first="Charles" last="Bachy">Charles Bachy</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Sudek, Sebastian" sort="Sudek, Sebastian" uniqKey="Sudek S" first="Sebastian" last="Sudek">Sebastian Sudek</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Van Baren, Marijke J" sort="Van Baren, Marijke J" uniqKey="Van Baren M" first="Marijke J." last="Van Baren">Marijke J. Van Baren</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Sudek, Lisa" sort="Sudek, Lisa" uniqKey="Sudek L" first="Lisa" last="Sudek">Lisa Sudek</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Ares, Manuel" sort="Ares, Manuel" uniqKey="Ares M" first="Manuel" last="Ares">Manuel Ares</name>
<affiliation><nlm:aff id="msv122-AFF3">Department of Molecular, Cell & Developmental Biology, University of California Santa Cruz</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Worden, Alexandra Z" sort="Worden, Alexandra Z" uniqKey="Worden A" first="Alexandra Z." last="Worden">Alexandra Z. Worden</name>
<affiliation><nlm:aff id="msv122-AFF1">Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="msv122-AFF2">Department of Ocean Sciences, University of California Santa Cruz</nlm:aff>
</affiliation>
<affiliation><nlm:aff id="msv122-AFF4">Integrated Microbial Biodiversity Program, Canadian Institute for Advanced Research, Toronto, ON, Canada</nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">Molecular Biology and Evolution</title>
<idno type="ISSN">0737-4038</idno>
<idno type="eISSN">1537-1719</idno>
<imprint><date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>Spliceosomal introns are a hallmark of eukaryotic genes that are hypothesized to play important roles in genome evolution but have poorly understood origins. Although most introns lack sequence homology to each other, new families of spliceosomal introns that are repeated hundreds of times in individual genomes have recently been discovered in a few organisms. The prevalence and conservation of these introner elements (IEs) or introner-like elements in other taxa, as well as their evolutionary relationships to regular spliceosomal introns, are still unknown. Here, we systematically investigate introns in the widespread marine green alga <italic>Micromonas</italic>
 and report new families of IEs, numerous intron presence–absence polymorphisms, and potential intron insertion hot-spots. The new families enabled identification of conserved IE secondary structure features and establishment of a novel general model for repetitive intron proliferation across genomes. Despite shared secondary structure, the IE families from each <italic>Micromonas</italic>
 lineage bear no obvious sequence similarity to those in the other lineages, suggesting that their appearance is intimately linked with the process of speciation. Two of the new IE families come from an Arctic culture (<italic>Micromonas</italic>
 Clade E2) isolated from a polar region where abundance of this alga is increasing due to climate induced changes. The same two families were detected in metagenomic data from Antarctica—a system where <italic>Micromonas</italic>
 has never before been reported. Strikingly high identity between the Arctic isolate and Antarctic coding sequences that flank the IEs suggests connectivity between populations in the two polar systems that we postulate occurs through deep-sea currents. Recovery of Clade E2 sequences in North Atlantic Deep Waters beneath the Gulf Stream supports this hypothesis. Our research illuminates the dynamic relationships between an unusual class of repetitive introns, genome evolution, speciation, and global distribution of this sentinel marine alga.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Aguilera, A" uniqKey="Aguilera A">A Aguilera</name>
</author>
<author><name sortKey="Garcia Muse, T" uniqKey="Garcia Muse T">T Garcia-Muse</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author><name sortKey="Gish, W" uniqKey="Gish W">W Gish</name>
</author>
<author><name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author><name sortKey="Myers, Ew" uniqKey="Myers E">EW Myers</name>
</author>
<author><name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author><name sortKey="Madden, Tl" uniqKey="Madden T">TL Madden</name>
</author>
<author><name sortKey="Schaffer, Aa" uniqKey="Schaffer A">AA Schaffer</name>
</author>
<author><name sortKey="Zhang, J" uniqKey="Zhang J">J Zhang</name>
</author>
<author><name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author><name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
<author><name sortKey="Lipman, Dj" uniqKey="Lipman D">DJ Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Bailey, Tl" uniqKey="Bailey T">TL Bailey</name>
</author>
<author><name sortKey="Elkan, C" uniqKey="Elkan C">C Elkan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Blanc, G" uniqKey="Blanc G">G Blanc</name>
</author>
<author><name sortKey="Duncan, G" uniqKey="Duncan G">G Duncan</name>
</author>
<author><name sortKey="Agarkova, I" uniqKey="Agarkova I">I Agarkova</name>
</author>
<author><name sortKey="Borodovsky, M" uniqKey="Borodovsky M">M Borodovsky</name>
</author>
<author><name sortKey="Gurnon, J" uniqKey="Gurnon J">J Gurnon</name>
</author>
<author><name sortKey="Kuo, A" uniqKey="Kuo A">A Kuo</name>
</author>
<author><name sortKey="Lindquist, E" uniqKey="Lindquist E">E Lindquist</name>
</author>
<author><name sortKey="Lucas, S" uniqKey="Lucas S">S Lucas</name>
</author>
<author><name sortKey="Pangilinan, J" uniqKey="Pangilinan J">J Pangilinan</name>
</author>
<author><name sortKey="Polle, J" uniqKey="Polle J">J Polle</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Broecker, Ws" uniqKey="Broecker W">WS Broecker</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Brogna, S" uniqKey="Brogna S">S Brogna</name>
</author>
<author><name sortKey="Wen, J" uniqKey="Wen J">J Wen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chan, Ya" uniqKey="Chan Y">YA Chan</name>
</author>
<author><name sortKey="Hieter, P" uniqKey="Hieter P">P Hieter</name>
</author>
<author><name sortKey="Stirling, Pc" uniqKey="Stirling P">PC Stirling</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Crooks, Ge" uniqKey="Crooks G">GE Crooks</name>
</author>
<author><name sortKey="Hon, G" uniqKey="Hon G">G Hon</name>
</author>
<author><name sortKey="Chandonia, Jm" uniqKey="Chandonia J">JM Chandonia</name>
</author>
<author><name sortKey="Brenner, Se" uniqKey="Brenner S">SE Brenner</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Csuros, M" uniqKey="Csuros M">M Csuros</name>
</author>
<author><name sortKey="Rogozin, Ib" uniqKey="Rogozin I">IB Rogozin</name>
</author>
<author><name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Curtis, Ba" uniqKey="Curtis B">BA Curtis</name>
</author>
<author><name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Davis, Lg" uniqKey="Davis L">LG Davis</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="De Wit, Pj" uniqKey="De Wit P">PJ de Wit</name>
</author>
<author><name sortKey="Van Der Burgt, A" uniqKey="Van Der Burgt A">A van der Burgt</name>
</author>
<author><name sortKey="Okmen, B" uniqKey="Okmen B">B Okmen</name>
</author>
<author><name sortKey="Stergiopoulos, I" uniqKey="Stergiopoulos I">I Stergiopoulos</name>
</author>
<author><name sortKey="Abd Elsalam, Ka" uniqKey="Abd Elsalam K">KA Abd-Elsalam</name>
</author>
<author><name sortKey="Aerts, Al" uniqKey="Aerts A">AL Aerts</name>
</author>
<author><name sortKey="Bahkali, Ah" uniqKey="Bahkali A">AH Bahkali</name>
</author>
<author><name sortKey="Beenen, Hg" uniqKey="Beenen H">HG Beenen</name>
</author>
<author><name sortKey="Chettri, P" uniqKey="Chettri P">P Chettri</name>
</author>
<author><name sortKey="Cox, Mp" uniqKey="Cox M">MP Cox</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Denoeud, F" uniqKey="Denoeud F">F Denoeud</name>
</author>
<author><name sortKey="Henriet, S" uniqKey="Henriet S">S Henriet</name>
</author>
<author><name sortKey="Mungpakdee, S" uniqKey="Mungpakdee S">S Mungpakdee</name>
</author>
<author><name sortKey="Aury, Jm" uniqKey="Aury J">JM Aury</name>
</author>
<author><name sortKey="Da Silva, C" uniqKey="Da Silva C">C Da Silva</name>
</author>
<author><name sortKey="Brinkmann, H" uniqKey="Brinkmann H">H Brinkmann</name>
</author>
<author><name sortKey="Mikhaleva, J" uniqKey="Mikhaleva J">J Mikhaleva</name>
</author>
<author><name sortKey="Olsen, Lc" uniqKey="Olsen L">LC Olsen</name>
</author>
<author><name sortKey="Jubin, C" uniqKey="Jubin C">C Jubin</name>
</author>
<author><name sortKey="Canestro, C" uniqKey="Canestro C">C Canestro</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dickson, L" uniqKey="Dickson L">L Dickson</name>
</author>
<author><name sortKey="Huang, Hr" uniqKey="Huang H">HR Huang</name>
</author>
<author><name sortKey="Liu, L" uniqKey="Liu L">L Liu</name>
</author>
<author><name sortKey="Matsuura, M" uniqKey="Matsuura M">M Matsuura</name>
</author>
<author><name sortKey="Lambowitz, Am" uniqKey="Lambowitz A">AM Lambowitz</name>
</author>
<author><name sortKey="Perlman, Ps" uniqKey="Perlman P">PS Perlman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Eskes, R" uniqKey="Eskes R">R Eskes</name>
</author>
<author><name sortKey="Liu, L" uniqKey="Liu L">L Liu</name>
</author>
<author><name sortKey="Ma, Hw" uniqKey="Ma H">HW Ma</name>
</author>
<author><name sortKey="Chao, My" uniqKey="Chao M">MY Chao</name>
</author>
<author><name sortKey="Dickson, L" uniqKey="Dickson L">L Dickson</name>
</author>
<author><name sortKey="Lambowitz, Am" uniqKey="Lambowitz A">AM Lambowitz</name>
</author>
<author><name sortKey="Perlman, Ps" uniqKey="Perlman P">PS Perlman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Fink, Gr" uniqKey="Fink G">GR Fink</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Foulon, E" uniqKey="Foulon E">E Foulon</name>
</author>
<author><name sortKey="Not, F" uniqKey="Not F">F Not</name>
</author>
<author><name sortKey="Jalabert, F" uniqKey="Jalabert F">F Jalabert</name>
</author>
<author><name sortKey="Cariou, T" uniqKey="Cariou T">T Cariou</name>
</author>
<author><name sortKey="Massana, R" uniqKey="Massana R">R Massana</name>
</author>
<author><name sortKey="Simon, N" uniqKey="Simon N">N Simon</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Fulford Smith, Sp" uniqKey="Fulford Smith S">SP Fulford-Smith</name>
</author>
<author><name sortKey="Sikes, El" uniqKey="Sikes E">EL Sikes</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gilbert, W" uniqKey="Gilbert W">W Gilbert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Goodwin, Sb" uniqKey="Goodwin S">SB Goodwin</name>
</author>
<author><name sortKey="M Arek, Sb" uniqKey="M Arek S">SB M’Barek</name>
</author>
<author><name sortKey="Dhillon, B" uniqKey="Dhillon B">B Dhillon</name>
</author>
<author><name sortKey="Wittenberg, Ah" uniqKey="Wittenberg A">AH Wittenberg</name>
</author>
<author><name sortKey="Crane, Cf" uniqKey="Crane C">CF Crane</name>
</author>
<author><name sortKey="Hane, Jk" uniqKey="Hane J">JK Hane</name>
</author>
<author><name sortKey="Foster, Aj" uniqKey="Foster A">AJ Foster</name>
</author>
<author><name sortKey="Van Der Lee, Ta" uniqKey="Van Der Lee T">TA Van der Lee</name>
</author>
<author><name sortKey="Grimwood, J" uniqKey="Grimwood J">J Grimwood</name>
</author>
<author><name sortKey="Aerts, A" uniqKey="Aerts A">A Aerts</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Huang, S" uniqKey="Huang S">S Huang</name>
</author>
<author><name sortKey="Chen, Z" uniqKey="Chen Z">Z Chen</name>
</author>
<author><name sortKey="Yan, X" uniqKey="Yan X">X Yan</name>
</author>
<author><name sortKey="Yu, T" uniqKey="Yu T">T Yu</name>
</author>
<author><name sortKey="Huang, G" uniqKey="Huang G">G Huang</name>
</author>
<author><name sortKey="Yan, Q" uniqKey="Yan Q">Q Yan</name>
</author>
<author><name sortKey="Pontarotti, Pa" uniqKey="Pontarotti P">PA Pontarotti</name>
</author>
<author><name sortKey="Zhao, H" uniqKey="Zhao H">H Zhao</name>
</author>
<author><name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author><name sortKey="Yang, P" uniqKey="Yang P">P Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Jobb, G" uniqKey="Jobb G">G Jobb</name>
</author>
<author><name sortKey="Von Haeseler, A" uniqKey="Von Haeseler A">A von Haeseler</name>
</author>
<author><name sortKey="Strimmer, K" uniqKey="Strimmer K">K Strimmer</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Katoh, K" uniqKey="Katoh K">K Katoh</name>
</author>
<author><name sortKey="Kuma, K" uniqKey="Kuma K">K Kuma</name>
</author>
<author><name sortKey="Toh, H" uniqKey="Toh H">H Toh</name>
</author>
<author><name sortKey="Miyata, T" uniqKey="Miyata T">T Miyata</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Keeling, Pj" uniqKey="Keeling P">PJ Keeling</name>
</author>
<author><name sortKey="Burki, F" uniqKey="Burki F">F Burki</name>
</author>
<author><name sortKey="Wilcox, Hm" uniqKey="Wilcox H">HM Wilcox</name>
</author>
<author><name sortKey="Allam, B" uniqKey="Allam B">B Allam</name>
</author>
<author><name sortKey="Allen, Ee" uniqKey="Allen E">EE Allen</name>
</author>
<author><name sortKey="Amaral Zettler, La" uniqKey="Amaral Zettler L">LA Amaral-Zettler</name>
</author>
<author><name sortKey="Armbrust, Ev" uniqKey="Armbrust E">EV Armbrust</name>
</author>
<author><name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
<author><name sortKey="Bharti, Ak" uniqKey="Bharti A">AK Bharti</name>
</author>
<author><name sortKey="Bell, Cj" uniqKey="Bell C">CJ Bell</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kilias, Es" uniqKey="Kilias E">ES Kilias</name>
</author>
<author><name sortKey="Nothig, E M" uniqKey="Nothig E">E-M Nöthig</name>
</author>
<author><name sortKey="Wolf, C" uniqKey="Wolf C">C Wolf</name>
</author>
<author><name sortKey="Metfies, K" uniqKey="Metfies K">K Metfies</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author><name sortKey="Tucker, Ae" uniqKey="Tucker A">AE Tucker</name>
</author>
<author><name sortKey="Sung, W" uniqKey="Sung W">W Sung</name>
</author>
<author><name sortKey="Thomas, Wk" uniqKey="Thomas W">WK Thomas</name>
</author>
<author><name sortKey="Lynch, M" uniqKey="Lynch M">M Lynch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Li, Wkw" uniqKey="Li W">WKW Li</name>
</author>
<author><name sortKey="Mclaughlin, Fa" uniqKey="Mclaughlin F">FA McLaughlin</name>
</author>
<author><name sortKey="Lovejoy, C" uniqKey="Lovejoy C">C Lovejoy</name>
</author>
<author><name sortKey="Carmack, Ec" uniqKey="Carmack E">EC Carmack</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Llopart, A" uniqKey="Llopart A">A Llopart</name>
</author>
<author><name sortKey="Comeron, Jm" uniqKey="Comeron J">JM Comeron</name>
</author>
<author><name sortKey="Brunet, Fg" uniqKey="Brunet F">FG Brunet</name>
</author>
<author><name sortKey="Lachaise, D" uniqKey="Lachaise D">D Lachaise</name>
</author>
<author><name sortKey="Long, M" uniqKey="Long M">M Long</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lovejoy, C" uniqKey="Lovejoy C">C Lovejoy</name>
</author>
<author><name sortKey="Vincent, Wf" uniqKey="Vincent W">WF Vincent</name>
</author>
<author><name sortKey="Bonilla, S" uniqKey="Bonilla S">S Bonilla</name>
</author>
<author><name sortKey="Roy, S" uniqKey="Roy S">S Roy</name>
</author>
<author><name sortKey="Martineau, Mj" uniqKey="Martineau M">MJ Martineau</name>
</author>
<author><name sortKey="Terrado, R" uniqKey="Terrado R">R Terrado</name>
</author>
<author><name sortKey="Potvin, M" uniqKey="Potvin M">M Potvin</name>
</author>
<author><name sortKey="Massana, R" uniqKey="Massana R">R Massana</name>
</author>
<author><name sortKey="Pedros Alio, C" uniqKey="Pedros Alio C">C Pedros-Alio</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Lynch, M" uniqKey="Lynch M">M Lynch</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Marin, B" uniqKey="Marin B">B Marin</name>
</author>
<author><name sortKey="Melkonian, M" uniqKey="Melkonian M">M Melkonian</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mcrose, D" uniqKey="Mcrose D">D McRose</name>
</author>
<author><name sortKey="Guo, J" uniqKey="Guo J">J Guo</name>
</author>
<author><name sortKey="Monier, A" uniqKey="Monier A">A Monier</name>
</author>
<author><name sortKey="Sudek, S" uniqKey="Sudek S">S Sudek</name>
</author>
<author><name sortKey="Wilken, S" uniqKey="Wilken S">S Wilken</name>
</author>
<author><name sortKey="Yan, S" uniqKey="Yan S">S Yan</name>
</author>
<author><name sortKey="Mock, T" uniqKey="Mock T">T Mock</name>
</author>
<author><name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
<author><name sortKey="Begley, Tp" uniqKey="Begley T">TP Begley</name>
</author>
<author><name sortKey="Reyes Prieto, A" uniqKey="Reyes Prieto A">A Reyes-Prieto</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Modrek, B" uniqKey="Modrek B">B Modrek</name>
</author>
<author><name sortKey="Lee, C" uniqKey="Lee C">C Lee</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Monier, A" uniqKey="Monier A">A Monier</name>
</author>
<author><name sortKey="Sudek, S" uniqKey="Sudek S">S Sudek</name>
</author>
<author><name sortKey="Fast, Nm" uniqKey="Fast N">NM Fast</name>
</author>
<author><name sortKey="Worden, Az" uniqKey="Worden A">AZ Worden</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Moore, Mj" uniqKey="Moore M">MJ Moore</name>
</author>
<author><name sortKey="Sharp, Pa" uniqKey="Sharp P">PA Sharp</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Morozov, Eg" uniqKey="Morozov E">EG Morozov</name>
</author>
<author><name sortKey="Demidov, An" uniqKey="Demidov A">AN Demidov</name>
</author>
<author><name sortKey="Tarakanov, Ry" uniqKey="Tarakanov R">RY Tarakanov</name>
</author>
<author><name sortKey="Zenk, W" uniqKey="Zenk W">W Zenk</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Morrison, Ak" uniqKey="Morrison A">AK Morrison</name>
</author>
<author><name sortKey="Frolicher, Tl" uniqKey="Frolicher T">TL Frölicher</name>
</author>
<author><name sortKey="Sarmiento, Jl" uniqKey="Sarmiento J">JL Sarmiento</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Parra, G" uniqKey="Parra G">G Parra</name>
</author>
<author><name sortKey="Bradnam, K" uniqKey="Bradnam K">K Bradnam</name>
</author>
<author><name sortKey="Rose, Ab" uniqKey="Rose A">AB Rose</name>
</author>
<author><name sortKey="Korf, I" uniqKey="Korf I">I Korf</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Philippe, H" uniqKey="Philippe H">H Philippe</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Posada, D" uniqKey="Posada D">D Posada</name>
</author>
<author><name sortKey="Crandall, Ka" uniqKey="Crandall K">KA Crandall</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Rogozin, Ib" uniqKey="Rogozin I">IB Rogozin</name>
</author>
<author><name sortKey="Carmel, L" uniqKey="Carmel L">L Carmel</name>
</author>
<author><name sortKey="Csuros, M" uniqKey="Csuros M">M Csuros</name>
</author>
<author><name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ronquist, F" uniqKey="Ronquist F">F Ronquist</name>
</author>
<author><name sortKey="Teslenko, M" uniqKey="Teslenko M">M Teslenko</name>
</author>
<author><name sortKey="Van Der Mark, P" uniqKey="Van Der Mark P">P van der Mark</name>
</author>
<author><name sortKey="Ayres, Dl" uniqKey="Ayres D">DL Ayres</name>
</author>
<author><name sortKey="Darling, A" uniqKey="Darling A">A Darling</name>
</author>
<author><name sortKey="Hohna, S" uniqKey="Hohna S">S Hohna</name>
</author>
<author><name sortKey="Larget, B" uniqKey="Larget B">B Larget</name>
</author>
<author><name sortKey="Liu, L" uniqKey="Liu L">L Liu</name>
</author>
<author><name sortKey="Suchard, Ma" uniqKey="Suchard M">MA Suchard</name>
</author>
<author><name sortKey="Huelsenbeck, Jp" uniqKey="Huelsenbeck J">JP Huelsenbeck</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Roy, Sw" uniqKey="Roy S">SW Roy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Roy, Sw" uniqKey="Roy S">SW Roy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Roy, Sw" uniqKey="Roy S">SW Roy</name>
</author>
<author><name sortKey="Gilbert, W" uniqKey="Gilbert W">W Gilbert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Slapeta, J" uniqKey="Slapeta J">J Slapeta</name>
</author>
<author><name sortKey="Lopez Garcia, P" uniqKey="Lopez Garcia P">P Lopez-Garcia</name>
</author>
<author><name sortKey="Moreira, D" uniqKey="Moreira D">D Moreira</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Storici, F" uniqKey="Storici F">F Storici</name>
</author>
<author><name sortKey="Bebenek, K" uniqKey="Bebenek K">K Bebenek</name>
</author>
<author><name sortKey="Kunkel, Ta" uniqKey="Kunkel T">TA Kunkel</name>
</author>
<author><name sortKey="Gordenin, Da" uniqKey="Gordenin D">DA Gordenin</name>
</author>
<author><name sortKey="Resnick, Ma" uniqKey="Resnick M">MA Resnick</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
<author><name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author><name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author><name sortKey="Altintas, I" uniqKey="Altintas I">I Altintas</name>
</author>
<author><name sortKey="Lin, A" uniqKey="Lin A">A Lin</name>
</author>
<author><name sortKey="Peltier, S" uniqKey="Peltier S">S Peltier</name>
</author>
<author><name sortKey="Stocks, K" uniqKey="Stocks K">K Stocks</name>
</author>
<author><name sortKey="Allen, Ee" uniqKey="Allen E">EE Allen</name>
</author>
<author><name sortKey="Ellisman, M" uniqKey="Ellisman M">M Ellisman</name>
</author>
<author><name sortKey="Grethe, J" uniqKey="Grethe J">J Grethe</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sverdlov, Sv" uniqKey="Sverdlov S">SV Sverdlov</name>
</author>
<author><name sortKey="Rogozin, Ib" uniqKey="Rogozin I">IB Rogozin</name>
</author>
<author><name sortKey="Babenko, Vn" uniqKey="Babenko V">VN Babenko</name>
</author>
<author><name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Talley, Ld" uniqKey="Talley L">LD Talley</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Torriani, Sf" uniqKey="Torriani S">SF Torriani</name>
</author>
<author><name sortKey="Stukenbrock, Eh" uniqKey="Stukenbrock E">EH Stukenbrock</name>
</author>
<author><name sortKey="Brunner, Pc" uniqKey="Brunner P">PC Brunner</name>
</author>
<author><name sortKey="Mcdonald, Ba" uniqKey="Mcdonald B">BA McDonald</name>
</author>
<author><name sortKey="Croll, D" uniqKey="Croll D">D Croll</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tseng, Ck" uniqKey="Tseng C">CK Tseng</name>
</author>
<author><name sortKey="Cheng, Sc" uniqKey="Cheng S">SC Cheng</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tseng, Ck" uniqKey="Tseng C">CK Tseng</name>
</author>
<author><name sortKey="Cheng, Sc" uniqKey="Cheng S">SC Cheng</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Van Der Burgt, A" uniqKey="Van Der Burgt A">A van der Burgt</name>
</author>
<author><name sortKey="Severing, E" uniqKey="Severing E">E Severing</name>
</author>
<author><name sortKey="De Wit, Pj" uniqKey="De Wit P">PJ de Wit</name>
</author>
<author><name sortKey="Collemare, J" uniqKey="Collemare J">J Collemare</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Verhelst, B" uniqKey="Verhelst B">B Verhelst</name>
</author>
<author><name sortKey="Van De Peer, Y" uniqKey="Van De Peer Y">Y Van de Peer</name>
</author>
<author><name sortKey="Rouze, P" uniqKey="Rouze P">P Rouze</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Worden, Az" uniqKey="Worden A">AZ Worden</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Worden, Az" uniqKey="Worden A">AZ Worden</name>
</author>
<author><name sortKey="Lee, Jh" uniqKey="Lee J">JH Lee</name>
</author>
<author><name sortKey="Mock, T" uniqKey="Mock T">T Mock</name>
</author>
<author><name sortKey="Rouze, P" uniqKey="Rouze P">P Rouze</name>
</author>
<author><name sortKey="Simmons, Mp" uniqKey="Simmons M">MP Simmons</name>
</author>
<author><name sortKey="Aerts, Al" uniqKey="Aerts A">AL Aerts</name>
</author>
<author><name sortKey="Allen, Ae" uniqKey="Allen A">AE Allen</name>
</author>
<author><name sortKey="Cuvelier, Ml" uniqKey="Cuvelier M">ML Cuvelier</name>
</author>
<author><name sortKey="Derelle, E" uniqKey="Derelle E">E Derelle</name>
</author>
<author><name sortKey="Everett, Mv" uniqKey="Everett M">MV Everett</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Worden, Az" uniqKey="Worden A">AZ Worden</name>
</author>
<author><name sortKey="Nolan, Jk" uniqKey="Nolan J">JK Nolan</name>
</author>
<author><name sortKey="Palenik, B" uniqKey="Palenik B">B Palenik</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Yenerall, P" uniqKey="Yenerall P">P Yenerall</name>
</author>
<author><name sortKey="Zhou, L" uniqKey="Zhou L">L Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zimmerly, S" uniqKey="Zimmerly S">S Zimmerly</name>
</author>
<author><name sortKey="Guo, H" uniqKey="Guo H">H Guo</name>
</author>
<author><name sortKey="Eskest, R" uniqKey="Eskest R">R Eskest</name>
</author>
<author><name sortKey="Yang, J" uniqKey="Yang J">J Yang</name>
</author>
<author><name sortKey="Perlman, Ps" uniqKey="Perlman P">PS Perlman</name>
</author>
<author><name sortKey="Lambowitz, Am" uniqKey="Lambowitz A">AM Lambowitz</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zuker, M" uniqKey="Zuker M">M Zuker</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">Mol Biol Evol</journal-id>
<journal-id journal-id-type="iso-abbrev">Mol. Biol. Evol</journal-id>
<journal-id journal-id-type="publisher-id">molbev</journal-id>
<journal-id journal-id-type="hwp">molbiolevol</journal-id>
<journal-title-group><journal-title>Molecular Biology and Evolution</journal-title>
</journal-title-group>
<issn pub-type="ppub">0737-4038</issn>
<issn pub-type="epub">1537-1719</issn>
<publisher><publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">25998521</article-id>
<article-id pub-id-type="pmc">4540971</article-id>
<article-id pub-id-type="doi">10.1093/molbev/msv122</article-id>
<article-id pub-id-type="publisher-id">msv122</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Fast Track</subject>
</subj-group>
</article-categories>
<title-group><article-title>Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic <italic>Micromonas</italic>
 Populations</article-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Simmons</surname>
<given-names>Melinda P.</given-names>
</name>
<xref ref-type="author-notes" rid="msv122-FN1"><sup>†</sup>
</xref>
<xref ref-type="aff" rid="msv122-AFF1"><sup>1</sup>
</xref>
<xref ref-type="aff" rid="msv122-AFF2"><sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Bachy</surname>
<given-names>Charles</given-names>
</name>
<xref ref-type="author-notes" rid="msv122-FN1"><sup>†</sup>
</xref>
<xref ref-type="aff" rid="msv122-AFF1"><sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Sudek</surname>
<given-names>Sebastian</given-names>
</name>
<xref ref-type="aff" rid="msv122-AFF1"><sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>van Baren</surname>
<given-names>Marijke J.</given-names>
</name>
<xref ref-type="aff" rid="msv122-AFF1"><sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Sudek</surname>
<given-names>Lisa</given-names>
</name>
<xref ref-type="aff" rid="msv122-AFF1"><sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Ares</surname>
<given-names>Manuel</given-names>
<suffix>Jr</suffix>
</name>
<xref ref-type="aff" rid="msv122-AFF3"><sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Worden</surname>
<given-names>Alexandra Z.</given-names>
</name>
<xref ref-type="corresp" rid="msv122-COR1">*</xref>
<xref ref-type="aff" rid="msv122-AFF1"><sup>1</sup>
</xref>
<xref ref-type="aff" rid="msv122-AFF2"><sup>2</sup>
</xref>
<xref ref-type="aff" rid="msv122-AFF4"><sup>4</sup>
</xref>
</contrib>
<aff id="msv122-AFF1"><sup>1</sup>
Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA</aff>
<aff id="msv122-AFF2"><sup>2</sup>
Department of Ocean Sciences, University of California Santa Cruz</aff>
<aff id="msv122-AFF3"><sup>3</sup>
Department of Molecular, Cell & Developmental Biology, University of California Santa Cruz</aff>
<aff id="msv122-AFF4"><sup>4</sup>
Integrated Microbial Biodiversity Program, Canadian Institute for Advanced Research, Toronto, ON, Canada</aff>
</contrib-group>
<author-notes><fn id="msv122-FN1"><p><sup>†</sup>
These authors contributed equally to this work.</p>
</fn>
<corresp id="msv122-COR1"><bold>*Corresponding author:</bold>
 E-mail: <email>azworden@mbari.org</email>
.</corresp>
<fn id="msv122-FN2"><p><bold>Associate editor:</bold>
 Hongzhi Kong</p>
</fn>
</author-notes>
<pub-date pub-type="ppub"><month>9</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="epub"><day>20</day>
<month>5</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="pmc-release"><day>20</day>
<month>5</month>
<year>2015</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
      <volume>32</volume>
<issue>9</issue>
<fpage>2219</fpage>
<lpage>2235</lpage>
<permissions><copyright-statement>© The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.</copyright-statement>
<copyright-year>2015</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by-nc/4.0/" license-type="creative-commons"><license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</ext-link>
), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com</license-p>
</license>
</permissions>
<abstract><p>Spliceosomal introns are a hallmark of eukaryotic genes that are hypothesized to play important roles in genome evolution but have poorly understood origins. Although most introns lack sequence homology to each other, new families of spliceosomal introns that are repeated hundreds of times in individual genomes have recently been discovered in a few organisms. The prevalence and conservation of these introner elements (IEs) or introner-like elements in other taxa, as well as their evolutionary relationships to regular spliceosomal introns, are still unknown. Here, we systematically investigate introns in the widespread marine green alga <italic>Micromonas</italic>
 and report new families of IEs, numerous intron presence–absence polymorphisms, and potential intron insertion hot-spots. The new families enabled identification of conserved IE secondary structure features and establishment of a novel general model for repetitive intron proliferation across genomes. Despite shared secondary structure, the IE families from each <italic>Micromonas</italic>
 lineage bear no obvious sequence similarity to those in the other lineages, suggesting that their appearance is intimately linked with the process of speciation. Two of the new IE families come from an Arctic culture (<italic>Micromonas</italic>
 Clade E2) isolated from a polar region where abundance of this alga is increasing due to climate induced changes. The same two families were detected in metagenomic data from Antarctica—a system where <italic>Micromonas</italic>
 has never before been reported. Strikingly high identity between the Arctic isolate and Antarctic coding sequences that flank the IEs suggests connectivity between populations in the two polar systems that we postulate occurs through deep-sea currents. Recovery of Clade E2 sequences in North Atlantic Deep Waters beneath the Gulf Stream supports this hypothesis. Our research illuminates the dynamic relationships between an unusual class of repetitive introns, genome evolution, speciation, and global distribution of this sentinel marine alga.</p>
</abstract>
<kwd-group><kwd>introns</kwd>
<kwd>marine algae</kwd>
<kwd>polar systems</kwd>
<kwd>phytoplankton</kwd>
<kwd>repetitive introns</kwd>
<kwd>Introner Elements</kwd>
</kwd-group>
<counts><page-count count="17"></page-count>
</counts>
</article-meta>
</front>
<body><sec sec-type="intro"><title>Introduction</title>
<p>Spliceosomal introns are distinctly eukaryotic gene features whose origins remain mysterious. Intimately linked with eukaryote evolution, introns interrupt coding information and must be removed from primary RNA transcripts by splicing. Introns and splicing are thought to provide eukaryotes with mechanisms for diversifying mRNA molecules from a gene after transcription. Mutations that create new mRNA splicing patterns can convey advantages, in particular, by encoding novel proteins (<xref rid="msv122-B20" ref-type="bibr">Gilbert 1978</xref>
; <xref rid="msv122-B27" ref-type="bibr">Koonin 2006</xref>
), or even generating multiple functionally distinct protein products from an individual gene (<xref rid="msv122-B35" ref-type="bibr">Modrek and Lee 2002</xref>
). Splicing can also modulate gene expression through mechanisms such as nonsense-mediated decay and intron-mediated enhancement (<xref rid="msv122-B7" ref-type="bibr">Brogna and Wen 2009</xref>
; <xref rid="msv122-B40" ref-type="bibr">Parra et al. 2011</xref>
).</p>
<p>Although introns play influential roles in eukaryotic biology, the molecular and evolutionary processes that structure their distributions in genomes remain difficult to trace (<xref rid="msv122-B11" ref-type="bibr">Curtis and Archibald 2010</xref>
; <xref rid="msv122-B43" ref-type="bibr">Rogozin et al. 2012</xref>
). In divergent taxa such as mammals and plants, introns can be found in homologous positions of orthologous genes where they are neutrally evolving and lack sequence homology (<xref rid="msv122-B51" ref-type="bibr">Sverdlov et al. 2007</xref>
; <xref rid="msv122-B43" ref-type="bibr">Rogozin et al. 2012</xref>
). This has been taken as evidence that the last eukaryotic common ancestor (LECA) had an intron-rich genome, that introns were a part of genes since the earliest evolutionary stages of life, and that the lack of spliceosomal introns in present-day archaea and bacteria resulted from subsequent streamlining (<xref rid="msv122-B45" ref-type="bibr">Roy 2003</xref>
; <xref rid="msv122-B27" ref-type="bibr">Koonin 2006</xref>
; <xref rid="msv122-B47" ref-type="bibr">Roy and Gilbert 2006</xref>
). An alternative hypothesis holds that introns appeared and spread with the emergence of eukaryotes, with intron loss being the dominant process since descent from ancestral eukaryotes (<xref rid="msv122-B27" ref-type="bibr">Koonin 2006</xref>
; <xref rid="msv122-B46" ref-type="bibr">Roy 2006</xref>
; <xref rid="msv122-B10" ref-type="bibr">Csuros et al. 2011</xref>
).</p>
<p>Several studies on closely related taxa have recently suggested that intron gain is an important and ongoing process. These studies have uncovered situations in which only one of a pair of orthologous genes has an intron, representing either recent insertion or precise deletion of the intron (e.g., <xref rid="msv122-B30" ref-type="bibr">Llopart et al. 2002</xref>
; <xref rid="msv122-B28" ref-type="bibr">Li, Tucker, et al. 2009</xref>
). There are more than two dozen such intron presence–absence polymorphisms between two isolates of the crustacean <italic>Daphnia pulex</italic>
, suggesting that parallel intron gains have recently occurred at homologous positions within orthologous genes (<xref rid="msv122-B28" ref-type="bibr">Li, Tucker, et al. 2009</xref>
). Short repeats (5–12 nt) flanking the inserted introns led to the hypothesis that polymorphic <italic>D. pulex</italic>
 introns are derived from double-strand break repair (<xref rid="msv122-B28" ref-type="bibr">Li, Tucker, et al. 2009</xref>
), although the origin of the donor intron sequence (needed for gain) is not clear.</p>
<p>As genomes from closely related organisms are sequenced, examples of new intron types are also emerging. These involve intron presence–absence polymorphisms where identical or nearly identical introns are present in one genome but are seemingly absent from related taxa, suggesting that these repetitive introns act as transposable elements to propagate across a given genome (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
). Originally reported in the unicellular green alga <italic>Micromonas</italic>
, an ecologically important genus of marine prasinophyte algae, these repeated intron families (introner elements, IEs) have three properties in common (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
). First, members of a single family of IE have highly similar sequences, for example, a family called IE3 (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
) has 32 members with identical nucleotide sequences scattered about the <italic>Micromonas pusilla</italic>
 CCMP1545 genome, and many more members that differ only at a single position. Second, each individual IE resides within a transcription unit in the sense orientation, and is removed after transcription by the spliceosome during mRNA processing. Third, IEs display intron presence–absence patterns characteristic of intron gain by repeat expansion in the genomes where they are abundant. Examples of repetitive introns also appear in the larvacean tunicate <italic>Oikopleura dioica</italic>
 (<xref rid="msv122-B14" ref-type="bibr">Denoeud et al. 2010</xref>
), and a number of terrestrial fungi (<xref rid="msv122-B53" ref-type="bibr">Torriani et al. 2011</xref>
; <xref rid="msv122-B56" ref-type="bibr">van der Burgt et al. 2012</xref>
). Notably, no repetitive intron family described thus far appears to encode a protein that could promote selective reverse splicing or transposition of these unusual introns.</p>
<p>IEs provide an interesting case study because <italic>Micromonas</italic>
 appears to have large effective population sizes with periodic isolation and reduction on short time scales (seasonal) as well as long-term isolation influenced by changes in glaciation, land mass organization, and ocean circulation. Although <italic>Micromonas</italic>
 has low intron numbers relative to other Viridiplantae, such as chlorophytes and land plants (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
; <xref rid="msv122-B5" ref-type="bibr">Blanc et al. 2010</xref>
), the 22 Mb genome of <italic>M. pusilla</italic>
 CCMP1545 is 1 Mb larger than that of <italic>Micromonas</italic>
 sp. RCC299 due almost entirely to the presence of four IE families (IE1–IE4) that collectively have over 6,000 members (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
). None of these families is found in RCC299, which instead contains a small, distinct IE family of approximately 221 members (<xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
). These two isolates share at most 90% of their protein-encoding genes, and represent two of six known <italic>Micromonas</italic>
 clades, each thought to represent different species (<xref rid="msv122-B48" ref-type="bibr">Slapeta et al. 2006</xref>
; <xref rid="msv122-B58" ref-type="bibr">Worden 2006</xref>
; <xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
). It remains an open question whether IEs are present in other <italic>Micromonas</italic>
 clades or are an atypical feature that is peculiarly abundant in the genome of CCMP1545, a North Atlantic strain isolated in the 1950s.</p>
<p>We systematically searched <italic>Micromonas</italic>
 isolates from around the world that represent the five established cultured clades to determine whether IEs are present in multiple clades. Combined with metagenomic analyses, our results reveal new IE families, expanding our understanding of these curious elements. Furthermore, we find that a newly delineated <italic>Micromonas</italic>
 clade containing the Arctic isolate CCMP2099 is widespread in the Southern Ocean, where <italic>Micromonas</italic>
 has not previously been reported. Environmental polymerase chain reaction (PCR) and cloning-based studies demonstrate the presence of this clade in the deep current that transports Arctic waters to the Southern Ocean, as well as polymorphic insertions of other IE families in Pacific Ocean populations. Our studies highlight the utility of IE families to track global distributions of <italic>Micromonas</italic>
 species. Moreover, by comparing the new IE families an unusual structural feature was identified that is potentially relevant to the spread of IEs. These results lead us to propose a novel model for the mechanism of intron transposition and to postulate that IEs influence the diversification of <italic>Micromonas</italic>
 lineages.</p>
</sec>
<sec sec-type="results"><title>Results</title>
<sec><title><italic>Micromonas</italic>
 Clades Distinguished by Cultured Isolates and Environmental Clones</title>
<p>The first goal of our study was to evaluate the presence and distribution of IEs in Clade D isolates apart from CCMP1545 (representing the species <italic>M. pusilla</italic>
), as well as in other <italic>Micromonas</italic>
 clades. Because this effort relies on robust clade discrimination, <italic>Micromonas</italic>
 isolates were analyzed using new and existing 18S rRNA gene data from cultured strains and environmental clone libraries. The phylogenetic reconstruction established the existence of seven clades (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>a</italic>
) designated here largely according to a previously established lettered naming system for cultured clades and Clade -.IV for the uncultured clade (<xref rid="msv122-B48" ref-type="bibr">Slapeta et al. 2006</xref>
; <xref rid="msv122-B58" ref-type="bibr">Worden 2006</xref>
) (<xref ref-type="table" rid="msv122-T1">table 1</xref>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). In the 18S rRNA gene phylogeny the key node separating <italic>Micromonas</italic>
 Clades A and B did not acquire bootstrap support (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>a</italic>
); however, these clades were distinguished using a concatenation of four protein-encoding genes (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online, see below). Different from prior phylogenies, Clade E members formed two distinct clades due to incorporation of new data (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>a</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link> online). We termed these Clades E1 and E2, with the latter containing the Arctic isolate CCMP2099.
<fig id="msv122-F1" orientation="portrait" position="float"><label>F<sc>ig</sc>
. 1.</label>
<caption><p>Molecular phylogeny of <italic>Micromonas</italic>
 and insertion sequences in gene homologs from cultured clades. (<italic>a</italic>
) Bayesian reconstruction of the 18S rRNA gene sequences from the Mamiellophyceae and other prasinophytes, using 1,646 unambiguously aligned positions and the GTR + Γ + I model of substitution. <italic>Micromonas</italic>
 clades (blue) are highlighted. Clade names are designated with letters, as in <xref rid="msv122-B48" ref-type="bibr">Slapeta et al. (2006)</xref>
 and roman numerals, as in <xref rid="msv122-B59" ref-type="bibr">Worden et al. (2009)</xref>
. Differentiation of Clade E.III to Clades E1 and E2 (black labeling) was achieved herein using new data. Sequences from environmental clone library studies were included for Clade -.IV (an uncultured clade) and groups with sparse representation in culture collections, such as the E2 Clade. Other widespread Mamiellophyceae genera shown, <italic>Ostreococcus</italic>
 (pink) and <italic>Bathycoccus</italic>
 (green), also have genome-sequenced representatives used in primer design for the IE PCR study. The tree is rooted by the prasinophyte <italic>Pycnococcus</italic>
-clade for display purposes. (<italic>b–e</italic>
) Architecture of amplified regions of protein-encoding genes investigated in cultured <italic>Micromonas</italic>
 clades (<xref ref-type="table" rid="msv122-T1">table 1</xref>
). Thick bars (blue) represent exons, vertical turquoise lines denote loci where introns are present (accompanied by thin horizontal intron lines) or absent (vertical line only). Thin horizontal lines represent Clade D IEs (yellow) and newly identified introns in Clade C (blue) and Clade E2 (red, purple) homologs of the Transporter. The first two Clade E2 introns (red) are highly identical (alignment under panel [<italic>e</italic>
]). Note that E1 and E2 data are lacking for the ATPase, as was Actin for E2 presumably due to primer mismatches later identified using transcriptome assemblies.</p>
</caption>
<graphic xlink:href="msv122f1p"></graphic>
</fig>
<table-wrap id="msv122-T1" orientation="portrait" position="float"><label>Table 1.</label>
<caption><p><italic>Micromonas</italic>
 Isolates Grown and Number of Assembled Sequences Obtained from Clones for Each Gene Homolog Investigated.</p>
</caption>
<table frame="hsides" rules="groups"><thead align="left"><tr><th rowspan="1" colspan="1">Isolate</th>
<th rowspan="1" colspan="1">Clade</th>
<th rowspan="1" colspan="1">Actin</th>
<th rowspan="1" colspan="1">ATPase</th>
<th rowspan="1" colspan="1">Transporter</th>
<th rowspan="1" colspan="1">Dehydrogenase</th>
</tr>
</thead>
<tbody align="left"><tr><td rowspan="1" colspan="1">RCC299</td>
<td rowspan="1" colspan="1">A<xref ref-type="table-fn" rid="msv122-TF1"><sup>a</sup>
</xref>
</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">1<xref ref-type="table-fn" rid="msv122-TF2"><sup>b</sup>
</xref>
</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">CCMP492</td>
<td rowspan="1" colspan="1">A</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">0<xref ref-type="table-fn" rid="msv122-TF2"><sup>b</sup>
</xref>
</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">CCMP1764</td>
<td rowspan="1" colspan="1">B</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">NEPCC29</td>
<td rowspan="1" colspan="1">C</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">0<xref ref-type="table-fn" rid="msv122-TF3"><sup>c</sup>
</xref>
</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">CS222</td>
<td rowspan="1" colspan="1">C</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">CCMP1195</td>
<td rowspan="1" colspan="1">C</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">CCMP490</td>
<td rowspan="1" colspan="1">D</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">CCMP1545</td>
<td rowspan="1" colspan="1">D</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
<tr><td rowspan="1" colspan="1">CCMP1646</td>
<td rowspan="1" colspan="1">E</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">0<xref ref-type="table-fn" rid="msv122-TF4"><sup>d</sup>
</xref>
</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">1</td>
</tr>
<tr><td rowspan="1" colspan="1">CCMP2099</td>
<td rowspan="1" colspan="1">E</td>
<td rowspan="1" colspan="1">0<sup>‡</sup>
</td>
<td rowspan="1" colspan="1">0<xref ref-type="table-fn" rid="msv122-TF4"><sup>d</sup>
</xref>
</td>
<td rowspan="1" colspan="1">2</td>
<td rowspan="1" colspan="1">2</td>
</tr>
</tbody>
</table>
<table-wrap-foot><fn id="msv122-TF1"><p>N<sc>ote</sc>
.—Clade designations based on <xref rid="msv122-B48" ref-type="bibr">Slapeta et al. (2006)</xref>
.</p>
</fn>
<fn id="msv122-TF2"><p><sup>a</sup>
RCC299 was not included in Slapeta’s analyses; therefore this assignment is based on phylogenetic analyses herein.</p>
</fn>
<fn id="msv122-TF3"><p><sup>b</sup>
The primers produced sequences from a different predicted ABC Transporter in Clade A strain CCMP492 and in RCC299; the correct RCC299 gene homolog was obtained from the sequenced genome and the CCMP492 amplicon was discarded from further analyses.</p>
</fn>
<fn id="msv122-TF4"><p><sup>c</sup>
While successful for other Clade C strains, the correct ATPase homolog was not retrieved in cloned NEPCC29 sequences.</p>
</fn>
<fn id="msv122-TF5"><p><sup>d</sup>
Comparison to transcript sequences, obtained later from Clade E2 isolate CCMP2099, revealed extensive primer mismatches for these genes, likely explaining unsuccessful PCR results.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
<p>We next evaluated prasinophyte orthologs of four genes carrying IEs in CCMP1545. The selected genes were identified in genome sequences available from two <italic>Micromonas</italic>
 and from other members of the prasinophyte class Mamiellophyceae, specifically two <italic>Ostreococcu</italic>
s strains (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>a</italic>
). The genes encode a putative Calcium ATPase (hereafter the gene is referred to as ATPase), a putative NADH dehydrogenase [ubiquinone] flavoprotein 1 (hereafter Dehydrogenase), Actin, and an ATP-binding cassette transporter (hereafter Transporter). PCR primers were designed against available genome sequenced prasinophytes to amplify gene regions spanning one or more known IE in CCMP1545 (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S2</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). <italic>Micromonas</italic>
 clade representatives from each cultured clade were then selected, grown, and these regions examined using PCR-based sequence data (<xref ref-type="table" rid="msv122-T1">table 1</xref>
).</p>
<p><italic>Micromonas</italic>
 Clade D representative CCMP490, isolated in the western North Atlantic, had IEs at the same insertion positions and phases as in the CCMP1545 orthologs (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>b</italic>
–<italic>e</italic>
). Identities for IEs at homologous positions were 100% and 99% for the two Actin IE1s and 100% (IE2), 98% (IE1) and 97% (IE3) for those in the ATPase, Transporter and Dehydrogenase, respectively. Three of these were phase-0 (i.e., introns located between two codons), whereas the 5′-most IE in the Actin gene fragment was phase-1 and the Transporter IE was phase-2. Splice sites (ss) were canonical, with GT/AG ss in the Actin 3′-IE, ATPase and Dehydrogenase or GC/AG ss for the Actin 5′-IE and Transporter IE (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S3</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). However, <italic>Micromonas</italic>
 Clade D IEs were not found in Clades A, B, C, and E1 or E2 orthologs (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>b</italic>
–<italic>e</italic>
) and were therefore termed D-IEs (<xref ref-type="table" rid="msv122-T2">table 2</xref>).
<table-wrap id="msv122-T2" orientation="portrait" position="float"><label>Table 2.</label>
<caption><p>IE Families and Their Distribution in the <italic>Micromonas</italic>
 Clades.</p>
</caption>
<table frame="hsides" rules="groups"><thead align="left"><tr><th rowspan="1" colspan="1">This Study</th>
<th rowspan="1" colspan="1"><xref rid="msv122-B59" ref-type="bibr">Worden (2009)</xref>
</th>
<th rowspan="1" colspan="1"><xref rid="msv122-B57" ref-type="bibr">Verhelst (2013)</xref>
</th>
<th rowspan="1" colspan="1">Strain or Metagenomic Read</th>
</tr>
</thead>
<tbody align="left"><tr><td rowspan="1" colspan="1">D-IE1</td>
<td rowspan="1" colspan="1">IE1</td>
<td rowspan="1" colspan="1">IEA1</td>
<td rowspan="1" colspan="1">CCMP1545, CCMP490, temperate & tropical metagenomes</td>
</tr>
<tr><td rowspan="1" colspan="1">D-IE2</td>
<td rowspan="1" colspan="1">IE2</td>
<td rowspan="1" colspan="1">IEA2</td>
<td rowspan="1" colspan="1">CCMP1545, CCMP490, temperate & tropical metagenomes</td>
</tr>
<tr><td rowspan="1" colspan="1">D-IE3</td>
<td rowspan="1" colspan="1">IE3</td>
<td rowspan="1" colspan="1">IEA3</td>
<td rowspan="1" colspan="1">CCMP1545, CCMP490 (metagenomes not searched)</td>
</tr>
<tr><td rowspan="1" colspan="1">D-IE4</td>
<td rowspan="1" colspan="1">IE4</td>
<td rowspan="1" colspan="1">IEA4</td>
<td rowspan="1" colspan="1">CCMP1545 (CCMP490 & metagenomes not searched)</td>
</tr>
<tr><td rowspan="1" colspan="1">Unconf.</td>
<td rowspan="1" colspan="1">Not reported</td>
<td rowspan="1" colspan="1">IEB</td>
<td rowspan="1" colspan="1">CCMP1545</td>
</tr>
<tr><td rowspan="1" colspan="1">Unconf.</td>
<td rowspan="1" colspan="1">Not reported</td>
<td rowspan="1" colspan="1">IED</td>
<td rowspan="1" colspan="1">CCMP1545</td>
</tr>
<tr><td rowspan="1" colspan="1">ABC-IE</td>
<td rowspan="1" colspan="1">Not reported</td>
<td rowspan="1" colspan="1">IEC, seen in RCC299</td>
<td rowspan="1" colspan="1">NEPCC29, CS222, CCMP1195, CCMP1764, RCC299, temperate metagenomes</td>
</tr>
<tr><td rowspan="1" colspan="1">E2-IEt1</td>
<td rowspan="1" colspan="1">Not reported</td>
<td rowspan="1" colspan="1">Not reported</td>
<td rowspan="1" colspan="1">CCMP2099, NADW, Antarctic metagenomes</td>
</tr>
<tr><td rowspan="1" colspan="1">E2-IEt2</td>
<td rowspan="1" colspan="1">Not reported</td>
<td rowspan="1" colspan="1">Not reported</td>
<td rowspan="1" colspan="1">CCMP2099, NADW, Antarctic metagenomes</td>
</tr>
</tbody>
</table>
<table-wrap-foot><fn id="msv122-TF6"><p>N<sc>ote</sc>
.—Families IEB and IED reported in <xref rid="msv122-B57" ref-type="bibr">Verhelst et al. (2013)</xref>
 are considered unconfirmed; these groups have very few members that are very diverged and most are not spliced as annotated in RNA-seq data.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec><title>Discovery of New IE Families with Lineage-Specific Distributions</title>
<p>Although D-IEs were not present in the other <italic>Micromonas</italic>
 clades, new IEs were discovered. Three novel introns were identified in the Transporter gene of <italic>Micromonas</italic>
 Clade E2 representative CCMP2099, but not in the E1 isolate (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>e</italic>
). These were in nonhomologous insertion positions to D-IEs from Clade D isolates. The two 5′-most introns in the CCMP2099 Transporter have 89% identity to one another (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>e</italic>
 alignment), higher than expected for regular spliceosomal introns (RSIs). The 5′-most novel intron is phase-1, the next phase-0, and the last phase-1. The latter, located near the 3′-end of the Transporter PCR product, is longer (185 nt) than the upstream introns (74 and 75 nt). BLASTn queries against the RCC299 and CCMP1545 genome assemblies, as well as CCMP1764 genomic DNA reads did not recover significant hits. Thus, the newly identified CCMP2099 introns do not appear to be similar to IEs in other <italic>Micromonas</italic>
 clades with sequenced genomes.</p>
<p>Given the lack of additional genome data, it remained unclear whether the new introns were present in other CCMP2099 genes, as expected of an IE family. Therefore, environmental data were searched and multiple copies of the shorter CCMP2099 Transporter intronic sequences (type 1) were recovered in metagenomes from the Antarctic (Ace Lake, Southern Ocean, Ross Sea; <xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>a</italic>
). The detected sequences interrupted multiple different protein-encoding genes according to BLASTx analyses of the metagenomic reads against NCBI’s nr database. Those with known functions included an autophagocytosis-associated protein, a putative aminopeptidase, transcriptional repressors, and an early light-induced protein (ELIP). These CCMP2099 “type 1” intronic sequences were identical or highly identical to those in metagenomes (<xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>b</italic>
 and <italic>c</italic>
) enabling the identification of a conserved intron motif using over 100 environmental sequences (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S2<italic>a</italic>
</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). This indicates that the two 5′-most introns (i.e., type 1) from the CCMP2099 Transporter represent an IE family present in Clade E2 that is not present in data from other <italic>Micromonas</italic> clades or in nonpolar environmental data. We therefore termed this new repetitive intron family E2-IEt1.
<fig id="msv122-F2" orientation="portrait" position="float"><label>F<sc>ig</sc>
. 2.</label>
<caption><p>The global distribution of <italic>Micromonas</italic>
 introns in available metagenomes and discovery of new IE families. (<italic>a</italic>
) Isolation sites for cultured <italic>Micromonas</italic>
 strains (circles), the sample site for environmental clone libraries generated herein (star), and sites where multiple BLASTn hits were recovered in public metagenomic data (symbols and color-codes as indicated on legend). Inset borders are color-coded to show corresponding map regions. Note that red triangles (representing E2-IEt1) lay beneath every purple triangle (E2-IEt2 sequences) and the location of the deep profile (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S3</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online) is not shown. (<italic>b</italic>
) E2-IEt1 consensus sequence from Antarctic metagenomic reads encoding eight different proteins. (<italic>c</italic>
) Aligned E2-IEt1 (and 12 exonic flanking nucleotides at each end) from Antarctic reads, including two from the same gene present in different samples (2 such examples, bottom 4 E2-IEs; excluded from [<italic>b</italic>
] to avoid overrepresenting element conservation) and from the CCMP2099 Transporter gene. CCMP2099 transcript contigs (nonbold numbers) are shown beneath each DNA sequence. Regions flanking the Arctic CCMP2099 E2-IEt1.a and Antarctic metagenomic E2-IEt1.24 and E2-IEt1.25 (from different Antarctic samples) in the Transporter gene were identical, as were the E2-IEs themselves except a single “T” at different positions in E2-IEt1.24 and E2-IEt1.25 (potentially representing 454 homopolymer accuracy issues).</p>
</caption>
<graphic xlink:href="msv122f2p"></graphic>
</fig>
</p>
<p>As seen for E2-IEt1, the distinct Clade E2 Transporter intronic sequence (type 2) garnered multiple hits in Antarctic metagenomic data, revealing another IE family (E2-IEt2; <xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>a</italic>
). IE flanking regions of the recovered reads represented yet other protein-encoding genes, including a putative amidophosphoribosyl transferase, a pre-mRNA-processing-splicing factor, cytochrome P450 monooxygenase, eukaryotic translation initiation factor 6, and transcription initiation factor TFIID sub.10. The longer E2-IEt2s averaged 186 ± 8 nt (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S2<italic>b</italic>
</ext-link>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1"><italic>c</italic>
</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online, ranging up to 207 nt) and were always geographically colocated with E2-IEt1 although the latter were found at additional sites. Detection efficiency for E2-IEt2s is likely lower than for E2-IEt1s because they are less likely to be captured completely within approximately 300-bp 454-metagenomic reads. Both E2-IE types were detected in many surface samples, but were also found below the photic zone at two sites at 330 m (at –66.57, 142.32) and 1,320 m (–67.07, 145.20, E2-IEt1 family only).</p>
<p>RNA-seq transcriptome assemblies from CCMP2099 (<xref rid="msv122-B34" ref-type="bibr">McRose et al. 2014</xref>
) revealed further similarities between this Arctic isolate and the Antarctic metagenomic sequences. CCMP2099 transcripts matched predicted exonic coding regions from Antarctic E2-IE-containing metagenomic reads (see e.g., flanking sequence; <xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>c</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S2<italic>c</italic>
</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Three hundred and sixty E2-IEt1-containing metagenomic reads with exonic regions present on both sides of the E2-IEt1 hit came from 165 different proteins also present in CCMP2099 transcriptome assemblies. Thirty-one and 248 of these metagenomic reads shared 100% and 99% coding sequences (CDS) identity with the matching CCMP2099 transcript, respectively, whereas the overall average was 98 ± 2% for all 360 sequences. The results experimentally confirmed splicing of E2-IEs as well as exclusive use of canonical GT/GC donor and AG acceptor sites. The results also demonstrate that <italic>Micromonas</italic>
 resides in Antarctica, and that Arctic and Antarctic populations have high nucleotide conservation.</p>
<p>The North Atlantic Deep Waters (NADW) provide a connection between Arctic and Antarctic waters because they form and sink in the Labrador and Nordic Seas, then flow in a deep (>1,500–2,000 m depth), thick layer through the North Atlantic and South Atlantic to the Southern Ocean where some upwelling occurs in the Wedell Sea (<xref rid="msv122-B6" ref-type="bibr">Broecker 1991</xref>
; <xref rid="msv122-B38" ref-type="bibr">Morozov et al. 2010</xref>
; <xref rid="msv122-B52" ref-type="bibr">Talley 2013</xref>
). The NADW signature is strong on the western side of the North Atlantic basin. Therefore, we extracted DNA from depth profile samples taken in the region of the Gulf Stream Current, several of which showed temperature and salinity characteristics of the NADW (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S4</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). We also designed primers specific to the flanking sequence of the CCMP2099 Transporter E2-IEt1, applied them to these samples, and obtained a PCR product of the anticipated 200 bp size in the 3,000 m sample. Bands with product sizes ≥300 bp were present in the 500, 2,000, and 4,000 m, as well as 90 m from a subtropical North Atlantic cast. Sequences from clones of the larger products came from bacteria (Verrucomicrobia, Planctomycetes, and Actinomycetes) and were unlike the E2-IEt1 sequence, whereas sequences from the cloned product of the 3,000 m sample came from CCMP2099 (98–100% nt identity).</p>
<p>Other new introns were found in <italic>Micromonas</italic>
 Clade C isolates. These were in the Transporter gene but at a nonhomologous position to both Clade D and Clade E2 IEs (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>e</italic>
). These phase-0 introns had one nucleotide polymorphism between each Clade C strain. PCR products from Clade A and B orthologs did not contain introns, but 64 hits (<italic>E</italic>
 values 10<sup>−</sup>
<sup>5</sup>
–10<sup>−</sup>
<sup>8</sup>
, with nucleotide identity 88–94%) were retrieved when the Clade C intron was used as a query against genomic DNA sequences from Clade B isolate CCMP1764 (see <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S3<italic>a</italic>
</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Hits were also recovered in RCC299 and the two best of these represented confirmed introns, one in a putative Calmodulin-binding protein (JGI Prot. ID 55550) and the other in Ribonuclease H (JGI Prot. ID 105055) (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S3<italic>b</italic>
</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Across Clades A, B, and C, the identified intron sequences shared higher sequence identity (∼80%) than observed for RSIs (<50%, see Materials and Methods). Highly similar sequences were also present in metagenomic data (<xref ref-type="fig" rid="msv122-F2">fig. 2<italic>a</italic>
</xref>
) with flanking regions that encode different proteins, for example, a putative intraflagellar transport protein and a putative superfamily I helicase. Ss were confirmed by aligning metagenomic reads with transcripts from Clade C isolate NEPCC29 (e.g., <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S3<italic>c</italic>
</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online) and compositional features evaluated using an alignment of metagenomic and culture-derived sequences (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S3<italic>d</italic>
</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). The presence of highly similar intron sequences in the Clade C Transporter gene, multiple <italic>Micromonas</italic>
 Clade A and B genes, and in metagenomes, identified them as yet another IE family, termed here ABC-IE (<xref ref-type="table" rid="msv122-T2">table 2</xref>
).</p>
<p>As observed for cultures, intron phase varied for IEs in metagenomic reads. ABC-IEs in manually curated metagenomic sequences that captured both 5′ and 3′ ss (<italic>n</italic>
 = 13) were phase-0 (7) and phase-1 (6). For a manually curated subset of E2-IEt1s that met the same criteria (<italic>n</italic>
 = 10), phase-0 (6), phase-1 (3) and phase-2 (1) were observed. E2-IEt2s were also found at all three phases, but distributed as phase-0 (3), phase-1 (5) and phase-2 (2). Additionally, an E2-IE (E2-IEt1.10) in the Antarctic metagenomic data was in the same <italic>ELIP</italic>
 gene and codon as a D-IE1 in CCMP1545, but in a different phase (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S4</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online).</p>
</sec>
<sec><title>Natural Variation in Polymorphic Introns Detected by Environmental Cloning</title>
<p><italic>Micromonas</italic>
 Clade D is also found in the North Pacific Ocean (<xref rid="msv122-B58" ref-type="bibr">Worden 2006</xref>
), but IE sequence similarities to the Atlantic Clade D isolates CCMP1545 and CCMP490, or indeed IE presence, are unknown. Two of the PCR primer sets targeting both <italic>Micromonas</italic>
 and <italic>Ostreococcus</italic>
 were used to construct environmental clone libraries from samples collected in spring and autumn in the eastern North Pacific Ocean. Of the total clones successfully sequenced, 122 (ATPase) and 294 (Actin) came from the targeted homologs and comprised intron-bearing (24 ATPase; 160 Actin) and intronless (98 ATPase; 134 Actin) sequences. The latter came from <italic>Micromonas</italic>
 (although not Clade D), <italic>Ostreococcus</italic>
 and sometimes more distant taxa. Apart from determining their clade assignment, intron-less sequences were not further analyzed.</p>
<p>The majority of Pacific environmental <italic>Micromonas</italic>
 Clade D sequences contained IEs in homologous positions as in the Clade D cultures, and all had canonical ss (<xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
<italic>a</italic>
 and <italic>b</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S3</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Sequence clustering showed that, for both genes, some environmental clones were identical to CCMP1545 and CCMP490 throughout the gene and IE, whereas others had nucleotide differences (<xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
<italic>c</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S5</ext-link>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">datafile S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link> online). For environmental clones containing both Actin D-IEs, the 5′- and more 3′-IEs had 97–100% and 98–100% nucleotide identity, respectively.
<fig id="msv122-F3" orientation="portrait" position="float"><label>F<sc>ig</sc>
. 3.</label>
<caption><p>Intron presence–absence patterns in Pacific Ocean environmental clones. Architecture for regions of the genes encoding (<italic>a</italic>
) the putative Calcium ATPase and (<italic>b</italic>
) Actin are shown. Thick bars (blue) represent exons, vertical turquoise lines denote loci where introns are present (accompanied by thin horizontal intron lines) or absent (vertical line only). Thin horizontal lines represent D-IEs (yellow, D-IEs) and a newly identified presence–absence polymorphism (green) in environmental clones similar to Clade D. ATPase Cluster B consists of six environmental sequences, whereas Actin Cluster S4 and the RSI-bearing Clade D-like type have one and two clones, respectively. (<italic>c</italic>
) Nucleotide polymorphisms in the amplified region of IE-bearing ATPase homologs. Coding region (black) and D-IEs (orange) lengths are indicated above top bar and numbering below corresponds to SNP positions. The number of sequences (100% identical) in each cluster from cultures and environmental clones from spring or fall Pacific clone libraries is indicated. Dots represent identical nucleotides to those of the first sequence and variants denote other nucleotides. Only positions with polymorphisms are shown. The asterisk (orange) represents a 5′-IE (184 nt) in Env. Cluster B sequences, absent from all other ATPase sequences (variant nucleotide numbering does not include the Cluster B 5′-IE).</p>
</caption>
<graphic xlink:href="msv122f3p"></graphic>
</fig>
</p>
<p>Intron-bearing clones with the least conservation to the ATPase and Actin genes from D-lineage cultures also had deviant intron numbers. Spring and autumn clones (P3I4_28 and P5I4_80, respectively) lacked the Actin 5′- and 3′-IEs, but contained a phase-0 intron at a nonhomologous intervening position (<xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
<italic>b</italic>
, Env. Clade D-like). These two nearly identical clones (one mismatch) have higher CDS identity to Clade D isolates (96%) than to other cultured <italic>Micromonas</italic>
 clades (∼90–91%), and appear to represent a basal Clade D group. High identity across the amplified Actin CDS region precluded further resolution by phylogenetic approaches and because similar sequences were not detected in available <italic>Micromonas</italic>
 genomic or metagenomic data, we concluded that this intron is a polymorphic RSI. However, introns in the other divergent Clade D environmental clusters were D-IEs. Despite sharing 99% CDS identity, Actin spring clone P3I4_VIIII47 (Cluster S4; <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S6</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online) lacked the more 5′ D-IE1 and the 3′ D-IE1 had relatively low identity (93%) to homologously positioned D-IE1s (<xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
<italic>b</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S5</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Notably, a phase-0 intron with canonical ss was also found in Actin from the prasinophyte <italic>Pterosperma cristatum</italic>
, at the same codon as the more 3′ D-IEs in <italic>Micromonas</italic>
 Clade D Actin genes (also phase-0; <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S6</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Finally, in the ATPase Cluster B (6 clones), a D-IE1 (phase-0) was identified upstream of the D-IE2 shared with cultures and other environmental Clade D clusters (<xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
<italic>a</italic>
 and <italic>c</italic>
). Cluster B CDS (and the more 3′ D-IE) nucleotide differences were also distinct (<xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
<italic>c</italic>
) and phylogenetic analysis placed it in a supported position basal to the <italic>Micromonas</italic>
 Clade D-lineage, with low evolutionary distance from the cultured strains (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S7</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Collectively, the intron presence–absence polymorphisms observed in wild Pacific <italic>Micromonas</italic>
 closest to Clade D Atlantic isolates and the nucleotide polymorphisms present in environmental clusters are suggestive of a dynamic IE landscape that influences the development of discrete <italic>Micromonas</italic>
 populations.</p>
</sec>
<sec><title>Structural Features of IE Families</title>
<p>We next searched for RNA structural features that might hint at mechanisms of transposition. Previous comparisons of IE sequences have been limited to a few families in the same genome (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
; <xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
), where multiple expansion episodes have created sequence-related sets of elements within which conservation of mechanistically required features are difficult to discern. Here, we chose IE groups from within the ABC, D and E2 lineages that were present in multiple exact copies as these likely represent recently active elements. Three short motifs conserved between these groups reflect function in splicing (<xref ref-type="fig" rid="msv122-F4">fig. 4</xref>
, see also <xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>b</italic>
 and <italic>c</italic>
, and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary figs. S2</ext-link>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">S3</ext-link>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">table S5</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online): GYRaGu represents the 5′ ss, GacUG<underline>A</underline>
C contains the intron branchpoint (underlined), and CAG is at the 3′ ss (not shown in <xref ref-type="fig" rid="msv122-F4">fig. 4</xref>
). Other than these regions and 1-2 pyridimine (mostly U) rich segments, no primary sequence similarity was detected between ABC-IE (<xref ref-type="fig" rid="msv122-F4">fig. 4</xref>
<italic>a</italic>
 and <italic>b</italic>
), D-IE (<xref ref-type="fig" rid="msv122-F4">fig. 4</xref>
<italic>c</italic>
), and E2-IEt1 (<xref ref-type="fig" rid="msv122-F4">fig. 4</xref>
<italic>d</italic>). However, all had a sequence complementary to their respective 5′ ss within a few tens of nucleotides downstream.
<fig id="msv122-F4" orientation="portrait" position="float"><label>F<sc>ig</sc>
. 4.</label>
<caption><p>Secondary structure models for Introner lariat RNAs. A sequence complementary to the 5′ splice site is found in several IE types. The feature does not appear as a conserved primary sequence element because its sequence varies to maintain pairing with the 5′ ss. 2′-5′ linkage between the branchpoint A residue and the G at the 5′-end of the intron is shown with an asterisk and bases are numbered from the beginning of the intron. Additional sequences between the 5′ splice site and the branchpoint are represented by a line, and sequences downstream from the branchpoint are not shown. (<italic>a</italic>
) ABC-IE in the Transporter gene of NEPCC29. (<italic>b</italic>
) ABC-IE in the Transporter gene of RCC299 (two exact copies in this genome). (<italic>c</italic>
) An example D-IE3 from an NADH dehydrogenase subunit that is present in 32 identical copies in CCMP1545, with another 28 copies that have single base changes. The loop is larger than in panels (<italic>a</italic>
) and (<italic>b</italic>
) and other structures can form but a 5′ ss complementary sequence is present. (<italic>d</italic>
) Secondary structure of the Type 1 E2-IE from the CCMP2099 Transporter gene. (<italic>e</italic>
) A generalized structure for IE lariats showing the 5′ ss paired with the sequence downstream.</p>
</caption>
<graphic xlink:href="msv122f4p"></graphic>
</fig>
</p>
</sec>
</sec>
<sec sec-type="discussion"><title>Discussion</title>
<p>The discovery of repetitive elements that bear properties of spliceosomal introns has raised a number of questions about the origin of introns and their influence on eukaryotic genomes. Until now, such introns had only been observed in <italic>Micromonas</italic>
 Clade D (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
) and more minimally in Clade A (<xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
) as well as in several fungi (<xref rid="msv122-B53" ref-type="bibr">Torriani et al. 2011</xref>
; <xref rid="msv122-B56" ref-type="bibr">van der Burgt et al. 2012</xref>
). They appear to be absent from closely related prasinophytes, plants, mammals, and other taxa for which genome or gene sequences are available. Given biases in genomic resources (<xref rid="msv122-B25" ref-type="bibr">Keeling et al. 2014</xref>
), their true distributions across the broader expanses of the eukaryotic tree of life remain unknown.</p>
<p>Here, we report new IE families and document polymorphic introns, including some RSIs, in cultured and wild <italic>Micromonas.</italic>
 The results demonstrate that multiple pervasive IE families exist which have no extended sequence homology to introns or genomic sequence in distant <italic>Micromonas</italic>
 clades, but instead correspond to individual lineages. We have designated these using a lettered prefix representing the <italic>Micromonas</italic>
 clade(s) they occupy, including modification of the original CCMP1545 nomenclature to D-IEs (<xref ref-type="table" rid="msv122-T2">table 2</xref>
). Our results lend insight into several outstanding questions on biogeography of this algal genus as well as intron and IE evolution.</p>
<sec><title>Diversification of <italic>Micromonas</italic>
</title>
<p>The established <italic>Micromonas</italic>
 clades are thought to reflect species level differences and have been defined using multiple marker genes (<xref rid="msv122-B48" ref-type="bibr">Slapeta et al. 2006</xref>
; <xref rid="msv122-B58" ref-type="bibr">Worden 2006</xref>
; <xref rid="msv122-B33" ref-type="bibr">Marin and Melkonian 2010</xref>
) (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>a</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Clade D, in which IEs were first reported (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
), is the most basal <italic>Micromonas</italic>
 group. Its divergence from other clades is estimated at 66 ± 10 Ma with evolutionary distances that appear to be greater than those between <italic>Maize</italic>
 and <italic>Oryza</italic>
 (<xref rid="msv122-B48" ref-type="bibr">Slapeta et al. 2006</xref>
). Across the <italic>Micromonas</italic>
 clades examined here, only Clade D isolates (<italic>M. pusilla</italic>
) contain D-IEs. D-IEs remain the best characterized repetitive introns, in part because the CCMP1545 genome has been sequenced and because they are numerous. D-IEs range in length, but the most abundant group (D-IE1, 6,112 members) is on average slightly shorter (173 nt) than the 3,553 RSIs (192 nt) in CCMP1545 (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
; <xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
). Multiple D-IEs in North Pacific environmental clones had 100% identity to those from the two North Atlantic Clade D isolates although these water masses last had surface water connectivity approximately 3 Ma or more ago (before the formation of the Isthmus of Panama) (<xref rid="msv122-B64" ref-type="bibr">Molnar 2008</xref>
). Separation of the relevant populations is presumably older, given the locations where CCMP1545 and CCMP490 were isolated and circulation patterns. Together, the results indicate that the putative founder D-IE invaded the genome at or shortly after the time of Clade D separation from other <italic>Micromonas</italic>
 clades, potentially contributing to its divergence.</p>
<p>The more recent divergence of Clades A, B, and C, relative to other <italic>Micromonas</italic>
 clades, is underscored by ABC-IE relatedness. The ABC-IEs in the Transporter intron presence–absence polymorphism, and other CCMP1764 (Clade B) and RCC299 (Clade A) genes have high identity to each other as well as to a family of approximately 200 members reported in RCC299 (<xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
) (table 2, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S3</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). This suggests that the common ancestor of these clades contained ABC-IEs, whereas their heterogeneous distribution in orthologs (e.g., <xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>e</italic>
) from the three clades indicates that differential colonization (gains) and/or losses are connected to subsequent clade divergence.</p>
<p>The discovery of additional IE families, such as those unique to Clade E2, lends further support to the idea that IE invasion may relate to diversification. Novel E2-IEt1s and E2-IEt2s were present in CCMP2099 and Antarctic metagenomic data, but none was found in the PCR-tested genes from Clade E1 isolate CCMP1646 (for which a genome sequence is lacking). This result could arise from Clade E1 under sampling. However, E2-IEs were not detected in metagenomic data from tropical sites or temperate sites similar to where CCMP1646 was isolated, supporting absence from Clade E1 (<xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>a</italic>
). Indeed, unless completely purged from Clade E1, E2-IEs must have been gained during or after E1 and E2 separation, potentially influencing E2 divergence through genome invasion. In line with these findings, our phylogenetic reconstruction supports separation of Clade E into Clades E1 and E2 (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>a</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online).</p>
<p>Phylogenetic analysis using the concatenated gene sequences generated here provides a more robust assessment of clade composition than most previous studies (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). The distinct sequences and insertion positions of IE family members raise possible explanations for how intron invasion might influence development of the <italic>Micromonas</italic>
 clades. The lineage-specific IE families (<xref ref-type="table" rid="msv122-T2">table 2</xref>
, <xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary figs. S2</ext-link>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">S3</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online) together with insertion polymorphisms of ABC-IEs (e.g., <xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>e</italic>
) and D-IEs (<xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
) suggest a heterogeneous landscape in IE distributions across the genomes of extant taxa. Such IE presence–absence polymorphisms could shape diversification by impeding homologous recombination that likely occurs in nature (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
). The level of CDS divergence among strains in the ABC lineage is low and may not be sufficient to impede sexual reproduction. However, the differential presence of ABC-IEs would have a significant impact on recombination. Divergence could also be influenced by other mechanisms related to differential invasion, for example, differential losses of gene function—the detrimental consequences of faulty insertions (<xref rid="msv122-B28" ref-type="bibr">Li, Tucker, et al. 2009</xref>
); influences on protein evolution and regulatory changes related to alternative splicing; or establishment of new proteins through exon shuffling facilitated by phase-0 insertions (<xref rid="msv122-B27" ref-type="bibr">Koonin 2006</xref>
; <xref rid="msv122-B22" ref-type="bibr">Huang et al. 2014</xref>
). All of these would contribute to development of accessory genome components unique to each <italic>Micromonas</italic>
 lineage. Thus, although we cannot rule out presence of all IE families in a more ancestral alga that then underwent major differential losses, the more parsimonious explanation of the patterns observed is that the putative founder of each IE family invaded the respective genome at, or shortly after, separation of that clade from other <italic>Micromonas</italic>
 clades. In this scenario, IE propagation could well have contributed to the expansion and diversification of the <italic>Micromonas</italic>
 radiation, which shows greater divergence across clades than other Mamiellophyceae genera (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>a</italic>
).</p>
</sec>
<sec><title>Global IE Family Distributions Point to Bipolar Connectivity</title>
<p>The geographic patterns observed for IE families advance knowledge of global <italic>Micromonas</italic>
 biogeography (<xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>a</italic>
). Here, D-IEs and ABC-IEs were frequently observed in the same temperate water samples, and colocation of these clades has been reported in the English Channel and Southern California Bight which are also temperate (<xref rid="msv122-B58" ref-type="bibr">Worden 2006</xref>
; <xref rid="msv122-B18" ref-type="bibr">Foulon et al. 2008</xref>
). D-IEs are present in some low latitude (tropical) metagenomes and CCMP1764 (carrying ABC-IE) was isolated from the tropics. Lack of ABC-IEs in tropical metagenomic data may reflect lower numbers of this IE family across ABC-lineage genomes (contributing to lower frequency and detection in metagenomic data than D-IEs) and/or large differences in relative cell abundances at the time of sampling. Detection of ABC-IEs in Atlantic, Pacific and Indian Ocean metagenomes expands the known range of the <italic>Micromonas</italic>
 ABC lineage, as being from tropical to high latitude temperate waters just below 60°S (<xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>a</italic>
).</p>
<p>Importantly, our studies demonstrate that <italic>Micromonas</italic>
 is present in the Southern Ocean, the circumpolar waters around Antarctica (<xref rid="msv122-B39" ref-type="bibr">Morrison et al. 2015</xref>
). E2-IEs were discovered in the Southern Ocean using query sequences from CCMP2099 which is considered endemic to the Arctic—restricted by geographic and ecological barriers (<xref rid="msv122-B31" ref-type="bibr">Lovejoy et al. 2007</xref>
). The clear separation of Arctic and Antarctic waters, high sequence identity between protein-encoding portions of the Antarctic E2-IE-containing reads and CCMP2099 transcripts, and of E2-IEs themselves (e.g., <xref ref-type="fig" rid="msv122-F2">fig. 2</xref>
<italic>b</italic>
 and <italic>c</italic>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S2</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online), pinpoints E2-IE gains (as well as the split of the Clade E-lineage) to before physical separation of these <italic>Micromonas</italic>
 populations. If originally endemic to the Arctic (<xref rid="msv122-B31" ref-type="bibr">Lovejoy et al. 2007</xref>
), then how did Clade E2 come to be present in both polar systems, which have “always” been divided by warmer equatorial waters? Apart from ballast water (unlikely based on shipping routes), an effective transport mechanism would be through the NADW which is formed in the Labrador and Nordic Seas, and upwelled in the Southern Ocean (e.g., near sites where E2-IEs were detected below the photic zone). Indeed, although unaware of <italic>Micromonas</italic>
 presence in the Southern Ocean, <xref rid="msv122-B48" ref-type="bibr">Slapeta et al. (2006)</xref>
 suggested that <italic>Micromonas</italic>
 may be circulated around the world in deep-sea currents at a low metabolic state. Our amplification of E2-IEs and flanking sequence from the NADW demonstrates that Clade E2 <italic>Micromonas</italic>
—or their intact DNA—are present in this deep-sea current and could therefore be a source for Antarctic populations, although clearly more comprehensive sampling of deep currents is warranted. It takes approximately 100 years for freshly formed NADW to reach the Southern Ocean (Talley L, Scripps Institution of Oceaongraphy, personal communication). Thus it seems possible that spores or a cell-walled life stage (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
) may serve as more stable morphotypes during long periods of transport in deep-sea currents.</p>
<p>We also found E2-IEs and flanking sequence corresponding to CCMP2099 transcripts in Antarctica’s Ace Lake which is thought to have undergone little change over the past 4,000 years (<xref rid="msv122-B19" ref-type="bibr">Fulford-Smith and Sikes 1996</xref>
). In these samples, salinities ranged down to 22 ppm with temperatures between 0.42 and 1 °C whereas the E2-IE containing Southern Ocean samples had temperatures as low as −1.9 °C and higher salinities (e.g., 33–34 ppm). CCMP2099 grows from 0 to 12 °C and occurs at Arctic sites with salinities from 27 to 34 ppm, but its salinity range has not been experimentally characterized (<xref rid="msv122-B31" ref-type="bibr">Lovejoy et al. 2007</xref>
; <xref rid="msv122-B26" ref-type="bibr">Kilias et al. 2014</xref>
). Our findings extend this range significantly and indicate that considerable salinity reductions will not adversely affect these cells. This is important because <italic>Micromonas</italic>
 abundance has increased in association with climate-change enhanced ice melt and corresponding salinity reductions (to ∼29 ppm) in the Canadian Arctic (<xref rid="msv122-B28" ref-type="bibr">Li, McLaughlin, et al. 2009</xref>
). <italic>Micromonas</italic>
 is much smaller than the phytoplankton it is replacing, resulting in different food web connections and sinking rates. Hence, further increases in <italic>Micromonas</italic>
 abundance will likely have major ecosystem consequences.</p>
</sec>
<sec><title>Intron Stability</title>
<p>Intron phase is hypothesized to play a role in stability, with introns that split a codon being more likely to cause faulty splicing or intron sliding (<xref rid="msv122-B32" ref-type="bibr">Lynch 2002</xref>
). About 50% of IEs in the two most abundant D-IE families are phase-0, the remainder being split between phase-1 and phase-2 (<xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
). The statistical power of data on E2-IEs and ABC-IEs is lower, but most appear to be phase-0, except E2-IEt2s. Nevertheless, phase-2 IEs were observed for each family and these general patterns may reflect different IE stabilities within the genomes of the various <italic>Micromonas</italic>
 clades.</p>
<p>In several fungi, introner-like elements (ILEs) have been identified and proposed to degrade into RSIs, thereby serving as a source of RSIs (<xref rid="msv122-B53" ref-type="bibr">Torriani et al. 2011</xref>
; <xref rid="msv122-B56" ref-type="bibr">van der Burgt et al. 2012</xref>
). The average number of total introns per gene (1.4 ± 2) in the four best characterized of these fungi is higher than in CCMP1545 (0.9) or RCC299 (0.6) (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
; <xref rid="msv122-B21" ref-type="bibr">Goodwin et al. 2011</xref>
; <xref rid="msv122-B13" ref-type="bibr">de Wit et al. 2012</xref>
). However, the number of ILEs per genome (372 ± 180) is akin to ABC-IEs in RCC299 and much lower than D-IEs in CCMP1545. Hence, the balance between RSIs and repetitive introns is different in these fungi than for Clade D (and potentially Clade E2) <italic>Micromonas.</italic>
 If IEs are degraded into RSIs, then the resulting introns may be less stable, given the low overall RSI numbers in <italic>Micromonas.</italic>
 Moreover, CCMP1545 RSIs are on average longer than D-IEs. Thus IEs in <italic>M. pusilla</italic>
 do not seem to fit criteria for degradation to RSIs, suggesting differences from fungal ILEs—the closest analog to IEs reported to date.</p>
<p>An intron-rich LECA has been used to explain occurrence of introns at homologous positions in gene orthologs from divergent eukaryotes (<xref rid="msv122-B27" ref-type="bibr">Koonin 2006</xref>
; <xref rid="msv122-B47" ref-type="bibr">Roy and Gilbert 2006</xref>
). An alternative hypothesis is that some regions, or types of sequence composition, are predisposed to intron-insertion. <xref rid="msv122-B28" ref-type="bibr">Li et al. (2009)</xref>
 reported parallel gains at homologous positions in independent allelic lineages of <italic>D. pulex.</italic>
 In addition to IEs in nonhomologous positions (e.g., <xref ref-type="fig" rid="msv122-F3">fig. 3</xref>
<italic>a</italic>
 and <italic>b</italic>
), we observed many at homologous positions in different isolates and environmental samples. However, intron phases were sometimes not matched within the homologous codon (e.g., <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary fig. S4</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online), as also observed for an intron and intein in a relative of <italic>Micromonas</italic>
, <italic>Bathycoccus</italic>
 (<xref rid="msv122-B36" ref-type="bibr">Monier et al. 2013</xref>
). Although these observations are too few to garner statistical support (which would require availability of either genome sequences or more gene homolog sequences from many more isolates), they are suggestive of parallel gains and potential intron insertion “hot spots” as proposed by <xref rid="msv122-B28" ref-type="bibr">Li et al. (2009)</xref>
 for <italic>D. pulex.</italic>
 If LECA derived, the ancestral intron would presumably have been replaced by the D-IE and have undergone differential losses in the other <italic>Micromonas</italic>
 clades, a less parsimonious scenario than parallel gain. Regardless, the likelihood of either type of scenario relies on the biological mechanism behind intron gains or losses. The fundamental question is: How are IEs propagated across an individual genome?</p>
</sec>
<sec><title>A Mechanism for IE propagation</title>
<p>Because IEs are always found on the coding strand, new copies have been hypothesized to arise by intron RNA transposition through reverse splicing (<xref rid="msv122-B54" ref-type="bibr">Tseng and Cheng 2008</xref>
, <xref rid="msv122-B55" ref-type="bibr">2013</xref>
) into an mRNA followed by reverse transcription of the RNA to cDNA (<xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
) and homologous recombination. At the same time, a well-accepted model for intron removal invokes reverse transcription of spliced mRNA followed by homologous recombination (<xref rid="msv122-B17" ref-type="bibr">Fink 1987</xref>
). This mechanism is thought to account for the strong 5′ position bias in RSIs due to the greater representation of cDNA products arising from the 3′-end of transcripts (<xref rid="msv122-B17" ref-type="bibr">Fink 1987</xref>
). By the same logic it follows that cDNA from a reverse spliced IE RNA would produce a net 3′ bias in the gene position of IEs gained, but this has not been observed. Furthermore, the anticipated amount of cDNA available for these competing gain and loss mechanisms would greatly favor intron removal due to the far greater abundance of spliced mRNA (<xref rid="msv122-B43" ref-type="bibr">Rogozin et al. 2012</xref>
; <xref rid="msv122-B61" ref-type="bibr">Yenerall and Zhou 2012</xref>
), creating a paradox. We propose an alternative hypothesis that avoids this paradox: IE insertion by reverse splicing directly into single-stranded DNA (ssDNA) of R-loops (<xref ref-type="fig" rid="msv122-F5">fig. 5</xref>
). R-loops are recently appreciated aberrant structures that form behind blocked or stalled RNA polymerase complexes in which the nascent RNA strand pairs back to the underwound DNA template strand behind the RNA polymerase. This displaces an ssDNA loop that can lead to local mutation at the site of R-loops, as well as genome instability (<xref rid="msv122-B1" ref-type="bibr">Aguilera and Garcia-Muse 2012</xref>
; <xref rid="msv122-B8" ref-type="bibr">Chan et al. 2014</xref>).
<fig id="msv122-F5" orientation="portrait" position="float"><label>F<sc>ig</sc>
. 5.</label>
<caption><p>Proposed model for IE reverse splicing into ssDNA generated at R-loops. (<italic>a</italic>
) Diagram of a stalled RNA polymerase II complex behind which an R-loop has formed by pairing of the nascent transcript with the template strand of DNA. A spliceosome that carries the lariat intron product of a recent splicing event binds to the displaced nontemplate DNA strand. RNA (red) and DNA (black) are shown along with nucleosomes (discs) and spliceosome (blue oval). The lightning bolt indicates potential for where the first step of reverse splicing (the reverse of the second step of forward splicing) might occur on the DNA. (<italic>b</italic>
) Detailed description of a possible reverse splicing mechanism for IE transposition at R-loops. See text for additional details.</p>
</caption>
<graphic xlink:href="msv122f5p"></graphic>
</fig>
</p>
<p>We propose that an “armed” spliceosome (<xref rid="msv122-B32" ref-type="bibr">Lynch 2002</xref>
) forms after the spliced mRNA has been released and retains the IE intron RNA. We further postulate that a special sequence or structure of the IE, possibly including the stem-loop in our secondary structure predictions (<xref ref-type="fig" rid="msv122-F4">fig. 4</xref>
), interferes with spliceosome disassembly and debranching, leading to the persistence of a splicing complex that is primed for reverse splicing. In this model, the ssDNA of the R-loop binds the armed spliceosome in the binding site formerly occupied by spliced mRNA exons, after the completion of the second step of splicing (<xref ref-type="fig" rid="msv122-F5">fig. 5</xref>
<italic>b</italic>
). The 3′OH at the end of the intron lariat (the leaving group in step 2 of forward splicing) becomes the attacking group on a phosphate in the ssDNA in the reverse reaction of step 2. The leaving group in this transesterification is a DNA 3′OH bound in the spliceosome where free exon 1 would be bound between the first and second steps of forward splicing. In a second transesterification, this 3′OH attacks the phosphate at the lariat branch, with the 2′OH of the branch point adenosine (see <xref ref-type="fig" rid="msv122-F4">fig. 4</xref>
) acting as the leaving group, in a reverse reaction of step 1 of splicing. This inserts the intron RNA cleanly into the ssDNA. As the R-loop is repaired (<xref rid="msv122-B1" ref-type="bibr">Aguilera and Garcia-Muse 2012</xref>
; <xref rid="msv122-B8" ref-type="bibr">Chan et al. 2014</xref>
), either reverse transcriptase or a DNA repair polymerase (<xref rid="msv122-B49" ref-type="bibr">Storici et al. 2007</xref>
) copies the strand containing the RNA into DNA and a new copy of the IE element is incorporated into the genome.</p>
<p>Our model relies on the ability of the spliceosome to catalyze reverse splicing on an ssDNA substrate. <xref rid="msv122-B37" ref-type="bibr">Moore and Sharp (1992)</xref>
 substituted the ribose moiety at the end of exon 1 in a model pre-mRNA with deoxyribose and found that the rates of the first (where the adjacent 3′OH is the leaving group) and second (where the adjacent 3′OH is the attacking group) steps of forward splicing were not affected. The reverse reactions should be similarly unaffected, suggesting that ssDNA exons should suffice for catalysis of reverse splicing. For comparison, the retrotransposition mechanism of catalytically similar group II self-splicing introns involves reverse splicing into DNA (<xref rid="msv122-B62" ref-type="bibr">Zimmerly et al. 1995</xref>
; <xref rid="msv122-B16" ref-type="bibr">Eskes et al. 2000</xref>
; <xref rid="msv122-B15" ref-type="bibr">Dickson et al. 2001</xref>
). Although the source of reverse transcriptase activity for repair of the inserted RNA into DNA is uncertain in our model, evidence exists that cellular DNA polymerases can copy an RNA template (<xref rid="msv122-B49" ref-type="bibr">Storici et al. 2007</xref>
).</p>
<p>One prediction of this mechanism is that IE insertions will be biased toward R-loop susceptible locations (<xref rid="msv122-B1" ref-type="bibr">Aguilera and Garcia-Muse 2012</xref>
; <xref rid="msv122-B8" ref-type="bibr">Chan et al. 2014</xref>
), rather than strictly by the transcription rate or cDNA production efficiency predicted by other models (<xref rid="msv122-B61" ref-type="bibr">Yenerall and Zhou 2012</xref>
). This bias should hold for the fungal ILEs as well. A general tendency for R-looped regions to act as targets for intron insertion might explain why intron insertion events appear to occur near each other, but not exactly in the same location (<xref rid="msv122-B61" ref-type="bibr">Yenerall and Zhou 2012</xref>
). R-loops suffer other kinds of mutation, and recruitment of repair proteins to R-loops may incidentally help promote reverse splicing and intron insertion. We envision that transposable IEs may arise spontaneously if the intron RNA sequence evolves so that 1) the intron is refractory to disassembly from the spliceosome and 2) the forward and reverse rates of splicing for that intron become similar. R-looping is intrinsic to transcription, thus IEs may be widespread and appear de novo in any genome.</p>
</sec>
</sec>
<sec sec-type="conclusions"><title>Conclusions</title>
<p>Intron gains were once considered rare events. Studies based on increased genome-level taxonomic sampling reaching beyond heavily investigated multicellular eukaryotic lineages have revealed major exceptions to this rule. The <italic>Micromonas</italic>
 species are extraordinary due to the number, variety, and repetitive nature of polymorphic introns comprising unique IE families that trace speciation. After analyzing representatives of different <italic>Micromonas</italic>
 clades, Pacific Ocean clone libraries, and global metagenomes, we have laid a foundation for future research on the heterogeneity and functional implications of the <italic>Micromonas</italic>
 IE landscape. We hypothesize that invasion of distinct IE families facilitated the divergence of extant <italic>Micromonas</italic>
 lineages from their last common ancestor. This could have occurred through IE-influenced processes, including impedance of homologous recombination, differential gene losses, and protein innovations resulting in gain of new functions. Our R-loop based model for IE proliferation is generalizable to the majority of eukaryotes, thus as genome sequences become available for a greater diversity of eukaryotes, we anticipate discoveries of other rampant invasions by repetitive intronic elements. Together with analyses on potential functions of repetitive introns, such studies will provide a more comprehensive view on intron gain and its influence on the eukaryotic tree of life.</p>
</sec>
<sec sec-type="materials|methods"><title>Materials and Methods</title>
<sec><title>Culturing and Nucleic Acid Extraction</title>
<p>Ten <italic>Micromonas</italic>
 isolates were grown at 200 photon m<sup>−</sup>
<sup>2</sup>
 s<sup>−</sup>
<sup>1</sup>
 PAR (measured using a QSL2101 light meter; Biospherical Instruments Inc., San Diego CA) on a 14-h/10-h light/dark cycle (<xref ref-type="table" rid="msv122-T1">table 1</xref>
). These were obtained immediately prior to the study from several culture collections (or for RCC299 and CCMP1545 from in-house) and grown in standard media and conditions (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S1</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Additionally, <italic>Micromonas</italic>
 RCC434, RCC472, and RCC1614 were grown to improve 18S rRNA gene sequence availability for three clades. Cells were harvested by centrifugation at 6,000 or 8,000 × g, the supernatant removed immediately and pelleted cells frozen at –80 °C until extraction. DNA was extracted using a QIAGEN DNeasy Kit (Germantown, MD) according to the manufacturer’s instructions except for CCMP1764 which was extracted using a protocol for genome quality DNA (<ext-link ext-link-type="uri" xlink:href="http://www.mbari.org/phyto-genome/Resources.html">http://www.mbari.org/phyto-genome/Resources.html</ext-link>
, last accessed June 8, 2015). Environmental samples were collected near the end of the Scripps Institution for Oceanography pier (32°53′N, 117°15′W) in April and October 2001 and extracted as part of a previous study (<xref rid="msv122-B58" ref-type="bibr">Worden 2006</xref>
).</p>
</sec>
<sec><title>PCR, Cloning, and Sequencing</title>
<p>PCR primers were designed to conserved regions of four gene homologs found in the genomes of <italic>Ostreococcus tauri</italic>
, <italic>Ostreococcus lucimarinus</italic>
, <italic>Micromonas</italic>
 sp. RCC299, <italic>M. pusilla</italic>
 CCMP1545, and spanning IE in the latter (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S2</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). Accessions for these in CCMP1545 are: Actin, XM_003061058.1; ATPase, XM_003062703.1; Transporter, XM_003060502.1; Dehydrogenase, XM_003058664.1. The three latter genes are single copy in the genome, whereas the Actin primers were specific to one of several related copies. In addition, 18S rRNA gene primers (18SEUKF: 5′-ACCTGGTTGATCCTGCCAG-3′; 18SEUKR: 5′-TGATCCTTCYGCAGGTTCAC-3′) were used to verify isolate identity as in <xref rid="msv122-B60" ref-type="bibr">Worden et al. (2004)</xref>
. DNA from each isolate was amplified in individual reactions for each of the five genes (i.e., including the 18S rRNA). Specifically, 25 µl PCR reactions consisted of 9 µl nuclease free water, 12.5 µl HotStar Master Mix (Qiagen), 500 nM each of forward and reverse primers, and 1 µl of DNA. For negative controls, an additional 1 µl of nuclease free water was used in place of DNA. PCR conditions were as follows: 30–32 cycles at 94 °C for 30 s, annealing for 30 s (see <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S2</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online, for temperatures), extension at 72 °C for 2 min, preceded by 15-min initial denaturation at 95 °C, and followed by 10-min extension at 72 °C.</p>
<p>For cultures, products for the different genes were amplified separately from each of the cultures and run on a 1% agarose gel. The majority showed a single band and was purified using the QIAquick PCR Kit. For those with multiple bands (starting cultures contained bacterial contaminants which primer design did not account for), specifically the ATPase of CCMP490, CCMP1195, CS222, NEPCC29, the Transporter of RCC472, and the NADH dehydrogenase of CCMP490, NEPCC29, PCR products were excised from the gels and purified using the QIAquick Gel Extraction Kit (Qiagen). Actin, the only eukaryote specific gene investigated, had single bands for all cultures. The PCR products from each culture and gene were then independently cloned using the TOPO TA Cloning kit (Life Technologies, Carlsbad, CA). Insert lengths ranged from 422 to 1,256 nt (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary tables S2</ext-link>
 and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">S3</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online). For each culture, 2–16 colonies were picked and plasmids purified using the QIAprep Miniprep Kit. The plasmids were sequenced bidirectionally on an ABI 3100 using BigDye terminator v3.1 chemistry (Life Technologies) with M13F (5′-CTGGCCGTCGTTTTAC-3′) and M13R (5′-CAGGAAACAGCTATGAC-3′). Additional sequencing primers were used for internal regions of the 18S rDNA, 502F (5′-GGAGGGCAAGTCTGGT-3′) and EUK1174R: (5′-CCCGTGTTGAGTCAAA-3′).</p>
<p>For environmental samples a different approach was used for investigating potential CCMP2099 presence in NADW (Atlantic Ocean samples) than for the environmental clone libraries used to investigate IE diversity (Pacific Ocean samples, see below). For the former, primers (ABC.E2F: GGCGAACCAGCAACAACGAGAAG; ABC.E2R GCTTCGTCCTGGAGTTTCGCC) were designed to specifically amplify a 200-bp region spanning the first E2-IEt1 of the CCMP2099 Transporter (<xref ref-type="fig" rid="msv122-F1">fig. 1</xref>
<italic>e</italic>
). Twenty-five µl PCR reactions consisted of 9.5 µl nuclease free water, 12.5 µl Qiagen HotStar Master Mix, 500 nM each of forward and reverse primers, and 1.5 µl of the extracted DNA template, the positive control (CCMP2099 DNA) or negative control (nuclease free water). PCR was carried out under the following conditions: 35 cycles of 94 °C for 30 s, annealing at 56.5 °C for 30 s, extension at 72 °C for 90 s, preceded by 15-min denaturation at 95 °C, and followed by extension at 72 °C (10 min). PCR products were purified using the QIAquick PCR Kit (Qiagen) and cloned using the TOPO TA Cloning kit (Life Technologies) following the instructions provided by the manufacturer. For each PCR product, 4–16 colonies were picked and cloned inserts amplified with the vector primers M13F and M13R. Inserts were sequenced unidirectionally using M13F for all clones, and bidirectionally (M13F and M13R) for 3,000‐m clones on an Applied Biosystems Hitachi 3500 xL Genetic Analyzer using BigDye terminator v3.1 chemistry (Life Technologies). Clone reads were assembled in geneious v8.1 with manual curation. To validate presence of eukaryotic DNA in these samples, PCR was performed using the 18S rRNA gene primers (as above) and products verified by size on a gel.</p>
<p>PCR was also performed on North Pacific environmental samples using the ATPase and Actin primer sets (<ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">supplementary table S2</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary Material</ext-link>
 online) as above. The reactions were performed independently for the spring and fall templates collected in <xref rid="msv122-B58" ref-type="bibr">Worden (2006)</xref>
. Two libraries were constructed for each of two genes from the spring and fall samples because for each gene amplification two different bands (one reflecting intron-less and one reflecting intron-containing sequences) were excised from a 1% agarose gene and independently cloned after clean up (i.e., eight total; results were attained for seven libraries because one non-IE bearing product was lost in processing). In total, 96 colonies were picked per library and plasmids purified according to the methods of <xref rid="msv122-B12" ref-type="bibr">Davis (1986)</xref>
 prior to sequencing using a 3730xl DNA Analyzer (Life Technologies).</p>
<p>Reads from each clone were assembled using DNAStar (Lasergene) and manual curation. Nonprasinophyte sequences were removed from environmental clone libraries based on an initial BLASTx and BLASTn (<xref rid="msv122-B3" ref-type="bibr">Altschul et al. 1997</xref>
) evaluation against NCBI’s nonredundant database and an in-house database of publically available genomes. CCMP1545 and CCMP490 ATPase Clusters O and P (one clone per strain) each had a one nucleotide difference from the CCMP1545 genome sequence (<xref rid="msv122-B59" ref-type="bibr">Worden et al. 2009</xref>
) and PCR-derived sequences from this study (across the entire amplicon, including IE sequence); these were considered PCR artifacts and not analyzed further. CCMP1764 DNA was sequenced using the 454-FLX platform.</p>
</sec>
<sec><title>Clustering, RSI Identities, and Phylogenetics</title>
<p>Genomic DNA and cDNA sequences were aligned using ClustalW, or manually in DNAStar or Bioedit. Clustering of environmental clones was performed using BLASTClust (<xref rid="msv122-B2" ref-type="bibr">Altschul et al. 1990</xref>
) with required coverage specified by both a similarity threshold of 100% and minimum length coverage of 1.0. Pairwise intron nucleotide identities were computed using Emboss water (employing the Smith–Waterman algorithm), unless otherwise specified. Sequence logos were constructed using WebLogo (<xref rid="msv122-B9" ref-type="bibr">Crooks et al. 2004</xref>
) after manual curation of the insertion sequences alignments. RSIs identities were calculated for introns in the β-tubulin gene because sequences exist for all cultured <italic>Micromonas</italic>
 clades and three introns are present. Two of these represent two different homologous RSIs (termed 5′ and 3′ here) for which RSI-locus comparisons show 73% (5′ RSI) and 51% (3′ RSI) nucleotide identity between NEPCC29 (Clade C) and RCC299 (Clade A). RSIs at the same loci in Clades D, E1 or E2 have less than 50% identity to those in the other clades. BLASTn queries of the RCC299 and CCMP1545 β-tubulin RSIs to their respective genome sequences attain only self-hits and pairwise alignment of these sequential β-tubulin RSIs renders identities less than 50% within each strain.</p>
<p>For the 18S rRNA gene phylogeny, we retrieved nearly complete (>1,500 bp) 18S rDNA from Mamiellophyceae and prasinophyte sister clades from NCBI and added those generated herein (see above). Sequences were aligned using MAFFT (<xref rid="msv122-B24" ref-type="bibr">Katoh et al. 2005</xref>
). Regions of unambiguous alignment were identified using MUST (<xref rid="msv122-B41" ref-type="bibr">Philippe 1993</xref>
) and all gap-containing positions removed, except for ten positions (corresponding to nucleotides 645–655 in the <italic>Micromonas pusilla</italic>
 CCMP1545 sequence #AY954994) that help resolve <italic>Micromonas</italic>
 clade differences. Phylogenetic reconstructions were statistically evaluated using Bayesian inference (BI) and maximum-likelihood (ML) methods from 1,646 homologous positions. The GTR + Γ + I was used as the model of nucleotide substitution for both analyses. Phylogenetic analyses were calculated using MrBayes 3.232 for BI (<xref rid="msv122-B44" ref-type="bibr">Ronquist et al. 2012</xref>
) and Treefinder for ML (<xref rid="msv122-B23" ref-type="bibr">Jobb et al. 2004</xref>
). Bayesian analyses were performed with two independent runs and 1,000,000 generations per run. After a burn in of 350,000 trees per run, the remaining trees were used to reconstruct a consensus tree and to get posterior probabilities for node supports. Bootstrap values were calculated using 1,000 replicates with the same substitution model.</p>
<p>For the ATPase phylogenetic analyses, introns and IEs were removed from the nucleotide sequences. Then, sequences from cultures and representative environmental sequences were aligned using MAFFT (<xref rid="msv122-B24" ref-type="bibr">Katoh et al. 2005</xref>
). Regions with unambiguous alignment were identified using MUST (<xref rid="msv122-B41" ref-type="bibr">Philippe 1993</xref>
), and all gap-containing positions were removed. A ML phylogeny was built from 734 homologous nucleotide positions using the TVM + G model including relaxing parameters of first, second, and third codon positions. The model was selected using Modeltest (<xref rid="msv122-B42" ref-type="bibr">Posada and Crandall 1998</xref>
) as implemented in Treefinder (<xref rid="msv122-B23" ref-type="bibr">Jobb et al. 2004</xref>
). The Dehydrogenase, Transporter, and Actin genes were analyzed similarly but using only sequences from cultures and MMETSP data from the same strains (<xref rid="msv122-B25" ref-type="bibr">Keeling et al. 2014</xref>
) to gain full length information. Four resulting alignments (including the ATPase) were concatenated. CCMP490 sequences were partial and the missing data were considered as missing entries in the matrix. An ML tree was constructed from 4,612 homologous positions (in the alignments of these four genes) using the same evolution model as the ATPase gene. Bootstrap statistics were performed using 1,000 ML replicates for all these phylogenies.</p>
</sec>
<sec><title>Metagenome Searches</title>
<p>Metagenomic searches (fig. 2<italic>a</italic>
) were performed using D-IE1.1 and the D-IE from the Transporter (PID68853) to represent Clade D elements, the Transporter-located IE in strain NEPCC29 to represent ABC-IEs, and finally E2-IE Type 1 and Type 2 sequences from the CCMP2099 Transporter, as queries in BLASTn, implemented in CAMERA (<xref rid="msv122-B50" ref-type="bibr">Sun et al. 2011</xref>
) using the CAMERA “all metagenomic 454” data set as of March 1, 2014. The complexity filter was off and only hits with <italic>E</italic>
 value < 10<sup>−</sup>
<sup>5</sup>
 were returned, those with IE typically ranged from 10<sup>−</sup>
<sup>7</sup>
 to 10<sup>−</sup>
<sup>100</sup>
. Only metagenomic reads with flanking sequence on either side of the “hit” alignment region were further characterized. Sequences were verified as being IE-like through alignment and used as BLASTn queries against the CCMP2099 and NEPCC29 transcriptomes (<xref rid="msv122-B34" ref-type="bibr">McRose et al. 2014</xref>
). Nucleotide identities were typically 99% between metagenomic flanking sequence and transcripts between Antarctic sequences and CCMP2099 as well as between IEC taken from model RCC299 Mipur011i11380 (<xref rid="msv122-B57" ref-type="bibr">Verhelst et al. 2013</xref>
) and the NEPCC29 Transporter.</p>
<p>To confirm identities between Antarctic CDS and CCMP2099 transcripts, the CCMP2099 Transporter E2-IEt1 was reblasted against 400 metagenomic sequences, and the resulting hits were searched using MEME (<xref rid="msv122-B4" ref-type="bibr">Bailey and Elkan 1994</xref>
) to find a common 50 nt motif (the IEs are longer but length variation is typically associated with the start of the polyU-stretch). The motif was then used to search all 400 sequences to find 402 hits in 396 sequences. E2-IEt1s (identified as G[CT]N[2-7 nt]—motif—N[7-37 nt]AG) were then excised from these sequences, and 21 sequences removed from the analysis because although the motif was present, the IE was not complete. The remaining (<italic>n</italic>
 = 375) read segments were used as BLASTn queries against CCMP2099, RCC299, and CCMP1545 sequences. All best hits were to the CCMP2099 transcriptome. The 360 best hits (15 had no hit) were used to compute nucleotide identities between CCMP2099 and protein-encoding portions of the Antarctic metagenomic reads.</p>
</sec>
<sec><title>Searches for RNA Structure</title>
<p>IE sequences were submitted to the mFold server <ext-link ext-link-type="uri" xlink:href="http://mfold.rna.albany.edu/?q=mfold">http://mfold.rna.albany.edu/?q=mfold</ext-link>
 (last accessed June 8, 2015) and the mFold output (<xref rid="msv122-B63" ref-type="bibr">Zuker 2003</xref>
) for different IEs was evaluated by inspection. Several structures of decreasing stability were evaluated for each IE. The most frequent common feature of folding for several IEs was the stem loop occupying the 5′ splice site sequence shown in <xref ref-type="fig" rid="msv122-F4">figure 4</xref>
.</p>
</sec>
</sec>
<sec sec-type="supplementary-material"><title>Supplementary Material</title>
<p><ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">Supplementary material</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">tables S1−S5</ext-link>
, <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">figures S1−S7</ext-link>
, and <ext-link ext-link-type="uri" xlink:href="http://mbe.oxfordjournals.org/lookup/suppl/doi:10.1093/molbev/msv122/-/DC1">datafile S1</ext-link>
 are available at <italic>Molecular Biology and Evolution</italic>
 online (<ext-link ext-link-type="uri" xlink:href="http://www.mbe.oxfordjournals.org/">http://www.mbe.oxfordjournals.org/</ext-link>
).</p>
<supplementary-material id="PMC_1" content-type="local-data"><caption><title>Supplementary Data</title>
</caption>
<media mimetype="text" mime-subtype="html" xlink:href="supp_32_9_2219__index.html"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="vnd.ms-excel" xlink:href="supp_msv122_Simmonsetal_SupplementaryDatafile1.xlsx"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_msv122_Simmonsetal_SupplFigs_24Apr2015.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_msv122_Supplementary_Tables_Simmons_et_al_24Apr2015_final.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_msv122_Supplementary_Figure_legends_SimmonsBachyetal.pdf"></media>
</supplementary-material>
</sec>
</body>
<back><ack><title>Acknowledgments</title>
<p>The authors thank H.M. Wilcox, E. Demir-Hilton, and D. McRose for assistance. They are grateful to G. Dick and S. Jain for blasting IEs against their deep sea metagenomes (negative results) and also to J. Sarmiento and L. Talley for kindly discussing NADW formation and movement with us. Sequences generated in this study have been deposited in NCBI database under accessions KR089059–KR089061 for 18S rRNA gene sequences from <italic>Micromonas</italic>
 RCC434, RCC472, and RCC1614; KR089139–KR089205 for the four genes studied using PCR amplification; representative (nonredundant) sequences KR089062–KR089138 (ATPase) and KR089206–KR089345 (Actin) from environmental clone libraries from the Eastern Pacific; KR152644–KR152649 for the 3,000-m NADW Transporter clones; and genomic DNA 454-FLX reads from CCMP1764 have been deposited in CAMERA under project CAM_PROJ_CCMP1764. M.A. was supported by <funding-source>NIH</funding-source>
 grant <award-id>GM040478</award-id>
. This research was supported by the <funding-source>David and Lucile Packard Foundation</funding-source>
, a <funding-source>Gordon and Betty Moore Foundation Investigator</funding-source>
 Award (<award-id>GBMF3788</award-id>
), <award-id>NSF-IOS0843119</award-id>
, and <award-id>DOE-DE-SC0004765</award-id>
 grants to A.Z.W.</p>
</ack>
<ref-list><title>References</title>
<ref id="msv122-B1"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Aguilera</surname>
<given-names>A</given-names>
</name>
<name><surname>Garcia-Muse</surname>
<given-names>T</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>R loops: from transcription byproducts to threats to genome stability</article-title>
. <source>Mol Cell.</source>
<volume>46</volume>
:<fpage>115</fpage>
–<lpage>124</lpage>
.<pub-id pub-id-type="pmid">22541554</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B2"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name><surname>Gish</surname>
<given-names>W</given-names>
</name>
<name><surname>Miller</surname>
<given-names>W</given-names>
</name>
<name><surname>Myers</surname>
<given-names>EW</given-names>
</name>
<name><surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<year>1990</year>
<article-title>Basic local alignment search tool</article-title>
. <source>J Mol Biol.</source>
<volume>215</volume>
:<fpage>403</fpage>
–<lpage>410</lpage>
.<pub-id pub-id-type="pmid">2231712</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B3"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name><surname>Madden</surname>
<given-names>TL</given-names>
</name>
<name><surname>Schaffer</surname>
<given-names>AA</given-names>
</name>
<name><surname>Zhang</surname>
<given-names>J</given-names>
</name>
<name><surname>Zhang</surname>
<given-names>Z</given-names>
</name>
<name><surname>Miller</surname>
<given-names>W</given-names>
</name>
<name><surname>Lipman</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<year>1997</year>
<article-title>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</article-title>
. <source>Nucleic Acids Res.</source>
<volume>25</volume>
:<fpage>3389</fpage>
–<lpage>3402</lpage>
.<pub-id pub-id-type="pmid">9254694</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B4"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bailey</surname>
<given-names>TL</given-names>
</name>
<name><surname>Elkan</surname>
<given-names>C</given-names>
</name>
</person-group>
<year>1994</year>
<article-title>Fitting a mixture model by expectation maximization to discover motifs in biopolymers</article-title>
. <source>Proc Int Conf Intell Syst Mol Biol.</source>
<volume>2</volume>
:<fpage>28</fpage>
–<lpage>36</lpage>
.<pub-id pub-id-type="pmid">7584402</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B5"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Blanc</surname>
<given-names>G</given-names>
</name>
<name><surname>Duncan</surname>
<given-names>G</given-names>
</name>
<name><surname>Agarkova</surname>
<given-names>I</given-names>
</name>
<name><surname>Borodovsky</surname>
<given-names>M</given-names>
</name>
<name><surname>Gurnon</surname>
<given-names>J</given-names>
</name>
<name><surname>Kuo</surname>
<given-names>A</given-names>
</name>
<name><surname>Lindquist</surname>
<given-names>E</given-names>
</name>
<name><surname>Lucas</surname>
<given-names>S</given-names>
</name>
<name><surname>Pangilinan</surname>
<given-names>J</given-names>
</name>
<name><surname>Polle</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>The <italic>Chlorella variabilis</italic>
 NC64A genome reveals adaptation to photosymbiosis, coevolution with viruses, and cryptic sex</article-title>
. <source>Plant Cell</source>
<volume>22</volume>
:<fpage>2943</fpage>
–<lpage>2955</lpage>
.<pub-id pub-id-type="pmid">20852019</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B6"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Broecker</surname>
<given-names>WS</given-names>
</name>
</person-group>
<year>1991</year>
<article-title>The great ocean conveyor</article-title>
. <source>Oceanography</source>
<volume>4</volume>
:<fpage>79</fpage>
–<lpage>89</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B7"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Brogna</surname>
<given-names>S</given-names>
</name>
<name><surname>Wen</surname>
<given-names>J</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Nonsense-mediated mRNA decay (NMD) mechanisms</article-title>
. <source>Nat Struct Mol Biol.</source>
<volume>16</volume>
:<fpage>107</fpage>
–<lpage>113</lpage>
.<pub-id pub-id-type="pmid">19190664</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B8"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chan</surname>
<given-names>YA</given-names>
</name>
<name><surname>Hieter</surname>
<given-names>P</given-names>
</name>
<name><surname>Stirling</surname>
<given-names>PC</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Mechanisms of genome instability induced by RNA-processing defects</article-title>
. <source>Trends Genet.</source>
<volume>30</volume>
:<fpage>245</fpage>
–<lpage>253</lpage>
.<pub-id pub-id-type="pmid">24794811</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B9"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Crooks</surname>
<given-names>GE</given-names>
</name>
<name><surname>Hon</surname>
<given-names>G</given-names>
</name>
, <name><surname>Chandonia</surname>
<given-names>JM</given-names>
</name>
<name><surname>Brenner</surname>
<given-names>SE</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>WebLogo: a sequence logo generator</article-title>
. <source>Genome Res.</source>
<volume>14</volume>
:<fpage>1188</fpage>
–<lpage>1190</lpage>
.<pub-id pub-id-type="pmid">15173120</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B10"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Csuros</surname>
<given-names>M</given-names>
</name>
<name><surname>Rogozin</surname>
<given-names>IB</given-names>
</name>
<name><surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes</article-title>
. <source>PLoS Comput Biol.</source>
<volume>7</volume>
:<fpage>e1002150</fpage>
.<pub-id pub-id-type="pmid">21935348</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B11"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Curtis</surname>
<given-names>BA</given-names>
</name>
<name><surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>A spliceosomal intron of mitochondrial DNA origin</article-title>
. <source>Curr Biol.</source>
<volume>20</volume>
:<fpage>R919</fpage>
–<lpage>R920</lpage>
.<pub-id pub-id-type="pmid">21056829</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B12"><mixed-citation publication-type="book"><person-group person-group-type="author"><name><surname>Davis</surname>
<given-names>LG</given-names>
</name>
</person-group>
<year>1986</year>
<article-title>Plasmid “Mini-Prep” method</article-title>
. In: <person-group person-group-type="editor"><name><surname>Davis</surname>
<given-names>LG</given-names>
</name>
<name><surname>Dibner</surname>
<given-names>MD</given-names>
</name>
<name><surname>Battey</surname>
<given-names>JF</given-names>
</name>
</person-group>
, editors. <source>Basic methods in molecular biology</source>
. <publisher-name>Elsevier Science Publishing Co, Inc, New York</publisher-name>
 p. <fpage>102</fpage>
–<lpage>104</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B13"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>de Wit</surname>
<given-names>PJ</given-names>
</name>
<name><surname>van der Burgt</surname>
<given-names>A</given-names>
</name>
<name><surname>Okmen</surname>
<given-names>B</given-names>
</name>
<name><surname>Stergiopoulos</surname>
<given-names>I</given-names>
</name>
<name><surname>Abd-Elsalam</surname>
<given-names>KA</given-names>
</name>
<name><surname>Aerts</surname>
<given-names>AL</given-names>
</name>
<name><surname>Bahkali</surname>
<given-names>AH</given-names>
</name>
<name><surname>Beenen</surname>
<given-names>HG</given-names>
</name>
<name><surname>Chettri</surname>
<given-names>P</given-names>
</name>
<name><surname>Cox</surname>
<given-names>MP</given-names>
</name>
<etal></etal>
</person-group>
<year>2012</year>
<article-title>The genomes of the fungal plant pathogens <italic>Cladosporium fulvum</italic>
 and <italic>Dothistroma septosporum</italic>
 reveal adaptation to different hosts and lifestyles but also signatures of common ancestry</article-title>
. <source>PLoS Genet.</source>
<volume>8</volume>
:<fpage>e1003088</fpage>
.<pub-id pub-id-type="pmid">23209441</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B14"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Denoeud</surname>
<given-names>F</given-names>
</name>
<name><surname>Henriet</surname>
<given-names>S</given-names>
</name>
<name><surname>Mungpakdee</surname>
<given-names>S</given-names>
</name>
<name><surname>Aury</surname>
<given-names>JM</given-names>
</name>
<name><surname>Da Silva</surname>
<given-names>C</given-names>
</name>
<name><surname>Brinkmann</surname>
<given-names>H</given-names>
</name>
<name><surname>Mikhaleva</surname>
<given-names>J</given-names>
</name>
<name><surname>Olsen</surname>
<given-names>LC</given-names>
</name>
<name><surname>Jubin</surname>
<given-names>C</given-names>
</name>
<name><surname>Canestro</surname>
<given-names>C</given-names>
</name>
<etal></etal>
</person-group>
<year>2010</year>
<article-title>Plasticity of animal genome architecture unmasked by rapid evolution of a pelagic tunicate</article-title>
. <source>Science</source>
<volume>330</volume>
:<fpage>1381</fpage>
–<lpage>1385</lpage>
.<pub-id pub-id-type="pmid">21097902</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B15"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dickson</surname>
<given-names>L</given-names>
</name>
<name><surname>Huang</surname>
<given-names>HR</given-names>
</name>
<name><surname>Liu</surname>
<given-names>L</given-names>
</name>
<name><surname>Matsuura</surname>
<given-names>M</given-names>
</name>
<name><surname>Lambowitz</surname>
<given-names>AM</given-names>
</name>
<name><surname>Perlman</surname>
<given-names>PS</given-names>
</name>
</person-group>
<year>2001</year>
<article-title>Retrotransposition of a yeast group II intron occurs by reverse splicing directly into ectopic DNA sites</article-title>
. <source>Proc Natl Acad Sci U S A.</source>
<volume>98</volume>
:<fpage>13207</fpage>
–<lpage>13212</lpage>
.<pub-id pub-id-type="pmid">11687644</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B16"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Eskes</surname>
<given-names>R</given-names>
</name>
<name><surname>Liu</surname>
<given-names>L</given-names>
</name>
<name><surname>Ma</surname>
<given-names>HW</given-names>
</name>
<name><surname>Chao</surname>
<given-names>MY</given-names>
</name>
<name><surname>Dickson</surname>
<given-names>L</given-names>
</name>
<name><surname>Lambowitz</surname>
<given-names>AM</given-names>
</name>
<name><surname>Perlman</surname>
<given-names>PS</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>Multiple homing pathways used by yeast mitochondrial group II introns</article-title>
. <source>Mol Cell Biol.</source>
<volume>20</volume>
:<fpage>8432</fpage>
–<lpage>8446</lpage>
.<pub-id pub-id-type="pmid">11046140</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B17"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Fink</surname>
<given-names>GR</given-names>
</name>
</person-group>
<year>1987</year>
<article-title>Pseudogenes in yeast?</article-title>
. <source>Cell</source>
<volume>49</volume>
:<fpage>5</fpage>
–<lpage>6</lpage>
.<pub-id pub-id-type="pmid">3549000</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B18"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Foulon</surname>
<given-names>E</given-names>
</name>
<name><surname>Not</surname>
<given-names>F</given-names>
</name>
<name><surname>Jalabert</surname>
<given-names>F</given-names>
</name>
<name><surname>Cariou</surname>
<given-names>T</given-names>
</name>
<name><surname>Massana</surname>
<given-names>R</given-names>
</name>
<name><surname>Simon</surname>
<given-names>N</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Ecological niche partitioning in the picoplanktonic green alga <italic>Micromonas pusilla</italic>
: evidence from environmental surveys using phylogenetic probes</article-title>
. <source>Environ Microbiol.</source>
<volume>10</volume>
:<fpage>2433</fpage>
–<lpage>2443</lpage>
.<pub-id pub-id-type="pmid">18537812</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B19"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Fulford-Smith</surname>
<given-names>SP</given-names>
</name>
<name><surname>Sikes</surname>
<given-names>EL</given-names>
</name>
</person-group>
<year>1996</year>
<article-title>The evolution of Ace Lake, Antarctica, determined from sedimentary diatom assemblages</article-title>
. <source>Palaeogeogr Palaeoclimatol Palaeoecol.</source>
<volume>124</volume>
:<fpage>73</fpage>
–<lpage>86</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B20"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gilbert</surname>
<given-names>W</given-names>
</name>
</person-group>
<year>1978</year>
<article-title>Why genes in pieces?</article-title>
. <source>Nature</source>
<volume>271</volume>
:<fpage>501</fpage>
.<pub-id pub-id-type="pmid">622185</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B21"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Goodwin</surname>
<given-names>SB</given-names>
</name>
<name><surname>M’Barek</surname>
<given-names>SB</given-names>
</name>
<name><surname>Dhillon</surname>
<given-names>B</given-names>
</name>
<name><surname>Wittenberg</surname>
<given-names>AH</given-names>
</name>
<name><surname>Crane</surname>
<given-names>CF</given-names>
</name>
<name><surname>Hane</surname>
<given-names>JK</given-names>
</name>
<name><surname>Foster</surname>
<given-names>AJ</given-names>
</name>
<name><surname>Van der Lee</surname>
<given-names>TA</given-names>
</name>
<name><surname>Grimwood</surname>
<given-names>J</given-names>
</name>
<name><surname>Aerts</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>Finished genome of the fungal wheat pathogen <italic>Mycosphaerella graminicola</italic>
 reveals dispensome structure, chromosome plasticity, and stealth pathogenesis</article-title>
. <source>PLoS Genet.</source>
<volume>7</volume>
:<fpage>e1002070</fpage>
.<pub-id pub-id-type="pmid">21695235</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B22"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname>
<given-names>S</given-names>
</name>
<name><surname>Chen</surname>
<given-names>Z</given-names>
</name>
<name><surname>Yan</surname>
<given-names>X</given-names>
</name>
<name><surname>Yu</surname>
<given-names>T</given-names>
</name>
<name><surname>Huang</surname>
<given-names>G</given-names>
</name>
<name><surname>Yan</surname>
<given-names>Q</given-names>
</name>
<name><surname>Pontarotti</surname>
<given-names>PA</given-names>
</name>
<name><surname>Zhao</surname>
<given-names>H</given-names>
</name>
<name><surname>Li</surname>
<given-names>J</given-names>
</name>
<name><surname>Yang</surname>
<given-names>P</given-names>
</name>
<etal></etal>
</person-group>
<year>2014</year>
<article-title>Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes</article-title>
. <source>Nat Commun.</source>
<volume>5</volume>
:<fpage>5896</fpage>
.<pub-id pub-id-type="pmid">25523484</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B23"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Jobb</surname>
<given-names>G</given-names>
</name>
<name><surname>von Haeseler</surname>
<given-names>A</given-names>
</name>
<name><surname>Strimmer</surname>
<given-names>K</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics</article-title>
. <source>BMC Evol Biol.</source>
<volume>4</volume>
:<fpage>18</fpage>
.<pub-id pub-id-type="pmid">15222900</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B24"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Katoh</surname>
<given-names>K</given-names>
</name>
<name><surname>Kuma</surname>
<given-names>K</given-names>
</name>
<name><surname>Toh</surname>
<given-names>H</given-names>
</name>
<name><surname>Miyata</surname>
<given-names>T</given-names>
</name>
</person-group>
<year>2005</year>
<article-title>MAFFT version 5: improvement in accuracy of multiple sequence alignment</article-title>
. <source>Nucleic Acids Res.</source>
<volume>33</volume>
:<fpage>511</fpage>
–<lpage>518</lpage>
.<pub-id pub-id-type="pmid">15661851</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B25"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Keeling</surname>
<given-names>PJ</given-names>
</name>
<name><surname>Burki</surname>
<given-names>F</given-names>
</name>
<name><surname>Wilcox</surname>
<given-names>HM</given-names>
</name>
<name><surname>Allam</surname>
<given-names>B</given-names>
</name>
<name><surname>Allen</surname>
<given-names>EE</given-names>
</name>
<name><surname>Amaral-Zettler</surname>
<given-names>LA</given-names>
</name>
<name><surname>Armbrust</surname>
<given-names>EV</given-names>
</name>
<name><surname>Archibald</surname>
<given-names>JM</given-names>
</name>
<name><surname>Bharti</surname>
<given-names>AK</given-names>
</name>
<name><surname>Bell</surname>
<given-names>CJ</given-names>
</name>
<etal></etal>
</person-group>
<year>2014</year>
<article-title>The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing</article-title>
. <source>PLoS Biol.</source>
<volume>12</volume>
:<fpage>e1001889</fpage>
.<pub-id pub-id-type="pmid">24959919</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B26"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kilias</surname>
<given-names>ES</given-names>
</name>
<name><surname>Nöthig</surname>
<given-names>E-M</given-names>
</name>
<name><surname>Wolf</surname>
<given-names>C</given-names>
</name>
<name><surname>Metfies</surname>
<given-names>K</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Picoeukaryote plankton composition off West Spitsbergen at the entrance to the Arctic Ocean</article-title>
. <source>J Euk Microbiol.</source>
<volume>61</volume>
:<fpage>569</fpage>
–<lpage>579</lpage>
.<pub-id pub-id-type="pmid">24996010</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B27"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate?</article-title>
<source>Biol Direct.</source>
<volume>1</volume>
:<fpage>22</fpage>
.<pub-id pub-id-type="pmid">16907971</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B28"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Li</surname>
<given-names>W</given-names>
</name>
<name><surname>Tucker</surname>
<given-names>AE</given-names>
</name>
<name><surname>Sung</surname>
<given-names>W</given-names>
</name>
<name><surname>Thomas</surname>
<given-names>WK</given-names>
</name>
<name><surname>Lynch</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Extensive, recent intron gains in <italic>Daphnia</italic>
 populations</article-title>
. <source>Science</source>
<volume>326</volume>
:<fpage>1260</fpage>
–<lpage>1262</lpage>
.<pub-id pub-id-type="pmid">19965475</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B29"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Li</surname>
<given-names>WKW</given-names>
</name>
<name><surname>McLaughlin</surname>
<given-names>FA</given-names>
</name>
<name><surname>Lovejoy</surname>
<given-names>C</given-names>
</name>
<name><surname>Carmack</surname>
<given-names>EC</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Smallest algae thrive as the Arctic Ocean freshens</article-title>
. <source>Science</source>
<volume>326</volume>
:<fpage>539</fpage>
.<pub-id pub-id-type="pmid">19900890</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B30"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Llopart</surname>
<given-names>A</given-names>
</name>
<name><surname>Comeron</surname>
<given-names>JM</given-names>
</name>
<name><surname>Brunet</surname>
<given-names>FG</given-names>
</name>
<name><surname>Lachaise</surname>
<given-names>D</given-names>
</name>
<name><surname>Long</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>Intron presence-absence polymorphism in <italic>Drosophila</italic>
 driven by positive Darwinian selection</article-title>
. <source>Proc Natl Acad Sci U S A.</source>
<volume>99</volume>
:<fpage>8121</fpage>
–<lpage>8126</lpage>
.<pub-id pub-id-type="pmid">12060758</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B31"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lovejoy</surname>
<given-names>C</given-names>
</name>
<name><surname>Vincent</surname>
<given-names>WF</given-names>
</name>
<name><surname>Bonilla</surname>
<given-names>S</given-names>
</name>
<name><surname>Roy</surname>
<given-names>S</given-names>
</name>
<name><surname>Martineau</surname>
<given-names>MJ</given-names>
</name>
<name><surname>Terrado</surname>
<given-names>R</given-names>
</name>
<name><surname>Potvin</surname>
<given-names>M</given-names>
</name>
<name><surname>Massana</surname>
<given-names>R</given-names>
</name>
<name><surname>Pedros-Alio</surname>
<given-names>C</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Distribution, phylogeny, and growth of cold-adapted picoprasinophytes in arctic seas</article-title>
. <source>J Phycol.</source>
<volume>43</volume>
:<fpage>78</fpage>
–<lpage>89</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B32"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lynch</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>Intron evolution as a population-genetic process</article-title>
. <source>Proc Natl Acad Sci U S A.</source>
<volume>99</volume>
:<fpage>6118</fpage>
–<lpage>6123</lpage>
.<pub-id pub-id-type="pmid">11983904</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B33"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Marin</surname>
<given-names>B</given-names>
</name>
<name><surname>Melkonian</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Molecular phylogeny and classification of the Mamiellophyceae class. nov. (Chlorophyta) based on sequence comparisons of the nuclear- and plastid-encoded rRNA operons</article-title>
. <source>Protist</source>
<volume>161</volume>
:<fpage>304</fpage>
–<lpage>336</lpage>
.<pub-id pub-id-type="pmid">20005168</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B34"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McRose</surname>
<given-names>D</given-names>
</name>
<name><surname>Guo</surname>
<given-names>J</given-names>
</name>
<name><surname>Monier</surname>
<given-names>A</given-names>
</name>
<name><surname>Sudek</surname>
<given-names>S</given-names>
</name>
<name><surname>Wilken</surname>
<given-names>S</given-names>
</name>
<name><surname>Yan</surname>
<given-names>S</given-names>
</name>
<name><surname>Mock</surname>
<given-names>T</given-names>
</name>
<name><surname>Archibald</surname>
<given-names>JM</given-names>
</name>
<name><surname>Begley</surname>
<given-names>TP</given-names>
</name>
<name><surname>Reyes-Prieto</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<year>2014</year>
<article-title>Alternatives to vitamin B1 uptake revealed with discovery of riboswitches in multiple marine eukaryotic lineages</article-title>
. <source>ISME J.</source>
<volume>8</volume>
:<fpage>2517</fpage>
–<lpage>2529</lpage>
.<pub-id pub-id-type="pmid">25171333</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B35"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Modrek</surname>
<given-names>B</given-names>
</name>
<name><surname>Lee</surname>
<given-names>C</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>A genomic view of alternative splicing</article-title>
. <source>Nat Genet.</source>
<volume>30</volume>
:<fpage>13</fpage>
–<lpage>19</lpage>
.<pub-id pub-id-type="pmid">11753382</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B64"><mixed-citation publication-type="other">Molnar P. 2008. Closing of the Central American Seaway and the Ice Age: A critical review. <italic>Paleooceanogr</italic>
. 23: PA2201.</mixed-citation>
</ref>
<ref id="msv122-B36"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Monier</surname>
<given-names>A</given-names>
</name>
<name><surname>Sudek</surname>
<given-names>S</given-names>
</name>
<name><surname>Fast</surname>
<given-names>NM</given-names>
</name>
<name><surname>Worden</surname>
<given-names>AZ</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Gene invasion in distant eukaryotic lineages: discovery of mutually exclusive genetic elements reveals marine biodiversity</article-title>
. <source>ISME J.</source>
<volume>7</volume>
:<fpage>1764</fpage>
–<lpage>1774</lpage>
.<pub-id pub-id-type="pmid">23635865</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B37"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Moore</surname>
<given-names>MJ</given-names>
</name>
<name><surname>Sharp</surname>
<given-names>PA</given-names>
</name>
</person-group>
<year>1992</year>
<article-title>Site-specific modification of pre-mRNA: the 2′-hydroxyl groups at the splice sites</article-title>
. <source>Science</source>
<volume>256</volume>
:<fpage>992</fpage>
–<lpage>997</lpage>
.<pub-id pub-id-type="pmid">1589782</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B38"><mixed-citation publication-type="book"><person-group person-group-type="author"><name><surname>Morozov</surname>
<given-names>EG</given-names>
</name>
<name><surname>Demidov</surname>
<given-names>AN</given-names>
</name>
<name><surname>Tarakanov</surname>
<given-names>RY</given-names>
</name>
<name><surname>Zenk</surname>
<given-names>W</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Deep water masses of the South and North Atlantic</article-title>
. In: <source>Abyssal channels in the Atlantic Ocean: water structure and flows</source>
. <publisher-loc>New York</publisher-loc>
, <publisher-name>Springer</publisher-name>
 p. <fpage>266</fpage>
.</mixed-citation>
</ref>
<ref id="msv122-B39"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Morrison</surname>
<given-names>AK</given-names>
</name>
<name><surname>Frölicher</surname>
<given-names>TL</given-names>
</name>
<name><surname>Sarmiento</surname>
<given-names>JL</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>Upwelling in the Southern Ocean</article-title>
. <source>Physics Today</source>
<volume>68</volume>
:<fpage>27</fpage>
–<lpage>32</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B40"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Parra</surname>
<given-names>G</given-names>
</name>
<name><surname>Bradnam</surname>
<given-names>K</given-names>
</name>
<name><surname>Rose</surname>
<given-names>AB</given-names>
</name>
<name><surname>Korf</surname>
<given-names>I</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Comparative and functional analysis of intron-mediated enhancement signals reveals conserved features among plants</article-title>
. <source>Nucleic Acids Res.</source>
<volume>39</volume>
:<fpage>5328</fpage>
–<lpage>5337</lpage>
.<pub-id pub-id-type="pmid">21427088</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B41"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Philippe</surname>
<given-names>H</given-names>
</name>
</person-group>
<year>1993</year>
<article-title>MUST, a computer package of Management Utilities for Sequences and Trees</article-title>
. <source>Nucleic Acids Res.</source>
<volume>21</volume>
:<fpage>5264</fpage>
–<lpage>5272</lpage>
.<pub-id pub-id-type="pmid">8255784</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B42"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Posada</surname>
<given-names>D</given-names>
</name>
<name><surname>Crandall</surname>
<given-names>KA</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Modeltest: testing the model of DNA substitution</article-title>
. <source>Bioinformatics</source>
<volume>14</volume>
:<fpage>817</fpage>
–<lpage>818</lpage>
.<pub-id pub-id-type="pmid">9918953</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B43"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rogozin</surname>
<given-names>IB</given-names>
</name>
<name><surname>Carmel</surname>
<given-names>L</given-names>
</name>
<name><surname>Csuros</surname>
<given-names>M</given-names>
</name>
<name><surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Origin and evolution of spliceosomal introns</article-title>
. <source>Biol Direct.</source>
<volume>7</volume>
:<fpage>11</fpage>
.<pub-id pub-id-type="pmid">22507701</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B44"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ronquist</surname>
<given-names>F</given-names>
</name>
<name><surname>Teslenko</surname>
<given-names>M</given-names>
</name>
<name><surname>van der Mark</surname>
<given-names>P</given-names>
</name>
<name><surname>Ayres</surname>
<given-names>DL</given-names>
</name>
<name><surname>Darling</surname>
<given-names>A</given-names>
</name>
<name><surname>Hohna</surname>
<given-names>S</given-names>
</name>
<name><surname>Larget</surname>
<given-names>B</given-names>
</name>
<name><surname>Liu</surname>
<given-names>L</given-names>
</name>
<name><surname>Suchard</surname>
<given-names>MA</given-names>
</name>
<name><surname>Huelsenbeck</surname>
<given-names>JP</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space</article-title>
. <source>Syst Biol.</source>
<volume>61</volume>
:<fpage>539</fpage>
–<lpage>542</lpage>
.<pub-id pub-id-type="pmid">22357727</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B45"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Roy</surname>
<given-names>SW</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Recent evidence for the exon theory of genes</article-title>
. <source>Genetica</source>
<volume>118</volume>
:<fpage>251</fpage>
–<lpage>266</lpage>
.<pub-id pub-id-type="pmid">12868614</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B46"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Roy</surname>
<given-names>SW</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>Intron-rich ancestors</article-title>
. <source>Trends Genet.</source>
<volume>22</volume>
:<fpage>468</fpage>
–<lpage>471</lpage>
.<pub-id pub-id-type="pmid">16857287</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B47"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Roy</surname>
<given-names>SW</given-names>
</name>
<name><surname>Gilbert</surname>
<given-names>W</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>The evolution of spliceosomal introns: patterns, puzzles and progress</article-title>
. <source>Nat Rev Genet.</source>
<volume>7</volume>
:<fpage>211</fpage>
–<lpage>221</lpage>
.<pub-id pub-id-type="pmid">16485020</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B48"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Slapeta</surname>
<given-names>J</given-names>
</name>
<name><surname>Lopez-Garcia</surname>
<given-names>P</given-names>
</name>
<name><surname>Moreira</surname>
<given-names>D</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>Global dispersal and ancient cryptic species in the smallest marine eukaryotes</article-title>
. <source>Mol Biol Evol.</source>
<volume>23</volume>
:<fpage>23</fpage>
–<lpage>29</lpage>
.<pub-id pub-id-type="pmid">16120798</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B49"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Storici</surname>
<given-names>F</given-names>
</name>
<name><surname>Bebenek</surname>
<given-names>K</given-names>
</name>
<name><surname>Kunkel</surname>
<given-names>TA</given-names>
</name>
<name><surname>Gordenin</surname>
<given-names>DA</given-names>
</name>
<name><surname>Resnick</surname>
<given-names>MA</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>RNA-templated DNA repair</article-title>
. <source>Nature</source>
<volume>447</volume>
:<fpage>338</fpage>
–<lpage>341</lpage>
.<pub-id pub-id-type="pmid">17429354</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B50"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname>
<given-names>S</given-names>
</name>
<name><surname>Chen</surname>
<given-names>J</given-names>
</name>
<name><surname>Li</surname>
<given-names>W</given-names>
</name>
<name><surname>Altintas</surname>
<given-names>I</given-names>
</name>
<name><surname>Lin</surname>
<given-names>A</given-names>
</name>
<name><surname>Peltier</surname>
<given-names>S</given-names>
</name>
<name><surname>Stocks</surname>
<given-names>K</given-names>
</name>
<name><surname>Allen</surname>
<given-names>EE</given-names>
</name>
<name><surname>Ellisman</surname>
<given-names>M</given-names>
</name>
<name><surname>Grethe</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<year>2011</year>
<article-title>Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource</article-title>
. <source>Nucleic Acids Res.</source>
<volume>39</volume>
:<fpage>D546</fpage>
–<lpage>D551</lpage>
.<pub-id pub-id-type="pmid">21045053</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B51"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sverdlov</surname>
<given-names>SV</given-names>
</name>
<name><surname>Rogozin</surname>
<given-names>IB</given-names>
</name>
<name><surname>Babenko</surname>
<given-names>VN</given-names>
</name>
<name><surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Conservation versus parallel gains in intron evolution</article-title>
. <source>Nucleic Acids Res.</source>
<volume>33</volume>
:<fpage>1741</fpage>
–<lpage>1748</lpage>
.<pub-id pub-id-type="pmid">15788746</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B52"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Talley</surname>
<given-names>LD</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Closure of the global overturning circulation through the Indian, Pacific, and Southern Oceans: schematics and transports</article-title>
. <source>Oceanography</source>
<volume>26</volume>
:<fpage>80</fpage>
–<lpage>97</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B53"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Torriani</surname>
<given-names>SF</given-names>
</name>
<name><surname>Stukenbrock</surname>
<given-names>EH</given-names>
</name>
<name><surname>Brunner</surname>
<given-names>PC</given-names>
</name>
<name><surname>McDonald</surname>
<given-names>BA</given-names>
</name>
<name><surname>Croll</surname>
<given-names>D</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Evidence for extensive recent intron transposition in closely related fungi</article-title>
. <source>Curr Biol.</source>
<volume>21</volume>
:<fpage>2017</fpage>
–<lpage>2022</lpage>
.<pub-id pub-id-type="pmid">22100062</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B54"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tseng</surname>
<given-names>CK</given-names>
</name>
<name><surname>Cheng</surname>
<given-names>SC</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Both catalytic steps of nuclear pre-mRNA splicing are reversible</article-title>
. <source>Science</source>
<volume>320</volume>
:<fpage>1782</fpage>
–<lpage>1784</lpage>
.<pub-id pub-id-type="pmid">18583613</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B55"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Tseng</surname>
<given-names>CK</given-names>
</name>
<name><surname>Cheng</surname>
<given-names>SC</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>The spliceosome catalyzes debranching in competition with reverse of the first chemical reaction</article-title>
. <source>RNA</source>
<volume>19</volume>
:<fpage>971</fpage>
–<lpage>981</lpage>
.<pub-id pub-id-type="pmid">23681507</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B56"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>van der Burgt</surname>
<given-names>A</given-names>
</name>
<name><surname>Severing</surname>
<given-names>E</given-names>
</name>
<name><surname>de Wit</surname>
<given-names>PJ</given-names>
</name>
<name><surname>Collemare</surname>
<given-names>J</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Birth of new spliceosomal introns in fungi by multiplication of introner-like elements</article-title>
. <source>Curr Biol.</source>
<volume>22</volume>
:<fpage>1260</fpage>
–<lpage>1265</lpage>
.<pub-id pub-id-type="pmid">22658596</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B57"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Verhelst</surname>
<given-names>B</given-names>
</name>
<name><surname>Van de Peer</surname>
<given-names>Y</given-names>
</name>
<name><surname>Rouze</surname>
<given-names>P</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>The complex intron landscape and massive intron invasion in a picoeukaryote provides insights into intron evolution</article-title>
. <source>Genome Biol Evol.</source>
<volume>5</volume>
:<fpage>2393</fpage>
–<lpage>2401</lpage>
.<pub-id pub-id-type="pmid">24273312</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B58"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Worden</surname>
<given-names>AZ</given-names>
</name>
</person-group>
<year>2006</year>
<article-title>Picoeukaryote diversity in coastal waters of the Pacific Ocean</article-title>
. <source>Aquat Microb Ecol.</source>
<volume>43</volume>
:<fpage>165</fpage>
–<lpage>175</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B59"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Worden</surname>
<given-names>AZ</given-names>
</name>
<name><surname>Lee</surname>
<given-names>JH</given-names>
</name>
<name><surname>Mock</surname>
<given-names>T</given-names>
</name>
<name><surname>Rouze</surname>
<given-names>P</given-names>
</name>
<name><surname>Simmons</surname>
<given-names>MP</given-names>
</name>
<name><surname>Aerts</surname>
<given-names>AL</given-names>
</name>
<name><surname>Allen</surname>
<given-names>AE</given-names>
</name>
<name><surname>Cuvelier</surname>
<given-names>ML</given-names>
</name>
<name><surname>Derelle</surname>
<given-names>E</given-names>
</name>
<name><surname>Everett</surname>
<given-names>MV</given-names>
</name>
<etal></etal>
</person-group>
<year>2009</year>
<article-title>Green evolution and dynamic adaptations revealed by genomes of the marine picoeukaryotes <italic>Micromonas</italic>
</article-title>
. <source>Science</source>
<volume>324</volume>
:<fpage>268</fpage>
–<lpage>272</lpage>
.<pub-id pub-id-type="pmid">19359590</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B60"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Worden</surname>
<given-names>AZ</given-names>
</name>
<name><surname>Nolan</surname>
<given-names>JK</given-names>
</name>
<name><surname>Palenik</surname>
<given-names>B</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Assessing the dynamics and ecology of marine picophytoplankton: the importance of the eukaryotic component</article-title>
. <source>Limnol Oceanogr.</source>
<volume>49</volume>
:<fpage>168</fpage>
–<lpage>179</lpage>
.</mixed-citation>
</ref>
<ref id="msv122-B61"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yenerall</surname>
<given-names>P</given-names>
</name>
<name><surname>Zhou</surname>
<given-names>L</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Identifying the mechanisms of intron gain: progress and trends</article-title>
. <source>Biol Direct.</source>
<volume>7</volume>
:<fpage>29</fpage>
.<pub-id pub-id-type="pmid">22963364</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B62"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zimmerly</surname>
<given-names>S</given-names>
</name>
<name><surname>Guo</surname>
<given-names>H</given-names>
</name>
<name><surname>Eskest</surname>
<given-names>R</given-names>
</name>
<name><surname>Yang</surname>
<given-names>J</given-names>
</name>
<name><surname>Perlman</surname>
<given-names>PS</given-names>
</name>
<name><surname>Lambowitz</surname>
<given-names>AM</given-names>
</name>
</person-group>
<year>1995</year>
<article-title>A group II intron RNA is a catalytic component of a DNA endonuclease involved in intron mobility</article-title>
. <source>Cell</source>
<volume>83</volume>
:<fpage>529</fpage>
–<lpage>538</lpage>
.<pub-id pub-id-type="pmid">7585955</pub-id>
</mixed-citation>
</ref>
<ref id="msv122-B63"><mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zuker</surname>
<given-names>M</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Mfold web server for nucleic acid folding and hybridization prediction</article-title>
. <source>Nucleic Acids Res.</source>
<volume>31</volume>
:<fpage>3406</fpage>
–<lpage>3415</lpage>
<pub-id pub-id-type="pmid">12824337</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000566 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000566 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4540971
   |texte=   Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic Micromonas Populations
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:25998521" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024

	Serveur d'exploration Cyberinfrastructure
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration Cyberinfrastructure

Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic Micromonas Populations

Intron Invasions Trace Algal Speciation and Reveal Nearly Identical Arctic and Antarctic Micromonas Populations

Source :

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki