Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Nucleomorph and plastid genome sequences of the chlorarachniophyte Lotharella oceanica: convergent reductive evolution and frequent recombination in nucleomorph-bearing algae

Identifieur interne : 000167 ( Pmc/Corpus ); précédent : 000166; suivant : 000168

Nucleomorph and plastid genome sequences of the chlorarachniophyte Lotharella oceanica: convergent reductive evolution and frequent recombination in nucleomorph-bearing algae

Auteurs : Goro Tanifuji ; Naoko T. Onodera ; Matthew W. Brown ; Bruce A. Curtis ; Andrew J. Roger ; Gane Ka-Shu Wong ; Michael Melkonian ; John M. Archibald

Source :

RBID : PMC:4035089

Abstract

Background

Nucleomorphs are residual nuclei derived from eukaryotic endosymbionts in chlorarachniophyte and cryptophyte algae. The endosymbionts that gave rise to nucleomorphs and plastids in these two algal groups were green and red algae, respectively. Despite their independent origin, the chlorarachniophyte and cryptophyte nucleomorph genomes share similar genomic features such as extreme size reduction and a three-chromosome architecture. This suggests that similar reductive evolutionary forces have acted to shape the nucleomorph genomes in the two groups. Thus far, however, only a single chlorarachniophyte nucleomorph and plastid genome has been sequenced, making broad evolutionary inferences within the chlorarachniophytes and between chlorarachniophytes and cryptophytes difficult. We have sequenced the nucleomorph and plastid genomes of the chlorarachniophyte Lotharella oceanica in order to gain insight into nucleomorph and plastid genome diversity and evolution.

Results

The L. oceanica nucleomorph genome was found to consist of three linear chromosomes totaling ~610 kilobase pairs (kbp), much larger than the 373 kbp nucleomorph genome of the model chlorarachniophyte Bigelowiella natans. The L. oceanica plastid genome is 71 kbp in size, similar to that of B. natans. Unexpectedly long (~35 kbp) sub-telomeric repeat regions were identified in the L. oceanica nucleomorph genome; internal multi-copy regions were also detected. Gene content analyses revealed that nucleomorph house-keeping genes and spliceosomal intron positions are well conserved between the L. oceanica and B. natans nucleomorph genomes. More broadly, gene retention patterns were found to be similar between nucleomorph genomes in chlorarachniophytes and cryptophytes. Chlorarachniophyte plastid genomes showed near identical protein coding gene complements as well as a high level of synteny.

Conclusions

We have provided insight into the process of nucleomorph genome evolution by elucidating the fine-scale dynamics of sub-telomeric repeat regions. Homologous recombination at the chromosome ends appears to be frequent, serving to expand and contract nucleomorph genome size. The main factor influencing nucleomorph genome size variation between different chlorarachniophyte species appears to be expansion-contraction of these telomere-associated repeats rather than changes in the number of unique protein coding genes. The dynamic nature of chlorarachniophyte nucleomorph genomes lies in stark contrast to their plastid genomes, which appear to be highly stable in terms of gene content and synteny.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-374) contains supplementary material, which is available to authorized users.


Url:
DOI: 10.1186/1471-2164-15-374
PubMed: 24885563
PubMed Central: 4035089

Links to Exploration step

PMC:4035089

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Nucleomorph and plastid genome sequences of the chlorarachniophyte
<italic>Lotharella oceanica</italic>
: convergent reductive evolution and frequent recombination in nucleomorph-bearing algae</title>
<author>
<name sortKey="Tanifuji, Goro" sort="Tanifuji, Goro" uniqKey="Tanifuji G" first="Goro" last="Tanifuji">Goro Tanifuji</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff2">Faculty of life and environmental sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8577 Japan</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Onodera, Naoko T" sort="Onodera, Naoko T" uniqKey="Onodera N" first="Naoko T" last="Onodera">Naoko T. Onodera</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Brown, Matthew W" sort="Brown, Matthew W" uniqKey="Brown M" first="Matthew W" last="Brown">Matthew W. Brown</name>
<affiliation>
<nlm:aff id="Aff3">Department of Biological Sciences, Mississippi State University, Mississippi State Mississippi, 39762 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Curtis, Bruce A" sort="Curtis, Bruce A" uniqKey="Curtis B" first="Bruce A" last="Curtis">Bruce A. Curtis</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Roger, Andrew J" sort="Roger, Andrew J" uniqKey="Roger A" first="Andrew J" last="Roger">Andrew J. Roger</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ka Shu Wong, Gane" sort="Ka Shu Wong, Gane" uniqKey="Ka Shu Wong G" first="Gane" last="Ka-Shu Wong">Gane Ka-Shu Wong</name>
<affiliation>
<nlm:aff id="Aff4">Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9 Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff5">Department of Medicine, University of Alberta, Edmonton, AB T6G 2E1 Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff6">BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen 518083 China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Melkonian, Michael" sort="Melkonian, Michael" uniqKey="Melkonian M" first="Michael" last="Melkonian">Michael Melkonian</name>
<affiliation>
<nlm:aff id="Aff7">Department of Botany, Cologne Biocenter, University of Cologne, Cologne, 50674 Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Archibald, John M" sort="Archibald, John M" uniqKey="Archibald J" first="John M" last="Archibald">John M. Archibald</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">24885563</idno>
<idno type="pmc">4035089</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4035089</idno>
<idno type="RBID">PMC:4035089</idno>
<idno type="doi">10.1186/1471-2164-15-374</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">000167</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Nucleomorph and plastid genome sequences of the chlorarachniophyte
<italic>Lotharella oceanica</italic>
: convergent reductive evolution and frequent recombination in nucleomorph-bearing algae</title>
<author>
<name sortKey="Tanifuji, Goro" sort="Tanifuji, Goro" uniqKey="Tanifuji G" first="Goro" last="Tanifuji">Goro Tanifuji</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff2">Faculty of life and environmental sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8577 Japan</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Onodera, Naoko T" sort="Onodera, Naoko T" uniqKey="Onodera N" first="Naoko T" last="Onodera">Naoko T. Onodera</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Brown, Matthew W" sort="Brown, Matthew W" uniqKey="Brown M" first="Matthew W" last="Brown">Matthew W. Brown</name>
<affiliation>
<nlm:aff id="Aff3">Department of Biological Sciences, Mississippi State University, Mississippi State Mississippi, 39762 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Curtis, Bruce A" sort="Curtis, Bruce A" uniqKey="Curtis B" first="Bruce A" last="Curtis">Bruce A. Curtis</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Roger, Andrew J" sort="Roger, Andrew J" uniqKey="Roger A" first="Andrew J" last="Roger">Andrew J. Roger</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ka Shu Wong, Gane" sort="Ka Shu Wong, Gane" uniqKey="Ka Shu Wong G" first="Gane" last="Ka-Shu Wong">Gane Ka-Shu Wong</name>
<affiliation>
<nlm:aff id="Aff4">Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9 Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff5">Department of Medicine, University of Alberta, Edmonton, AB T6G 2E1 Canada</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="Aff6">BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen 518083 China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Melkonian, Michael" sort="Melkonian, Michael" uniqKey="Melkonian M" first="Michael" last="Melkonian">Michael Melkonian</name>
<affiliation>
<nlm:aff id="Aff7">Department of Botany, Cologne Biocenter, University of Cologne, Cologne, 50674 Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Archibald, John M" sort="Archibald, John M" uniqKey="Archibald J" first="John M" last="Archibald">John M. Archibald</name>
<affiliation>
<nlm:aff id="Aff1">Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Genomics</title>
<idno type="eISSN">1471-2164</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Nucleomorphs are residual nuclei derived from eukaryotic endosymbionts in chlorarachniophyte and cryptophyte algae. The endosymbionts that gave rise to nucleomorphs and plastids in these two algal groups were green and red algae, respectively. Despite their independent origin, the chlorarachniophyte and cryptophyte nucleomorph genomes share similar genomic features such as extreme size reduction and a three-chromosome architecture. This suggests that similar reductive evolutionary forces have acted to shape the nucleomorph genomes in the two groups. Thus far, however, only a single chlorarachniophyte nucleomorph and plastid genome has been sequenced, making broad evolutionary inferences within the chlorarachniophytes and between chlorarachniophytes and cryptophytes difficult. We have sequenced the nucleomorph and plastid genomes of the chlorarachniophyte
<italic>Lotharella oceanica</italic>
in order to gain insight into nucleomorph and plastid genome diversity and evolution.</p>
</sec>
<sec>
<title>Results</title>
<p>The
<italic>L. oceanica</italic>
nucleomorph genome was found to consist of three linear chromosomes totaling ~610 kilobase pairs (kbp), much larger than the 373 kbp nucleomorph genome of the model chlorarachniophyte
<italic>Bigelowiella natans</italic>
. The
<italic>L. oceanica</italic>
plastid genome is 71 kbp in size, similar to that of
<italic>B. natans</italic>
. Unexpectedly long (~35 kbp) sub-telomeric repeat regions were identified in the
<italic>L. oceanica</italic>
nucleomorph genome; internal multi-copy regions were also detected. Gene content analyses revealed that nucleomorph house-keeping genes and spliceosomal intron positions are well conserved between the
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
nucleomorph genomes. More broadly, gene retention patterns were found to be similar between nucleomorph genomes in chlorarachniophytes and cryptophytes. Chlorarachniophyte plastid genomes showed near identical protein coding gene complements as well as a high level of synteny.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We have provided insight into the process of nucleomorph genome evolution by elucidating the fine-scale dynamics of sub-telomeric repeat regions. Homologous recombination at the chromosome ends appears to be frequent, serving to expand and contract nucleomorph genome size. The main factor influencing nucleomorph genome size variation between different chlorarachniophyte species appears to be expansion-contraction of these telomere-associated repeats rather than changes in the number of unique protein coding genes. The dynamic nature of chlorarachniophyte nucleomorph genomes lies in stark contrast to their plastid genomes, which appear to be highly stable in terms of gene content and synteny.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/1471-2164-15-374) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Dolezal, P" uniqKey="Dolezal P">P Dolezal</name>
</author>
<author>
<name sortKey="Likic, V" uniqKey="Likic V">V Likic</name>
</author>
<author>
<name sortKey="Tachezy, J" uniqKey="Tachezy J">J Tachezy</name>
</author>
<author>
<name sortKey="Lithgow, T" uniqKey="Lithgow T">T Lithgow</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gray, Mw" uniqKey="Gray M">MW Gray</name>
</author>
<author>
<name sortKey="Burger, G" uniqKey="Burger G">G Burger</name>
</author>
<author>
<name sortKey="Lang, Bf" uniqKey="Lang B">BF Lang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gould, Sb" uniqKey="Gould S">SB Gould</name>
</author>
<author>
<name sortKey="Waller, Rr" uniqKey="Waller R">RR Waller</name>
</author>
<author>
<name sortKey="Mcfadden, Gi" uniqKey="Mcfadden G">GI McFadden</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reyes Prieto, A" uniqKey="Reyes Prieto A">A Reyes-Prieto</name>
</author>
<author>
<name sortKey="Weber, Apm" uniqKey="Weber A">APM Weber</name>
</author>
<author>
<name sortKey="Bhattacharya, D" uniqKey="Bhattacharya D">D Bhattacharya</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Martin, W" uniqKey="Martin W">W Martin</name>
</author>
<author>
<name sortKey="Herrmann, Rg" uniqKey="Herrmann R">RG Herrmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Martin, W" uniqKey="Martin W">W Martin</name>
</author>
<author>
<name sortKey="Rujan, T" uniqKey="Rujan T">T Rujan</name>
</author>
<author>
<name sortKey="Richly, E" uniqKey="Richly E">E Richly</name>
</author>
<author>
<name sortKey="Hansen, A" uniqKey="Hansen A">A Hansen</name>
</author>
<author>
<name sortKey="Cornelsen, S" uniqKey="Cornelsen S">S Cornelsen</name>
</author>
<author>
<name sortKey="Lins, T" uniqKey="Lins T">T Lins</name>
</author>
<author>
<name sortKey="Leister, D" uniqKey="Leister D">D Leister</name>
</author>
<author>
<name sortKey="Stoebe, B" uniqKey="Stoebe B">B Stoebe</name>
</author>
<author>
<name sortKey="Hasegawa, M" uniqKey="Hasegawa M">M Hasegawa</name>
</author>
<author>
<name sortKey="Penny, D" uniqKey="Penny D">D Penny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Timmis, Jn" uniqKey="Timmis J">JN Timmis</name>
</author>
<author>
<name sortKey="Ayliffe, Ma" uniqKey="Ayliffe M">MA Ayliffe</name>
</author>
<author>
<name sortKey="Huang, Cy" uniqKey="Huang C">CY Huang</name>
</author>
<author>
<name sortKey="Martin, W" uniqKey="Martin W">W Martin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zybailov, B" uniqKey="Zybailov B">B Zybailov</name>
</author>
<author>
<name sortKey="Rutschow, H" uniqKey="Rutschow H">H Rutschow</name>
</author>
<author>
<name sortKey="Friso, G" uniqKey="Friso G">G Friso</name>
</author>
<author>
<name sortKey="Rudella, A" uniqKey="Rudella A">A Rudella</name>
</author>
<author>
<name sortKey="Emanuelsson, O" uniqKey="Emanuelsson O">O Emanuelsson</name>
</author>
<author>
<name sortKey="Sun, Q" uniqKey="Sun Q">Q Sun</name>
</author>
<author>
<name sortKey="Van Wijk, Kj" uniqKey="Van Wijk K">KJ van Wijk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tanifuji, G" uniqKey="Tanifuji G">G Tanifuji</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Douglas, S" uniqKey="Douglas S">S Douglas</name>
</author>
<author>
<name sortKey="Murphy, Ca" uniqKey="Murphy C">CA Murphy</name>
</author>
<author>
<name sortKey="Spencer, Df" uniqKey="Spencer D">DF Spencer</name>
</author>
<author>
<name sortKey="Gray, Mw" uniqKey="Gray M">MW Gray</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van De Peer, Y" uniqKey="Van De Peer Y">Y van de Peer</name>
</author>
<author>
<name sortKey="Rensing, Sa" uniqKey="Rensing S">SA Rensing</name>
</author>
<author>
<name sortKey="Maier, Ug" uniqKey="Maier U">UG Maier</name>
</author>
<author>
<name sortKey="De Wachter, R" uniqKey="De Wachter R">R De Wachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rogers, Mb" uniqKey="Rogers M">MB Rogers</name>
</author>
<author>
<name sortKey="Gilson, Pr" uniqKey="Gilson P">PR Gilson</name>
</author>
<author>
<name sortKey="Su, V" uniqKey="Su V">V Su</name>
</author>
<author>
<name sortKey="Mcfadden, Gi" uniqKey="Mcfadden G">GI McFadden</name>
</author>
<author>
<name sortKey="Keeling, Pj" uniqKey="Keeling P">PJ Keeling</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Silver, Td" uniqKey="Silver T">TD Silver</name>
</author>
<author>
<name sortKey="Koike, S" uniqKey="Koike S">S Koike</name>
</author>
<author>
<name sortKey="Yabuki, A" uniqKey="Yabuki A">A Yabuki</name>
</author>
<author>
<name sortKey="Kofuji, R" uniqKey="Kofuji R">R Kofuji</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
<author>
<name sortKey="Ishida, Ki" uniqKey="Ishida K">KI Ishida</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tanifuji, G" uniqKey="Tanifuji G">G Tanifuji</name>
</author>
<author>
<name sortKey="Onodera, Nt" uniqKey="Onodera N">NT Onodera</name>
</author>
<author>
<name sortKey="Hara, Y" uniqKey="Hara Y">Y Hara</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ishida, K" uniqKey="Ishida K">K Ishida</name>
</author>
<author>
<name sortKey="Endo, H" uniqKey="Endo H">H Endo</name>
</author>
<author>
<name sortKey="Koike, S" uniqKey="Koike S">S Koike</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Silver, Td" uniqKey="Silver T">TD Silver</name>
</author>
<author>
<name sortKey="Moore, Ce" uniqKey="Moore C">CE Moore</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gilson, Pr" uniqKey="Gilson P">PR Gilson</name>
</author>
<author>
<name sortKey="Su, V" uniqKey="Su V">V Su</name>
</author>
<author>
<name sortKey="Slamovits, Ch" uniqKey="Slamovits C">CH Slamovits</name>
</author>
<author>
<name sortKey="Reith, Me" uniqKey="Reith M">ME Reith</name>
</author>
<author>
<name sortKey="Keeling, Pj" uniqKey="Keeling P">PJ Keeling</name>
</author>
<author>
<name sortKey="Mcfadden, Gi" uniqKey="Mcfadden G">GI McFadden</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Douglas, S" uniqKey="Douglas S">S Douglas</name>
</author>
<author>
<name sortKey="Zauner, S" uniqKey="Zauner S">S Zauner</name>
</author>
<author>
<name sortKey="Fraunholz, M" uniqKey="Fraunholz M">M Fraunholz</name>
</author>
<author>
<name sortKey="Beaton, M" uniqKey="Beaton M">M Beaton</name>
</author>
<author>
<name sortKey="Penny, S" uniqKey="Penny S">S Penny</name>
</author>
<author>
<name sortKey="Deng, Lt" uniqKey="Deng L">LT Deng</name>
</author>
<author>
<name sortKey="Wu, Xn" uniqKey="Wu X">XN Wu</name>
</author>
<author>
<name sortKey="Reith, M" uniqKey="Reith M">M Reith</name>
</author>
<author>
<name sortKey="Cavalier Smith, T" uniqKey="Cavalier Smith T">T Cavalier-Smith</name>
</author>
<author>
<name sortKey="Maier, Ug" uniqKey="Maier U">UG Maier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lane, Ce" uniqKey="Lane C">CE Lane</name>
</author>
<author>
<name sortKey="Van Den Heuvel, K" uniqKey="Van Den Heuvel K">K van den Heuvel</name>
</author>
<author>
<name sortKey="Kozera, C" uniqKey="Kozera C">C Kozera</name>
</author>
<author>
<name sortKey="Curtis, Ba" uniqKey="Curtis B">BA Curtis</name>
</author>
<author>
<name sortKey="Parsons, Bj" uniqKey="Parsons B">BJ Parsons</name>
</author>
<author>
<name sortKey="Bowman, S" uniqKey="Bowman S">S Bowman</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tanifuji, G" uniqKey="Tanifuji G">G Tanifuji</name>
</author>
<author>
<name sortKey="Onodera, Nt" uniqKey="Onodera N">NT Onodera</name>
</author>
<author>
<name sortKey="Wheeler, Tj" uniqKey="Wheeler T">TJ Wheeler</name>
</author>
<author>
<name sortKey="Dlutek, M" uniqKey="Dlutek M">M Dlutek</name>
</author>
<author>
<name sortKey="Donaher, N" uniqKey="Donaher N">N Donaher</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Moore, Ce" uniqKey="Moore C">CE Moore</name>
</author>
<author>
<name sortKey="Curtis, Ba" uniqKey="Curtis B">BA Curtis</name>
</author>
<author>
<name sortKey="Mills, T" uniqKey="Mills T">T Mills</name>
</author>
<author>
<name sortKey="Tanifuji, G" uniqKey="Tanifuji G">G Tanifuji</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Katinka, Md" uniqKey="Katinka M">MD Katinka</name>
</author>
<author>
<name sortKey="Duprat, S" uniqKey="Duprat S">S Duprat</name>
</author>
<author>
<name sortKey="Cornillot, E" uniqKey="Cornillot E">E Cornillot</name>
</author>
<author>
<name sortKey="Metenier, G" uniqKey="Metenier G">G Metenier</name>
</author>
<author>
<name sortKey="Thomarat, F" uniqKey="Thomarat F">F Thomarat</name>
</author>
<author>
<name sortKey="Prensier, G" uniqKey="Prensier G">G Prensier</name>
</author>
<author>
<name sortKey="Barbe, V" uniqKey="Barbe V">V Barbe</name>
</author>
<author>
<name sortKey="Peyretaillade, E" uniqKey="Peyretaillade E">E Peyretaillade</name>
</author>
<author>
<name sortKey="Brottier, P" uniqKey="Brottier P">P Brottier</name>
</author>
<author>
<name sortKey="Wincker, P" uniqKey="Wincker P">P Wincker</name>
</author>
<author>
<name sortKey="Delbac, F" uniqKey="Delbac F">F Delbac</name>
</author>
<author>
<name sortKey="Alaoui, Hei" uniqKey="Alaoui H">HEI Alaoui</name>
</author>
<author>
<name sortKey="Peyret, P" uniqKey="Peyret P">P Peyret</name>
</author>
<author>
<name sortKey="Saurin, W" uniqKey="Saurin W">W Saurin</name>
</author>
<author>
<name sortKey="Gouy, M" uniqKey="Gouy M">M Gouy</name>
</author>
<author>
<name sortKey="Weissenbach, J" uniqKey="Weissenbach J">J Weissenbach</name>
</author>
<author>
<name sortKey="Vivares, Cp" uniqKey="Vivares C">CP Vivares</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sasaki, Y" uniqKey="Sasaki Y">Y Sasaki</name>
</author>
<author>
<name sortKey="Ishikawa, J" uniqKey="Ishikawa J">J Ishikawa</name>
</author>
<author>
<name sortKey="Yamashita, A" uniqKey="Yamashita A">A Yamashita</name>
</author>
<author>
<name sortKey="Oshima, K" uniqKey="Oshima K">K Oshima</name>
</author>
<author>
<name sortKey="Kenri, T" uniqKey="Kenri T">T Kenri</name>
</author>
<author>
<name sortKey="Furuya, K" uniqKey="Furuya K">K Furuya</name>
</author>
<author>
<name sortKey="Yoshino, C" uniqKey="Yoshino C">C Yoshino</name>
</author>
<author>
<name sortKey="Horino, A" uniqKey="Horino A">A Horino</name>
</author>
<author>
<name sortKey="Shiba, T" uniqKey="Shiba T">T Shiba</name>
</author>
<author>
<name sortKey="Sasaki, T" uniqKey="Sasaki T">T Sasaki</name>
</author>
<author>
<name sortKey="Hattori, M" uniqKey="Hattori M">M Hattori</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Corradi, N" uniqKey="Corradi N">N Corradi</name>
</author>
<author>
<name sortKey="Pombert, Jf" uniqKey="Pombert J">JF Pombert</name>
</author>
<author>
<name sortKey="Farinelli, L" uniqKey="Farinelli L">L Farinelli</name>
</author>
<author>
<name sortKey="Didier, Es" uniqKey="Didier E">ES Didier</name>
</author>
<author>
<name sortKey="Keeling, Pj" uniqKey="Keeling P">PJ Keeling</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mccutcheon, Jp" uniqKey="Mccutcheon J">JP McCutcheon</name>
</author>
<author>
<name sortKey="Mcdonald, Br" uniqKey="Mcdonald B">BR McDonald</name>
</author>
<author>
<name sortKey="Moran, Na" uniqKey="Moran N">NA Moran</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, Dr" uniqKey="Smith D">DR Smith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Slamovits, Ch" uniqKey="Slamovits C">CH Slamovits</name>
</author>
<author>
<name sortKey="Keeling, Pj" uniqKey="Keeling P">PJ Keeling</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Merchant, Ss" uniqKey="Merchant S">SS Merchant</name>
</author>
<author>
<name sortKey="Prochnik, Se" uniqKey="Prochnik S">SE Prochnik</name>
</author>
<author>
<name sortKey="Vallon, O" uniqKey="Vallon O">O Vallon</name>
</author>
<author>
<name sortKey="Harris, Eh" uniqKey="Harris E">EH Harris</name>
</author>
<author>
<name sortKey="Karpowicz, Sj" uniqKey="Karpowicz S">SJ Karpowicz</name>
</author>
<author>
<name sortKey="Witman, Gb" uniqKey="Witman G">GB Witman</name>
</author>
<author>
<name sortKey="Terry, A" uniqKey="Terry A">A Terry</name>
</author>
<author>
<name sortKey="Salamov, A" uniqKey="Salamov A">A Salamov</name>
</author>
<author>
<name sortKey="Fritz Laylin, Lk" uniqKey="Fritz Laylin L">LK Fritz-Laylin</name>
</author>
<author>
<name sortKey="Marechal Drouard, L" uniqKey="Marechal Drouard L">L Marechal-Drouard</name>
</author>
<author>
<name sortKey="Marshall, Wf" uniqKey="Marshall W">WF Marshall</name>
</author>
<author>
<name sortKey="Qu, Lh" uniqKey="Qu L">LH Qu</name>
</author>
<author>
<name sortKey="Nelson, Dr" uniqKey="Nelson D">DR Nelson</name>
</author>
<author>
<name sortKey="Sanderfoot, Aa" uniqKey="Sanderfoot A">AA Sanderfoot</name>
</author>
<author>
<name sortKey="Spalding, Mh" uniqKey="Spalding M">MH Spalding</name>
</author>
<author>
<name sortKey="Kapitonov, Vv" uniqKey="Kapitonov V">VV Kapitonov</name>
</author>
<author>
<name sortKey="Ren, Qh" uniqKey="Ren Q">QH Ren</name>
</author>
<author>
<name sortKey="Ferris, P" uniqKey="Ferris P">P Ferris</name>
</author>
<author>
<name sortKey="Lindquist, E" uniqKey="Lindquist E">E Lindquist</name>
</author>
<author>
<name sortKey="Shapiro, H" uniqKey="Shapiro H">H Shapiro</name>
</author>
<author>
<name sortKey="Lucas, Sm" uniqKey="Lucas S">SM Lucas</name>
</author>
<author>
<name sortKey="Grimwood, J" uniqKey="Grimwood J">J Grimwood</name>
</author>
<author>
<name sortKey="Schmutz, J" uniqKey="Schmutz J">J Schmutz</name>
</author>
<author>
<name sortKey="Cardol, P" uniqKey="Cardol P">P Cardol</name>
</author>
<author>
<name sortKey="Cerutti, H" uniqKey="Cerutti H">H Cerutti</name>
</author>
<author>
<name sortKey="Chanfreau, G" uniqKey="Chanfreau G">G Chanfreau</name>
</author>
<author>
<name sortKey="Chen, Cl" uniqKey="Chen C">CL Chen</name>
</author>
<author>
<name sortKey="Cognat, V" uniqKey="Cognat V">V Cognat</name>
</author>
<author>
<name sortKey="Croft, Mt" uniqKey="Croft M">MT Croft</name>
</author>
<author>
<name sortKey="Dent, R" uniqKey="Dent R">R Dent</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Palenik, B" uniqKey="Palenik B">B Palenik</name>
</author>
<author>
<name sortKey="Grimwood, J" uniqKey="Grimwood J">J Grimwood</name>
</author>
<author>
<name sortKey="Aerts, A" uniqKey="Aerts A">A Aerts</name>
</author>
<author>
<name sortKey="Rouze, P" uniqKey="Rouze P">P Rouze</name>
</author>
<author>
<name sortKey="Salamov, A" uniqKey="Salamov A">A Salamov</name>
</author>
<author>
<name sortKey="Putnam, N" uniqKey="Putnam N">N Putnam</name>
</author>
<author>
<name sortKey="Dupont, C" uniqKey="Dupont C">C Dupont</name>
</author>
<author>
<name sortKey="Jorgensen, R" uniqKey="Jorgensen R">R Jorgensen</name>
</author>
<author>
<name sortKey="Derelle, E" uniqKey="Derelle E">E Derelle</name>
</author>
<author>
<name sortKey="Rombauts, S" uniqKey="Rombauts S">S Rombauts</name>
</author>
<author>
<name sortKey="Zhou, Km" uniqKey="Zhou K">KM Zhou</name>
</author>
<author>
<name sortKey="Otillar, R" uniqKey="Otillar R">R Otillar</name>
</author>
<author>
<name sortKey="Merchant, Ss" uniqKey="Merchant S">SS Merchant</name>
</author>
<author>
<name sortKey="Podell, S" uniqKey="Podell S">S Podell</name>
</author>
<author>
<name sortKey="Gaasterland, T" uniqKey="Gaasterland T">T Gaasterland</name>
</author>
<author>
<name sortKey="Napoli, C" uniqKey="Napoli C">C Napoli</name>
</author>
<author>
<name sortKey="Gendler, K" uniqKey="Gendler K">K Gendler</name>
</author>
<author>
<name sortKey="Manuell, A" uniqKey="Manuell A">A Manuell</name>
</author>
<author>
<name sortKey="Tai, V" uniqKey="Tai V">V Tai</name>
</author>
<author>
<name sortKey="Vallon, O" uniqKey="Vallon O">O Vallon</name>
</author>
<author>
<name sortKey="Piganeau, G" uniqKey="Piganeau G">G Piganeau</name>
</author>
<author>
<name sortKey="Jancek, S" uniqKey="Jancek S">S Jancek</name>
</author>
<author>
<name sortKey="Heijde, M" uniqKey="Heijde M">M Heijde</name>
</author>
<author>
<name sortKey="Jabbari, K" uniqKey="Jabbari K">K Jabbari</name>
</author>
<author>
<name sortKey="Bowler, C" uniqKey="Bowler C">C Bowler</name>
</author>
<author>
<name sortKey="Lohr, M" uniqKey="Lohr M">M Lohr</name>
</author>
<author>
<name sortKey="Robbens, S" uniqKey="Robbens S">S Robbens</name>
</author>
<author>
<name sortKey="Werner, G" uniqKey="Werner G">G Werner</name>
</author>
<author>
<name sortKey="Dubchak, I" uniqKey="Dubchak I">I Dubchak</name>
</author>
<author>
<name sortKey="Pazour, Gj" uniqKey="Pazour G">GJ Pazour</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roy, Sw" uniqKey="Roy S">SW Roy</name>
</author>
<author>
<name sortKey="Penny, D" uniqKey="Penny D">D Penny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Douglas, Se" uniqKey="Douglas S">SE Douglas</name>
</author>
<author>
<name sortKey="Penny, Sl" uniqKey="Penny S">SL Penny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Khan, H" uniqKey="Khan H">H Khan</name>
</author>
<author>
<name sortKey="Parks, N" uniqKey="Parks N">N Parks</name>
</author>
<author>
<name sortKey="Kozera, C" uniqKey="Kozera C">C Kozera</name>
</author>
<author>
<name sortKey="Curtis, Ba" uniqKey="Curtis B">BA Curtis</name>
</author>
<author>
<name sortKey="Parsons, Bj" uniqKey="Parsons B">BJ Parsons</name>
</author>
<author>
<name sortKey="Bowman, S" uniqKey="Bowman S">S Bowman</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Donaher, N" uniqKey="Donaher N">N Donaher</name>
</author>
<author>
<name sortKey="Tanifuji, G" uniqKey="Tanifuji G">G Tanifuji</name>
</author>
<author>
<name sortKey="Onodera, Nt" uniqKey="Onodera N">NT Onodera</name>
</author>
<author>
<name sortKey="Malfatti, Sa" uniqKey="Malfatti S">SA Malfatti</name>
</author>
<author>
<name sortKey="Chain, Psg" uniqKey="Chain P">PSG Chain</name>
</author>
<author>
<name sortKey="Hara, Y" uniqKey="Hara Y">Y Hara</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Emanuelsson, O" uniqKey="Emanuelsson O">O Emanuelsson</name>
</author>
<author>
<name sortKey="Nielsen, H" uniqKey="Nielsen H">H Nielsen</name>
</author>
<author>
<name sortKey="Von Heijne, G" uniqKey="Von Heijne G">G von Heijne</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Emanuelsson, O" uniqKey="Emanuelsson O">O Emanuelsson</name>
</author>
<author>
<name sortKey="Nielsen, H" uniqKey="Nielsen H">H Nielsen</name>
</author>
<author>
<name sortKey="Brunak, S" uniqKey="Brunak S">S Brunak</name>
</author>
<author>
<name sortKey="Von Heijne, G" uniqKey="Von Heijne G">G von Heijne</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lane, Ce" uniqKey="Lane C">CE Lane</name>
</author>
<author>
<name sortKey="Khan, H" uniqKey="Khan H">H Khan</name>
</author>
<author>
<name sortKey="Mackinnon, M" uniqKey="Mackinnon M">M MacKinnon</name>
</author>
<author>
<name sortKey="Fong, A" uniqKey="Fong A">A Fong</name>
</author>
<author>
<name sortKey="Theophilou, S" uniqKey="Theophilou S">S Theophilou</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tanifuji, G" uniqKey="Tanifuji G">G Tanifuji</name>
</author>
<author>
<name sortKey="Onodeta, Nt" uniqKey="Onodeta N">NT Onodeta</name>
</author>
<author>
<name sortKey="Moore, Ce" uniqKey="Moore C">CE Moore</name>
</author>
<author>
<name sortKey="Archibald, Jm" uniqKey="Archibald J">JM Archibald</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Galtier, N" uniqKey="Galtier N">N Galtier</name>
</author>
<author>
<name sortKey="Piganeau, G" uniqKey="Piganeau G">G Piganeau</name>
</author>
<author>
<name sortKey="Mouchiroud, D" uniqKey="Mouchiroud D">D Mouchiroud</name>
</author>
<author>
<name sortKey="Duret, L" uniqKey="Duret L">L Duret</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Escobar, Js" uniqKey="Escobar J">JS Escobar</name>
</author>
<author>
<name sortKey="Glemin, S" uniqKey="Glemin S">S Glemin</name>
</author>
<author>
<name sortKey="Galtier, N" uniqKey="Galtier N">N Galtier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Curtis, Ba" uniqKey="Curtis B">BA Curtis</name>
</author>
<author>
<name sortKey="Tanifuji, G" uniqKey="Tanifuji G">G Tanifuji</name>
</author>
<author>
<name sortKey="Burki, F" uniqKey="Burki F">F Burki</name>
</author>
<author>
<name sortKey="Gruber, A" uniqKey="Gruber A">A Gruber</name>
</author>
<author>
<name sortKey="Irimia, M" uniqKey="Irimia M">M Irimia</name>
</author>
<author>
<name sortKey="Maruyama, S" uniqKey="Maruyama S">S Maruyama</name>
</author>
<author>
<name sortKey="Arias, Mc" uniqKey="Arias M">MC Arias</name>
</author>
<author>
<name sortKey="Ball, Sg" uniqKey="Ball S">SG Ball</name>
</author>
<author>
<name sortKey="Gile, Gh" uniqKey="Gile G">GH Gile</name>
</author>
<author>
<name sortKey="Hirakawa, Y" uniqKey="Hirakawa Y">Y Hirakawa</name>
</author>
<author>
<name sortKey="Hopkins, Jf" uniqKey="Hopkins J">JF Hopkins</name>
</author>
<author>
<name sortKey="Kuo, A" uniqKey="Kuo A">A Kuo</name>
</author>
<author>
<name sortKey="Rensing, Sa" uniqKey="Rensing S">SA Rensing</name>
</author>
<author>
<name sortKey="Schmutz, J" uniqKey="Schmutz J">J Schmutz</name>
</author>
<author>
<name sortKey="Symeonidi, A" uniqKey="Symeonidi A">A Symeonidi</name>
</author>
<author>
<name sortKey="Elias, M" uniqKey="Elias M">M Elias</name>
</author>
<author>
<name sortKey="Eveleigh, Rjm" uniqKey="Eveleigh R">RJM Eveleigh</name>
</author>
<author>
<name sortKey="Herman, Ek" uniqKey="Herman E">EK Herman</name>
</author>
<author>
<name sortKey="Klute, Mj" uniqKey="Klute M">MJ Klute</name>
</author>
<author>
<name sortKey="Nakayama, T" uniqKey="Nakayama T">T Nakayama</name>
</author>
<author>
<name sortKey="Obornik, M" uniqKey="Obornik M">M Oborník</name>
</author>
<author>
<name sortKey="Reyes Prieto, A" uniqKey="Reyes Prieto A">A Reyes-Prieto</name>
</author>
<author>
<name sortKey="Armbrust, Ev" uniqKey="Armbrust E">EV Armbrust</name>
</author>
<author>
<name sortKey="Aves, St" uniqKey="Aves S">ST Aves</name>
</author>
<author>
<name sortKey="Beiko, Rg" uniqKey="Beiko R">RG Beiko</name>
</author>
<author>
<name sortKey="Coutinho, P" uniqKey="Coutinho P">P Coutinho</name>
</author>
<author>
<name sortKey="Dacks, Jb" uniqKey="Dacks J">JB Dacks</name>
</author>
<author>
<name sortKey="Durnford, Dg" uniqKey="Durnford D">DG Durnford</name>
</author>
<author>
<name sortKey="Fast, Nm" uniqKey="Fast N">NM Fast</name>
</author>
<author>
<name sortKey="Green, Br" uniqKey="Green B">BR Green</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turmel, M" uniqKey="Turmel M">M Turmel</name>
</author>
<author>
<name sortKey="Gagnon, Mc" uniqKey="Gagnon M">MC Gagnon</name>
</author>
<author>
<name sortKey="O Kelly, Cj" uniqKey="O Kelly C">CJ O'Kelly</name>
</author>
<author>
<name sortKey="Otis, C" uniqKey="Otis C">C Otis</name>
</author>
<author>
<name sortKey="Lemieux, C" uniqKey="Lemieux C">C Lemieux</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leliaert, F" uniqKey="Leliaert F">F Leliaert</name>
</author>
<author>
<name sortKey="Smith, Dr" uniqKey="Smith D">DR Smith</name>
</author>
<author>
<name sortKey="Moreau, H" uniqKey="Moreau H">H Moreau</name>
</author>
<author>
<name sortKey="Herron, Md" uniqKey="Herron M">MD Herron</name>
</author>
<author>
<name sortKey="Verbruggen, H" uniqKey="Verbruggen H">H Verbruggen</name>
</author>
<author>
<name sortKey="Delwiche, Cf" uniqKey="Delwiche C">CF Delwiche</name>
</author>
<author>
<name sortKey="De Clerck, O" uniqKey="De Clerck O">O De Clerck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marin, B" uniqKey="Marin B">B Marin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marin, B" uniqKey="Marin B">B Marin</name>
</author>
<author>
<name sortKey="Melkonian, M" uniqKey="Melkonian M">M Melkonian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stamatakis, A" uniqKey="Stamatakis A">A Stamatakis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lartillot, N" uniqKey="Lartillot N">N Lartillot</name>
</author>
<author>
<name sortKey="Rodrigue, N" uniqKey="Rodrigue N">N Rodrigue</name>
</author>
<author>
<name sortKey="Stubbs, D" uniqKey="Stubbs D">D Stubbs</name>
</author>
<author>
<name sortKey="Richer, J" uniqKey="Richer J">J Richer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, Sl" uniqKey="Sun S">SL Sun</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Li, Wz" uniqKey="Li W">WZ Li</name>
</author>
<author>
<name sortKey="Altintas, I" uniqKey="Altintas I">I Altintas</name>
</author>
<author>
<name sortKey="Lin, A" uniqKey="Lin A">A Lin</name>
</author>
<author>
<name sortKey="Peltier, S" uniqKey="Peltier S">S Peltier</name>
</author>
<author>
<name sortKey="Stocks, K" uniqKey="Stocks K">K Stocks</name>
</author>
<author>
<name sortKey="Allen, Ee" uniqKey="Allen E">EE Allen</name>
</author>
<author>
<name sortKey="Ellisman, M" uniqKey="Ellisman M">M Ellisman</name>
</author>
<author>
<name sortKey="Grethe, J" uniqKey="Grethe J">J Grethe</name>
</author>
<author>
<name sortKey="Wooley, J" uniqKey="Wooley J">J Wooley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, Y" uniqKey="Liu Y">Y Liu</name>
</author>
<author>
<name sortKey="Schmidt, B" uniqKey="Schmidt B">B Schmidt</name>
</author>
<author>
<name sortKey="Maskell, Dl" uniqKey="Maskell D">DL Maskell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, H" uniqKey="Li H">H Li</name>
</author>
<author>
<name sortKey="Handsaker, B" uniqKey="Handsaker B">B Handsaker</name>
</author>
<author>
<name sortKey="Wysoker, A" uniqKey="Wysoker A">A Wysoker</name>
</author>
<author>
<name sortKey="Fennell, T" uniqKey="Fennell T">T Fennell</name>
</author>
<author>
<name sortKey="Ruan, J" uniqKey="Ruan J">J Ruan</name>
</author>
<author>
<name sortKey="Homer, N" uniqKey="Homer N">N Homer</name>
</author>
<author>
<name sortKey="Marth, G" uniqKey="Marth G">G Marth</name>
</author>
<author>
<name sortKey="Abecasis, G" uniqKey="Abecasis G">G Abecasis</name>
</author>
<author>
<name sortKey="Durbin, R" uniqKey="Durbin R">R Durbin</name>
</author>
<author>
<name sortKey="Proc, Gpd" uniqKey="Proc G">GPD Proc</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Robinson, Jt" uniqKey="Robinson J">JT Robinson</name>
</author>
<author>
<name sortKey="Thorvaldsdottir, H" uniqKey="Thorvaldsdottir H">H Thorvaldsdottir</name>
</author>
<author>
<name sortKey="Winckler, W" uniqKey="Winckler W">W Winckler</name>
</author>
<author>
<name sortKey="Guttman, M" uniqKey="Guttman M">M Guttman</name>
</author>
<author>
<name sortKey="Lander, Es" uniqKey="Lander E">ES Lander</name>
</author>
<author>
<name sortKey="Getz, G" uniqKey="Getz G">G Getz</name>
</author>
<author>
<name sortKey="Mesirov, Jp" uniqKey="Mesirov J">JP Mesirov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Larkin, Ma" uniqKey="Larkin M">MA Larkin</name>
</author>
<author>
<name sortKey="Blackshields, G" uniqKey="Blackshields G">G Blackshields</name>
</author>
<author>
<name sortKey="Brown, Np" uniqKey="Brown N">NP Brown</name>
</author>
<author>
<name sortKey="Chenna, R" uniqKey="Chenna R">R Chenna</name>
</author>
<author>
<name sortKey="Mcgettigan, Pa" uniqKey="Mcgettigan P">PA McGettigan</name>
</author>
<author>
<name sortKey="Mcwilliam, H" uniqKey="Mcwilliam H">H McWilliam</name>
</author>
<author>
<name sortKey="Valentin, F" uniqKey="Valentin F">F Valentin</name>
</author>
<author>
<name sortKey="Wallace, Im" uniqKey="Wallace I">IM Wallace</name>
</author>
<author>
<name sortKey="Wilm, A" uniqKey="Wilm A">A Wilm</name>
</author>
<author>
<name sortKey="Lopez, R" uniqKey="Lopez R">R Lopez</name>
</author>
<author>
<name sortKey="Thompson, Jd" uniqKey="Thompson J">JD Thompson</name>
</author>
<author>
<name sortKey="Gibson, Tj" uniqKey="Gibson T">TJ Gibson</name>
</author>
<author>
<name sortKey="Higgins, Dg" uniqKey="Higgins D">DG Higgins</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Crooks, Ge" uniqKey="Crooks G">GE Crooks</name>
</author>
<author>
<name sortKey="Hon, G" uniqKey="Hon G">G Hon</name>
</author>
<author>
<name sortKey="Chandonia, Jm" uniqKey="Chandonia J">JM Chandonia</name>
</author>
<author>
<name sortKey="Brenner, Se" uniqKey="Brenner S">SE Brenner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rutherford, K" uniqKey="Rutherford K">K Rutherford</name>
</author>
<author>
<name sortKey="Parkhill, J" uniqKey="Parkhill J">J Parkhill</name>
</author>
<author>
<name sortKey="Crook, J" uniqKey="Crook J">J Crook</name>
</author>
<author>
<name sortKey="Horsnell, T" uniqKey="Horsnell T">T Horsnell</name>
</author>
<author>
<name sortKey="Rice, P" uniqKey="Rice P">P Rice</name>
</author>
<author>
<name sortKey="Rajandream, Ma" uniqKey="Rajandream M">MA Rajandream</name>
</author>
<author>
<name sortKey="Barrell, B" uniqKey="Barrell B">B Barrell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schattner, P" uniqKey="Schattner P">P Schattner</name>
</author>
<author>
<name sortKey="Brooks, An" uniqKey="Brooks A">AN Brooks</name>
</author>
<author>
<name sortKey="Lowe, Tm" uniqKey="Lowe T">TM Lowe</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Renner, T" uniqKey="Renner T">T Renner</name>
</author>
<author>
<name sortKey="Waters, Er" uniqKey="Waters E">ER Waters</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Katoh, K" uniqKey="Katoh K">K Katoh</name>
</author>
<author>
<name sortKey="Standley, Dm" uniqKey="Standley D">DM Standley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Capella Gutierrez, S" uniqKey="Capella Gutierrez S">S Capella-Gutierrez</name>
</author>
<author>
<name sortKey="Silla Martinez, Jm" uniqKey="Silla Martinez J">JM Silla-Martinez</name>
</author>
<author>
<name sortKey="Gabaldon, T" uniqKey="Gabaldon T">T Gabaldon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Criscuolo, A" uniqKey="Criscuolo A">A Criscuolo</name>
</author>
<author>
<name sortKey="Gribaldo, S" uniqKey="Gribaldo S">S Gribaldo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brown, Mw" uniqKey="Brown M">MW Brown</name>
</author>
<author>
<name sortKey="Sharpe, Sc" uniqKey="Sharpe S">SC Sharpe</name>
</author>
<author>
<name sortKey="Silberman, Jd" uniqKey="Silberman J">JD Silberman</name>
</author>
<author>
<name sortKey="Heiss, Aa" uniqKey="Heiss A">AA Heiss</name>
</author>
<author>
<name sortKey="Lang, Bf" uniqKey="Lang B">BF Lang</name>
</author>
<author>
<name sortKey="Simpson, Agb" uniqKey="Simpson A">AGB Simpson</name>
</author>
<author>
<name sortKey="Roger, Aj" uniqKey="Roger A">AJ Roger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burki, F" uniqKey="Burki F">F Burki</name>
</author>
<author>
<name sortKey="Shalchian Tabrizi, K" uniqKey="Shalchian Tabrizi K">K Shalchian-Tabrizi</name>
</author>
<author>
<name sortKey="Minge, M" uniqKey="Minge M">M Minge</name>
</author>
<author>
<name sortKey="Skjaeveland, A" uniqKey="Skjaeveland A">A Skjaeveland</name>
</author>
<author>
<name sortKey="Nikolaev, Si" uniqKey="Nikolaev S">SI Nikolaev</name>
</author>
<author>
<name sortKey="Jakobsen, Ks" uniqKey="Jakobsen K">KS Jakobsen</name>
</author>
<author>
<name sortKey="Pawlowski, J" uniqKey="Pawlowski J">J Pawlowski</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Burki, F" uniqKey="Burki F">F Burki</name>
</author>
<author>
<name sortKey="Okamoto, N" uniqKey="Okamoto N">N Okamoto</name>
</author>
<author>
<name sortKey="Pombert, Jf" uniqKey="Pombert J">JF Pombert</name>
</author>
<author>
<name sortKey="Keeling, Pj" uniqKey="Keeling P">PJ Keeling</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hampl, V" uniqKey="Hampl V">V Hampl</name>
</author>
<author>
<name sortKey="Hug, L" uniqKey="Hug L">L Hug</name>
</author>
<author>
<name sortKey="Leigh, Jw" uniqKey="Leigh J">JW Leigh</name>
</author>
<author>
<name sortKey="Dacks, Jb" uniqKey="Dacks J">JB Dacks</name>
</author>
<author>
<name sortKey="Lang, Bf" uniqKey="Lang B">BF Lang</name>
</author>
<author>
<name sortKey="Simpson, Agb" uniqKey="Simpson A">AGB Simpson</name>
</author>
<author>
<name sortKey="Roger, Aj" uniqKey="Roger A">AJ Roger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brown, Mw" uniqKey="Brown M">MW Brown</name>
</author>
<author>
<name sortKey="Kolisko, M" uniqKey="Kolisko M">M Kolisko</name>
</author>
<author>
<name sortKey="Silberman, Jd" uniqKey="Silberman J">JD Silberman</name>
</author>
<author>
<name sortKey="Roger, Aj" uniqKey="Roger A">AJ Roger</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Genomics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Genomics</journal-id>
<journal-title-group>
<journal-title>BMC Genomics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2164</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">24885563</article-id>
<article-id pub-id-type="pmc">4035089</article-id>
<article-id pub-id-type="publisher-id">6068</article-id>
<article-id pub-id-type="doi">10.1186/1471-2164-15-374</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Nucleomorph and plastid genome sequences of the chlorarachniophyte
<italic>Lotharella oceanica</italic>
: convergent reductive evolution and frequent recombination in nucleomorph-bearing algae</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Tanifuji</surname>
<given-names>Goro</given-names>
</name>
<address>
<email>tanifuji.goro.gn@u.tsukuba.ac.jp</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
<xref ref-type="aff" rid="Aff2"></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Onodera</surname>
<given-names>Naoko T</given-names>
</name>
<address>
<email>naokot@dal.ca</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Brown</surname>
<given-names>Matthew W</given-names>
</name>
<address>
<email>matthew.brown@msstate.edu</email>
</address>
<xref ref-type="aff" rid="Aff3"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Curtis</surname>
<given-names>Bruce A</given-names>
</name>
<address>
<email>bruce.curtis@dal.ca</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Roger</surname>
<given-names>Andrew J</given-names>
</name>
<address>
<email>andrew.roger@dal.ca</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ka-Shu Wong</surname>
<given-names>Gane</given-names>
</name>
<address>
<email>gane@ualberta.ca</email>
</address>
<xref ref-type="aff" rid="Aff4"></xref>
<xref ref-type="aff" rid="Aff5"></xref>
<xref ref-type="aff" rid="Aff6"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Melkonian</surname>
<given-names>Michael</given-names>
</name>
<address>
<email>michael.melkonian@uni-koeln.de</email>
</address>
<xref ref-type="aff" rid="Aff7"></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Archibald</surname>
<given-names>John M</given-names>
</name>
<address>
<email>john.archibald@dal.ca</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<aff id="Aff1">
<label></label>
Department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, Dalhousie University, Halifax, Nova Scotia B3H 4R2 Canada</aff>
<aff id="Aff2">
<label></label>
Faculty of life and environmental sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8577 Japan</aff>
<aff id="Aff3">
<label></label>
Department of Biological Sciences, Mississippi State University, Mississippi State Mississippi, 39762 USA</aff>
<aff id="Aff4">
<label></label>
Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9 Canada</aff>
<aff id="Aff5">
<label></label>
Department of Medicine, University of Alberta, Edmonton, AB T6G 2E1 Canada</aff>
<aff id="Aff6">
<label></label>
BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen 518083 China</aff>
<aff id="Aff7">
<label></label>
Department of Botany, Cologne Biocenter, University of Cologne, Cologne, 50674 Germany</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>15</day>
<month>5</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>15</day>
<month>5</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="collection">
<year>2014</year>
</pub-date>
<volume>15</volume>
<issue>1</issue>
<elocation-id>374</elocation-id>
<history>
<date date-type="received">
<day>10</day>
<month>1</month>
<year>2014</year>
</date>
<date date-type="accepted">
<day>9</day>
<month>5</month>
<year>2014</year>
</date>
</history>
<permissions>
<copyright-statement>© Tanifuji et al.; licensee BioMed Central Ltd. 2014</copyright-statement>
<license license-type="open-access">
<license-p>This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<sec>
<title>Background</title>
<p>Nucleomorphs are residual nuclei derived from eukaryotic endosymbionts in chlorarachniophyte and cryptophyte algae. The endosymbionts that gave rise to nucleomorphs and plastids in these two algal groups were green and red algae, respectively. Despite their independent origin, the chlorarachniophyte and cryptophyte nucleomorph genomes share similar genomic features such as extreme size reduction and a three-chromosome architecture. This suggests that similar reductive evolutionary forces have acted to shape the nucleomorph genomes in the two groups. Thus far, however, only a single chlorarachniophyte nucleomorph and plastid genome has been sequenced, making broad evolutionary inferences within the chlorarachniophytes and between chlorarachniophytes and cryptophytes difficult. We have sequenced the nucleomorph and plastid genomes of the chlorarachniophyte
<italic>Lotharella oceanica</italic>
in order to gain insight into nucleomorph and plastid genome diversity and evolution.</p>
</sec>
<sec>
<title>Results</title>
<p>The
<italic>L. oceanica</italic>
nucleomorph genome was found to consist of three linear chromosomes totaling ~610 kilobase pairs (kbp), much larger than the 373 kbp nucleomorph genome of the model chlorarachniophyte
<italic>Bigelowiella natans</italic>
. The
<italic>L. oceanica</italic>
plastid genome is 71 kbp in size, similar to that of
<italic>B. natans</italic>
. Unexpectedly long (~35 kbp) sub-telomeric repeat regions were identified in the
<italic>L. oceanica</italic>
nucleomorph genome; internal multi-copy regions were also detected. Gene content analyses revealed that nucleomorph house-keeping genes and spliceosomal intron positions are well conserved between the
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
nucleomorph genomes. More broadly, gene retention patterns were found to be similar between nucleomorph genomes in chlorarachniophytes and cryptophytes. Chlorarachniophyte plastid genomes showed near identical protein coding gene complements as well as a high level of synteny.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>We have provided insight into the process of nucleomorph genome evolution by elucidating the fine-scale dynamics of sub-telomeric repeat regions. Homologous recombination at the chromosome ends appears to be frequent, serving to expand and contract nucleomorph genome size. The main factor influencing nucleomorph genome size variation between different chlorarachniophyte species appears to be expansion-contraction of these telomere-associated repeats rather than changes in the number of unique protein coding genes. The dynamic nature of chlorarachniophyte nucleomorph genomes lies in stark contrast to their plastid genomes, which appear to be highly stable in terms of gene content and synteny.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/1471-2164-15-374) contains supplementary material, which is available to authorized users.</p>
</sec>
</abstract>
<kwd-group xml:lang="en">
<title>Keywords</title>
<kwd>Nucleomorph</kwd>
<kwd>Genome reduction</kwd>
<kwd>Chlorarachniophytes</kwd>
<kwd>Cryptophytes</kwd>
<kwd>Endosymbiosis</kwd>
<kwd>Phylogenomics</kwd>
</kwd-group>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2014</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1">
<title>Background</title>
<p>Endosymbiosis has been a driving force in the evolution of eukaryotic cells. All known eukaryotes possess mitochondria (or mitochondrion-derived organelles), which evolved from an ancestor of modern-day alpha-proteobacteria [
<xref ref-type="bibr" rid="CR1">1</xref>
,
<xref ref-type="bibr" rid="CR2">2</xref>
]. Plastids, the light-gathering organelles of plants and algae, have evolved in several different ways [
<xref ref-type="bibr" rid="CR3">3</xref>
,
<xref ref-type="bibr" rid="CR4">4</xref>
]. The plastids of green, red and glaucophyte algae are believed to have evolved from a single “primary” endosymbiotic event that occurred between a eukaryotic host and a cyanobacterial endosymbiont. Subsequently, the primary plastids of red and green algae spread to other eukaryotic lineages by “secondary” endosymbiosis, e.g., in the haptophytes, stramenopiles and euglenophytes. Genome size reduction is a commonly seen phenomenon in endosymbiosis. Plastid and mitochondrial genomes are typically <5% of the size of the bacterial genomes from which they are believed to have evolved [
<xref ref-type="bibr" rid="CR5">5</xref>
,
<xref ref-type="bibr" rid="CR6">6</xref>
] and in the case of secondary endosymbiosis the nucleus of the eukaryotic endosymbiont is usually lost entirely [
<xref ref-type="bibr" rid="CR3">3</xref>
,
<xref ref-type="bibr" rid="CR4">4</xref>
]. This massive genome reduction is due to the combined effects of endosymbiotic gene transfer (EGT) and the outright loss of genes presumed to be unnecessary for an endosymbiotic lifestyle [
<xref ref-type="bibr" rid="CR5">5</xref>
<xref ref-type="bibr" rid="CR7">7</xref>
]. More than 1,000 organelle-targeted proteins, encoded by the host nucleus, are necessary for the proper functioning of modern-day plastids and mitochondria [
<xref ref-type="bibr" rid="CR8">8</xref>
].</p>
<p>Nucleomorphs are residual secondary endosymbiont nuclei found in two different eukaryotic algal lineages, the chlorarachniophytes and cryptophytes [
<xref ref-type="bibr" rid="CR9">9</xref>
]. These unusual organelles exist between the second and third envelope membranes of their plastids. This space is known as the periplastidial compartment (PPC), which corresponds to the cytosol of the engulfed eukaryotic endosymbiont. Molecular phylogenetic analyses have shown that the nucleomorph and plastid of chlorarachniophytes are derived from endosymbiotic green algae, whereas red algae gave rise to the nucleomorph and plastid in cryptophytes [
<xref ref-type="bibr" rid="CR10">10</xref>
<xref ref-type="bibr" rid="CR12">12</xref>
]. Nucleomorph genomes are highly reduced: at present the observed size range is between 0.33 and 1 megabase pairs (Mbp) [
<xref ref-type="bibr" rid="CR13">13</xref>
<xref ref-type="bibr" rid="CR15">15</xref>
]. Despite their independent origins, the nucleomorph genomes of both cryptophytes and chlorarachniophytes exhibit common structural features such as the presence of three linear chromosomes and sub-telomeric ribosomal RNA (rRNA) operons [
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR13">13</xref>
,
<xref ref-type="bibr" rid="CR14">14</xref>
,
<xref ref-type="bibr" rid="CR16">16</xref>
]. These similarities suggest that both endosymbiont genomes have been subjected to similar reductive pressures during the process of secondary endosymbiosis.</p>
<p>Nucleomorph genomes have been completely sequenced in one chlorarachniophyte,
<italic>Bigellowiela natans</italic>
[
<xref ref-type="bibr" rid="CR17">17</xref>
], and four cryptophytes,
<italic>Guillardia theta</italic>
[
<xref ref-type="bibr" rid="CR18">18</xref>
],
<italic>Hemiselmis andersenii</italic>
[
<xref ref-type="bibr" rid="CR19">19</xref>
]
<italic>, Cryptomonas paramecium</italic>
[
<xref ref-type="bibr" rid="CR20">20</xref>
] and
<italic>Chroomonas mesostigmatica</italic>
[
<xref ref-type="bibr" rid="CR21">21</xref>
]. The number of protein genes encoded by the nucleomorph genomes examined thus far is between 300 and 500 and gene density is high, in some cases approximately one gene per kbp. Protein sequence lengths and intergenic spacer regions are also reduced compared to those of free-living algae [
<xref ref-type="bibr" rid="CR19">19</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
]. As in mitochondria, plastids, and some endosymbionts and parasites, nucleomorph genomes exhibit a highly biased A + T content (ca. 75%) [
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR22">22</xref>
<xref ref-type="bibr" rid="CR26">26</xref>
]. Most of the proteins encoded by nucleomorph genomes are predicted to be involved in housekeeping functions such as translation and transcription [
<xref ref-type="bibr" rid="CR17">17</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
]. Only a small number of genes for plastid-associated proteins are present (17 in the chlorarachniophyte
<italic>B. natans</italic>
and 18–31 in cryptophytes). Tanifuji
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR20">20</xref>
] found that the sequenced nucleomorph genomes of cryptophytes and chlorarachniophytes share a similar set of core proteins, suggesting convergent patterns of gene retention in these independently reduced genomes. Overall, chlorarachniophyte and cryptophyte nucleomorph genomes are similar in terms of size, structure and gene content, despite having evolved from different algal endosymbionts.</p>
<p>Perhaps the most striking difference between chlorarachniophyte and cryptophyte nucleomorph genomes is the number of spliceosomal introns. While chlorarachniophyte nucleomorph genomes are intron-rich (
<italic>B. natans</italic>
has 852 of them), few or no introns are found in cryptophyte nucleomorph genomes [
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR17">17</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
,
<xref ref-type="bibr" rid="CR27">27</xref>
]. Nucleomorph introns in the chlorarachniophytes
<italic>B. natans</italic>
and
<italic>Gymnochlora stellata</italic>
are tiny (18–24 bp) compared to those in green algae [
<xref ref-type="bibr" rid="CR17">17</xref>
,
<xref ref-type="bibr" rid="CR27">27</xref>
]. In the green algae
<italic>Chlamydomonas reinhardtii</italic>
and
<italic>Ostreococcus tauri</italic>
, for example, mean intron lengths are 373 and 103 bp, respectively [
<xref ref-type="bibr" rid="CR28">28</xref>
,
<xref ref-type="bibr" rid="CR29">29</xref>
]. Comparison of intron positions in the
<italic>B. natans</italic>
nucleomorph genome to green algae (e.g.,
<italic>C. reinhardtii</italic>
) and plants suggest that the nucleomorph introns were acquired from the green algal nucleus that gave rise to the nucleomorph [
<xref ref-type="bibr" rid="CR17">17</xref>
,
<xref ref-type="bibr" rid="CR30">30</xref>
]. The radical intron size reduction in chlorarachniophyte nucleomorph genomes appears to have taken place prior to the divergence of the modern-day species that have been examined [
<xref ref-type="bibr" rid="CR27">27</xref>
].</p>
<p>Although knowledge of nucleomorph genome evolution is accumulating, progress is hampered by the existence of only a single chlorarachniophyte nucleomorph genome sequence. The true diversity of nucleomorph genome structure and coding capacity in chlorarachniophytes is at present unclear. In order to gain insight into the patterns and processes of reduction within and between chlorarachniophyte and cryptophyte nucleomorph genomes, additional sequences from chlorarachniophytes are necessary. Plastid genome sequences are also important pieces of the puzzle. Complete plastid genomes have been sequenced for three cryptophytes,
<italic>G. theta</italic>
[
<xref ref-type="bibr" rid="CR31">31</xref>
],
<italic>Rhodomonas salina</italic>
[
<xref ref-type="bibr" rid="CR32">32</xref>
] and
<italic>C. paramecium</italic>
[
<xref ref-type="bibr" rid="CR33">33</xref>
], but only a single chlorarachniophyte plastid genome sequence is presently available, that of
<italic>B. natans</italic>
[
<xref ref-type="bibr" rid="CR12">12</xref>
].</p>
<p>Here we present complete nucleomorph and plastid genome sequences for the chlorarachniophyte
<italic>Lotharella oceanica</italic>
and compare them to their counterparts in the model chlorarachniophyte
<italic>B. natans</italic>
. The results suggest that recombination is an important factor driving the shrinkage and occasional expansion of nucleomorph genomes in chlorarachniophytes and perhaps cryptophytes. We have also carried out a phylogenetic analysis of nucleomorph and plastid proteins in order to gain insight into the origin of the chlorarachniophyte secondary endosymbiont.</p>
</sec>
<sec id="Sec2">
<title>Results and discussion</title>
<sec id="Sec3">
<title>
<italic>Lotharella oceanica</italic>
nucleomorph and plastid genome sequences</title>
<p>The
<italic>L. oceanica</italic>
nucleomorph genome consists of three linear chromosomes totaling ~610 kbp, which is 240 kbp larger than
<italic>B. natans</italic>
, the first complete nucleomorph genome sequenced for a chlorarachniophyte (Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
and Table 
<xref rid="Tab1" ref-type="table">1</xref>
). The
<italic>L. oceanica</italic>
chromosomes are ~210,000, 207,543 and 194,115 bp in size. On chromosome I, an internally repeated region consisting of at least five tandem repeats containing ClpC and tfIIa-gamma genes and three additional ORFs was identified (the repeat number was estimated by considering the sequence coverage depth in this region) (Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
). Because of this repeat, the exact size of chromosome I is unclear. However, our assembly-based chromosome size predictions are consistent with previous estimates of ~210, ~205, and ~195 kbp obtained by pulsed-field gel electrophoresis [
<xref ref-type="bibr" rid="CR13">13</xref>
].
<fig id="Fig1">
<label>Figure 1</label>
<caption>
<p>
<bold>Physical map of the</bold>
<bold>
<italic>Lotharella oceanica</italic>
</bold>
<bold>nucleomorph genome.</bold>
The genome is ~610 kbp in size with three chromosomes, shown artificially broken at their midpoint. Annotated genes are colored according to the functional categories shown in the lower right. The exact number of tandem repeats containing the ClpC and tfIIa-gamma genes on chromosome I is not known but was estimated to be at least five. Orange boxes indicate regions syntenic with the
<italic>Bigelowiella natans</italic>
nucleomorph genome (see text). Gray boxes show multi-copy regions. Genes mapped on the left side of each chromosome are transcribed bottom to top and those on the right, top to bottom.</p>
</caption>
<graphic xlink:href="12864_2014_6068_Fig1_HTML" id="d30e670"></graphic>
</fig>
</p>
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>
<bold>Summary of chlorarachniophyte nucleomorph and plastid genomes</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td>
<bold>Nucleomorph</bold>
</td>
<td>
<bold>
<italic>L. oceanica</italic>
</bold>
</td>
<td>
<bold>
<italic>B. natans</italic>
</bold>
</td>
</tr>
<tr>
<td>Total genome size (bp)</td>
<td>~611,658</td>
<td>372,879</td>
</tr>
<tr>
<td>chr. 1 size (bp)</td>
<td>~210,000</td>
<td>140,598</td>
</tr>
<tr>
<td>chr. 2 size (bp)</td>
<td>207,543</td>
<td>134,144</td>
</tr>
<tr>
<td>chr. 3 size (bp)</td>
<td>194,115</td>
<td>98,137</td>
</tr>
<tr>
<td>G + C content (%)</td>
<td>33.0 (24.0*)</td>
<td>28.5 (25.1*)</td>
</tr>
<tr>
<td># of introns</td>
<td>1,011</td>
<td>951**</td>
</tr>
<tr>
<td># of tRNAs</td>
<td>19</td>
<td>19</td>
</tr>
<tr>
<td>Average intergenic spacer size (bp)</td>
<td>147.4(140.9*)</td>
<td>82.1</td>
</tr>
<tr>
<td># of protein genes</td>
<td>610</td>
<td>284**</td>
</tr>
<tr>
<td># of non-redundant protein genes</td>
<td>348</td>
<td>283</td>
</tr>
<tr>
<td>
<bold>Plastid</bold>
</td>
<td>
<bold>
<italic>L. oceanica</italic>
</bold>
</td>
<td>
<bold>
<italic>B. natans</italic>
</bold>
</td>
</tr>
<tr>
<td>Total genome size (bp)</td>
<td>70,997</td>
<td>~69,000</td>
</tr>
<tr>
<td># of protein genes</td>
<td>60</td>
<td>61</td>
</tr>
<tr>
<td>G + C content (%)</td>
<td>30.6</td>
<td>30.2</td>
</tr>
<tr>
<td># of tRNAs</td>
<td>28</td>
<td>31</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>*Calculated with subtelomeric regions excluded.</p>
<p>**Numbers updated from original publication (Gilson et al. [
<xref ref-type="bibr" rid="CR17">17</xref>
]).</p>
</table-wrap-foot>
</table-wrap>
<p>The G + C content of the three
<italic>L. oceanica</italic>
nucleomorph chromosomes is 33.6%, 32.8%, and 32.7% for I, II, and III, respectively. However, ~35 kbp sub-telomeric regions repeated on each of the six chromosome ends were found to be of much higher G + C content, 49.4% (see below). Excluding the sub-telomeric regions, the G + C content is 25.3, 23.9, and 22.8% for chromosomes I, II, and III, similar to that in the
<italic>B. natans</italic>
nucleomorph genome (25.1%) as well as the four cryptophyte nucleomorph genomes sequenced to date (25.2%-26.4%) [
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR17">17</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
].</p>
<p>The plastid genome of
<italic>L. oceanica</italic>
was found to be circular and 70,997 bp in size, encoding 60 proteins (two pseudogenes) and 28 tRNAs, plus 6 rRNAs in inverted repeats (Figure 
<xref rid="Fig2" ref-type="fig">2</xref>
and Table 
<xref rid="Tab1" ref-type="table">1</xref>
). The G + C content is 30.6%. The genome is highly similar to the
<italic>B. natans</italic>
plastid genome in size, gene order, G + C content and structure, albeit with a few exceptions: 1) a psaA gene is missing from one of the two ends of the inverted repeats, and a petD gene is missing from the other end; 2) the position of petD and petB are switched; 3) a gene encoding a putative reverse transcriptase was found in
<italic>L. oceanica</italic>
between the chlI and petA genes, implying the existence of a group II intron in
<italic>L. oceanica</italic>
. However, an ORF was not detectable in the vicinity of the reverse transcriptase-encoding region (Figure 
<xref rid="Fig2" ref-type="fig">2</xref>
). The
<italic>L. oceanica</italic>
plastid genome contains three small inverted repeats. One consists of a 28 bp sequence located between the psbJ and tRNA-Phe genes, and two inverted repeat pairs consisting of 76 bp and 93 bp are located side by side between atpI and psbE
<italic>.</italic>
The region between the atpI and psbE genes could not be sequenced in
<italic>B. natans</italic>
, presumably due to extensive secondary structure. We were successful in determining the DNA sequence of the corresponding region in the
<italic>L. oceanica</italic>
plastid genome. However, its sequence characteristics do not provide further insight into the possibility that it is an origin of replication site, as was suggested for
<italic>B. natans</italic>
[
<xref ref-type="bibr" rid="CR12">12</xref>
]. The unusual rRNA operon inversion seen in
<italic>B. natans</italic>
, in which the small and large subunit rRNA genes are on the opposite strand, is also present in the
<italic>L. oceanica</italic>
plastid genome.
<fig id="Fig2">
<label>Figure 2</label>
<caption>
<p>
<bold>Circular physical map of the plastid genome of</bold>
<bold>
<italic>Lotharella oceanica.</italic>
</bold>
The 70,997 bp genome contains inverted rRNA operons, 60 predicted protein genes, and 28 tRNA genes. Genes shown on the outside of the circle are transcribed clockwise. Annotated genes are colored according to the functional categories shown in the center.</p>
</caption>
<graphic xlink:href="12864_2014_6068_Fig2_HTML" id="d30e953"></graphic>
</fig>
</p>
</sec>
<sec id="Sec4">
<title>Synteny, gene content and intron evolution in chlorarachniophyte nucleomorph genomes</title>
<p>The structure and coding capacity of the nucleomorph genomes in
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
are summarized in Table 
<xref rid="Tab1" ref-type="table">1</xref>
and Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
. 610 protein genes were predicted in the
<italic>L. oceanica</italic>
nucleomorph genome; this number is almost twice as many as in
<italic>B. natans</italic>
. However, this is mainly due to a total of 258 predicted ORFs with no introns in the ~35 kbp sub-telomeric regions. In fact, the number of non-redundant protein genes (counting each multicopy gene as one) is 348, 65 more than in
<italic>B. natans</italic>
(283) (Table 
<xref rid="Tab1" ref-type="table">1</xref>
and Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
). Since the sub-telomeric region of each chromosome end shows extremely low levels of gene expression compared with internally located genes, it is unclear if the sub-telomeric ORFs are real protein-coding genes (see below). In addition, the total number of introns in both genomes is similar, with 1,011 in
<italic>L. oceanica</italic>
and 951 in
<italic>B. natans</italic>
(see following section). There are 19 tRNA genes in both genomes, but with a few differences in their content compared to each other. Bona fide snRNAs were not found in the
<italic>L. oceanica</italic>
nucleomorph genome.</p>
<p>The average intergenic spacer size in the
<italic>L. oceanica</italic>
genome is 147.4 bp, which is significantly higher than in
<italic>B. natans</italic>
(82.1 bp) (supported by a T-test (
<italic>p</italic>
 < 0.001) (Table 
<xref rid="Tab1" ref-type="table">1</xref>
)). The average length of all predicted proteins in the
<italic>L. oceanica</italic>
genome (247.2 amino acids) was found to be smaller than that in
<italic>B. natans</italic>
(326.8) (
<italic>p</italic>
 < 0.001). However, no significant difference (
<italic>p</italic>
 > 0.05) was seen in the average length of 166 shared proteins (excluding ORFs with identifiable protein motifs and unique ORFs found in nucleomorph genomes), with an average of 334.4 amino acids in
<italic>L. oceanica</italic>
and 343.1 in
<italic>B. natans</italic>
.</p>
<p>The
<italic>L. oceanica</italic>
nucleomorph chromosomes were found to contain a repetitive sequence consisting of a 36 nt repeat located between all of the ~35 kbp sub-telomeric regions and the internal single-copy regions (Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
; between 5 and >29 repeats of the following sequence: RTAYCTRGTTRCCTTATCGTATGCCATGGCTTTATC). Repetitive sequences were found in the nucleomorph genome of the cryptophyte
<italic>C. mesostigmatica</italic>
[
<xref ref-type="bibr" rid="CR21">21</xref>
], but in that case they consisted of A-T-rich simple repeats such as [TTA]
<sub>n</sub>
located in the ITS between the 5S and 28S rDNA, and [AT
<sub>4–5</sub>
]
<sub>14</sub>
and [TA
<sub>2</sub>
GA
<sub>2</sub>
TA
<sub>5</sub>
]
<sub>4–25</sub>
in intergenic spacers of the sub-telomeric repeats. A long homopolymer of [A/T]
<sub>24–37</sub>
was also found in several sites within 28S rDNA in
<italic>C. mesostigmatica</italic>
. In contrast, the repetitive sequences in
<italic>L. oceanica</italic>
were less A-T rich and less variable in sequence. Intriguingly, each 36 nt repeat in
<italic>L. oceanica</italic>
has a potential initiation codon in the same reading frame as the gene for the molecular chaperone dnaK, which is of cyanobacterial origin and presumed to be targeted to the plastid. However, the transit peptide prediction for the repetitive sequence, as well as the region immediately upstream of the predicted dnaK start codon, did not identify a stretch of amino acids predicted to target the protein to the plastid. On the other hand, an additional dnaK gene located internally on chromosome I, was predicted to encode a transit peptide using the ChloroP [
<xref ref-type="bibr" rid="CR34">34</xref>
] and TargetP [
<xref ref-type="bibr" rid="CR35">35</xref>
] programs.</p>
<p>Genome synteny has been investigated previously in four completely sequenced cryptophyte nucleomorph genomes [
<xref ref-type="bibr" rid="CR19">19</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
]. In
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
we identified 17 short blocks of synteny (14 blocks using the same criteria used by Moore
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="CR21">21</xref>
]) (Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
). In contrast, the plastid genomes in the two chlorarachniophytes showed almost the same gene order. A similar pattern is seen in cryptophytes. The plastid genomes of the photosynthetic species
<italic>G. theta</italic>
and
<italic>R. salina</italic>
are completely syntenic, and while the non-photosynthetic species
<italic>C. paramecium</italic>
contains one rearranged region around the ribosomal operon in its plastid genome, the overall level of synteny is still high [
<xref ref-type="bibr" rid="CR31">31</xref>
<xref ref-type="bibr" rid="CR33">33</xref>
]. This suggests that recombination is much more frequent in the nucleomorph genomes of both cryptophytes and chlorarachniophytes than in the plastid genomes (refer to the following section for further discussion). In sum, the nucleomorph genomes of
<italic>B. natans</italic>
and
<italic>L. oceanica</italic>
differ remarkably in terms of the number of predicted proteins, structure and synteny, while their plastid genomes have changed little since the two organisms diverged from a common ancestor.</p>
</sec>
<sec id="Sec5">
<title>Sub-telomeric repeats and secondary expansion of the
<italic>L. oceanica</italic>
nucleomorph genome</title>
<p>The most unexpected finding in the
<italic>L. oceanica</italic>
nucleomorph genome was the presence of large, nearly identical (with only 3 nucleotide differences) ~35 kbp sub-telomeric repeats on each of the six chromosome ends (Figure
<xref rid="Fig1" ref-type="fig">1</xref>
). As mentioned above, these repeats have a G + C content (49.4%) that is much higher than the internal regions of the chromosomes, which are ~25% G + C (Figure 
<xref rid="Fig3" ref-type="fig">3</xref>
). Lane
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR36">36</xref>
] reported a ‘high’ G + C content in the short sub-telomeric region (830–4,030 bp) between the rDNA operon and the nearby ubc4 gene in the cryptophyte nucleomorph genomes of
<italic>G. theta</italic>
,
<italic>Hanusia phi</italic>
and
<italic>Proteomonas sulcata</italic>
. Regions of ~50% G + C, including the rDNA operons themselves, also exist on at least four of the six chromosome ends in completely sequenced cryptophyte nucleomorph genomes [
<xref ref-type="bibr" rid="CR9">9</xref>
]. However, in these species the repeats are relatively short with lengths ranging from 7–12 kbp (including 6–7.6 kbp of rDNA).
<fig id="Fig3">
<label>Figure 3</label>
<caption>
<p>
<bold>G + C content and gene expression in the</bold>
<bold>
<italic>Lotharella oceanica</italic>
</bold>
<bold>nucleomorph genome.</bold>
Gray boxes indicate gene expression levels corresponding to RNA-Seq coverage depth of each gene. Red lines show the G + C levels on the chromosomes, which were captured from Artemis genome annotation software with the default setting. The black boxes under the graph indicate multicopy regions.</p>
</caption>
<graphic xlink:href="12864_2014_6068_Fig3_HTML" id="d30e1184"></graphic>
</fig>
</p>
<p>Also remarkable are the protein genes located within the nucleomorph sub-telomeric repeats of
<italic>L. oceanica</italic>
. Of the 45 protein genes predicted in this region, only two show obvious sequence similarity to known proteins (dnaK and hsp70) (as in previous studies, potential genes were defined as being >150 nucleotides in length from an initiator methionine codon). Interestingly, transcriptome analyses of
<italic>L. oceanica</italic>
showed that the level of transcription of these 43 unknown genes in sub-telomeric repeats was extremely low (the mean RNA-Seq coverage of the 43 ORFs was 0.536; 26/46 ORFs show no coverage at all), almost 2,500 times lower than the average expression level for the protein genes in single-copy regions of the chromosomes (Figure 
<xref rid="Fig3" ref-type="fig">3</xref>
). It is therefore possible that the 43 ORFs lacking obvious functions in the
<italic>L. oceanica</italic>
sub-telomeric regions are in fact not real protein genes. A recent comprehensive analysis of nucleomorph gene expression in
<italic>G. theta, C. mesostigmatica, C. paramecium</italic>
and
<italic>B. natans</italic>
showed that as a whole, nucleomorphs exhibit high transcription levels, with >97% of these four nucleomorph genomes being transcribed into mRNA, including non-coding regions [
<xref ref-type="bibr" rid="CR37">37</xref>
]. Therefore, the extremely low transcription levels of the ORFs residing within the ~35 kb sub-telomeric regions of the
<italic>L. oceanica</italic>
genome is unusual, and consistent with them being spurious ORFs. The completely sequenced nucleomorph genomes of
<italic>G. theta, C. mesostigmatica, C. paramecium</italic>
and
<italic>B. natans</italic>
also contain sub-telomeric repeats, but they are much shorter than those in
<italic>L. oceanica</italic>
, and we were unable to obtain a clear picture of the extent to which they are expressed.</p>
<p>What is the biological significance of the long sub-telomeric repeat regions in
<italic>L. oceanica</italic>
? Gene conversion via genetic recombination is the most likely explanation for the maintenance of the near-identical rDNA sequences at the nucleomorph chromosome ends and such a process presumably acts to homogenize the adjacent sub-telomeric region as well. It seems significant that there are short repetitive sequences located next to the ~35 kbp sub-telomeric regions in
<italic>L. oceanica</italic>
, which show variable numbers of sequence units (x5 - >29 repeats, 180 - ~1,000 bp) (Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
). These heterogeneous repetitive sequences might be mediators of genetic recombination within and between nucleomorph chromosomes. Also, the elevated G + C content of the sub-telomeric regions in the
<italic>L. oceanica</italic>
nucleomorph genome is consistent with the phenomenon of GC-biased gene conversion, which has been demonstrated in other genomes such as those of mammals [
<xref ref-type="bibr" rid="CR38">38</xref>
,
<xref ref-type="bibr" rid="CR39">39</xref>
].</p>
<p>Investigation of the hsp70 gene in
<italic>L. oceanica</italic>
provided insight into nucleomorph genome dynamics. No fewer than 10 copies of hsp70 were found in the genome: an ‘internal’ gene on chromosome I, another on chromosome II, six copies within the sub-telomeric repeats and two pseudogenes on chromosomes I and III (Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
). The sub-telomeric hsp70 genes have a G + C content of 44.1%, higher than the other isoforms (37.3% and 35.0%). Homologous recombination events have presumably occurred at the nucleomorph chromosome ends, resulting in hsp70 genes whose sequences are homogenized by gene conversion. The G + C content of the hsp70 genes located internally on chromosomes I and II are still higher than the average G + C content (~25%, excluding sub-telomeric regions). This is the case for other internally located multicopy regions as well (Figure 
<xref rid="Fig3" ref-type="fig">3</xref>
). These observations suggest that gene conversion may be homogenizing multicopy genes throughout the
<italic>L. oceanica</italic>
nucleomorph genome, not just those residing in sub-telomeric regions. Such a process could result in the loss of some or all of the sub-telomeric repeat regions; homologous regions could disappear by unequal crossing over between sister chromatids. In cryptophytes, evidence for active recombination in nucleomorph genomes has come from investigations of
<italic>C. paramecium</italic>
and
<italic>H. anderseniii</italic>
, where complete rDNA operons have been lost from two and three of the six chromosome ends, respectively (only 5S ribosomal rDNA regions remain) [
<xref ref-type="bibr" rid="CR19">19</xref>
,
<xref ref-type="bibr" rid="CR20">20</xref>
].</p>
<p>In order to gain further insight into the dynamic nature of nucleomorph chromosomes in
<italic>L. oceanica</italic>
, we examined the evolution of hsp70, a multi-copy gene present at least once on all three chromosomes and whose protein product is amenable to phylogenetic analysis. The three main types of hsp70 genes/proteins in
<italic>L. oceanica</italic>
were found to be monophyletic in phylogenetic trees and to branch sister to the
<italic>B. natans</italic>
nucleomorph hsp70, which is not sub-telomeric in its location (Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
). This result is consistent with the idea that gene duplication occurred sometime after
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
diverged from one another. Alternatively,
<italic>B. natans</italic>
may have possessed multiple hsp70 genes in its nucleomorph genome in the past, as does
<italic>L. oceanica</italic>
today, but lost them, perhaps due to recombination. On balance, however, it seems likely that while nucleomorph genomes are highly reduced, active recombination at the sub-telomeric repeats can serve to expand the genome over evolutionary time.</p>
<p>Previous nucleomorph comparative genomic investigations revealed the existence of numerous blocks of synteny (more than three homologous genes in the same order) among the cryptophyte genomes of
<italic>G. theta</italic>
,
<italic>H. andersenii</italic>
and
<italic>C. paramecium</italic>
, some of which are in the range of ~25-45 kbp. However, the genome of the cryptophyte
<italic>C. mesostigmatica</italic>
was found to be much more fragmented compared to the other three, despite the fact that
<italic>C. mesostigmatica</italic>
and
<italic>H. andersenii</italic>
are specifically related to one another [
<xref ref-type="bibr" rid="CR21">21</xref>
]. Moore et al. [
<xref ref-type="bibr" rid="CR21">21</xref>
] showed that in
<italic>C. mesostigmatica</italic>
, the average number of genes in each syntenic block compared to
<italic>G. theta</italic>
,
<italic>H. andersenii</italic>
and
<italic>C. paramecium</italic>
is 6.7-9.0, whereas it is 9.4-10.9 when these three cryptophyte species are compared to one another. Compared to their plastid genomes, which show an extremely high level of gene synteny, the nucleomorph genomes of
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
are highly scrambled. We identified 14 nucleomorph syntenic blocks (using the same criteria as Moore
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR21">21</xref>
]) and the average number of genes in these regions was found to be 5.5, even smaller than the level of synteny seen between
<italic>C. mesostigmatica</italic>
and the other three cryptophytes nucleomorph genomes. Lane
<italic>et al</italic>
. [
<xref ref-type="bibr" rid="CR19">19</xref>
] suggested that the high level of synteny in cryptophyte nucleomorph genomes is due to relatively low rates of recombination in internal regions. Why might the nucleomorph genomes of
<italic>L. oceanica</italic>
and
<italic>C. mesostigmatica</italic>
possess somewhat smaller blocks of synteny compared to the other nucleomorph genomes? Although unrelated to one another, these two nucleomorph genomes are similar in their possession of multicopy regions, repetitive sequence elements and somewhat larger genome sizes. Therefore, it is possible that in addition to homologous recombination at the sub-telomeric regions, homologous and non-homologous recombination in internal portions of the chromosomes have occurred along the entire lengths of the chromosomes in
<italic>L. oceanica</italic>
and
<italic>C. mesostigmatica</italic>
. The significantly larger mean intergenic spacer size in
<italic>L. oceanica</italic>
(147.4 bp) compared to
<italic>B. natans</italic>
(82.1 bp) also supports the idea of more frequent internal recombination; in regions where gene density is lower, recombination events would be less likely to disrupt essential genes. Compared to other cryptophytes, the
<italic>C. mesostigmatica</italic>
nucleomorph genome shows a somewhat larger average intergenic spacer size as well [
<xref ref-type="bibr" rid="CR21">21</xref>
]. A more detailed picture of the impact of recombination on genome structure in chlorarachniophyte nucleomorphs will require genomic data from additional strains and species; with only two sequences in hand, it is unclear which of the two genomes—that of
<italic>L. oceanica</italic>
or
<italic>B. natans</italic>
—is more recombinagenic.</p>
</sec>
<sec id="Sec6">
<title>Convergent nucleomorph genome evolution in cryptophytes and chlorarachniophytes</title>
<p>The
<italic>L. oceanica</italic>
nucleomorph genome contains 610 predicted protein genes. Of the 348 non-redundant protein genes, 160 were categorized as proteins with predicted functions shared with other eukaryotes under previously proposed criteria [
<xref ref-type="bibr" rid="CR20">20</xref>
]. 17 of 348 non-redundant protein genes had homologs in cyanobacteria and were designated ‘plastid-associated’. 171 protein genes were considered to be ‘nucleomorph ORFans’ (nORFans), that is, nucleomorph proteins with unknown functions (most of these proteins show no sequence similarity to any known protein, in nucleomorphs or other genomes; see Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
). However, 43 of these nORFans reside within the sub-telomeric repeats (43 nORFans × 6 repeats), and thus account for 25% of the non-redundant nORFan genes. These genes were removed from our comparative genomic investigations.</p>
<p>Compared to the non-redundant gene set in the
<italic>B. natans</italic>
nucleomorph genome,
<italic>L. oceanica</italic>
retains similar sets of eukaryotic conserved genes (151 of 160 
<italic>L. oceanica</italic>
protein genes in these categories are present in
<italic>B. natans</italic>
) (Figure 
<xref rid="Fig4" ref-type="fig">4</xref>
, Additional files
<xref rid="MOESM1" ref-type="media">1</xref>
and
<xref rid="MOESM3" ref-type="media">3</xref>
). 14 of these genes (rpl14A, rpl24, rpl30, rpoF, rps24-like, rpoL, rad25, dip2, rhel1, pcna, rfc4, msl1, snrpE-1, and ub2) are missing in
<italic>L. oceanica</italic>
but are present in the
<italic>B. natans</italic>
nucleomorph genome, and vice versa for nine
<italic>L. oceanica</italic>
genes (rpl15, rpl21A, nop56-like, cc1-like, psf2, ruvB2-like, tbl3, KH-domain and BRCA1) (Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
). Despite missing several protein genes, the absence of whole suites of proteins predicted to function together in complexes was not observed in
<italic>L. oceanica</italic>
.
<fig id="Fig4">
<label>Figure 4</label>
<caption>
<p>
<bold>Comparison of nucleomorph gene content within and between chlorarachniophyte and cryptophyte algae.</bold>
The Venn diagrams show the number of shared and / or unique genes in three categories: eukaryotic conserved (upper left), nucleomorph ORFans (upper right), and plastid-associated genes in the two chlorarachniophyte nucleomorph genomes (middle center), and core nucleomorph genes (excluding spliceosomal machinery genes and plastid-associated genes) in chlorarachniophytes and cryptophytes (bottom).</p>
</caption>
<graphic xlink:href="12864_2014_6068_Fig4_HTML" id="d30e1454"></graphic>
</fig>
</p>
<p>In stark contrast to the eukaryotic conserved genes with predicted functions, only 16.4-20.1% of nORFans were obviously shared between
<italic>L. oceanica</italic>
(21/128) and
<italic>B. natans</italic>
(21/101) (Figure 
<xref rid="Fig4" ref-type="fig">4</xref>
). Interestingly, a similar tendency was observed when considering nucleomorph gene sets among three cryptophytes [
<xref ref-type="bibr" rid="CR20">20</xref>
]. In this case, 93.0-99.6% of eukaryotic conserved ORFs (excluding spliceosome machinery genes) were shared with one another in
<italic>G. theta</italic>
,
<italic>H. andersenii</italic>
and
<italic>C. paramecium</italic>
, and only 12.2-23.2% for nORFans (Figure 
<xref rid="Fig4" ref-type="fig">4</xref>
). The nORFans of chlorarachniophytes and cryptophytes have been proposed to represent particularly fast-evolving genes whose sequences have diverged to the point where their homologs in related nucleomorph genomes (and other nuclear genomes) can no longer be detected [
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR19">19</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
]. Additionally, the
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
genomes encode the exact same set of 17 plastid-associated proteins. This parallels the cryptophyte genomes sequenced thus far (excluding the non-photosynthetic cryptophyte,
<italic>C. paramecium</italic>
), which share a set of 31 plastid-associated protein genes [
<xref ref-type="bibr" rid="CR9">9</xref>
,
<xref ref-type="bibr" rid="CR18">18</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
].</p>
<p>The convergent evolution of nucleomorph genomes in cryptophytes and chlorarachniophytes is apparent from the perspective of G + C content, chromosome structure (in particular the rDNA-containing chromosome ends), and the presence of three linear chromosomes. Tanifuji
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR20">20</xref>
] extended this to the level of gene content, demonstrating that the majority of conserved eukaryotic genes (81.7%) in the
<italic>B. natans</italic>
nucleomorph genome were contained in the core set of cryptophyte nucleomorph genes, i.e., genes shared by all three photosynthetic cryptophytes (spliceosomal genes with discernable functions were excluded due to the fact that the cryptophyte
<italic>H. andersenii</italic>
lacks introns and thus spliceosomal machinery). This observation is bolstered by the present study. A ‘core’ set of 112 chlorarachniophyte nucleomorph genes was identified by comparison of the
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
gene sets (excluding plastid-associated and spliceosomal genes). 90 of these 112 genes (80.4%) were shared with at least one of the cryptophytes (which have a 217-gene core set) (Figure 
<xref rid="Fig4" ref-type="fig">4</xref>
). There is now strong support for the idea that these two independently evolved nucleomorph genomes have undergone similar reductive pressures and converged upon similar gene sets.</p>
<p>The availability of a complete nuclear genome sequence for
<italic>B. natans</italic>
has allowed us to explore the fate of the nine ‘missing’ conserved eukaryotic genes in this organism (rpl15, rpl21A, nop56-like, cc1-like, psf2, ruvB2-like, tbl3, KH-domain and BRCA1). Curtis
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR40">40</xref>
] predicted 1,002 and 2,401 nucleus-encoded, periplastidal compartment (PPC)-targeted proteins in
<italic>B. natans</italic>
and the cryptophyte
<italic>G. theta</italic>
, respectively. Interestingly, in
<italic>G. theta</italic>
16 of 17 genes missing from the nucleomorph genome but present in other cryptophyte nucleomorphs appear to have been lost without being replaced by a nucleus-encoded, PPC-targeted homolog (only the kinase-encoding gene kin(cdc) was potentially replaced by a host derived, PPC-targeted protein). As in cryptophytes, we were unable to identify obvious orthologs of the 9 genes missing from the
<italic>B. natans</italic>
nucleomorph genome that are present in the
<italic>L. oceanica</italic>
nucleomorph genome by searching the
<italic>B. natans</italic>
nuclear genome. Obvious replacements for the 19 protein genes missing from the
<italic>L. oceanica</italic>
nucleomorph genome were also not found in transcriptome data from this organism (a complete genome sequence is not available for analysis). That is, the genes do not appear to have undergone nucleomorph-to-host-nucleus gene transfer. One explanation is that the functions of these missing proteins have been taken over by functionally ambiguous nORFan proteins and/or nucleus-encoded, PPC-targeted hypothetical proteins. Determining whether this is the case will require much more comparative sequence data and a better understanding of the biochemical processes taking place in the PPC.</p>
<p>Several other factors also contribute to the larger nucleomorph genome size in
<italic>L. oceanica</italic>
. Unlike the
<italic>B. natans</italic>
genome where only an rps8 gene and sub-telomeric regions are multicopy, there are many multicopy regions in the
<italic>L. oceanica</italic>
nucleomorph genome. In addition to the sub-telomeric regions and tandem repeats containing the ClpC and tfIIa-gamma genes on chromosome I, this includes 15 internally duplicated regions (Figure 
<xref rid="Fig1" ref-type="fig">1</xref>
) that result in numerous multicopy genes (e.g., eif4A, rpl10A, rpl23, rpl27, rps9, orf264, orf150, orf328, orf363 (2 copies each) and gsp2 (3 copies)). These genes contribute to the increase in the number of total protein genes in the
<italic>L. oceanica</italic>
genome (352, excluding ORFs in sub-telomeric regions), resulting in 69 more genes than in
<italic>B. natans</italic>
(283). Another contributing factor is a slightly larger average intergenic spacer size in
<italic>L. oceanica</italic>
relative to
<italic>B. natans</italic>
(Table 
<xref rid="Tab1" ref-type="table">1</xref>
). This mirrors the situation seen in the ‘large’ nucleomorph genome of the cryptophyte
<italic>C. mesostigmatica.</italic>
All things considered, a similar set of structural differences can explain the observed variation in nucleomorph genome size from species to species in both chlorarachniophytes and cryptophytes, despite their independent origins.</p>
</sec>
<sec id="Sec7">
<title>Origins and evolution of chlorarachniophyte nucleomorph introns</title>
<p>A striking difference between chlorarachniophyte and cryptophyte nucleomorph genomes is the number of spliceosomal introns. Gilson
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR17">17</xref>
] found 852 pigmy introns from the
<italic>B. natans</italic>
nucleomorph genome, whereas cryptophyte nucleomorph genomes contain only 0–25 spliceosomal introns [
<xref ref-type="bibr" rid="CR17">17</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
]. In this study we identified 1,011 
<italic>L. oceanica</italic>
nucleomorph introns using Illumina transcriptome data (RNA-Seq). For the
<italic>B. natans</italic>
nucleomorph, 115 new splice sites were also identified and corrected using Illumina RNA-Seq data. With these adjustments, the total number of spliceosomal introns in the
<italic>B. natans</italic>
nucleomorph genome has increased from 852 (the original estimate [
<xref ref-type="bibr" rid="CR17">17</xref>
]) to 951. As in
<italic>B. natans</italic>
, the
<italic>L. oceanica</italic>
introns have canonical GT-AG intron boundaries, with the exception of three introns (two GC-AG and one GA-AG). Most of these introns are 18–23 bp in size. One intron, within an RNA helicase gene (prp43-2), was found to be 32 nt, which is the largest nucleomorph intron known (the previous largest was a 27 nt intron found in the rpb6 gene of the
<italic>Gymnochlora stellata</italic>
nucleomorph genome [
<xref ref-type="bibr" rid="CR27">27</xref>
]).</p>
<p>The size distribution of introns in the
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
genomes is shown in Figure 
<xref rid="Fig5" ref-type="fig">5</xref>
. The pattern of intron abundance versus size in
<italic>L. oceanica</italic>
is similar to that in
<italic>B. natans</italic>
in that 19 nt-long introns are most abundant (496 in
<italic>L. oceanica</italic>
and 654 in
<italic>B. natans</italic>
), although the proportion of 19 nt introns in
<italic>L. oceanica</italic>
(49.1%) is smaller than in
<italic>B. natans</italic>
(68.8%). In
<italic>L. oceanica</italic>
, the size distribution is shifted towards longer introns (the proportions of introns <20 nt (18, 19 nt) are 56.7% and 81.5%, whereas the proportion of >20 nt introns are 43.3% and 18.5% in
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
, respectively) (Figure 
<xref rid="Fig5" ref-type="fig">5</xref>
A). A previous comparison of introns in 54 homologous genes in the
<italic>B. natans</italic>
and
<italic>G. stellata</italic>
nucleomorph genomes [
<xref ref-type="bibr" rid="CR27">27</xref>
] showed that the
<italic>G. stellata</italic>
genome also possesses a higher proportion of 19 nt-introns (78.3%) than
<italic>B. natans</italic>
(75.0%). The distribution pattern of intron size was somewhat shifted to longer intron size in
<italic>G. stellata</italic>
(20.3% and 13.3% of 20–24 nt intron in
<italic>G.stellata</italic>
and
<italic>B. natans</italic>
). The fact that the ~385 kbp
<italic>G. stellata</italic>
nucleomorph genome is estimated to be ~12 kbp larger than
<italic>B. natans</italic>
and tends to have longer introns than
<italic>B. natans</italic>
is consistent with our
<italic>L. oceanica</italic>
data in showing a positive correlation between nucleomorph genome size and intron length.
<fig id="Fig5">
<label>Figure 5</label>
<caption>
<p>
<bold>Introns in the</bold>
<bold>
<italic>Lotharella oceanica</italic>
</bold>
<bold>and</bold>
<bold>
<italic>Bigelowiella natans</italic>
</bold>
<bold>nucleomorph genomes</bold>
<bold>
<italic>.</italic>
</bold>
<bold>A)</bold>
Intron size distribution for
<italic>L. oceanica</italic>
(left) and
<italic>B. natans</italic>
(right). The numbers above each bar show the actual numbers of introns in each size category.
<bold>B)</bold>
Intron comparison of two chlorarachniophyte nucleomorph genomes. The Venn diagram shows the number of shared and/ or unique comparable spliceosomal introns in the two genomes. Comparable spliceosomal introns were selected using the criteria proposed by Roy and Penny [
<xref ref-type="bibr" rid="CR30">30</xref>
].</p>
</caption>
<graphic xlink:href="12864_2014_6068_Fig5_HTML" id="d30e1769"></graphic>
</fig>
</p>
<p>In terms of phase and base composition, the
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
nucleomorph introns are similar. Specifically, there is an abundance of phase 0 introns at similar proportions (58.3% in
<italic>L. oceanica</italic>
and 59.4% in
<italic>B. natans</italic>
), a bias towards A residues at position -2, and high A + T content in the intron sequences themselves (Additional file
<xref rid="MOESM4" ref-type="media">4</xref>
). The latter two features are also seen in
<italic>G. stellata</italic>
nucleomorph introns [
<xref ref-type="bibr" rid="CR27">27</xref>
].</p>
<p>625 and 579 introns were found in 146 homologous nucleomorph genes shared between
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
. Of these introns, 151 
<italic>L. oceanica</italic>
and 141
<italic>B. natans</italic>
introns were comparable under the criteria used by Roy and Penny [
<xref ref-type="bibr" rid="CR30">30</xref>
] (see Methods). In terms of intron position, 136 introns were shared between these two species (Figure 
<xref rid="Fig5" ref-type="fig">5</xref>
B). In order to gain insight into patterns of intron gain and loss during reductive evolution, 20 introns not shared between
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
were compared with those of two well-studied species, the plant
<italic>Arabidopsis thaliana</italic>
and the green alga
<italic>Chlamydomonas reinhardtii</italic>
. As shown in Additional file
<xref rid="MOESM5" ref-type="media">5</xref>
, one intron in rpc2 was absent only in
<italic>L. oceanica</italic>
and two introns (in the rpb10 and rpl44 genes) were absent only in
<italic>B. natans</italic>
. In these cases we can safely infer intron loss in
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
. Based on the fact that the intron density in the
<italic>B. natans</italic>
nucleomorph genome is similar to that of
<italic>C. reinhardtii</italic>
and
<italic>A. thaliana</italic>
, Slamovits and Keeling [
<xref ref-type="bibr" rid="CR27">27</xref>
] suggested that nucleomorph intron loss is rare. Our comparison of
<italic>B. natans</italic>
and
<italic>L. oceanica</italic>
shows a high similarity in the number of introns in homologous genes, and is consistent with this notion.</p>
</sec>
<sec id="Sec8">
<title>Phylogenetic analyses</title>
<p>A fundamental question in the evolution of chlorarachniophyte algae is the nature of the green algal endosymbiont that gave rise to the plastid and nucleomorph. Previous phylogenetic trees inferred from concatenated plastid-encoded proteins suggested that the chlorarachniophyte endosymbiont was related to the so-called TUC group of green algae, consisting of trebouxiophytes, ulvophytes, and chlorophytes, within the core chlorophytes [
<xref ref-type="bibr" rid="CR12">12</xref>
,
<xref ref-type="bibr" rid="CR41">41</xref>
<xref ref-type="bibr" rid="CR44">44</xref>
]. However, the precise position of the chlorarachniophyte plastid and nucleomorph within the TUC group was unclear. We attempted to address this issue by assembling and analyzing a large data set including both nucleomorph and plastid proteins.</p>
<p>The taxon and data resources used in these analyses are shown in Additional file
<xref rid="MOESM6" ref-type="media">6</xref>
. Three separate supermatrices were constructed: (i) a 52-protein dataset containing nucleomorph and nuclear proteins (12,854 amino acid (AA) sites), (ii) a 47 plastid protein set (10,026 AA sites in total) and (iii) a 99-protein combined set (25,688 AA sites in total) (see Methods). Maximum likelihood trees inferred using the RAxML method [
<xref ref-type="bibr" rid="CR45">45</xref>
] are shown in Additional file
<xref rid="MOESM7" ref-type="media">7</xref>
. As expected,
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
grouped within the Viridiplantae clade. The plastid protein tree and combined dataset tree place the two chlorarachniophyte species as monophyletic with the TUC clade with 100% bootstrap support. The relationship within the TUC clade was, however, unclear. In a Bayesian analysis using the Phylobayes method [
<xref ref-type="bibr" rid="CR46">46</xref>
],
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
grouped with the TUC group and the Mamiellophyceae with maximum support (1.0 and 100%; Figure 
<xref rid="Fig6" ref-type="fig">6</xref>
). Together, these analyses are consistent with the hypothesis that the endosymbiont that gave rise to the chlorarachniophyte nucleomorph and plastid was more closely related to members of the TUC group than the other green algal members. However, the branching pattern within the TUC clade, including the chlorarachniophytes, was not resolved by ML analysis (Figure 
<xref rid="Fig6" ref-type="fig">6</xref>
and Additional file
<xref rid="MOESM7" ref-type="media">7</xref>
c). Therefore, although the analyses presented herein use the most gene-rich datasets thus far assembled to address the question of the origin of the chlorarachniophyte endosymbiont, they still lack sufficient taxon sampling and the resolution needed to elucidate the relationship between chlorarachniophytes and the members of the TUC clade.
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
are clearly the longest branches in our trees, with the exception of the non-photosynthetic euglenoid,
<italic>Euglena longa,</italic>
in the plastid tree. These long branch lengths suggest that substitution rates in the nucleomorph and plastid genomes have accelerated after secondary endosymbiosis. Such long branches are well known as the cause of phylogenetic artifacts.
<fig id="Fig6">
<label>Figure 6</label>
<caption>
<p>
<bold>Phylogenetic tree inferred from a concatenated set of 99 proteins (52 nucleomorph-/nucleus-encoded proteins and 47 plastid proteins) using PhyloBayes with the CAT + GTR + gamma model (4 rate categories).</bold>
Support values below the lines indicate Bayesian posterior probabilities, while the upper numbers are bootstrap support values based on RAxML. Black circles indicate branches supported with 100% bootstrap values and posterior probabilities of 1.0. Nodes where support values are less than 50% are shown with an asterisk (*). Scale bars shown by solid and broken lines indicate inferred number of amino acid substitutions per site. The fraction of amino acid sites present in the data matrix (site coverage) is shown on the left.</p>
</caption>
<graphic xlink:href="12864_2014_6068_Fig6_HTML" id="d30e1933"></graphic>
</fig>
</p>
</sec>
</sec>
<sec id="Sec9" sec-type="conclusions">
<title>Conclusions</title>
<p>We have sequenced the nucleomorph and plastid genomes of the chlorarachniophyte
<italic>L. oceanica</italic>
. The
<italic>L. oceanica</italic>
nucleomorph genome is the largest sequenced so far within the chlorarachniophytes, ~240 kbp larger than that of
<italic>B. natans</italic>
. Large, nearly identical sub-telomeric repeats on each of the six chromosome ends are the main contributors to its increased size, an apparent consequence of frequent homologous recombination at the chromosome ends that serve to expand and contract these regions in different species. Internally duplicated regions have also contributed to the secondary expansion of the
<italic>L. oceanica</italic>
nucleomorph genome. The
<italic>L. oceanica</italic>
nucleomorph genome contains 610 protein genes (348 non-redundant). 94.3% (151/160) of the eukaryotic conserved genes (i.e., ‘housekeeping’ genes) and all the plastid-associated genes were shared between
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
. Intron sizes and positions were also conserved. The presence of an overlap between the core gene sets in chlorarachniophyte and cryptophyte nucleomorph genomes speaks to convergent patterns of gene retention in these independently evolved nucleomorphs. In contrast to its dynamically evolving nucleomorph genome, the
<italic>L. oceanica</italic>
plastid genome is highly similar to the
<italic>B. natans</italic>
plastid genome in terms of genome size, gene content and gene synteny. Phylogenetic trees inferred from large datasets consisting of nucleomorph-/nucleus- and plastid-encoded proteins support the idea that the endosymbiont that gave rise to the chlorarachniophyte plastid and nucleomorph was a relative of the TUC group of green algae. Further phylogenetic resolution will require sequence data from more diverse members of the chlorarachniophytes, which will also improve our understanding of the tempo and mode of nucleomorph and plastid genome evolution in these enigmatic unicellular algae.</p>
</sec>
<sec id="Sec10" sec-type="methods">
<title>Methods</title>
<sec id="Sec11">
<title>Cell Culturing, DNA extraction, and genome sequencing</title>
<p>
<italic>Lotharella oceanica</italic>
(CCMP622) cells were cultured in K media under 12:12 Light/Dark conditions at room temperature. Two-month old cells were then collected by centrifugation at 2,300 rpm for 15 min at 4°C. DNA extraction was done using a standard SDS-phenol/chloroform extraction method. 6 mg of total cellular DNA was separated by Hoechst dye (No 33258, Sigma-Aldrich, St. Louis, MO, USA)-cesium chloride density gradient centrifugation at 35,000 rpm for 65 hrs at 4°C as described previously [
<xref ref-type="bibr" rid="CR19">19</xref>
<xref ref-type="bibr" rid="CR21">21</xref>
]. Discrete bands were purified and gradient centrifugation was repeated twice using the same conditions to increase purity. The genomic identities of the resulting four DNA fractions were determined by semi-quantitative PCR using four primer sets which specifically amplified nuclear (actin), nucleomorph (18S rDNA), plastid (rbcL), and mitochondrial (cox1) genes. A DNA fraction (8.4 μg) found to be enriched with nucleomorph and plastid DNA was sent to the Genome Quebec Innovation Center (Montreal, Quebec) for TruSeq library construction and Illumina HiSeq2000 sequencing.</p>
</sec>
<sec id="Sec12">
<title>RNA extraction and transcriptome sequencing</title>
<p>Two-month old cell cultures were transferred to new culture flasks with a half volume of new K media three days before RNA extraction. Cells were harvested at 2,300 rpm for 15 min at 4°C and RNA was extracted using TRizol (Invitrogen). Extracted total RNA was digested with DNase I at 37°C for 15 min and purified using the RNeasy
<sup>®</sup>
Mini Kit (Qiagen, Toronto, ON, Canada). 5 μg of total RNA was used for sequencing on the Illumina HiSeq2000 platform and reads were assembled into contigs at the National Center for Genome Resources (Santa Fe, NM, USA) within the context of the Marine Microbial Eukaryote Transcriptome Project (
<ext-link ext-link-type="uri" xlink:href="http://marinemicroeukaryotes.org/">http://marinemicroeukaryotes.org/</ext-link>
). The data was deposited in the CAMERA portal [
<xref ref-type="bibr" rid="CR47">47</xref>
] under the project ID MMETSP0040.</p>
</sec>
<sec id="Sec13">
<title>Nucleomorph and plastid genome assembly</title>
<p>424,477,680 paired-end reads (100 base read lengths) were trimmed to 95 bases where the mean base quality score was >32 (Illumina 1.8+, phred + 33) using the FASTX-Tool kit (Ver.0.0.13) (
<ext-link ext-link-type="uri" xlink:href="http://hannonlab.cshl.edu/fastx_toolkit/">http://hannonlab.cshl.edu/fastx_toolkit/</ext-link>
). The sequence assembly was done using PASHA (ver. 1.0.3) [
<xref ref-type="bibr" rid="CR48">48</xref>
] with a kmer size of 31. The first assembly generated ~1,800 scaffolds >5 kbp. The three largest scaffolds (93,892 bp, 89,349 bp, and 46,146 bp) were identified as nucleomorph with a depth of coverage of ~7,000× and with a G + C content of ~20% (The mapping strategy for calculating coverage is described in the next section). No scaffolds of plastid origin were identified in the first assembly. The reads used to construct scaffolds in the first assembly were filtered out, leaving 246 million reads. In our experience, scaffolds with extremely high levels of coverage tend to be ignored during assembly. Therefore, only 1,700,000 reads were used for the second-round assembly of the plastid genome. Four scaffolds were generated, all of which were found to be plastid in origin (12,337 bp, 9,456 bp, 28,109 bp, and 6,832 bp) with a depth of coverage ~1,600×. Two plastid-derived contigs from the transcriptome data (6,735 bp and 7,150 bp) were also used for subsequent manual assembly.</p>
<p>For the nucleomorph genome, paired-reads corresponding to the plastid genome and/or low coverage scaffolds (<1,000×)) in the first assembly were discarded, after which ~30 million paired-end reads remained. Only 2 of the 30 million paired-end reads were used for the third assembly in order to avoid problems associated with high coverage depths. 40 scaffolds (177 contigs) were generated with ~200× coverage. All scaffolds with a size of >4.5 kbp (11 scaffolds, 370,802 bp in total) were clearly nucleomorph in origin.</p>
<p>To fill gaps between/within scaffolds and contigs, 51 and 17 primers were designed for the nucleomorph and plastid genomes, respectively. Primers were also designed against five loci of the multicopy hsp70 gene, including a frame-shifted gene and one with an internal stop codon, to verify sequences. PCR was done using Takara exTaq and PCR products were cloned into pGEM T easy vector (Promega). The cloned samples were sent for Sanger sequencing at the Genwiz sequencing facility. To verify the structure of chromosomal repeats, long range PCR was performed using Takara LA Taq with primers that were designed against unique internal regions of the chromosomes and the chromosomal repeats themselves.</p>
</sec>
<sec id="Sec14">
<title>Genome mapping, intron detection, and intron comparison</title>
<p>21,095,918 RNA-Seq reads (100 base-pairs long) were trimmed to 85 bases where the mean base quality score was above 24 (Illumina 1.5+, phred + 64) using the FASTX-Tool kit (Ver.0.0.13). All reads were mapped to the three complete nucleomorph chromosomes as a reference using BWA (Burrows-Wheeler aligner, ver. 0.6.2) [
<xref ref-type="bibr" rid="CR49">49</xref>
]. For transcriptome analyses, the base coverage at each nucleotide position was calculated using the mpileup function in Samtools (ver. 0.1.18 ) [
<xref ref-type="bibr" rid="CR50">50</xref>
]. The depth of coverage for each gene was calculated by summing up the coverage at specific gene coordinates and normalizing the output for gene length using in-house perl scripts, as in Tanifuji et al. [
<xref ref-type="bibr" rid="CR37">37</xref>
]. Intron positions were detected manually from the BWA mapping results and visualized using the Integrative Genome Viewer (IGV ver.2.0) [
<xref ref-type="bibr" rid="CR51">51</xref>
]. RNA-Seq transcriptome data for
<italic>B. natans</italic>
(MMETSP0045) were also mapped to the nucleomorph genome of this organism (accession numbers NC_010004.1, NC_010005.1 and NC_010006.1), with the goal of verifying intron predictions based on Sanger EST sequencing by Gilson
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR17">17</xref>
]. For comparison of intron positions between
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
, 145 homologous genes were individually aligned using clustalW (ver. 2.1) [
<xref ref-type="bibr" rid="CR52">52</xref>
] and intron positions within high-quality alignment regions were compared (a minimum of 15 gap-free amino acid residues on either side of the intron-containing codon with >50% identity were retained for downstream analysis, as in Roy and Penny [
<xref ref-type="bibr" rid="CR30">30</xref>
]). In order to verify intron presence/absence in
<italic>A. thaliana</italic>
and
<italic>C. reinhardtii</italic>
for 20 intron sites not shared between
<italic>B. natans</italic>
and
<italic>L. oceanica</italic>
, nuclear genome sequences and annotations were obtained from the public database of the National Center for Biotechnology Information (NCBI) for
<italic>A. thaliana</italic>
(NC_003070-NC003071, NC_003074-NC003076) and The Joint Genome Institute
<italic>Chlamydomonas</italic>
v4 portal for
<italic>C. reinhardtii</italic>
[
<xref ref-type="bibr" rid="CR28">28</xref>
]. Orthologous genes in
<italic>A. thaliana</italic>
and
<italic>C. reinhardtii</italic>
were identified by BlastP using nucleomorph protein sequences from
<italic>B. natans</italic>
and
<italic>L. oceanica</italic>
as queries. The protein sequences from all four species were aligned using clustalW (ver. 2.1) [
<xref ref-type="bibr" rid="CR52">52</xref>
]; regions with >40% amino acid sequence identity were compared; introns were retained for further analysis if the criteria of Roy and Penny [
<xref ref-type="bibr" rid="CR30">30</xref>
] were met. The graphs of sequence conservation and base frequencies of introns and their flanking regions were generated using WebLogo 3 [
<xref ref-type="bibr" rid="CR53">53</xref>
]. Gene expression analysis for
<italic>B. natans</italic>
was carried out as described previously [
<xref ref-type="bibr" rid="CR37">37</xref>
].</p>
</sec>
<sec id="Sec15">
<title>Gene annotation</title>
<p>After the identification of introns in the
<italic>L. oceanica</italic>
nucleomorph genome, ORFs greater than 50 amino acids were identified using Artemis (Ver.13.0) [
<xref ref-type="bibr" rid="CR54">54</xref>
]; ORFs in the plastid genome were identified similarly. Blastx and blastp searches were done for all ORFs against the
<italic>B. natans</italic>
nucleomorph genome (Accession No. NC_010004.1, NC_010005.1 and NC_010006.1) and plastid genome sequence data (Accession No. NC_008408.1) with an e-value cut-off of 0.01. The non-redundant (nr) database was also searched with a cut-off e-value of 0.001. Each predicted gene was functionally categorized according to Tanifuji
<italic>et al.</italic>
[
<xref ref-type="bibr" rid="CR20">20</xref>
]. Transfer RNA prediction was done by tRNAscan-SE ver.1.2.1 [
<xref ref-type="bibr" rid="CR55">55</xref>
] using a ‘Eukaryotic’ source for the nucleomorph genome and a ‘Mito/Chloroplast’ source for the plastid genome. Ribosomal RNA operons were identified using published data (Accession No. HQ009889).</p>
</sec>
<sec id="Sec16">
<title>Average protein lengths and intergenic spacer sizes</title>
<p>The mean length of proteins shared between
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
(excluding nORFans) was calculated using 166 proteins. The mean length for all proteins was calculated using 610 and 284 proteins in
<italic>L. oceanica</italic>
and
<italic>B. natans</italic>
, respectively, and the average intergenic spacer sizes were calculated using 633 (
<italic>L. oceanica</italic>
and 305 (
<italic>B. natans</italic>
) intergenic spacers. T-tests and paired T-tests were performed using R statistical software (Ver. 2.14.1) [
<xref ref-type="bibr" rid="CR56">56</xref>
].</p>
</sec>
<sec id="Sec17">
<title>Phylogenetic analysis of hsp70</title>
<p>A dataset of 34 hsp70 amino acid sequences (including mitochondrial, plastid, cytoplasmic, and ER isoforms) from
<italic>Cyanidioschyzon merolae</italic>
,
<italic>Thalassiosira pseudonana</italic>
,
<italic>Ostreococcus tauri</italic>
,
<italic>Synecocystis</italic>
sp. Strain CC6803,
<italic>C. reinhardtii</italic>
, and
<italic>A. thaliana</italic>
was assembled according to Renner and Waters [
<xref ref-type="bibr" rid="CR57">57</xref>
]. Including the sequences of three hsp70 isoforms encoded in the
<italic>L. oceanica</italic>
nucleomorph genome and one from
<italic>B. natans</italic>
, all 38 amino acid sequences were aligned using MAFFT version 7 [
<xref ref-type="bibr" rid="CR58">58</xref>
]. Alignments were then trimmed automatically using TrimAl v1.2 [
<xref ref-type="bibr" rid="CR59">59</xref>
] with a gap threshold of 0.8 and a similarity threshold of 0.001. A phylogenetic tree was constructed using RAxML ver. 7.0.3 [
<xref ref-type="bibr" rid="CR45">45</xref>
] with the LG substitution matrix + Gamma + Invar (4 site rate categories). Bootstrap values were calculated with the rapid bootstrap method with 1,000 replicates.</p>
</sec>
<sec id="Sec18">
<title>Phylogenomic dataset construction</title>
<p>77 nucleomorph and 47 plastid proteins shared among chlorarachniophytes and cryptophytes were selected from the
<italic>L. oceanica</italic>
genomic data and used to construct a phylogenomic dataset (multicopy genes and genes/proteins showing low similarity to homologs in other organisms were excluded). First, the protein sequences for each
<italic>L. oceanica</italic>
protein were used to identify homologs using blastp against OrthoMCL v.5 (
<ext-link ext-link-type="uri" xlink:href="http://www.orthomcl.org">http://www.orthomcl.org</ext-link>
) with an e-value cutoff of 1e-5. The OrthoMCL ID of the top hit for each input protein was then collected and compared against all other input protein OrthoMCL IDs. If an OrthoMCL ID was present more than once, these genes were removed from consideration, as it may be the result of paralogy. The remaining OrthoMCL IDs were then used to create a reference ortholog dataset by collecting the corresponding orthologs (based on OrthoMCL ID) from
<italic>Chlamydomonas reinhardtii</italic>
in the OrthoMCL database. In cases where a
<italic>C. reinhardtii</italic>
ortholog was not present in the OrthoMCL database, the corresponding ortholog from
<italic>Arabidopsis thaliana, Cyanidioschyzon merolae, Ostreococcus lucimarinus,</italic>
or
<italic>Volvox carteri</italic>
f.
<italic>nagariensis</italic>
was collected (see Additional file
<xref rid="MOESM6" ref-type="media">6</xref>
).</p>
<p>Various organismal genomic/transcriptomic data and plastid data collected from publically available sources were used as input for an in-house pipeline, described below, for the creation of single protein datasets and, subsequently, the phylogenomic data matrix. The organismal data were individually screened for orthologs using either blastp or tblastn, depending on the data type, with the above reference ortholog sequences used as queries in
<sc>Blastmonkey</sc>
from the Barrel-o-Monkeys toolkit (
<ext-link ext-link-type="uri" xlink:href="http://rogerlab.biochem.dal.ca">http://rogerlab.biochem.dal.ca</ext-link>
). If the sequence dataset was nucleotides, then the tblastn hits were translated to amino acid residues. Blastp was then used to screen these putative orthologs against the OrthoMCL database, and the output for each gene from each organism was compared against a dictionary of orthologous OrthoMCL IDs. Those putative orthologs that did not match orthologous IDs were designated as paralogs and removed. The remaining orthologs from each organism were combined and aligned using MAFFT-LINSI [
<xref ref-type="bibr" rid="CR58">58</xref>
]. Ambiguously aligned positions were identified and removed using Block Mapping and Gathering with Entropy (BMGE) [
<xref ref-type="bibr" rid="CR60">60</xref>
]. For each individual protein alignment, maximum-likelihood (ML) trees were inferred in
<sc>RAxML</sc>
v7.2.6 [
<xref ref-type="bibr" rid="CR46">46</xref>
] using the LG + gamma distribution with four rate categories, with 10 ML tree searches and 100 ML bootstrap (MLBS) replicates. To test for undetected paralogy or contaminants, we constructed a consensus tree (ConTree) representing phylogenetic groupings of well-established eukaryotic clades [
<xref ref-type="bibr" rid="CR61">61</xref>
<xref ref-type="bibr" rid="CR65">65</xref>
]. The resulting individual nucleomorph/nuclear protein trees that placed taxa in conflicting positions relative to the ConTree with more than 70% maximum likelihood bootstrap support (MLBS), with a zero-branch length, or with extremely long branches were checked manually. These sequences were further scrutinized for hidden paralogy and contamination issues through reciprocal blastp against
<italic>Oryza sativa</italic>
orthologs in NCBI. All problematic sequences identified using these methods were removed from the dataset. The resulting protein alignments were then re-trimmed for ambiguously aligned positions using BMGE and concatenated into three separate supermatrices; a nuclear/nucleomorph dataset (52 proteins: 12,854 amino acid (AA) sites), a plastid dataset (47 proteins: 10,026 AA sites), and a combined nuclear-nucleomorph/plastid dataset (99 proteins: 25,688 AA sites), using a in-house script employing alvert.py from the Barrel-o-Monkey’s toolkit. For the combined nuclear/plastid dataset, taxon sampling was reduced to focus specifically on plants/algae and chlorarachniophyte taxa whose plastid genomes were available along with a sufficient amount of nuclear/nucleomorph data. In this dataset two chimeras were constructed using nuclear and plastid data from different but closely related taxa, namely
<italic>Bryopsis plumosa</italic>
(nuclear) ×
<italic>B. hypnoides</italic>
(plastid) and
<italic>Nitella hyalina</italic>
(nuclear) ×
<italic>Chara vulgaris</italic>
(plastid). Data sources, details on gene sampling, and information on missing data are in Additional file
<xref rid="MOESM6" ref-type="media">6</xref>
.</p>
</sec>
<sec id="Sec19">
<title>Phylogenomic analyses</title>
<p>ML trees for each dataset were estimated from 60 independent searches using
<sc>RAxML</sc>
under an LG + gamma (four rate categories) and an empirical amino acid frequencies model, selected by the Akaike information criterion. Topological support was assessed using 1,000
<sc>RAxML</sc>
bootstrap replicates (Additional file
<xref rid="MOESM7" ref-type="media">7</xref>
). For the combined nucleomorph-nucleus/plastid dataset, Bayesian inferences (BI) were made in PhyloBayes-MPI [
<xref ref-type="bibr" rid="CR46">46</xref>
] under the CAT-GTR + gamma (four rate categories) model, and two independent Markov chain Monte Carlo chains were run for 8,000 generations sampling every two generations. For PhyloBayes analyses, constant sites were removed to decrease computational time. Convergence was achieved for the chains at 400 generations, with the largest discrepancy in posterior probabilities (maxdiff) <0.012 and the effective size of continuous model parameters were in the range of acceptable values (>50). Posterior probabilities of post-burn-in bipartitions (2 chains, 3,600 trees, sampling every 10 trees) were mapped on to the consensus BI topology (Figure 
<xref rid="Fig6" ref-type="fig">6</xref>
). The MLBS values were mapped onto this tree.</p>
</sec>
<sec id="Sec20">
<title>Availability of supporting data</title>
<p>Nucleomorph and plastid genome sequences and annotations were deposited in GenBank under the following accession numbers; nucleomorph chromosomes 1–3 (CP006627-CP006629), plastid genome (KF438023). The RNA-Seq data is available in the CAMERA portal (
<ext-link ext-link-type="uri" xlink:href="http://camera.calit2.net/">http://camera.calit2.net/</ext-link>
) under the project ID MMETSP0040. Supporting protein alignment datasets used in this work are available from the LabArchives (
<ext-link ext-link-type="uri" xlink:href="http://www.labarchives.com/">http://www.labarchives.com/</ext-link>
) repository under doi:10.6070/H4BV7DJZ.</p>
</sec>
<sec id="Sec21">
<title>Ethics</title>
<p>Ethics approval was not required for the research described in this manuscript.</p>
</sec>
</sec>
<sec sec-type="supplementary-material">
<title>Electronic supplementary material</title>
<sec id="Sec22">
<supplementary-material content-type="local-data" id="MOESM1">
<media xlink:href="12864_2014_6068_MOESM1_ESM.pdf">
<caption>
<p>Additional file 1:
<bold>Nucleomorph Gene Content List of</bold>
<bold>
<italic>L. oceanica</italic>
</bold>
<bold>and</bold>
<bold>
<italic>B. natans.</italic>
</bold>
(PDF 98 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM2">
<media xlink:href="12864_2014_6068_MOESM2_ESM.pdf">
<caption>
<p>Additional file 2:
<bold>RaxML tree inferred from hsp70 proteins.</bold>
(PDF 325 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM3">
<media xlink:href="12864_2014_6068_MOESM3_ESM.pdf">
<caption>
<p>Additional file 3:
<bold>Venn Diagram of gene content between</bold>
<bold>
<italic>L. oceanica</italic>
</bold>
<bold>and</bold>
<bold>
<italic>B. natans.</italic>
</bold>
(PDF 308 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM4">
<media xlink:href="12864_2014_6068_MOESM4_ESM.pdf">
<caption>
<p>Additional file 4:
<bold>Base composition of introns and their flanking regions.</bold>
(PDF 512 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM5">
<media xlink:href="12864_2014_6068_MOESM5_ESM.pdf">
<caption>
<p>Additional file 5:
<bold>Intron comparison between hlorarachniophytes and green algae.</bold>
(PDF 271 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM6">
<media xlink:href="12864_2014_6068_MOESM6_ESM.pdf">
<caption>
<p>Additional file 6:
<bold>Taxon list and resource for phylogenetic analysis.</bold>
(PDF 41 KB)</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM7">
<media xlink:href="12864_2014_6068_MOESM7_ESM.pdf">
<caption>
<p>Additional file 7:
<bold>RAxML trees inferred from different protein sets.</bold>
(PDF 563 KB)</p>
</caption>
</media>
</supplementary-material>
</sec>
</sec>
</body>
<back>
<fn-group>
<fn>
<p>
<bold>Competing interests</bold>
</p>
<p>The authors declare that they have no competing interests.</p>
</fn>
<fn>
<p>
<bold>Authors’ contributions</bold>
</p>
<p>GT and NTO analyzed and annotated the genomes. GT, NTO, MWB, BAC and JMA drafted the manuscript. NTO carried out experiments. MWB and AJR performed phylogenetic analyses. GKSW and MM provided EST data for green algae. All authors read and approved the final manuscript.</p>
</fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>This research was funded by an operating grant from the Canadian Institutes of Health Research awarded to JMA and the Gordon and Betty Moore Foundation through Grant #2637 to the National Center for Genome Resources (NCGR). The 1000 Plants (1KP) initiative, led by GKSW, is funded by the Alberta Ministry of Innovation and Advanced Education, Alberta Innovates Technology Futures (AITF) Innovates Centres of Research Excellence (iCORE), Musea Ventures, and BGI-Shenzhen. GT was supported by a Tula Foundation postdoctoral fellowship from the Centre for Comparative Genomics and Evolutionary Bioinformatics at Dalhousie University. JMA and AJR are Senior Fellows of the Canadian Institute for Advanced Research, Program in Integrated Microbial Biodiversity; JMA was a holder of a CIHR New Investigator Award. Computations were performed on the TCS and GPC supercomputers at the SciNet HPC Consortium. SciNet is funded by: the Canada Foundation for Innovation under the auspices of Compute Canada; the Government of Ontario; Ontario Research Fund - Research Excellence; and the University of Toronto.</p>
</ack>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dolezal</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Likic</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Tachezy</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lithgow</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Evolution of the molecular machines for protein import into mitochondria</article-title>
<source>Science</source>
<year>2006</year>
<volume>313</volume>
<fpage>314</fpage>
<lpage>318</lpage>
<pub-id pub-id-type="doi">10.1126/science.1127895</pub-id>
<pub-id pub-id-type="pmid">16857931</pub-id>
</element-citation>
</ref>
<ref id="CR2">
<label>2.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gray</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Burger</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Lang</surname>
<given-names>BF</given-names>
</name>
</person-group>
<article-title>Mitochondrial evolution</article-title>
<source>Science</source>
<year>1999</year>
<volume>283</volume>
<fpage>1476</fpage>
<lpage>1481</lpage>
<pub-id pub-id-type="doi">10.1126/science.283.5407.1476</pub-id>
<pub-id pub-id-type="pmid">10066161</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gould</surname>
<given-names>SB</given-names>
</name>
<name>
<surname>Waller</surname>
<given-names>RR</given-names>
</name>
<name>
<surname>McFadden</surname>
<given-names>GI</given-names>
</name>
</person-group>
<article-title>Plastid evolution</article-title>
<source>Annu Rev Plant Biol</source>
<year>2008</year>
<volume>59</volume>
<fpage>491</fpage>
<lpage>517</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.arplant.59.032607.092915</pub-id>
<pub-id pub-id-type="pmid">18315522</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reyes-Prieto</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Weber</surname>
<given-names>APM</given-names>
</name>
<name>
<surname>Bhattacharya</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>The origin and establishment of the plastid in algae and plants</article-title>
<source>Annu Rev Genet</source>
<year>2007</year>
<volume>41</volume>
<fpage>147</fpage>
<lpage>168</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.genet.41.110306.130134</pub-id>
<pub-id pub-id-type="pmid">17600460</pub-id>
</element-citation>
</ref>
<ref id="CR5">
<label>5.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martin</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Herrmann</surname>
<given-names>RG</given-names>
</name>
</person-group>
<article-title>Gene transfer from organelles to the nucleus: How much, what happens, and why?</article-title>
<source>Plant Physiol</source>
<year>1998</year>
<volume>118</volume>
<fpage>9</fpage>
<lpage>17</lpage>
<pub-id pub-id-type="doi">10.1104/pp.118.1.9</pub-id>
<pub-id pub-id-type="pmid">9733521</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martin</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Rujan</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Richly</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Hansen</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Cornelsen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lins</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Leister</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Stoebe</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Hasegawa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Penny</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Evolutionary analysis of
<italic>Arabidopsis</italic>
, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2002</year>
<volume>99</volume>
<fpage>12246</fpage>
<lpage>12251</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.182432999</pub-id>
<pub-id pub-id-type="pmid">12218172</pub-id>
</element-citation>
</ref>
<ref id="CR7">
<label>7.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Timmis</surname>
<given-names>JN</given-names>
</name>
<name>
<surname>Ayliffe</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>CY</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>W</given-names>
</name>
</person-group>
<article-title>Endosymbiotic gene transfer: Organelle genomes forge eukaryotic chromosomes</article-title>
<source>Nat Rev Genet</source>
<year>2004</year>
<volume>5</volume>
<fpage>123</fpage>
<lpage>135</lpage>
<pub-id pub-id-type="doi">10.1038/nrg1271</pub-id>
<pub-id pub-id-type="pmid">14735123</pub-id>
</element-citation>
</ref>
<ref id="CR8">
<label>8.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zybailov</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Rutschow</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Friso</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Rudella</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Emanuelsson</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>van Wijk</surname>
<given-names>KJ</given-names>
</name>
</person-group>
<article-title>Sorting signals, N-terminal modifications and abundance of the chloroplast proteome</article-title>
<source>Pros One</source>
<year>2008</year>
<volume>3</volume>
<fpage>e1994</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0001994</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Tanifuji</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Wolfgang</surname>
<given-names>Löffelhardt</given-names>
</name>
</person-group>
<article-title>Nucleomorph comparative genomics</article-title>
<source>Endosymbiosis</source>
<year>2014</year>
<publisher-loc>Vienna</publisher-loc>
<publisher-name>Springer</publisher-name>
<fpage>197</fpage>
<lpage>214</lpage>
</element-citation>
</ref>
<ref id="CR10">
<label>10.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Douglas</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Murphy</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Spencer</surname>
<given-names>DF</given-names>
</name>
<name>
<surname>Gray</surname>
<given-names>MW</given-names>
</name>
</person-group>
<article-title>Cryptomonad algae are evolutionaly chimaeras of two phylogenetically distinct unicellular eukaryotes</article-title>
<source>Nature</source>
<year>1991</year>
<volume>350</volume>
<fpage>148</fpage>
<lpage>151</lpage>
<pub-id pub-id-type="doi">10.1038/350148a0</pub-id>
<pub-id pub-id-type="pmid">2005963</pub-id>
</element-citation>
</ref>
<ref id="CR11">
<label>11.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van de Peer</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Rensing</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Maier</surname>
<given-names>UG</given-names>
</name>
<name>
<surname>De Wachter</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Substituon rate calibration of small subunit ribosomal RNA identifies chlorarachniophyte endosymbionts as remnants of green algae</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1996</year>
<volume>93</volume>
<fpage>7732</fpage>
<lpage>7736</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.93.15.7732</pub-id>
<pub-id pub-id-type="pmid">8755544</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rogers</surname>
<given-names>MB</given-names>
</name>
<name>
<surname>Gilson</surname>
<given-names>PR</given-names>
</name>
<name>
<surname>Su</surname>
<given-names>V</given-names>
</name>
<name>
<surname>McFadden</surname>
<given-names>GI</given-names>
</name>
<name>
<surname>Keeling</surname>
<given-names>PJ</given-names>
</name>
</person-group>
<article-title>The complete chloroplast genome of the chlorarachniophyte
<italic>Bigelowiella natans</italic>
: Evidence for independent origins of chlorarachniophyte and euglenid secondary endosymbionts</article-title>
<source>Mol Biol Evol</source>
<year>2007</year>
<volume>24</volume>
<fpage>54</fpage>
<lpage>62</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msl129</pub-id>
<pub-id pub-id-type="pmid">16990439</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Silver</surname>
<given-names>TD</given-names>
</name>
<name>
<surname>Koike</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yabuki</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Kofuji</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Ishida</surname>
<given-names>KI</given-names>
</name>
</person-group>
<article-title>Phylogeny and nucleomorph karyotype diversity of chlorarachniophyte algae</article-title>
<source>J Eukaryot Microbiol</source>
<year>2007</year>
<volume>54</volume>
<fpage>403</fpage>
<lpage>410</lpage>
<pub-id pub-id-type="doi">10.1111/j.1550-7408.2007.00279.x</pub-id>
<pub-id pub-id-type="pmid">17910684</pub-id>
</element-citation>
</ref>
<ref id="CR14">
<label>14.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tanifuji</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Onodera</surname>
<given-names>NT</given-names>
</name>
<name>
<surname>Hara</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Nucleomorph genome diversity and its phylogenetic implications in cryptomonad algae</article-title>
<source>Phycol Res</source>
<year>2010</year>
<volume>58</volume>
<fpage>230</fpage>
<lpage>237</lpage>
<pub-id pub-id-type="doi">10.1111/j.1440-1835.2010.00580.x</pub-id>
</element-citation>
</ref>
<ref id="CR15">
<label>15.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ishida</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Endo</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Koike</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>
<italic>Partenskyella glossopodia</italic>
(Chlorarachniophyceae) possesses a nucleomorph genome of approximately 1 Mbp</article-title>
<source>Phycol Res</source>
<year>2011</year>
<volume>59</volume>
<fpage>120</fpage>
<lpage>122</lpage>
<pub-id pub-id-type="doi">10.1111/j.1440-1835.2011.00608.x</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Silver</surname>
<given-names>TD</given-names>
</name>
<name>
<surname>Moore</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Nucleomorph ribosomal DNA and telomere dynamics in chlorarachniophyte algae</article-title>
<source>J Eukaryot Microbiol</source>
<year>2010</year>
<volume>57</volume>
<fpage>453</fpage>
<lpage>459</lpage>
<pub-id pub-id-type="doi">10.1111/j.1550-7408.2010.00511.x</pub-id>
<pub-id pub-id-type="pmid">21040099</pub-id>
</element-citation>
</ref>
<ref id="CR17">
<label>17.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gilson</surname>
<given-names>PR</given-names>
</name>
<name>
<surname>Su</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Slamovits</surname>
<given-names>CH</given-names>
</name>
<name>
<surname>Reith</surname>
<given-names>ME</given-names>
</name>
<name>
<surname>Keeling</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>McFadden</surname>
<given-names>GI</given-names>
</name>
</person-group>
<article-title>Complete nucleotide sequence of the chlorarachniophyte nucleomorph: Nature's smallest nucleus</article-title>
<source>P Natl Acad Sci USA</source>
<year>2006</year>
<volume>103</volume>
<fpage>9566</fpage>
<lpage>9571</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0600707103</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Douglas</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zauner</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Fraunholz</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Beaton</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Penny</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>LT</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>XN</given-names>
</name>
<name>
<surname>Reith</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Cavalier-Smith</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Maier</surname>
<given-names>UG</given-names>
</name>
</person-group>
<article-title>The highly reduced genome of an enslaved algal nucleus</article-title>
<source>Nature</source>
<year>2001</year>
<volume>410</volume>
<fpage>1091</fpage>
<lpage>1096</lpage>
<pub-id pub-id-type="doi">10.1038/35074092</pub-id>
<pub-id pub-id-type="pmid">11323671</pub-id>
</element-citation>
</ref>
<ref id="CR19">
<label>19.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lane</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>van den Heuvel</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kozera</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Curtis</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Parsons</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Bowman</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Nucleomorph genome of
<italic>Hemiselmis andersenii</italic>
reveals complete intron loss and compaction as a driver of protein structure and function</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2007</year>
<volume>104</volume>
<fpage>19908</fpage>
<lpage>19913</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0707419104</pub-id>
<pub-id pub-id-type="pmid">18077423</pub-id>
</element-citation>
</ref>
<ref id="CR20">
<label>20.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tanifuji</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Onodera</surname>
<given-names>NT</given-names>
</name>
<name>
<surname>Wheeler</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Dlutek</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Donaher</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Complete nucleomorph genome sequence of the nonphotosynthetic alga
<italic>Cryptomonas paramecium</italic>
reveals a core nucleomorph gene set</article-title>
<source>Genome Biol Evol</source>
<year>2011</year>
<volume>3</volume>
<fpage>44</fpage>
<lpage>54</lpage>
<pub-id pub-id-type="doi">10.1093/gbe/evq082</pub-id>
<pub-id pub-id-type="pmid">21147880</pub-id>
</element-citation>
</ref>
<ref id="CR21">
<label>21.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moore</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Curtis</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Mills</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Tanifuji</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Nucleomorph genome sequence of the cryptophyte alga
<italic>Chroomonas mesostigmatica</italic>
CCMP1168 reveals lineage-specific gene loss and genome complexity</article-title>
<source>Genome Biol Evol</source>
<year>2012</year>
<volume>4</volume>
<fpage>1162</fpage>
<lpage>1175</lpage>
<pub-id pub-id-type="doi">10.1093/gbe/evs090</pub-id>
<pub-id pub-id-type="pmid">23042551</pub-id>
</element-citation>
</ref>
<ref id="CR22">
<label>22.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katinka</surname>
<given-names>MD</given-names>
</name>
<name>
<surname>Duprat</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Cornillot</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Metenier</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Thomarat</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Prensier</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Barbe</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Peyretaillade</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Brottier</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Wincker</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Delbac</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Alaoui</surname>
<given-names>HEI</given-names>
</name>
<name>
<surname>Peyret</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Saurin</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Gouy</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Weissenbach</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Vivares</surname>
<given-names>CP</given-names>
</name>
</person-group>
<article-title>Genome sequence and gene compaction of the eukaryote parasite
<italic>Encephalitozoon cuniculi</italic>
</article-title>
<source>Nature</source>
<year>2001</year>
<volume>414</volume>
<fpage>450</fpage>
<lpage>453</lpage>
<pub-id pub-id-type="doi">10.1038/35106579</pub-id>
<pub-id pub-id-type="pmid">11719806</pub-id>
</element-citation>
</ref>
<ref id="CR23">
<label>23.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sasaki</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Ishikawa</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Yamashita</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Oshima</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kenri</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Furuya</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Yoshino</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Horino</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Shiba</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Sasaki</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Hattori</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>The complete genomic sequence of
<italic>Mycoplasma penetrans</italic>
, an intracellular bacterial pathogen in humans</article-title>
<source>Nucleic Acids Res</source>
<year>2002</year>
<volume>30</volume>
<fpage>5293</fpage>
<lpage>5300</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkf667</pub-id>
<pub-id pub-id-type="pmid">12466555</pub-id>
</element-citation>
</ref>
<ref id="CR24">
<label>24.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Corradi</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Pombert</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Farinelli</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Didier</surname>
<given-names>ES</given-names>
</name>
<name>
<surname>Keeling</surname>
<given-names>PJ</given-names>
</name>
</person-group>
<article-title>The complete sequence of the smallest known nuclear genome from the microsporidian
<italic>Encephalitozoon intestinalis</italic>
</article-title>
<source>Nat Commun</source>
<year>2010</year>
<volume>1</volume>
<fpage>77</fpage>
<pub-id pub-id-type="doi">10.1038/ncomms1082</pub-id>
<pub-id pub-id-type="pmid">20865802</pub-id>
</element-citation>
</ref>
<ref id="CR25">
<label>25.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McCutcheon</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>McDonald</surname>
<given-names>BR</given-names>
</name>
<name>
<surname>Moran</surname>
<given-names>NA</given-names>
</name>
</person-group>
<article-title>Origin of an alternative genetic code in the extremely small and GC-rich genome of a bacterial symbiont</article-title>
<source>Plos Genet</source>
<year>2009</year>
<volume>5</volume>
<fpage>e1000565</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pgen.1000565</pub-id>
<pub-id pub-id-type="pmid">19609354</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>DR</given-names>
</name>
</person-group>
<article-title>Unparalleled GC content in the plastid DNA of
<italic>Selaginella</italic>
</article-title>
<source>Plant Mol Biol</source>
<year>2009</year>
<volume>71</volume>
<fpage>627</fpage>
<lpage>639</lpage>
<pub-id pub-id-type="doi">10.1007/s11103-009-9545-3</pub-id>
<pub-id pub-id-type="pmid">19774466</pub-id>
</element-citation>
</ref>
<ref id="CR27">
<label>27.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Slamovits</surname>
<given-names>CH</given-names>
</name>
<name>
<surname>Keeling</surname>
<given-names>PJ</given-names>
</name>
</person-group>
<article-title>Evolution of ultrasmall spliceosomal introns in highly reduced nuclear genomes</article-title>
<source>Mol Biol Evol</source>
<year>2009</year>
<volume>26</volume>
<fpage>1699</fpage>
<lpage>1705</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msp081</pub-id>
<pub-id pub-id-type="pmid">19380463</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Merchant</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Prochnik</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Vallon</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Harris</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Karpowicz</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Witman</surname>
<given-names>GB</given-names>
</name>
<name>
<surname>Terry</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Salamov</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Fritz-Laylin</surname>
<given-names>LK</given-names>
</name>
<name>
<surname>Marechal-Drouard</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Marshall</surname>
<given-names>WF</given-names>
</name>
<name>
<surname>Qu</surname>
<given-names>LH</given-names>
</name>
<name>
<surname>Nelson</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Sanderfoot</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Spalding</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>Kapitonov</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>QH</given-names>
</name>
<name>
<surname>Ferris</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Lindquist</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Shapiro</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Lucas</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Grimwood</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Schmutz</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Cardol</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Cerutti</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Chanfreau</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Cognat</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Croft</surname>
<given-names>MT</given-names>
</name>
<name>
<surname>Dent</surname>
<given-names>R</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The
<italic>Chlamydomonas</italic>
genome reveals the evolution of key animal and plant functions</article-title>
<source>Science</source>
<year>2007</year>
<volume>318</volume>
<fpage>245</fpage>
<lpage>251</lpage>
<pub-id pub-id-type="doi">10.1126/science.1143609</pub-id>
<pub-id pub-id-type="pmid">17932292</pub-id>
</element-citation>
</ref>
<ref id="CR29">
<label>29.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Palenik</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Grimwood</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Aerts</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rouze</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Salamov</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Putnam</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Dupont</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Jorgensen</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Derelle</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Rombauts</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>KM</given-names>
</name>
<name>
<surname>Otillar</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Merchant</surname>
<given-names>SS</given-names>
</name>
<name>
<surname>Podell</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gaasterland</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Napoli</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Gendler</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Manuell</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Tai</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Vallon</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Piganeau</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Jancek</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Heijde</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Jabbari</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Bowler</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lohr</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Robbens</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Werner</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Dubchak</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Pazour</surname>
<given-names>GJ</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The tiny eukaryote
<italic>Ostreococcus</italic>
provides genomic insights into the paradox of plankton speciation</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2007</year>
<volume>104</volume>
<fpage>7705</fpage>
<lpage>7710</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0611046104</pub-id>
<pub-id pub-id-type="pmid">17460045</pub-id>
</element-citation>
</ref>
<ref id="CR30">
<label>30.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roy</surname>
<given-names>SW</given-names>
</name>
<name>
<surname>Penny</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Patterns of intron loss and gain in plants: Intron loss-dominated evolution and genome-wide comparison of
<italic>O. sativa</italic>
and
<italic>A. thaliana</italic>
</article-title>
<source>Mol Biol Evol</source>
<year>2007</year>
<volume>24</volume>
<fpage>171</fpage>
<lpage>181</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msl159</pub-id>
<pub-id pub-id-type="pmid">17065597</pub-id>
</element-citation>
</ref>
<ref id="CR31">
<label>31.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Douglas</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Penny</surname>
<given-names>SL</given-names>
</name>
</person-group>
<article-title>The plastid genome of the cryptophyte alga,
<italic>Guillardia theta</italic>
: Complete sequence and conserved synteny groups confirm its common ancestry with red algae</article-title>
<source>J Mol Evol</source>
<year>1999</year>
<volume>48</volume>
<fpage>236</fpage>
<lpage>244</lpage>
<pub-id pub-id-type="doi">10.1007/PL00006462</pub-id>
<pub-id pub-id-type="pmid">9929392</pub-id>
</element-citation>
</ref>
<ref id="CR32">
<label>32.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khan</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Parks</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Kozera</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Curtis</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Parsons</surname>
<given-names>BJ</given-names>
</name>
<name>
<surname>Bowman</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Plastid genome sequence of the cryptophyte alga
<italic>Rhodomonas salina</italic>
CCMP1319: Lateral transfer of putative DNA replication machinery and a test of chromist plastid phylogeny</article-title>
<source>Mol Biol Evol</source>
<year>2007</year>
<volume>24</volume>
<fpage>1832</fpage>
<lpage>1842</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msm101</pub-id>
<pub-id pub-id-type="pmid">17522086</pub-id>
</element-citation>
</ref>
<ref id="CR33">
<label>33.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Donaher</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Tanifuji</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Onodera</surname>
<given-names>NT</given-names>
</name>
<name>
<surname>Malfatti</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Chain</surname>
<given-names>PSG</given-names>
</name>
<name>
<surname>Hara</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>The complete plastid genome sequence of the secondarily nonphotosynthetic alga
<italic>Cryptomonas paramecium</italic>
: reduction, compaction, and accelerated evolutionary rate</article-title>
<source>Genome Biol Evol</source>
<year>2009</year>
<volume>1</volume>
<fpage>439</fpage>
<lpage>448</lpage>
<pub-id pub-id-type="doi">10.1093/gbe/evp047</pub-id>
<pub-id pub-id-type="pmid">20333213</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Emanuelsson</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>von Heijne</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites</article-title>
<source>Protein Science</source>
<year>1999</year>
<volume>8</volume>
<fpage>978</fpage>
<lpage>984</lpage>
<pub-id pub-id-type="doi">10.1110/ps.8.5.978</pub-id>
<pub-id pub-id-type="pmid">10338008</pub-id>
</element-citation>
</ref>
<ref id="CR35">
<label>35.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Emanuelsson</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Brunak</surname>
<given-names>S</given-names>
</name>
<name>
<surname>von Heijne</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Predicting subcellular localization of proteins based on their N-terminal amino acid sequence</article-title>
<source>J Mol Biol</source>
<year>2000</year>
<volume>300</volume>
<fpage>1005</fpage>
<lpage>1016</lpage>
<pub-id pub-id-type="doi">10.1006/jmbi.2000.3903</pub-id>
<pub-id pub-id-type="pmid">10891285</pub-id>
</element-citation>
</ref>
<ref id="CR36">
<label>36.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lane</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Khan</surname>
<given-names>H</given-names>
</name>
<name>
<surname>MacKinnon</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fong</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Theophilou</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Insight into the diversity and evolution of the cryptomonad nucleomorph genome</article-title>
<source>Mol Biol Evol</source>
<year>2006</year>
<volume>23</volume>
<fpage>856</fpage>
<lpage>865</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msj066</pub-id>
<pub-id pub-id-type="pmid">16306383</pub-id>
</element-citation>
</ref>
<ref id="CR37">
<label>37.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tanifuji</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Onodeta</surname>
<given-names>NT</given-names>
</name>
<name>
<surname>Moore</surname>
<given-names>CE</given-names>
</name>
<name>
<surname>Archibald</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Reduced nuclear genomes maintain high gene transcription levels</article-title>
<source>Mol Biol Evol</source>
<year>2014</year>
<volume>31</volume>
<fpage>625</fpage>
<lpage>635</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/mst254</pub-id>
<pub-id pub-id-type="pmid">24336878</pub-id>
</element-citation>
</ref>
<ref id="CR38">
<label>38.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Galtier</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Piganeau</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Mouchiroud</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Duret</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>GC-content evolution in mammalian genomes: The biased gene conversion hypothesis</article-title>
<source>Genetics</source>
<year>2001</year>
<volume>159</volume>
<fpage>907</fpage>
<lpage>911</lpage>
<pub-id pub-id-type="pmid">11693127</pub-id>
</element-citation>
</ref>
<ref id="CR39">
<label>39.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Escobar</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Glemin</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Galtier</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>GC-biased gene conversion impacts ribosomal DNA evolution in vertebrates, angiosperms, and other eukaryotes</article-title>
<source>Mol Biol Evol</source>
<year>2011</year>
<volume>28</volume>
<fpage>2561</fpage>
<lpage>2575</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msr079</pub-id>
<pub-id pub-id-type="pmid">21444650</pub-id>
</element-citation>
</ref>
<ref id="CR40">
<label>40.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Curtis</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Tanifuji</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Burki</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Gruber</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Irimia</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Maruyama</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Arias</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Ball</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Gile</surname>
<given-names>GH</given-names>
</name>
<name>
<surname>Hirakawa</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Hopkins</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Kuo</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rensing</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Schmutz</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Symeonidi</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Elias</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Eveleigh</surname>
<given-names>RJM</given-names>
</name>
<name>
<surname>Herman</surname>
<given-names>EK</given-names>
</name>
<name>
<surname>Klute</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Nakayama</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Oborník</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Reyes-Prieto</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Armbrust</surname>
<given-names>EV</given-names>
</name>
<name>
<surname>Aves</surname>
<given-names>ST</given-names>
</name>
<name>
<surname>Beiko</surname>
<given-names>RG</given-names>
</name>
<name>
<surname>Coutinho</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Dacks</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Durnford</surname>
<given-names>DG</given-names>
</name>
<name>
<surname>Fast</surname>
<given-names>NM</given-names>
</name>
<name>
<surname>Green</surname>
<given-names>BR</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Algal genomes reveal evolutionary mosaicism and the fate of nucleomorphs</article-title>
<source>Nature</source>
<year>2012</year>
<volume>492</volume>
<fpage>59</fpage>
<lpage>65</lpage>
<pub-id pub-id-type="doi">10.1038/nature11681</pub-id>
<pub-id pub-id-type="pmid">23201678</pub-id>
</element-citation>
</ref>
<ref id="CR41">
<label>41.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Turmel</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Gagnon</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>O'Kelly</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Otis</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lemieux</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>The chloroplast genomes of the green algae
<italic>Pyramimonas</italic>
,
<italic>Monomastix</italic>
, and
<italic>Pycnococcus</italic>
shed new light on the evolutionary history of prasinophytes and the origin of the secondary chloroplasts of euglenids</article-title>
<source>Mol Biol Evol</source>
<year>2009</year>
<volume>26</volume>
<fpage>631</fpage>
<lpage>648</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msn285</pub-id>
<pub-id pub-id-type="pmid">19074760</pub-id>
</element-citation>
</ref>
<ref id="CR42">
<label>42.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leliaert</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Moreau</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Herron</surname>
<given-names>MD</given-names>
</name>
<name>
<surname>Verbruggen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Delwiche</surname>
<given-names>CF</given-names>
</name>
<name>
<surname>De Clerck</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>Phylogeny and molecular evolution of the green algae</article-title>
<source>Crit Rev Plant Sci</source>
<year>2012</year>
<volume>31</volume>
<fpage>1</fpage>
<lpage>46</lpage>
<pub-id pub-id-type="doi">10.1080/07352689.2011.615705</pub-id>
</element-citation>
</ref>
<ref id="CR43">
<label>43.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marin</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Nested in the Chlorellales or independent class? Phylogeny and classification of the Pedinophyceae(Viridiplantae) revealed by molecular phylogenetic analyses of complete nuclear and plastid-encoded rRNA operons</article-title>
<source>Protist</source>
<year>2012</year>
<volume>163</volume>
<fpage>778</fpage>
<lpage>805</lpage>
<pub-id pub-id-type="doi">10.1016/j.protis.2011.11.004</pub-id>
<pub-id pub-id-type="pmid">22192529</pub-id>
</element-citation>
</ref>
<ref id="CR44">
<label>44.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marin</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Melkonian</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Molecular phylogeny and classification of the Mamiellophyceae class. nov. (Chlorophyta) based on sequence comparisons of the nuclear- and plastid-encoded rRNA operons</article-title>
<source>Protist</source>
<year>2010</year>
<volume>161</volume>
<fpage>304</fpage>
<lpage>336</lpage>
<pub-id pub-id-type="doi">10.1016/j.protis.2009.10.002</pub-id>
<pub-id pub-id-type="pmid">20005168</pub-id>
</element-citation>
</ref>
<ref id="CR45">
<label>45.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stamatakis</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models</article-title>
<source>Bioinformatics</source>
<year>2006</year>
<volume>22</volume>
<fpage>2688</fpage>
<lpage>2690</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btl446</pub-id>
<pub-id pub-id-type="pmid">16928733</pub-id>
</element-citation>
</ref>
<ref id="CR46">
<label>46.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lartillot</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Rodrigue</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Stubbs</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Richer</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>PhyloBayes MPI: Phylogenetic reconstruction with Infinite mixtures of profiles in a parallel environment</article-title>
<source>Syst Biol</source>
<year>2013</year>
<volume>62</volume>
<fpage>611</fpage>
<lpage>615</lpage>
<pub-id pub-id-type="doi">10.1093/sysbio/syt022</pub-id>
<pub-id pub-id-type="pmid">23564032</pub-id>
</element-citation>
</ref>
<ref id="CR47">
<label>47.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>WZ</given-names>
</name>
<name>
<surname>Altintas</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Peltier</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Stocks</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Allen</surname>
<given-names>EE</given-names>
</name>
<name>
<surname>Ellisman</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Grethe</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wooley</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<fpage>D546</fpage>
<lpage>D551</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkq1102</pub-id>
<pub-id pub-id-type="pmid">21045053</pub-id>
</element-citation>
</ref>
<ref id="CR48">
<label>48.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Maskell</surname>
<given-names>DL</given-names>
</name>
</person-group>
<article-title>Parallelized short read assembly of large genomes using de Bruijn graphs</article-title>
<source>BMC Bioinformatics</source>
<year>2011</year>
<volume>12</volume>
<fpage>354</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-12-354</pub-id>
<pub-id pub-id-type="pmid">21867511</pub-id>
</element-citation>
</ref>
<ref id="CR49">
<label>49.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Fast and accurate short read alignment with Burrows-Wheeler transform</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<fpage>1754</fpage>
<lpage>1760</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp324</pub-id>
<pub-id pub-id-type="pmid">19451168</pub-id>
</element-citation>
</ref>
<ref id="CR50">
<label>50.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Handsaker</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Wysoker</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Fennell</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ruan</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Homer</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Marth</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Abecasis</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Durbin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Proc</surname>
<given-names>GPD</given-names>
</name>
</person-group>
<article-title>The sequence alignment/map format and SAMtools</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<fpage>2078</fpage>
<lpage>2079</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp352</pub-id>
<pub-id pub-id-type="pmid">19505943</pub-id>
</element-citation>
</ref>
<ref id="CR51">
<label>51.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Robinson</surname>
<given-names>JT</given-names>
</name>
<name>
<surname>Thorvaldsdottir</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Winckler</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Guttman</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lander</surname>
<given-names>ES</given-names>
</name>
<name>
<surname>Getz</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Mesirov</surname>
<given-names>JP</given-names>
</name>
</person-group>
<article-title>Integrative genomics viewer</article-title>
<source>Nat Biotechnol</source>
<year>2011</year>
<volume>29</volume>
<fpage>24</fpage>
<lpage>26</lpage>
<pub-id pub-id-type="doi">10.1038/nbt.1754</pub-id>
<pub-id pub-id-type="pmid">21221095</pub-id>
</element-citation>
</ref>
<ref id="CR52">
<label>52.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Larkin</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Blackshields</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>NP</given-names>
</name>
<name>
<surname>Chenna</surname>
<given-names>R</given-names>
</name>
<name>
<surname>McGettigan</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>McWilliam</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Valentin</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Wallace</surname>
<given-names>IM</given-names>
</name>
<name>
<surname>Wilm</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lopez</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Gibson</surname>
<given-names>TJ</given-names>
</name>
<name>
<surname>Higgins</surname>
<given-names>DG</given-names>
</name>
</person-group>
<article-title>Clustal W and clustal X version 2.0</article-title>
<source>Bioinformatics</source>
<year>2007</year>
<volume>23</volume>
<fpage>2947</fpage>
<lpage>2948</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btm404</pub-id>
<pub-id pub-id-type="pmid">17846036</pub-id>
</element-citation>
</ref>
<ref id="CR53">
<label>53.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Crooks</surname>
<given-names>GE</given-names>
</name>
<name>
<surname>Hon</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Chandonia</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Brenner</surname>
<given-names>SE</given-names>
</name>
</person-group>
<article-title>WebLogo: A sequence logo generator</article-title>
<source>Genome Res</source>
<year>2004</year>
<volume>14</volume>
<fpage>1188</fpage>
<lpage>1190</lpage>
<pub-id pub-id-type="doi">10.1101/gr.849004</pub-id>
<pub-id pub-id-type="pmid">15173120</pub-id>
</element-citation>
</ref>
<ref id="CR54">
<label>54.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rutherford</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Parkhill</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Crook</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Horsnell</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Rice</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Rajandream</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Barrell</surname>
<given-names>B</given-names>
</name>
</person-group>
<article-title>Artemis: sequence visualization and annotation</article-title>
<source>Bioinformatics</source>
<year>2000</year>
<volume>16</volume>
<fpage>944</fpage>
<lpage>945</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/16.10.944</pub-id>
<pub-id pub-id-type="pmid">11120685</pub-id>
</element-citation>
</ref>
<ref id="CR55">
<label>55.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schattner</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Brooks</surname>
<given-names>AN</given-names>
</name>
<name>
<surname>Lowe</surname>
<given-names>TM</given-names>
</name>
</person-group>
<article-title>The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs</article-title>
<source>Nucleic Acids Res</source>
<year>2005</year>
<volume>33</volume>
<fpage>W686</fpage>
<lpage>W689</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gki366</pub-id>
<pub-id pub-id-type="pmid">15980563</pub-id>
</element-citation>
</ref>
<ref id="CR56">
<label>56.</label>
<element-citation publication-type="book">
<source>A language and environment for statistical computing</source>
<year>2008</year>
<publisher-loc>Vienna</publisher-loc>
<publisher-name>R Foundation for Statistical Computing</publisher-name>
</element-citation>
</ref>
<ref id="CR57">
<label>57.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Renner</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Waters</surname>
<given-names>ER</given-names>
</name>
</person-group>
<article-title>Comparative genomic analysis of the Hsp70s from five diverse photosynthetic eukaryotes</article-title>
<source>Cell stress Chaperones</source>
<year>2007</year>
<volume>12</volume>
<fpage>172</fpage>
<lpage>185</lpage>
<pub-id pub-id-type="doi">10.1379/CSC-230R1.1</pub-id>
<pub-id pub-id-type="pmid">17688196</pub-id>
</element-citation>
</ref>
<ref id="CR58">
<label>58.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katoh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Standley</surname>
<given-names>DM</given-names>
</name>
</person-group>
<article-title>MAFFT multiple sequence alignment software version 7: Improvements in performance and usability</article-title>
<source>Mol Biol Evol</source>
<year>2013</year>
<volume>30</volume>
<fpage>772</fpage>
<lpage>780</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/mst010</pub-id>
<pub-id pub-id-type="pmid">23329690</pub-id>
</element-citation>
</ref>
<ref id="CR59">
<label>59.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Capella-Gutierrez</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Silla-Martinez</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Gabaldon</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<fpage>1972</fpage>
<lpage>1973</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp348</pub-id>
<pub-id pub-id-type="pmid">19505945</pub-id>
</element-citation>
</ref>
<ref id="CR60">
<label>60.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Criscuolo</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Gribaldo</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments</article-title>
<source>BMC Evol Biol</source>
<year>2010</year>
<volume>10</volume>
<fpage>210</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2148-10-210</pub-id>
<pub-id pub-id-type="pmid">20626897</pub-id>
</element-citation>
</ref>
<ref id="CR61">
<label>61.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brown</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Sharpe</surname>
<given-names>SC</given-names>
</name>
<name>
<surname>Silberman</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Heiss</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Lang</surname>
<given-names>BF</given-names>
</name>
<name>
<surname>Simpson</surname>
<given-names>AGB</given-names>
</name>
<name>
<surname>Roger</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>Phylogenomics demonstrates that breviate flagellates are related to opisthokonts and apusomonads</article-title>
<source>Proc Biol Sci</source>
<year>2013</year>
<volume>280</volume>
<fpage>20131755</fpage>
<pub-id pub-id-type="doi">10.1098/rspb.2013.1755</pub-id>
<pub-id pub-id-type="pmid">23986111</pub-id>
</element-citation>
</ref>
<ref id="CR62">
<label>62.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Burki</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Shalchian-Tabrizi</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Minge</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Skjaeveland</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Nikolaev</surname>
<given-names>SI</given-names>
</name>
<name>
<surname>Jakobsen</surname>
<given-names>KS</given-names>
</name>
<name>
<surname>Pawlowski</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Phylogenomics reshuffles the eukaryotic supergroups</article-title>
<source>Plos One</source>
<year>2007</year>
<volume>2</volume>
<fpage>e790</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0000790</pub-id>
<pub-id pub-id-type="pmid">17726520</pub-id>
</element-citation>
</ref>
<ref id="CR63">
<label>63.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Burki</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Okamoto</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Pombert</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Keeling</surname>
<given-names>PJ</given-names>
</name>
</person-group>
<article-title>The evolutionary history of haptophytes and cryptophytes: phylogenomic evidence for separate origins</article-title>
<source>Proc Biol Sci</source>
<year>2012</year>
<volume>279</volume>
<fpage>2246</fpage>
<lpage>2254</lpage>
<pub-id pub-id-type="doi">10.1098/rspb.2011.2301</pub-id>
<pub-id pub-id-type="pmid">22298847</pub-id>
</element-citation>
</ref>
<ref id="CR64">
<label>64.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hampl</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Hug</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Leigh</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Dacks</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Lang</surname>
<given-names>BF</given-names>
</name>
<name>
<surname>Simpson</surname>
<given-names>AGB</given-names>
</name>
<name>
<surname>Roger</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>Phylogenomic analyses support the monophyly of excavata and resolve relationships among eukaryotic “supergroups”</article-title>
<source>P Natl Acad Sci USA</source>
<year>2009</year>
<volume>106</volume>
<fpage>3859</fpage>
<lpage>3864</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0807880106</pub-id>
</element-citation>
</ref>
<ref id="CR65">
<label>65.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brown</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Kolisko</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Silberman</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Roger</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>Aggregative multicellularity evolved independently in the eukaryotic supergroup rhizaria</article-title>
<source>Curr Biol</source>
<year>2012</year>
<volume>22</volume>
<fpage>1123</fpage>
<lpage>1127</lpage>
<pub-id pub-id-type="doi">10.1016/j.cub.2012.04.021</pub-id>
<pub-id pub-id-type="pmid">22608512</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000167 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000167 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4035089
   |texte=   Nucleomorph and plastid genome sequences of the chlorarachniophyte Lotharella oceanica: convergent reductive evolution and frequent recombination in nucleomorph-bearing algae
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:24885563" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024