Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A novel group of diverse Polinton-like viruses discovered by metagenome analysis

Identifieur interne : 000161 ( Pmc/Corpus ); précédent : 000160; suivant : 000162

A novel group of diverse Polinton-like viruses discovered by metagenome analysis

Auteurs : Natalya Yutin ; Sofiya Shevchenko ; Vladimir Kapitonov ; Mart Krupovic ; Eugene V. Koonin

Source :

RBID : PMC:4642659

Abstract

Background

The rapidly growing metagenomic databases provide increasing opportunities for computational discovery of new groups of organisms. Identification of new viruses is particularly straightforward given the comparatively small size of viral genomes, although fast evolution of viruses complicates the analysis of novel sequences. Here we report the metagenomic discovery of a distinct group of diverse viruses that are distantly related to the eukaryotic virus-like transposons of the Polinton superfamily.

Results

The sequence of the putative major capsid protein (MCP) of the unusual linear virophage associated with Phaeocystis globosa virus (PgVV) was used as a bait to identify potential related viruses in metagenomic databases. Assembly of the contigs encoding the PgVV MCP homologs followed by comprehensive sequence analysis of the proteins encoded in these contigs resulted in the identification of a large group of Polinton-like viruses (PLV) that resemble Polintons (polintoviruses) and virophages in genome size, and share with them a conserved minimal morphogenetic module that consists of major and minor capsid proteins and the packaging ATPase. With a single exception, the PLV lack the retrovirus-type integrase that is encoded in the genomes of all Polintons and the Mavirus group of virophages. However, some PLV encode a newly identified tyrosine recombinase-integrase that is common in bacteria and bacteriophages and is also found in the Organic Lake virophage group. Although several PLV genomes and individual genes are integrated into algal genomes, it appears likely that most of the PLV are viruses. Given the absence of protease and retrovirus-type integrase, the PLV could resemble the ancestral polintoviruses that evolved from bacterial tectiviruses. Apart from the conserved minimal morphogenetic module, the PLV widely differ in their genome complements but share a gene network with Polintons and virophages, suggestive of multiple gene exchanges within a shared gene pool.

Conclusions

The discovery of PLV substantially expands the emerging class of eukaryotic viruses and transposons that also includes Polintons and virophages. This class of selfish elements is extremely widespread and might have been a hotbed of eukaryotic virus, transposon and plasmid evolution. New families of these elements are expected to be discovered.

Electronic supplementary material

The online version of this article (doi:10.1186/s12915-015-0207-4) contains supplementary material, which is available to authorized users.


Url:
DOI: 10.1186/s12915-015-0207-4
PubMed: 26560305
PubMed Central: 4642659

Links to Exploration step

PMC:4642659

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A novel group of diverse Polinton-like viruses discovered by metagenome analysis</title>
<author>
<name sortKey="Yutin, Natalya" sort="Yutin, Natalya" uniqKey="Yutin N" first="Natalya" last="Yutin">Natalya Yutin</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shevchenko, Sofiya" sort="Shevchenko, Sofiya" uniqKey="Shevchenko S" first="Sofiya" last="Shevchenko">Sofiya Shevchenko</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kapitonov, Vladimir" sort="Kapitonov, Vladimir" uniqKey="Kapitonov V" first="Vladimir" last="Kapitonov">Vladimir Kapitonov</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Krupovic, Mart" sort="Krupovic, Mart" uniqKey="Krupovic M" first="Mart" last="Krupovic">Mart Krupovic</name>
<affiliation>
<nlm:aff id="Aff2">Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Department of Microbiology, Institut Pasteur, Paris, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Koonin, Eugene V" sort="Koonin, Eugene V" uniqKey="Koonin E" first="Eugene V." last="Koonin">Eugene V. Koonin</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26560305</idno>
<idno type="pmc">4642659</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4642659</idno>
<idno type="RBID">PMC:4642659</idno>
<idno type="doi">10.1186/s12915-015-0207-4</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000161</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A novel group of diverse Polinton-like viruses discovered by metagenome analysis</title>
<author>
<name sortKey="Yutin, Natalya" sort="Yutin, Natalya" uniqKey="Yutin N" first="Natalya" last="Yutin">Natalya Yutin</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Shevchenko, Sofiya" sort="Shevchenko, Sofiya" uniqKey="Shevchenko S" first="Sofiya" last="Shevchenko">Sofiya Shevchenko</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kapitonov, Vladimir" sort="Kapitonov, Vladimir" uniqKey="Kapitonov V" first="Vladimir" last="Kapitonov">Vladimir Kapitonov</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Krupovic, Mart" sort="Krupovic, Mart" uniqKey="Krupovic M" first="Mart" last="Krupovic">Mart Krupovic</name>
<affiliation>
<nlm:aff id="Aff2">Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Department of Microbiology, Institut Pasteur, Paris, France</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Koonin, Eugene V" sort="Koonin, Eugene V" uniqKey="Koonin E" first="Eugene V." last="Koonin">Eugene V. Koonin</name>
<affiliation>
<nlm:aff id="Aff1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Biology</title>
<idno type="eISSN">1741-7007</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>The rapidly growing metagenomic databases provide increasing opportunities for computational discovery of new groups of organisms. Identification of new viruses is particularly straightforward given the comparatively small size of viral genomes, although fast evolution of viruses complicates the analysis of novel sequences. Here we report the metagenomic discovery of a distinct group of diverse viruses that are distantly related to the eukaryotic virus-like transposons of the Polinton superfamily.</p>
</sec>
<sec>
<title>Results</title>
<p>The sequence of the putative major capsid protein (MCP) of the unusual linear virophage associated with
<italic>Phaeocystis globosa</italic>
virus (PgVV) was used as a bait to identify potential related viruses in metagenomic databases. Assembly of the contigs encoding the PgVV MCP homologs followed by comprehensive sequence analysis of the proteins encoded in these contigs resulted in the identification of a large group of Polinton-like viruses (PLV) that resemble Polintons (polintoviruses) and virophages in genome size, and share with them a conserved minimal morphogenetic module that consists of major and minor capsid proteins and the packaging ATPase. With a single exception, the PLV lack the retrovirus-type integrase that is encoded in the genomes of all Polintons and the Mavirus group of virophages. However, some PLV encode a newly identified tyrosine recombinase-integrase that is common in bacteria and bacteriophages and is also found in the Organic Lake virophage group. Although several PLV genomes and individual genes are integrated into algal genomes, it appears likely that most of the PLV are viruses. Given the absence of protease and retrovirus-type integrase, the PLV could resemble the ancestral polintoviruses that evolved from bacterial tectiviruses. Apart from the conserved minimal morphogenetic module, the PLV widely differ in their genome complements but share a gene network with Polintons and virophages, suggestive of multiple gene exchanges within a shared gene pool.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The discovery of PLV substantially expands the emerging class of eukaryotic viruses and transposons that also includes Polintons and virophages. This class of selfish elements is extremely widespread and might have been a hotbed of eukaryotic virus, transposon and plasmid evolution. New families of these elements are expected to be discovered.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s12915-015-0207-4) contains supplementary material, which is available to authorized users.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Godzik, A" uniqKey="Godzik A">A Godzik</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simon, C" uniqKey="Simon C">C Simon</name>
</author>
<author>
<name sortKey="Daniel, R" uniqKey="Daniel R">R Daniel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nelson, Ke" uniqKey="Nelson K">KE Nelson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tuffin, M" uniqKey="Tuffin M">M Tuffin</name>
</author>
<author>
<name sortKey="Anderson, D" uniqKey="Anderson D">D Anderson</name>
</author>
<author>
<name sortKey="Heath, C" uniqKey="Heath C">C Heath</name>
</author>
<author>
<name sortKey="Cowan, Da" uniqKey="Cowan D">DA Cowan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ufarte, L" uniqKey="Ufarte L">L Ufarte</name>
</author>
<author>
<name sortKey="Potocki Veronese, G" uniqKey="Potocki Veronese G">G Potocki-Veronese</name>
</author>
<author>
<name sortKey="Laville, E" uniqKey="Laville E">E Laville</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yooseph, S" uniqKey="Yooseph S">S Yooseph</name>
</author>
<author>
<name sortKey="Sutton, G" uniqKey="Sutton G">G Sutton</name>
</author>
<author>
<name sortKey="Rusch, Db" uniqKey="Rusch D">DB Rusch</name>
</author>
<author>
<name sortKey="Halpern, Al" uniqKey="Halpern A">AL Halpern</name>
</author>
<author>
<name sortKey="Williamson, Sj" uniqKey="Williamson S">SJ Williamson</name>
</author>
<author>
<name sortKey="Remington, K" uniqKey="Remington K">K Remington</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sunagawa, S" uniqKey="Sunagawa S">S Sunagawa</name>
</author>
<author>
<name sortKey="Coelho, Lp" uniqKey="Coelho L">LP Coelho</name>
</author>
<author>
<name sortKey="Chaffron, S" uniqKey="Chaffron S">S Chaffron</name>
</author>
<author>
<name sortKey="Kultima, Jr" uniqKey="Kultima J">JR Kultima</name>
</author>
<author>
<name sortKey="Labadie, K" uniqKey="Labadie K">K Labadie</name>
</author>
<author>
<name sortKey="Salazar, G" uniqKey="Salazar G">G Salazar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kristensen, Dm" uniqKey="Kristensen D">DM Kristensen</name>
</author>
<author>
<name sortKey="Mushegian, Ar" uniqKey="Mushegian A">AR Mushegian</name>
</author>
<author>
<name sortKey="Dolja, Vv" uniqKey="Dolja V">VV Dolja</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosario, K" uniqKey="Rosario K">K Rosario</name>
</author>
<author>
<name sortKey="Breitbart, M" uniqKey="Breitbart M">M Breitbart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mokili, Jl" uniqKey="Mokili J">JL Mokili</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Dutilh, Be" uniqKey="Dutilh B">BE Dutilh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Diemer, Gs" uniqKey="Diemer G">GS Diemer</name>
</author>
<author>
<name sortKey="Stedman, Km" uniqKey="Stedman K">KM Stedman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roux, S" uniqKey="Roux S">S Roux</name>
</author>
<author>
<name sortKey="Enault, F" uniqKey="Enault F">F Enault</name>
</author>
<author>
<name sortKey="Bronner, G" uniqKey="Bronner G">G Bronner</name>
</author>
<author>
<name sortKey="Vaulot, D" uniqKey="Vaulot D">D Vaulot</name>
</author>
<author>
<name sortKey="Forterre, P" uniqKey="Forterre P">P Forterre</name>
</author>
<author>
<name sortKey="Krupovic, M" uniqKey="Krupovic M">M Krupovic</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krupovic, M" uniqKey="Krupovic M">M Krupovic</name>
</author>
<author>
<name sortKey="Zhi, N" uniqKey="Zhi N">N Zhi</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Hu, G" uniqKey="Hu G">G Hu</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
<author>
<name sortKey="Wong, S" uniqKey="Wong S">S Wong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dutilh, Be" uniqKey="Dutilh B">BE Dutilh</name>
</author>
<author>
<name sortKey="Cassman, N" uniqKey="Cassman N">N Cassman</name>
</author>
<author>
<name sortKey="Mcnair, K" uniqKey="Mcnair K">K McNair</name>
</author>
<author>
<name sortKey="Sanchez, Se" uniqKey="Sanchez S">SE Sanchez</name>
</author>
<author>
<name sortKey="Silva, Gg" uniqKey="Silva G">GG Silva</name>
</author>
<author>
<name sortKey="Boling, L" uniqKey="Boling L">L Boling</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mozar, M" uniqKey="Mozar M">M Mozar</name>
</author>
<author>
<name sortKey="Claverie, Jm" uniqKey="Claverie J">JM Claverie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kapitonov, Vv" uniqKey="Kapitonov V">VV Kapitonov</name>
</author>
<author>
<name sortKey="Jurka, J" uniqKey="Jurka J">J Jurka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jurka, J" uniqKey="Jurka J">J Jurka</name>
</author>
<author>
<name sortKey="Kapitonov, Vv" uniqKey="Kapitonov V">VV Kapitonov</name>
</author>
<author>
<name sortKey="Kohany, O" uniqKey="Kohany O">O Kohany</name>
</author>
<author>
<name sortKey="Jurka, Mv" uniqKey="Jurka M">MV Jurka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pritham, Ej" uniqKey="Pritham E">EJ Pritham</name>
</author>
<author>
<name sortKey="Putliwala, T" uniqKey="Putliwala T">T Putliwala</name>
</author>
<author>
<name sortKey="Feschotte, C" uniqKey="Feschotte C">C Feschotte</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feschotte, C" uniqKey="Feschotte C">C Feschotte</name>
</author>
<author>
<name sortKey="Pritham, Ej" uniqKey="Pritham E">EJ Pritham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Haapa Paananen, S" uniqKey="Haapa Paananen S">S Haapa-Paananen</name>
</author>
<author>
<name sortKey="Wahlberg, N" uniqKey="Wahlberg N">N Wahlberg</name>
</author>
<author>
<name sortKey="Savilahti, H" uniqKey="Savilahti H">H Savilahti</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krupovic, M" uniqKey="Krupovic M">M Krupovic</name>
</author>
<author>
<name sortKey="Bamford, Dh" uniqKey="Bamford D">DH Bamford</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krupovic, M" uniqKey="Krupovic M">M Krupovic</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wuitschick, Jd" uniqKey="Wuitschick J">JD Wuitschick</name>
</author>
<author>
<name sortKey="Gershan, Ja" uniqKey="Gershan J">JA Gershan</name>
</author>
<author>
<name sortKey="Lochowicz, Aj" uniqKey="Lochowicz A">AJ Lochowicz</name>
</author>
<author>
<name sortKey="Li, S" uniqKey="Li S">S Li</name>
</author>
<author>
<name sortKey="Karrer, Km" uniqKey="Karrer K">KM Karrer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
<author>
<name sortKey="Dolja, Vv" uniqKey="Dolja V">VV Dolja</name>
</author>
<author>
<name sortKey="Krupovic, M" uniqKey="Krupovic M">M Krupovic</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="La Scola, B" uniqKey="La Scola B">B La Scola</name>
</author>
<author>
<name sortKey="Desnues, C" uniqKey="Desnues C">C Desnues</name>
</author>
<author>
<name sortKey="Pagnier, I" uniqKey="Pagnier I">I Pagnier</name>
</author>
<author>
<name sortKey="Robert, C" uniqKey="Robert C">C Robert</name>
</author>
<author>
<name sortKey="Barrassi, L" uniqKey="Barrassi L">L Barrassi</name>
</author>
<author>
<name sortKey="Fournous, G" uniqKey="Fournous G">G Fournous</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Claverie, Jm" uniqKey="Claverie J">JM Claverie</name>
</author>
<author>
<name sortKey="Abergel, C" uniqKey="Abergel C">C Abergel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Desnues, C" uniqKey="Desnues C">C Desnues</name>
</author>
<author>
<name sortKey="Boyer, M" uniqKey="Boyer M">M Boyer</name>
</author>
<author>
<name sortKey="Raoult, D" uniqKey="Raoult D">D Raoult</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fischer, Mg" uniqKey="Fischer M">MG Fischer</name>
</author>
<author>
<name sortKey="Suttle, Ca" uniqKey="Suttle C">CA Suttle</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yutin, N" uniqKey="Yutin N">N Yutin</name>
</author>
<author>
<name sortKey="Raoult, D" uniqKey="Raoult D">D Raoult</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, J" uniqKey="Zhou J">J Zhou</name>
</author>
<author>
<name sortKey="Sun, D" uniqKey="Sun D">D Sun</name>
</author>
<author>
<name sortKey="Childers, A" uniqKey="Childers A">A Childers</name>
</author>
<author>
<name sortKey="Mcdermott, Tr" uniqKey="Mcdermott T">TR McDermott</name>
</author>
<author>
<name sortKey="Wang, Y" uniqKey="Wang Y">Y Wang</name>
</author>
<author>
<name sortKey="Liles, Mr" uniqKey="Liles M">MR Liles</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, J" uniqKey="Zhou J">J Zhou</name>
</author>
<author>
<name sortKey="Zhang, W" uniqKey="Zhang W">W Zhang</name>
</author>
<author>
<name sortKey="Yan, S" uniqKey="Yan S">S Yan</name>
</author>
<author>
<name sortKey="Xiao, J" uniqKey="Xiao J">J Xiao</name>
</author>
<author>
<name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author>
<name sortKey="Li, B" uniqKey="Li B">B Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yutin, N" uniqKey="Yutin N">N Yutin</name>
</author>
<author>
<name sortKey="Kapitonov, Vv" uniqKey="Kapitonov V">VV Kapitonov</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X Zhang</name>
</author>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
<author>
<name sortKey="Xiang, Y" uniqKey="Xiang Y">Y Xiang</name>
</author>
<author>
<name sortKey="Wong, J" uniqKey="Wong J">J Wong</name>
</author>
<author>
<name sortKey="Klose, T" uniqKey="Klose T">T Klose</name>
</author>
<author>
<name sortKey="Raoult, D" uniqKey="Raoult D">D Raoult</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Santini, S" uniqKey="Santini S">S Santini</name>
</author>
<author>
<name sortKey="Jeudy, S" uniqKey="Jeudy S">S Jeudy</name>
</author>
<author>
<name sortKey="Bartoli, J" uniqKey="Bartoli J">J Bartoli</name>
</author>
<author>
<name sortKey="Poirot, O" uniqKey="Poirot O">O Poirot</name>
</author>
<author>
<name sortKey="Lescot, M" uniqKey="Lescot M">M Lescot</name>
</author>
<author>
<name sortKey="Abergel, C" uniqKey="Abergel C">C Abergel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stepanova, Oa" uniqKey="Stepanova O">OA Stepanova</name>
</author>
<author>
<name sortKey="Boyko, Al" uniqKey="Boyko A">AL Boyko</name>
</author>
<author>
<name sortKey="Gordienko, Ai" uniqKey="Gordienko A">AI Gordienko</name>
</author>
<author>
<name sortKey="Sherban, Sa" uniqKey="Sherban S">SA Sherban</name>
</author>
<author>
<name sortKey="Shevchenko, Tp" uniqKey="Shevchenko T">TP Shevchenko</name>
</author>
<author>
<name sortKey="Polischuck, Vp" uniqKey="Polischuck V">VP Polischuck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stepanova, Oa" uniqKey="Stepanova O">OA Stepanova</name>
</author>
<author>
<name sortKey="Boiko, Al" uniqKey="Boiko A">AL Boiko</name>
</author>
<author>
<name sortKey="Shcherbatenko, Is" uniqKey="Shcherbatenko I">IS Shcherbatenko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pagarete, A" uniqKey="Pagarete A">A Pagarete</name>
</author>
<author>
<name sortKey="Grebert, T" uniqKey="Grebert T">T Grebert</name>
</author>
<author>
<name sortKey="Stepanova, O" uniqKey="Stepanova O">O Stepanova</name>
</author>
<author>
<name sortKey="Sandaa, Ra" uniqKey="Sandaa R">RA Sandaa</name>
</author>
<author>
<name sortKey="Bratbak, G" uniqKey="Bratbak G">G Bratbak</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Colson, P" uniqKey="Colson P">P Colson</name>
</author>
<author>
<name sortKey="Yutin, N" uniqKey="Yutin N">N Yutin</name>
</author>
<author>
<name sortKey="Shabalina, Sa" uniqKey="Shabalina S">SA Shabalina</name>
</author>
<author>
<name sortKey="Robert, C" uniqKey="Robert C">C Robert</name>
</author>
<author>
<name sortKey="Fournous, G" uniqKey="Fournous G">G Fournous</name>
</author>
<author>
<name sortKey="La Scola, B" uniqKey="La Scola B">B La Scola</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Das, B" uniqKey="Das B">B Das</name>
</author>
<author>
<name sortKey="Martinez, E" uniqKey="Martinez E">E Martinez</name>
</author>
<author>
<name sortKey="Midonet, C" uniqKey="Midonet C">C Midonet</name>
</author>
<author>
<name sortKey="Barre, Fx" uniqKey="Barre F">FX Barre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Farr, Ga" uniqKey="Farr G">GA Farr</name>
</author>
<author>
<name sortKey="Zhang, Lg" uniqKey="Zhang L">LG Zhang</name>
</author>
<author>
<name sortKey="Tattersall, P" uniqKey="Tattersall P">P Tattersall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cotmore, Sf" uniqKey="Cotmore S">SF Cotmore</name>
</author>
<author>
<name sortKey="Tattersall, P" uniqKey="Tattersall P">P Tattersall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Iyer, Lm" uniqKey="Iyer L">LM Iyer</name>
</author>
<author>
<name sortKey="Abhiman, S" uniqKey="Abhiman S">S Abhiman</name>
</author>
<author>
<name sortKey="Aravind, L" uniqKey="Aravind L">L Aravind</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hall, Rm" uniqKey="Hall R">RM Hall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dyda, F" uniqKey="Dyda F">F Dyda</name>
</author>
<author>
<name sortKey="Chandler, M" uniqKey="Chandler M">M Chandler</name>
</author>
<author>
<name sortKey="Hickman, Ab" uniqKey="Hickman A">AB Hickman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Desnues, C" uniqKey="Desnues C">C Desnues</name>
</author>
<author>
<name sortKey="La Scola, B" uniqKey="La Scola B">B La Scola</name>
</author>
<author>
<name sortKey="Yutin, N" uniqKey="Yutin N">N Yutin</name>
</author>
<author>
<name sortKey="Fournous, G" uniqKey="Fournous G">G Fournous</name>
</author>
<author>
<name sortKey="Robert, C" uniqKey="Robert C">C Robert</name>
</author>
<author>
<name sortKey="Azza, S" uniqKey="Azza S">S Azza</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Krupovic, M" uniqKey="Krupovic M">M Krupovic</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
<author>
<name sortKey="Madden, Tl" uniqKey="Madden T">TL Madden</name>
</author>
<author>
<name sortKey="Schaffer, Aa" uniqKey="Schaffer A">AA Schaffer</name>
</author>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J Zhang</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Altintas, I" uniqKey="Altintas I">I Altintas</name>
</author>
<author>
<name sortKey="Lin, A" uniqKey="Lin A">A Lin</name>
</author>
<author>
<name sortKey="Peltier, S" uniqKey="Peltier S">S Peltier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Resource Coordinators, Ncbi" uniqKey="Resource Coordinators N">NCBI Resource Coordinators</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Price, Mn" uniqKey="Price M">MN Price</name>
</author>
<author>
<name sortKey="Dehal, Ps" uniqKey="Dehal P">PS Dehal</name>
</author>
<author>
<name sortKey="Arkin, Ap" uniqKey="Arkin A">AP Arkin</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Morgulis, A" uniqKey="Morgulis A">A Morgulis</name>
</author>
<author>
<name sortKey="Coulouris, G" uniqKey="Coulouris G">G Coulouris</name>
</author>
<author>
<name sortKey="Raytselis, Y" uniqKey="Raytselis Y">Y Raytselis</name>
</author>
<author>
<name sortKey="Madden, Tl" uniqKey="Madden T">TL Madden</name>
</author>
<author>
<name sortKey="Agarwala, R" uniqKey="Agarwala R">R Agarwala</name>
</author>
<author>
<name sortKey="Schaffer, Aa" uniqKey="Schaffer A">AA Schaffer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marchler Bauer, A" uniqKey="Marchler Bauer A">A Marchler-Bauer</name>
</author>
<author>
<name sortKey="Zheng, C" uniqKey="Zheng C">C Zheng</name>
</author>
<author>
<name sortKey="Chitsaz, F" uniqKey="Chitsaz F">F Chitsaz</name>
</author>
<author>
<name sortKey="Derbyshire, Mk" uniqKey="Derbyshire M">MK Derbyshire</name>
</author>
<author>
<name sortKey="Geer, Ly" uniqKey="Geer L">LY Geer</name>
</author>
<author>
<name sortKey="Geer, Rc" uniqKey="Geer R">RC Geer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Soding, J" uniqKey="Soding J">J Soding</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pei, J" uniqKey="Pei J">J Pei</name>
</author>
<author>
<name sortKey="Kim, Bh" uniqKey="Kim B">BH Kim</name>
</author>
<author>
<name sortKey="Grishin, Nv" uniqKey="Grishin N">NV Grishin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, Rc" uniqKey="Edgar R">RC Edgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Capella Gutierrez, S" uniqKey="Capella Gutierrez S">S Capella-Gutierrez</name>
</author>
<author>
<name sortKey="Silla Martinez, Jm" uniqKey="Silla Martinez J">JM Silla-Martinez</name>
</author>
<author>
<name sortKey="Gabaldon, T" uniqKey="Gabaldon T">T Gabaldon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guindon, S" uniqKey="Guindon S">S Guindon</name>
</author>
<author>
<name sortKey="Gascuel, O" uniqKey="Gascuel O">O Gascuel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guindon, S" uniqKey="Guindon S">S Guindon</name>
</author>
<author>
<name sortKey="Dufayard, Jf" uniqKey="Dufayard J">JF Dufayard</name>
</author>
<author>
<name sortKey="Lefort, V" uniqKey="Lefort V">V Lefort</name>
</author>
<author>
<name sortKey="Anisimova, M" uniqKey="Anisimova M">M Anisimova</name>
</author>
<author>
<name sortKey="Hordijk, W" uniqKey="Hordijk W">W Hordijk</name>
</author>
<author>
<name sortKey="Gascuel, O" uniqKey="Gascuel O">O Gascuel</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Biol</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Biol</journal-id>
<journal-title-group>
<journal-title>BMC Biology</journal-title>
</journal-title-group>
<issn pub-type="epub">1741-7007</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">26560305</article-id>
<article-id pub-id-type="pmc">4642659</article-id>
<article-id pub-id-type="publisher-id">207</article-id>
<article-id pub-id-type="doi">10.1186/s12915-015-0207-4</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>A novel group of diverse Polinton-like viruses discovered by metagenome analysis</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Yutin</surname>
<given-names>Natalya</given-names>
</name>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Shevchenko</surname>
<given-names>Sofiya</given-names>
</name>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kapitonov</surname>
<given-names>Vladimir</given-names>
</name>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Krupovic</surname>
<given-names>Mart</given-names>
</name>
<xref ref-type="aff" rid="Aff2"></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Koonin</surname>
<given-names>Eugene V.</given-names>
</name>
<address>
<email>koonin@ncbi.nlm.nih.gov</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<aff id="Aff1">
<label></label>
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA</aff>
<aff id="Aff2">
<label></label>
Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Department of Microbiology, Institut Pasteur, Paris, France</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>11</day>
<month>11</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>11</day>
<month>11</month>
<year>2015</year>
</pub-date>
<pub-date pub-type="collection">
<year>2015</year>
</pub-date>
<volume>13</volume>
<elocation-id>95</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>8</month>
<year>2015</year>
</date>
<date date-type="accepted">
<day>28</day>
<month>10</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>© Yutin et al. 2015</copyright-statement>
<license license-type="OpenAccess">
<license-p>
<bold>Open Access</bold>
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<sec>
<title>Background</title>
<p>The rapidly growing metagenomic databases provide increasing opportunities for computational discovery of new groups of organisms. Identification of new viruses is particularly straightforward given the comparatively small size of viral genomes, although fast evolution of viruses complicates the analysis of novel sequences. Here we report the metagenomic discovery of a distinct group of diverse viruses that are distantly related to the eukaryotic virus-like transposons of the Polinton superfamily.</p>
</sec>
<sec>
<title>Results</title>
<p>The sequence of the putative major capsid protein (MCP) of the unusual linear virophage associated with
<italic>Phaeocystis globosa</italic>
virus (PgVV) was used as a bait to identify potential related viruses in metagenomic databases. Assembly of the contigs encoding the PgVV MCP homologs followed by comprehensive sequence analysis of the proteins encoded in these contigs resulted in the identification of a large group of Polinton-like viruses (PLV) that resemble Polintons (polintoviruses) and virophages in genome size, and share with them a conserved minimal morphogenetic module that consists of major and minor capsid proteins and the packaging ATPase. With a single exception, the PLV lack the retrovirus-type integrase that is encoded in the genomes of all Polintons and the Mavirus group of virophages. However, some PLV encode a newly identified tyrosine recombinase-integrase that is common in bacteria and bacteriophages and is also found in the Organic Lake virophage group. Although several PLV genomes and individual genes are integrated into algal genomes, it appears likely that most of the PLV are viruses. Given the absence of protease and retrovirus-type integrase, the PLV could resemble the ancestral polintoviruses that evolved from bacterial tectiviruses. Apart from the conserved minimal morphogenetic module, the PLV widely differ in their genome complements but share a gene network with Polintons and virophages, suggestive of multiple gene exchanges within a shared gene pool.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The discovery of PLV substantially expands the emerging class of eukaryotic viruses and transposons that also includes Polintons and virophages. This class of selfish elements is extremely widespread and might have been a hotbed of eukaryotic virus, transposon and plasmid evolution. New families of these elements are expected to be discovered.</p>
</sec>
<sec>
<title>Electronic supplementary material</title>
<p>The online version of this article (doi:10.1186/s12915-015-0207-4) contains supplementary material, which is available to authorized users.</p>
</sec>
</abstract>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2015</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1" sec-type="introduction">
<title>Background</title>
<p>Metagenomic sequences are a treasure trove of novel genes and genomes [
<xref ref-type="bibr" rid="CR1">1</xref>
<xref ref-type="bibr" rid="CR5">5</xref>
]. At present, after the release of the massive data from the Global Ocean Survey [
<xref ref-type="bibr" rid="CR6">6</xref>
] and especially the more recent Tara project [
<xref ref-type="bibr" rid="CR7">7</xref>
], the amount of metagenomic sequences already substantially exceeds the size of the regular sequence databases such as GenBank. Perhaps even more importantly, metagenomic data sets have not only a quantitative but also a qualitative advantage over genomic databases. Metagenomes are not constrained by the inherent requirement of traditional sequence databases that for genome sequencing, an organism has to be clearly identified and, at least in the case of microbes, grown in the laboratory. Metagenomes, at least in principle, are unbiased representations of the environs from which they originate (apart from possible sequencing biases). Herein, also lies the intrinsic weakness of metagenomics: technically, it is never known which organism a given sequence comes from. Sequences with high similarity to those from known organisms are readily assignable but those from truly novel genomes can be difficult to place. A further difficulty for metagenomic discovery is the correct assembly of long contigs. Complete assembly of a typical bacterial or archaeal genome of several megabases is usually impractical, even when extremely deep sequencing is performed, and assignment of multiple contigs to a single genome is a separate, non-trivial problem.</p>
<p>The difficulties of metagenomic genome discovery are alleviated to a considerable extent when it comes to discovery of new virus genomes [
<xref ref-type="bibr" rid="CR8">8</xref>
<xref ref-type="bibr" rid="CR10">10</xref>
]. Viral genomes are relatively small, for many important and abundant virus families, only several kilobases (kb) long, so that assembly of complete viral genome is often feasible. Furthermore, viruses possess signature genes, such as those encoding capsid protein, that provide for assignment of novel genomes to a particular family of viruses even in the absence of close relatives. The flip side of the coin is that viruses typically evolve much faster than cellular life forms so that finding homologs of virus genes is often challenging. Hence the special importance of advanced methods for sequence and structure analysis in virus discovery.</p>
<p>Virus metagenomics is a young research direction but can already boast several stories of conspicuous success in the discovery of novel groups of (putative) viruses with unexpected features. Among the most notable ones is the identification of the class of chimeric single-stranded (ss) DNA viruses that combine genes from known families of positive-strand RNA viruses and ssDNA viruses. The first discovery of a chimeric virus [
<xref ref-type="bibr" rid="CR11">11</xref>
] was followed by a systematic effort on metagenome mining, which led to the identification of several diverse groups of naturally chimeric virus genomes [
<xref ref-type="bibr" rid="CR12">12</xref>
], and finally, by a serendipitous discovery of an actual virus with a chimeric genome [
<xref ref-type="bibr" rid="CR13">13</xref>
]. These findings have substantially enriched the available collection of ssDNA virus genomes, but more importantly, have changed the existing picture of the evolution of this class of viruses. An equally spectacular discovery of metagenomics is the identification of a novel bacteriophage that is more abundant than any other phage in the human gut but has remained unnoticed until metagenome mining has become highly efficient [
<xref ref-type="bibr" rid="CR14">14</xref>
]. Yet another notable case in point is the use of a signature enzyme of the giant viruses in the family
<italic>Mimiviridae</italic>
, glutamine-hydrolyzing asparagine synthase, as the bait to search metagenomic databases, resulting in a substantial expansion of this virus family [
<xref ref-type="bibr" rid="CR15">15</xref>
]. There are more examples of successful application of metagenome mining for the discovery of new viruses but the above should suffice to illustrate the utility and promise of this approach. Although virus metagenomics cannot provide direct data on the structure and functionality of the discovered viruses, through genome analysis, it yields a wealth of predictions that are amenable to experimental validation.</p>
<p>We are interested in a class of viruses and virus-like transposable elements that includes Polintons (Mavericks) and virophages. Polintons are large (genomes of 15 to 20 kb, the largest among the eukaryotic transposons), self-replicating transposable elements that are integrated into the genomes of diverse unicellular and multicellular eukaryotes in highly variable numbers of copies [
<xref ref-type="bibr" rid="CR16">16</xref>
<xref ref-type="bibr" rid="CR20">20</xref>
]. All Polintons encode a protein-primed DNA polymerase and a retrovirus-like integrase (hence the name of these elements: POLINTons). The majority of Polintons also encode a homolog of the DNA-packaging ATPase and maturation protease that are characteristic of diverse double-stranded (ds) DNA viruses. Thus, Polintons have been often considered virus-like transposons although no structural proteins have been initially detected. To resolve this conundrum, we have recently performed an exhaustive computational analysis of the Polinton-encoded protein sequences and have shown that most of the Polintons encode putative major and minor capsid proteins (MCP and mCP, respectively) [
<xref ref-type="bibr" rid="CR21">21</xref>
]. Although sequence similarity between these proteins and capsid proteins of other viruses, such as the large and giant viruses that constitute the putative order “Megavirales”, is low, homology modeling indicates that MCP and mCP respectively adopt intact double and single jelly-roll folds, strongly suggesting that Polintons are capable of forming virions [
<xref ref-type="bibr" rid="CR21">21</xref>
]. Thus, we have hypothesized that Polintons lead a dual life style and, in addition to behaving like typical transposons, are capable of forming virions, hence the proposed name polintoviruses [
<xref ref-type="bibr" rid="CR21">21</xref>
,
<xref ref-type="bibr" rid="CR22">22</xref>
]. However, the actual polintovirus virions remain to be identified and therefore below we stick to the term Polintons unless putative virus particles are explicitly considered.</p>
<p>A distinct Polinton-like transposable element called Tlr1 is present in the genome of the ciliate
<italic>Tetrahymena thermophila</italic>
[
<xref ref-type="bibr" rid="CR23">23</xref>
]. Although Tlr1 does not encode the DNA polymerase, it shares with
<italic>bona fide</italic>
Polintons the retroviral-type integrase (RVE), DNA-packaging ATPase and both capsid proteins [
<xref ref-type="bibr" rid="CR21">21</xref>
]. In addition, Tlr1 encodes several other proteins, including a PIF1-like superfamily 1 helicase, that are shared with some members of the proposed order “Megavirales” [
<xref ref-type="bibr" rid="CR22">22</xref>
,
<xref ref-type="bibr" rid="CR23">23</xref>
].</p>
<p>Extensive comparative analysis of the genomes of dsDNA viruses infecting eukaryotes as well as dsDNA plasmids and transposons has led to an evolutionary scenario in which polintoviruses evolved directly from bacteriophages of the family
<italic>Tectiviridae</italic>
and played a central role in the origin and evolution of diverse selfish elements in eukaryotes including giant viruses of the proposed order “Megavirales” [
<xref ref-type="bibr" rid="CR22">22</xref>
,
<xref ref-type="bibr" rid="CR24">24</xref>
].</p>
<p>The group of viruses that arguably includes the closest relatives of the Polintons are the virophages, unusual small DNA viruses that parasitize on giant viruses of the family
<italic>Mimiviridae</italic>
[
<xref ref-type="bibr" rid="CR25">25</xref>
<xref ref-type="bibr" rid="CR27">27</xref>
]. The genome size and organization of the virophages show striking similarity to the Polintons (polintoviruses) except that the virophage genomes are apparently circular [
<xref ref-type="bibr" rid="CR28">28</xref>
,
<xref ref-type="bibr" rid="CR29">29</xref>
]. The virophage genomes are about 20 to 30 kb in size and encode the MCP and mCP, the packaging ATPase and the maturation protease along with a heterogeneous set of other genes [
<xref ref-type="bibr" rid="CR29">29</xref>
]. Mavirus-like virophages also encompass genes for a protein-primed DNA polymerase (pDNAP) and a RVE integrase [
<xref ref-type="bibr" rid="CR28">28</xref>
]. Thus, the Maviruses effectively qualify as polintoviruses except that so far integration into the host genome has not been demonstrated.</p>
<p>Originally, virophages have been isolated as virus particles from preparations of giant viruses. Efforts on discovery of new virophages in metagenomic databases have yielded several relatives of the previously characterized virophages, primarily from the thermal Yellowstone Lake [
<xref ref-type="bibr" rid="CR30">30</xref>
,
<xref ref-type="bibr" rid="CR31">31</xref>
]. In addition, our previous search of the metagenomic sequences available in GenBank has led to the identification of a putative novel group of virophages in the sheep rumen metagenome [
<xref ref-type="bibr" rid="CR32">32</xref>
]. Notably, these rumen virophages (RVP), in addition to the typical virophage major (but not minor) capsid protein, ATPase and protease, encode a Polinton-type pDNAP and thus appear to be hybrids between virophages and Polintons.</p>
<p>So far all searches for putative new virophages in metagenomic sequences have employed the virophage MCP as the initial bait [
<xref ref-type="bibr" rid="CR30">30</xref>
<xref ref-type="bibr" rid="CR32">32</xref>
]. The virophage MCP is a structurally highly derived version of the double jelly-roll fold, to the extent that sequence similarity with other capsid proteins, including those from Polintons, is virtually undetectable [
<xref ref-type="bibr" rid="CR33">33</xref>
]. Therefore, these searches appear to be inherently limited with respect to the range of viruses that can be detected, i.e. are likely to identify only viruses that possess the virophage variety of MCP.</p>
<p>An unusual putative virophage has been discovered in DNA preparations of
<italic>Phaeocystis globosa</italic>
virus (PGV) infecting an abundant marine haptophyte (chromist alga) [
<xref ref-type="bibr" rid="CR34">34</xref>
]. The PGV virophage (PgVV) has a linear genome of approximately 20 kb containing long terminal inverted repeats (TIR) and apparently can integrate into the PGV genome. Only three PgVV genes have been reported to encode proteins sharing significant sequence similarity to virophage proteins, namely a predicted primase, an endonuclease and an uncharacterized protein [
<xref ref-type="bibr" rid="CR34">34</xref>
]. However, in the process of searching for potential capsid proteins of Polintons, we identified a candidate MCP of PgVV, a distant member of the polintovirus MCP family [
<xref ref-type="bibr" rid="CR21">21</xref>
]. Given the distinct features of PgVV that differentiate it from both the characterized virophages and Polintons, we performed exhaustive searches of genomic and metagenomic sequences using the putative PgVV MCP as a bait. These searches, followed by extensive analysis of the retrieved contigs, yielded a diverse group of putative novel viruses.</p>
</sec>
<sec id="Sec2" sec-type="results">
<title>Results</title>
<sec id="Sec3">
<title>Sequence database screening for putative PgVV-like virus genomes</title>
<p>The following strategy was developed for comprehensive identification of the putative viruses encoding a PgVV-like putative MCP in genomic and metagenomic databases (see
<xref rid="Sec8" ref-type="sec">Methods</xref>
for details). The sequence of the predicted MCP of PgVV was first used as a query in a PSI-BLAST search of the non-redundant protein sequence (nr) database at the NCBI. This search detected homologs of the PgVV MCP in five eukaryotic genomes: the algae
<italic>Monoraphidium neglectum</italic>
(gi|761972244);
<italic>Aureococcus anophagefferens</italic>
(gi|676398223);
<italic>Chlamydomonas reinhardtii</italic>
(gi|159489398); and
<italic>Guillardia theta</italic>
CCMP2712 (gi|551629560, gi|551640847, gi|551645334) (three homologs in the latter genome); and the carnivorous plant
<italic>Genlisea aurea</italic>
(gi|527182119). Examination of the genomic surroundings of the putative MCP genes resulted in the identification of several genes typical of Polintons in
<italic>M. neglectum</italic>
and
<italic>G. theta</italic>
(Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
; see details below), whereas the other three genomes contained solitary MCP genes, conceivably remnants of genome invasion by PgVV-related elements. All PgVV-like MCP proteins were used as queries for TBLASTN searches against two metagenomic databases, CAMERA and marine metagenomes. All hits were collected, translated (the camera hits were assembled before translation), and the resulting protein sequences were employed as queries to search the nr database using BLASTP. Those metagenomic sequences, for which the encoded protein produced the best hits to one of the query MCP proteins, were retained. The collected metagenomic sequences were extended whenever possible using alternating cycles of BLASTN searches and assembly with Geneious (see
<xref rid="Sec8" ref-type="sec">Methods</xref>
for details). This procedure allowed us to extend even some of the Tara Oceans sequences that have been assembled prior to database submission.
<fig id="Fig1">
<label>Fig. 1</label>
<caption>
<p>Genome architectures of the Polinton-like viruses (PLV): complete genomes from identified sources. Genes are shown roughly to scale. Homologous genes are color-coded as shown in the inset. Homologous genes without predictable function (activity) are marked by same letters. Arrows represent terminal inverted repeats; their lengths and percent identity are indicated near the rightmost repeat. AEP, archaeo-eukaryotic primase; Dcm, methyltransferase of the Dcm family; GIY, GIY-YIG family nuclease; MCP, major capsid protein; mCP, minor capsid protein; PolB, protein-primed polymerase of family B; primpol, primase-polymerase; S1H, superfamily 1 helicase (distinct from the Tlr helicase in Figs. 
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
); S3H, superfamily 3 helicase; TVpol, transposon-viral polymerase; Yrec (OLV11), OLV11-like tyrosine recombinase (integrated PLV from
<italic>G. theta</italic>
encode a distinct tyrosine recombinase only distantly related to the OLV11-like family)</p>
</caption>
<graphic xlink:href="12915_2015_207_Fig1_HTML" id="MO1"></graphic>
</fig>
</p>
<p>The sequences of the PgVV MCP homologs that were collected through the iterated procedure described above were used as queries in an exhaustive search for homologous MCP in metagenomes. Specifically, 20 diverse representatives were chosen using BLASTClust; for each of these, the TBLASTN search (e-value ≤10) against marine metagenomes was repeated. All contigs with hits were translated and subjected to two searches, namely BLASTP against nr and a profile-based version of TBLASTN against the marine metagenomes, with the position-specific scoring matrix derived from the alignment of all detected MCPs used as the query. The contigs encoding proteins with best hits to one of the identified MCPs or matching the MCP profile, were collected. This procedure yielded nearly 300 marine metagenome contigs encoding putative PgVV-like MCP. The 20 longer contigs that contained additional genes homologous to genes of polintoviruses or virophages were examined in detail (see Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
), primarily by exhaustive analysis of the encoded protein sequences using sensitive database search methods such as PSI-BLAST and HHpred.</p>
</sec>
<sec id="Sec4">
<title>Genomic architectures and gene complements of putative novel Polinton-like viruses</title>
<p>Comprehensive analysis of the putative proteins encoded by the genes in the neighborhoods of the detected PgVV-like MCP genes involved PSI-BLAST search against the nr database, HHpred search against the Pfam and Interpro databases, as well as custom profiles for poorly conserved virus proteins, e.g. mCP. These searches resulted in the determination of the evolutionary provenance and functional predictions for many genes and suggest the existence of a broad range of diverse Polinton-like viruses (PLV) (Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
). Many of the PLV contigs contain long TIR, resembling the genome structure of PgVV (Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
), suggesting that these are complete genomes within the polintovirus and virophage genome size range, namely between 18 and 28 kb. Notably, one PLV contig contains long direct terminal repeats suggestive of a circular genome (Fig. 
<xref rid="Fig3" ref-type="fig">3</xref>
). Four of the PLV genomes are integrated in sequenced algal genomes, as indicated by the identification of sequences spanning the junctions between the putative viral and host genomes. Three such elements were detected in the
<italic>G. theta</italic>
genome and one in the genome of
<italic>M. neglectum</italic>
(Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
). Notably, another PLV genome belongs to a poorly characterized spherical virus that has been isolated from the marine algae
<italic>Tetraselmis viridis</italic>
and
<italic>T. striata</italic>
[
<xref ref-type="bibr" rid="CR35">35</xref>
<xref ref-type="bibr" rid="CR37">37</xref>
]. The rest of the PLV genomes are assembled metagenomic contigs and thus their hosts cannot be taxonomically assigned.
<fig id="Fig2">
<label>Fig. 2</label>
<caption>
<p>Genome architectures of the Polinton-like viruses: genomic contigs extracted and assembled from metagenomes, the PgVV-like group. Tlr1 helicase, a superfamily 1 helicase similar to the helicase encoded by the Polinton-like element Tlr1 from
<italic>Tetrahymena</italic>
. The other designations and color coding are the same as in Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
</p>
</caption>
<graphic xlink:href="12915_2015_207_Fig2_HTML" id="MO2"></graphic>
</fig>
<fig id="Fig3">
<label>Fig. 3</label>
<caption>
<p>Genome architectures of the Polinton-like viruses: genomic contigs extracted and assembled from metagenomes, the TVS-like group and unclassified contigs. The designations and color coding are the same as in Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
</p>
</caption>
<graphic xlink:href="12915_2015_207_Fig3_HTML" id="MO3"></graphic>
</fig>
</p>
<p>The PLV were detected in samples from various locations and filtering fractions but primarily in the virus fraction, i.e. below 0.22 micrometer (see Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
). Importantly, although the assembly was performed “blindly”, i.e. from any segments present in the metagenomic database, nearly all of the assembled genomes of the putative PLV consisted of sequences originating from the same sampling location (Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
), compatible with the validity of the assembly. Therefore, we named the PLV according to the sampling location: ACE, Ace Lake (Antarctica); INO, Indian Ocean; MED, Mediterranean; RED, Red Sea; SAF, South Africa; SPO, South Pacific Ocean; and YSL, Yellowstone Lakes.</p>
<p>By the design of the search procedure, all PLV genomes encode an MCP. Several of the PLV show duplication or even triplication of the MCP gene that previously has not been detected in Polintons or virophages (Figs. 
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
) but has been observed in mimiviruses [
<xref ref-type="bibr" rid="CR38">38</xref>
]. In addition, most of the PLV genomes, and in particular, all that contain TIR, suggestive of completeness, encode a mCP and a packaging ATPase (Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
). These three genes represent the (nearly) universal core of PLV genes that encode components of the virus morphogenetic module. The identification of the mCP merits special comment. The mCP sequences are extremely poorly conserved so that the prediction of the mCP in Polintons required multiple, exhaustive database searches. Even then, no mCP has been identified in PgVV. However, multiple PSI-BLAST iterations with all the predicted proteins of PLV yielded several connections with the predicted mCP of Polintons. For example, a PSI-BLAST search initiated with the sequence of a hypothetical protein encoded by gene 7 of
<italic>G. theta</italic>
element 2 showed a statistically significant similarity (e-value <0.001) to the predicted mCPs of several Polintons, as well as “Megavirales” starting with the second search iteration. The proteins with significant similarity to the predicted Polinton mCP are conserved in nearly all PLV (Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
). Taken together, these observations indicate that (nearly) all PLV encode both MCP and mCP.</p>
<p>Notably, none of the PLV encodes the capsid maturation protease that is present in all virophages and nearly all Polintons. Apart from the three core genes, the gene distribution in the PLV is patchy (Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
). Several PLV encode a pDNAP but only the integrated element from
<italic>M. neglectum</italic>
encodes a RVE integrase. Thus, in general, the PLV do not qualify as Polintons in which the DNAP and RVE integrase are universal signatures. The element from
<italic>M. neglectum</italic>
seems to be the only exception as it encodes both pDNAP and RVE, and thus could be considered a Polinton although its MCP is clearly more similar to those of the PLV.</p>
<p>Several other genes of the PLV are shared with subsets of Polintons and/or virophages including putative primase-superfamily 3 helicase, superfamily 1 helicase, GIY-YIG endonuclease, lipase and two uncharacterized genes, Tlr6f and Organic Lake virophage gene 11 product, OLV11 (Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
). Unexpectedly, our database searches showed that the OLV11 protein that is conserved in several of the PLV and the Organic Lake group of virophages is a member of the tyrosine recombinase superfamily that is extremely widespread in bacteria, archaea and their viruses, as well as bacterial and some eukaryotic transposons [
<xref ref-type="bibr" rid="CR39">39</xref>
] (Additional file
<xref rid="MOESM3" ref-type="media">3</xref>
). OLV11-like recombinases are also encoded by PgVV and Aureococcus anophagefferens virus (gi|672551235; 2nd PSI-BLAST iteration, E = 2e-04), a member of the “Megavirales”. The three elements integrated in the
<italic>G. theta</italic>
genome encode a distinct subgroup of tyrosine recombinases (Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
). Finally, Sputnik-like virophages encode tyrosine recombinases [
<xref ref-type="bibr" rid="CR25">25</xref>
] that are not recognizably similar to either OLV11-like predicted recombinases or the recombinases encoded by the
<italic>G. theta</italic>
PLV. These findings suggest that different PLV, virophages and Polintons employ unrelated or distantly related enzymes for integration into the host genomes and have acquired the corresponding recombinase genes on multiple, independent occasions. A new common theme among the PLV is the presence of genes encoding predicted DNA methylases of the Dam and Dcm families in several genomes (Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
).</p>
<p>Of special note is the abundance of genes encoding diverse lipases in PLV and virophages. In particular, we identified a previously unnoticed putative lipase in the genome of PgVV (ORF6; HHpred hit to cd00519, Lipase_3, Probability = 96.8). Notably, some Polintons encode lipases most closely related to the PLA2 phospholipase domain that is tethered to the capsid proteins of parvoviruses. In a close parallel to the recombinases, these lipase genes appear to have been acquired independently on several occasions. By analogy with parvoviruses that employ the lipase activity to disrupt the cellular endosomal membrane during viral entry [
<xref ref-type="bibr" rid="CR40">40</xref>
,
<xref ref-type="bibr" rid="CR41">41</xref>
], it seems likely that the lipases of the PLV, Polintons (polintoviruses) and virophages are packed into the virions and facilitate virus penetration into the host cells.</p>
<p>The small protein Tlr6f is conserved in nearly all PLV, several virophages, numerous members of the “Megavirales” and some poorly characterized phages, suggestive of an important role in the reproduction of diverse viruses, but shows no detectable similarity to any domains with known structure or function. Another uncharacterized gene, provisionally denoted G in Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
, encodes a small protein without detectable similarity to any characterized domains that is conserved in a variety of PLV and also many phycodnaviruses and bacteria (Additional file
<xref rid="MOESM3" ref-type="media">3</xref>
).</p>
</sec>
<sec id="Sec5">
<title>Evolutionary relationships of the PLV genes</title>
<p>In an attempt to get insights into the evolution of the PLV, we analyzed phylogenetic trees for the MCP, the packaging ATPase and the pDNAP (the sequence conservation for the mCP is too low to produce a reliable phylogeny). The MCP sequences of the PLV were supplemented with a representative sample of the “solitary” MCPs that were detected in our search of the metagenomic and genomic sequences, and subjected to phylogenetic analysis jointly with a selection of the Polinton MCPs that appear to be the closest homologs identified in database searches (see Additional files
<xref rid="MOESM3" ref-type="media">3</xref>
and
<xref rid="MOESM4" ref-type="media">4</xref>
). The tree was rooted using the MCP sequences of phycodnaviruses and a mimivirus, apparently the next closest family [
<xref ref-type="bibr" rid="CR21">21</xref>
], as the outgroup. The resulting phylogenetic tree contains two major clades, one of which can be denoted the PgVV group and the other one the TVS group (Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
), after the best characterized representative,
<italic>Tetraselmis viridis</italic>
virus S1 (TVS1; Fig. 
<xref rid="Fig3" ref-type="fig">3</xref>
). The PgVV-like group includes all the PLV sequences shown to be integrated into algal genomes, as well as several solitary MCPs identified in genomic sequences from plants and algae. A third clade of the PLV (group X in Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
) is weakly affiliated with the TVS group. The monophyly of the PLV with respect to Polintons is moderately supported except that two Polintons from the sea anemone fall within the X group of PLV (Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
). These are the same Polinton MCPs that showed the highest similarity to the putative MCP of PgVV in the previous analysis [
<xref ref-type="bibr" rid="CR21">21</xref>
]. The respective Polintons might be chimeric elements encoding a PLV-type MCP along with pDNAP and RVE integrase characteristic of Polintons.
<fig id="Fig4">
<label>Fig. 4</label>
<caption>
<p>Phylogenetic tree of the major capsid proteins of Polinton-like viruses and related integrated elements. Proteins shown in Figs. 
<xref rid="Fig1" ref-type="fig">1</xref>
,
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
are marked in bold. Color code: blue, eukaryotes; green, bacteria; orange, viruses; and black, metagenomes. For protein sequences present in GenBank, the species name abbreviation and the protein identification numbers are indicated; sequences translated in this work are marked as “gene xx” followed by either contig number (for in-house assemblies) or nucleotide identification number (for GenBank metagenomic assemblies). Acapo,
<italic>Acanthamoeba polyphaga</italic>
mimivirus; Auran,
<italic>Aureococcus anophagefferens</italic>
virus; Chlre,
<italic>Chlamydomonas reinhardtii</italic>
; Genau,
<italic>Genlisea aurea</italic>
; Guil1, Guil2, Guil3,
<italic>Guillardia theta</italic>
elements 1, 2, 3, respectively; NV,
<italic>Nematostella vectensis</italic>
; PBCV-1,
<italic>Paramecium bursaria</italic>
Chlorella virus type 1; PGVV,
<italic>Phaeocystis globosa</italic>
virus virophage; Protbac,
<italic>Proteobacteria bacterium</italic>
JGI 0000113; TVS1,
<italic>Tetraselmis viridis</italic>
virus S1</p>
</caption>
<graphic xlink:href="12915_2015_207_Fig4_HTML" id="MO4"></graphic>
</fig>
</p>
<p>The phylogenetic tree of the packaging ATPases provides for inclusion of homologs from diverse sources, in particular Polintons, virophages and members of the “Megavirales” [
<xref ref-type="bibr" rid="CR22">22</xref>
]. However, the resolution power of the phylogenetic analysis in this case is low because the multiple alignment of the ATPases underlying the tree includes only 165 phylogenetically informative positions (see Additional files
<xref rid="MOESM3" ref-type="media">3</xref>
and
<xref rid="MOESM4" ref-type="media">4</xref>
). The topology of the ATPase tree is generally compatible with the MCP phylogeny in that both the PgVV and TVS groups come across as monophyletic (Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
). However, the monophyly of all PLV was not recovered because the members of the Group X of PLV clustered with Polintons and members of the “Megavirales”, and in addition, the TVS group clustered with poxviruses (Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
). It remains unclear whether these affiliations in the ATPase tree reflect independent acquisition or replacement of this gene in different groups of PLV or are due to long-branch attraction and other phylogenetic artifacts. Despite the consistent segregation of PgVV-like and TVS-like groups of PLV in phylogenetic analyses, the gene repertoires of the two groups are closely similar indicative of the coherence of the PLV as a whole (Figs. 
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
).
<fig id="Fig5">
<label>Fig. 5</label>
<caption>
<p>Phylogenetic tree of the maturation ATPases of Polinton-like viruses, Polintons, virophages and related elements. Branches with bootstrap support less than 50 were collapsed. Sequence labeling is the same as on Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
. Genau,
<italic>Genlisea aurea</italic>
; Guil1, Guil2, Guil3,
<italic>Guillardia theta</italic>
elements 1, 2, 3, respectively; HM,
<italic>Hydra magnipapillata</italic>
; Micpu,
<italic>Micromonas pusilla</italic>
; NV,
<italic>Nematostella vectensis</italic>
; PGVV,
<italic>Phaeocystis globosa</italic>
virus virophage; Polpa,
<italic>Polysphondylium pallidum</italic>
; Proba,
<italic>Proteobacteria bacterium</italic>
JGI 0000113-E04; Tetth,
<italic>Tetrahymena thermophila</italic>
; TVS1,
<italic>Tetraselmis viridis</italic>
virus S1; TVS,
<italic>Tetraselmis viridis</italic>
virus S1 group</p>
</caption>
<graphic xlink:href="12915_2015_207_Fig5_HTML" id="MO5"></graphic>
</fig>
</p>
<p>As observed previously for virophages and Polintons [
<xref ref-type="bibr" rid="CR29">29</xref>
], analysis of the PLV genes implicated in genome replication revealed complicated relationships suggestive of complex evolution. The pDNAPs of the PLV form three distinct clades (Fig. 
<xref rid="Fig6" ref-type="fig">6</xref>
; see Additional files
<xref rid="MOESM3" ref-type="media">3</xref>
and
<xref rid="MOESM4" ref-type="media">4</xref>
). The largest of these, denoted Group 1 in Fig. 
<xref rid="Fig6" ref-type="fig">6</xref>
, clusters with the pDNAPs of fungal cytoplasmic DNA plasmids. Group 2 is associated with the pDNAPs of mitochondrial plasmids, Mavirus-like virophages and a distinct subfamily of Polintons, whereas Group 3 is the sister group of the Polinton group 1 clade (Fig. 
<xref rid="Fig6" ref-type="fig">6</xref>
). Finally, the pDNAP of the
<italic>M. neglectum</italic>
element belongs to the Polinton group 2 rather than any of the PLV clades (Fig. 
<xref rid="Fig6" ref-type="fig">6</xref>
). Notably, the pDNAPs of Group 1 are represented both in the PgVV-like group and the TVS-like group of the PLV (Figs. 
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
). Each of these affinities has a strong bootstrap support (Fig. 
<xref rid="Fig6" ref-type="fig">6</xref>
). Thus, it appears most likely that the PLV acquired the pDNAP genes from Polintons and possibly from DNA plasmids on multiple occasions, resulting in the combination of distinct DNAPs with different morphogenetic modules.
<fig id="Fig6">
<label>Fig. 6</label>
<caption>
<p>Phylogenetic tree of protein-primed DNA polymerase Polinton-like viruses, Polintons, virophages and related elements. Sequence labeling is the same as in Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
. Acysu,
<italic>Acytostelium subglobosum</italic>
; Dicfa,
<italic>Dictyostelium fasciculatum</italic>
; Entin,
<italic>Entamoeba invadens</italic>
; Giain,
<italic>Giardia intestinalis</italic>
; HM,
<italic>Hydra magnipapillata</italic>
; Micpu,
<italic>Micromonas pusilla</italic>
; Morve,
<italic>Mortierella verticillata</italic>
; NV,
<italic>Nematostella vectensis</italic>
; Polpa,
<italic>Polysphondylium pallidum</italic>
; Triva,
<italic>Trichomonas vaginalis</italic>
</p>
</caption>
<graphic xlink:href="12915_2015_207_Fig6_HTML" id="MO6"></graphic>
</fig>
</p>
<p>Apart from the pDNAPs, several PLV encompass the gene for a superfamily 3 helicase (S3H) that in PgVV, PLV-YSL1 and PLV-RED1 is fused to a distinct homolog of bacterial DNA polymerase I, known as TVpol (transposon-virus polymerase). A homologous fusion protein is encoded by Sputnik and other virophages and is predicted to function as the primase-helicase in genome replication [
<xref ref-type="bibr" rid="CR42">42</xref>
]. Several other PLV encode S3H that is not directly related to that in the TVpol fusion proteins and is likely to have an independent origin (Figs. 
<xref rid="Fig2" ref-type="fig">2</xref>
and
<xref rid="Fig3" ref-type="fig">3</xref>
). In two of the three
<italic>G. theta</italic>
integrated elements, the N-terminal regions of the respective proteins are typical archaeo-eukaryotic primases, so that the entire protein has the same domain architecture as the primase-helicase of the “Megavirales”. In the rest of the PLV, the region upstream of S3H lacks detectable homologs and potentially could encompass divergent primases or inactivated derivatives thereof.</p>
</sec>
</sec>
<sec id="Sec6" sec-type="discussion">
<title>Discussion</title>
<p>The PLV comprise of a diverse set of putative viruses that is defined by the distinct, PgVV-like MCP and universally share three genes, those for the MCP, mCP (with a few uncertainties) and the packaging ATPase. It should be noted that of the numerous PgVV-like MCP detected in the present analysis, less than 10 % were found in large or extendable contigs and thus were analyzed here in detail. Altogether, the PLV seem to be an abundant group of viruses that remains to be characterized through a combination of metagenomic and virological approaches.</p>
<p>The conserved genes of the PLV represent the minimal morphogenetic module that is shared with many other viruses, including Polintons, virophages, adenoviruses and the “Megavirales”, and in most of these eukaryotic viruses, additionally includes the maturation protease. This protease is conspicuously missing from the PLV. Apart from the minimal morphogenetic module, subsets of the PLV share additional genes, some of which are implicated in genome replication, with each other, and also with Polintons and virophages (Fig. 
<xref rid="Fig7" ref-type="fig">7</xref>
).
<fig id="Fig7">
<label>Fig. 7</label>
<caption>
<p>Shared genes between Polinton-like viruses, Polintons and virophages. Different shades in the same column denote distantly related proteins that most likely have been acquired independently</p>
</caption>
<graphic xlink:href="12915_2015_207_Fig7_HTML" id="MO7"></graphic>
</fig>
</p>
<p>The discovery of the PLV expands the emerging class of dsDNA viruses of eukaryotes that share several distinctive characteristics: genome size of 15 to 30 kb; icosahedral particles approximately 60 nm in size; and homologous morphogenetic modules that consist of MCP, mCP, packaging ATPase and maturation protease [
<xref ref-type="bibr" rid="CR22">22</xref>
,
<xref ref-type="bibr" rid="CR24">24</xref>
]. The morphogenetic modules can be reduced as is the case of the RVP, which lack the mCP [
<xref ref-type="bibr" rid="CR32">32</xref>
], and the PLV which do not encode the protease. The typical Polintons (polintoviruses) also encode pDNAP and RVE integrase and so far have been found only in the integrated state. However, the finding that Polintons encode a MCP that, according to homology modeling results, retains the typical double jelly-roll structure, as well as the mCPs, strongly suggests the existence of Polinton (polintovirus) virions [
<xref ref-type="bibr" rid="CR21">21</xref>
,
<xref ref-type="bibr" rid="CR22">22</xref>
]. As described here, several PLV are integrated into the host genomes but only one of the integrated elements encodes a RVE integrase. This genome organization of this element integrated into the genome of the alga
<italic>M. neglectum</italic>
actually resembles a
<italic>bona fide</italic>
Polinton more than a typical PLV. The remaining integrated PLV lack any enzymes that could be implicated in integration and could conceivably employ integrases of resident Polintons within the same host
<italic>in trans</italic>
. In contrast, several other PLV encode the newly identified OLV11-like tyrosine recombinase and thus can be predicted to lead a dual, Polinton-like life style combining a virus stage with an integrated stage, similar to the Polintons.</p>
<p>The involvement of the OLV11-like tyrosine recombinase in the integration of the PLV into the host genome is compatible with the evidence that PgVV integrates into the genome of its helper virus, PGV [
<xref ref-type="bibr" rid="CR34">34</xref>
]. In this work, we identified a protein, the OLV11-like tyrosine recombinase encoded by PgVV open reading frame (ORF) 3, which is likely to be responsible for this integration. Notably, the recombination hotspot on the PgVV genome has been mapped to the region between ORF3 and ORF4 [
<xref ref-type="bibr" rid="CR34">34</xref>
], pinpointing the location of a putative attachment site on the virophage genome. In this respect, PgVV resembles many temperate bacterial and archaeal viruses in which the attachment sites are located next to the integrase genes [
<xref ref-type="bibr" rid="CR43">43</xref>
]. In integration reactions mediated by tyrosine recombinases, the donor DNA molecule is typically circular [
<xref ref-type="bibr" rid="CR44">44</xref>
], suggesting that the PgVV genome, and by extension the genomes of other PLV that encode the putative tyrosine recombinase, circularize prior to integration.</p>
<p>The divergent recombinases encoded by the three elements integrated into the genome of
<italic>G. theta</italic>
are of further interest. In all these elements, the recombinase genes are also located close to the extremities of the integrated genome, i.e. in the proximity of the attachment sites. Notably, in one of the elements, the recombinase gene is disrupted by the insertion of a Copia-like LTR-retrotransposon (Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
), potentially leading to immobilization of this element or making it dependent on the supply of the integrase
<italic>in trans</italic>
. Given that the majority of the PLV were discovered in the virus fraction of the respective metagenomes and often lack a detectable integrase, many if not most of the PLV genomes likely originate from virus particles. The ability of the PLV to form virions is demonstrated by the fact that the experimentally characterized TVS1 belongs to this group [
<xref ref-type="bibr" rid="CR35">35</xref>
<xref ref-type="bibr" rid="CR37">37</xref>
].</p>
<p>Typical of metagenome mining studies, the hosts of the PLV are unknown. Nevertheless, the integration of several PLV genomes into algal genomes, together with the fact that the only experimentally characterized virus among the PLV, TVS1, infects an alga [
<xref ref-type="bibr" rid="CR35">35</xref>
], imply that most if not all PLV are algal viruses. The association of the PgVV with a virus that belongs to the extended family
<italic>Mimiviridae</italic>
[
<xref ref-type="bibr" rid="CR34">34</xref>
] suggests the possibility that some or all of the PLV parasitize on large viruses, i.e. represent a distinct variety of virophages. Integration of the Sputnik and PgVV virophages into the host virus genomes has been reported [
<xref ref-type="bibr" rid="CR34">34</xref>
,
<xref ref-type="bibr" rid="CR45">45</xref>
].</p>
<p>The morphogenetic module and the DNAPs of Polintons appear to originate from the homologous module of bacteriophages of the family
<italic>Tectiviridae</italic>
[
<xref ref-type="bibr" rid="CR22">22</xref>
]. However, tectiviruses lack the protease and the integrase that most likely have been acquired at a later stage of evolution, possibly, from a single Ginger-like transposon [
<xref ref-type="bibr" rid="CR22">22</xref>
]. Thus, the PLV might resemble the ancestral forms of polintoviruses (Polintons). However, the alternative possibility, that the PLV are derived polintoviruses, cannot be ruled out. Moreover, given that the PLV mix with polintoviruses in some phylogenetic trees, in particular, the MCP tree (Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
), multiple, independent origins of PLV from different groups of Polintons appear possible.</p>
<p>As shown previously, the RVP are chimeric viruses that combine the virophage morphogenetic module with the Polinton-derived pDNAP [
<xref ref-type="bibr" rid="CR32">32</xref>
]. Here we demonstrate that different groups of PLV defined by phylogenetic analysis of the MCP and the packaging ATPase share the pDNAPs and other replicative enzymes with different groups of Polintons and virophages. Thus, some of the PLV also appear to be chimeras, an evolutionary trend that is increasingly observed in different groups of viruses [
<xref ref-type="bibr" rid="CR24">24</xref>
,
<xref ref-type="bibr" rid="CR46">46</xref>
]. Moreover, the network-type relationship between the gene complements of the PLV, Polintons and virophages (Fig. 
<xref rid="Fig7" ref-type="fig">7</xref>
) indicates that, on the evolutionary scale, they all share a common gene pool. Apparently, this gene pool has spawned a great variety of diverse genetic elements. The PLV are unlikely to be the last group of viruses in this class to be discovered.</p>
</sec>
<sec id="Sec7" sec-type="conclusion">
<title>Conclusions</title>
<p>Metagenomic database mining increasingly leads to the discovery of novel groups of organisms. Identification of new viruses is particularly straightforward given the comparatively small size of viral genomes, but presents additional challenges due to the typical fast evolution of viruses resulting in difficulties with respect to the detection of homologous relationships. Here we report a comprehensive metagenomic database search complemented by the use of sensitive methods to detect homologous proteins that resulted in the discovery of a novel group of Polinton-like viruses. These putative viruses resemble Polintons (polintoviruses) in the overall genome organization but possess a distinct form of the major capsid protein and a minimal morphogenetic module lacking the maturation protease that is typical of Polintons and virophages. With a single exception, the PLV also lack the RVE integrase that is encoded in the genomes of all Polintons and the Mavirus group of virophages. However, several PLV encode a predicted novel tyrosine recombinase that could provide an alternative route of integration. Although we identified several PLV genomes and individual genes integrated into eukaryotic genomes, it appears likely that most of the PLV are viruses. Given the lack of protease and RVE integrase, which appear to be relatively late acquisitions during the evolution of Polintons, the PLV could resemble the ancestral polintoviruses that evolved from bacterial tectiviruses, although the possibility that the PLV are degraded derivatives of polintoviruses could not be ruled out either.</p>
<p>Apart from the conserved minimal morphogenetic module, the PLV widely differ in their gene repertoires but share a network of homologous genes with Polintons and virophages. Although we explicitly analyzed only 20 PLV genomes that could be assembled from metagenomic contigs to (near) completion, the overall number of detected PLV-type MCP is much greater indicating that these viruses are common, at least in marine habitats. To summarize, Polintons (polintoviruses), PLV and virophages are widespread among eukaryotes, share a common gene pool and appear to represent an emerging major class of eukaryotic viruses and transposons. Most likely, new families of viruses and other mobile elements within this class remain to be discovered.</p>
</sec>
<sec id="Sec8" sec-type="materials|methods">
<title>Methods</title>
<sec id="Sec9">
<title>Metagenomic database screening</title>
<p>The sequence of the PgVV MCP was first used as a query in a PSI-BLAST search [
<xref ref-type="bibr" rid="CR47">47</xref>
] of the non-redundant protein sequence database (nr) at the NCBI. This search detected homologs of the PgVV MCP in five eukaryotic genomes. PgVV MCP and its eukaryotic homologs found in the nr database were used as queries for TBLASTN searches (e-value ≤10) against two metagenomic databases, CAMERA [
<xref ref-type="bibr" rid="CR48">48</xref>
] and the NCBI Whole Genome Shotgun (WGS) contigs database [
<xref ref-type="bibr" rid="CR49">49</xref>
].</p>
<p>After the metagenomic PLV fragments were assembled, translated and validated (as described below), a preliminary phylogenetic tree for all detected MCP-like proteins of PLV was constructed using FastTree [
<xref ref-type="bibr" rid="CR50">50</xref>
]. Using this tree as a guide, 20 diverse representatives were selected, and the metagenomic databases were screened once again, as described above.</p>
</sec>
<sec id="Sec10">
<title>CAMERA sequence assembly</title>
<p>Because the CAMERA database is mostly composed of unassembled reads, the hits obtained from CAMERA were assembled using Geneious Pro 8.0.2 (
<ext-link ext-link-type="uri" xlink:href="http://www.geneious.com">www.geneious.com</ext-link>
), separately for each query. Resulting contigs were translated using GeneMark [
<xref ref-type="bibr" rid="CR51">51</xref>
] and checked for presence of a PLV MCP-like protein, using either (i) first BLASTP nr hits of GeneMark-translated proteins, or (ii) PSI-BLAST initiated by a profile made from all MCPs detected up to that point against the GeneMark-translated proteins. The contigs encoding proteins with best hits to one of the identified MCPs or matching the MCP profile, were collected. Only contigs encoding PLV-like MCPs (hereinafter seeds) were taken to the next step.</p>
<p>For each seed, BLASTN search of the terminal regions (400 nt) against CAMERA was performed using MegaBLAST [
<xref ref-type="bibr" rid="CR52">52</xref>
]. Highly similar (97 % identical nucleotides) hits were collected and assembled with the seed (Geneious Pro 8.0.3,
<italic>de novo</italic>
assembly algorithm). The resulting contig was used as a seed again, and the cycle was repeated until the contig could not be extended any longer. To validate and refine the final assembly, the last seed was searched against CAMERA using MegaBLAST, highly similar (97 % identical nucleotides) hits were collected and assembled using the last seed as a guide by Geneious (combined mapping and
<italic>de novo</italic>
assembly workflow). All final contigs were manually checked for assembly errors.</p>
</sec>
<sec id="Sec11">
<title>WGS sequence assembly</title>
<p>The sequences obtained by the TBLASTN searches against the WGS database were collected and translated by GeneMark. Sequences not containing PLV-like MCPs were filtered out as described for CAMERA (above). All remaining sequences belonged to the marine metagenome subset of WGS metagenomic data, and accordingly, all subsequent BLASTN searches were performed against marine metagenomes. The collected metagenomic sequences were extended whenever possible using alternating cycles of BLASTN searches and assembly with Geneious. This procedure allowed for extension of some of the Tara Oceans sequences [
<xref ref-type="bibr" rid="CR7">7</xref>
] that have been assembled prior to database submission.</p>
</sec>
<sec id="Sec12">
<title>Protein sequence analysis</title>
<p>The sequences of contigs obtained as described above were translated using GeneMark, and the resulting protein sequences were used as queries to search the nr database using PSI-BLAST, the Conserved Domain Database (CDD) using RPS-BLAST [
<xref ref-type="bibr" rid="CR53">53</xref>
], and the CDD and Pfam databases using HHpred [
<xref ref-type="bibr" rid="CR54">54</xref>
].</p>
</sec>
<sec id="Sec13">
<title>Phylogenetic analysis</title>
<p>The MCP, ATPase and pDNAP protein sequences collected previously [
<xref ref-type="bibr" rid="CR29">29</xref>
] were pooled with the respective PLV proteins and their best hits from the nr database. The MCP sequences were aligned with PROMALS3D, with the PBCV MCP structure used as a template [
<xref ref-type="bibr" rid="CR55">55</xref>
]. The ATPase and pDNAP sequences were aligned using MUSCLE [
<xref ref-type="bibr" rid="CR56">56</xref>
]. Poorly aligned (low information content) positions were removed using the gappyout function of trimAl [
<xref ref-type="bibr" rid="CR57">57</xref>
]. Phylogenetic trees were constructed using the PhyML program [
<xref ref-type="bibr" rid="CR58">58</xref>
,
<xref ref-type="bibr" rid="CR59">59</xref>
], the latest version of which (
<ext-link ext-link-type="uri" xlink:href="http://www.atgc-montpellier.fr/phyml-sms/">http://www.atgc-montpellier.fr/phyml-sms/</ext-link>
) includes automatic selection of the best-fit substitution model for a given alignment. The best models identified by PhyML were LG + G6 + I + F for MCP and ATPase, and Blosum62 + G6 + I + F for pDNAP. LG, Le-Gascuel matrix; G6 + I + F, gamma shape parameter: estimated; number of categories: 6; proportion of invariable sites: estimated; and equilibrium frequencies: empirical.</p>
</sec>
</sec>
</body>
<back>
<app-group>
<app id="App1">
<sec id="Sec14">
<title>Additional files</title>
<p>
<media position="anchor" xlink:href="12915_2015_207_MOESM1_ESM.txt" id="MOESM1">
<label>Additional file 1:</label>
<caption>
<p>
<bold>Annotated genomes of Polinton-like viruses.</bold>
(TXT 748 kb)</p>
</caption>
</media>
<media position="anchor" xlink:href="12915_2015_207_MOESM2_ESM.xlsx" id="MOESM2">
<label>Additional file 2:</label>
<caption>
<p>
<bold>Geographical origins of the PLV.</bold>
(XLSX 273 kb)</p>
</caption>
</media>
<media position="anchor" xlink:href="12915_2015_207_MOESM3_ESM.pptx" id="MOESM3">
<label>Additional file 3:</label>
<caption>
<p>
<bold>Multiple alignments of the core genes of the PLV.</bold>
(PPTX 1221 kb)</p>
</caption>
</media>
<media position="anchor" xlink:href="12915_2015_207_MOESM4_ESM.txt" id="MOESM4">
<label>Additional file 4:</label>
<caption>
<p>
<bold>PhyML trees (Newick format), trimmed alignments used for phylogenetic tree construction, and original alignments for MCP, ATPase and pDNAP.</bold>
(TXT 510 kb)</p>
</caption>
</media>
</p>
</sec>
</app>
</app-group>
<fn-group>
<fn>
<p>
<bold>Competing interests</bold>
</p>
<p>The authors declare that they have no competing interests.</p>
</fn>
<fn>
<p>
<bold>Authors’ contributions</bold>
</p>
<p>NY and SS performed data analysis; NY, SS, VK, MK and EVK analyzed the results; EVK wrote the manuscript that was edited by all authors. All authors read and approved the final manuscript.</p>
</fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>We thank Yuri Wolf for helpful advice on sequence assembly and analysis. NY, SS, VK and EVK were supported by intramural funds of the US Department of Health and Human Services (to the National Library of Medicine).</p>
</ack>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Godzik</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Metagenomics and the protein universe</article-title>
<source>Curr Opin Struct Biol</source>
<year>2011</year>
<volume>21</volume>
<issue>3</issue>
<fpage>398</fpage>
<lpage>403</lpage>
<pub-id pub-id-type="doi">10.1016/j.sbi.2011.03.010</pub-id>
<pub-id pub-id-type="pmid">21497084</pub-id>
</element-citation>
</ref>
<ref id="CR2">
<label>2.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simon</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Daniel</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Metagenomic analyses: past and future trends</article-title>
<source>Appl Environ Microbiol</source>
<year>2011</year>
<volume>77</volume>
<issue>4</issue>
<fpage>1153</fpage>
<lpage>61</lpage>
<pub-id pub-id-type="doi">10.1128/AEM.02345-10</pub-id>
<pub-id pub-id-type="pmid">21169428</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nelson</surname>
<given-names>KE</given-names>
</name>
</person-group>
<article-title>Microbiomes</article-title>
<source>Microb Ecol</source>
<year>2013</year>
<volume>65</volume>
<issue>4</issue>
<fpage>916</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1007/s00248-013-0227-y</pub-id>
<pub-id pub-id-type="pmid">23604403</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tuffin</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Anderson</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Heath</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Cowan</surname>
<given-names>DA</given-names>
</name>
</person-group>
<article-title>Metagenomic gene discovery: how far have we moved into novel sequence space?</article-title>
<source>Biotechnol J</source>
<year>2009</year>
<volume>4</volume>
<issue>12</issue>
<fpage>1671</fpage>
<lpage>83</lpage>
<pub-id pub-id-type="doi">10.1002/biot.200900235</pub-id>
<pub-id pub-id-type="pmid">19946882</pub-id>
</element-citation>
</ref>
<ref id="CR5">
<label>5.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ufarte</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Potocki-Veronese</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Laville</surname>
<given-names>E</given-names>
</name>
</person-group>
<article-title>Discovery of new protein families and functions: new challenges in functional metagenomics for biotechnologies and microbial ecology</article-title>
<source>Front Microbiol</source>
<year>2015</year>
<volume>6</volume>
<fpage>563</fpage>
<pub-id pub-id-type="pmid">26097471</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yooseph</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Sutton</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Rusch</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Halpern</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Williamson</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Remington</surname>
<given-names>K</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families</article-title>
<source>PLoS Biol</source>
<year>2007</year>
<volume>5</volume>
<issue>3</issue>
<fpage>e16</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.0050016</pub-id>
<pub-id pub-id-type="pmid">17355171</pub-id>
</element-citation>
</ref>
<ref id="CR7">
<label>7.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sunagawa</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Coelho</surname>
<given-names>LP</given-names>
</name>
<name>
<surname>Chaffron</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kultima</surname>
<given-names>JR</given-names>
</name>
<name>
<surname>Labadie</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Salazar</surname>
<given-names>G</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Ocean plankton. Structure and function of the global ocean microbiome</article-title>
<source>Science</source>
<year>2015</year>
<volume>348</volume>
<issue>6237</issue>
<fpage>1261359</fpage>
<pub-id pub-id-type="doi">10.1126/science.1261359</pub-id>
<pub-id pub-id-type="pmid">25999513</pub-id>
</element-citation>
</ref>
<ref id="CR8">
<label>8.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kristensen</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Mushegian</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Dolja</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<article-title>New dimensions of the virus world discovered through metagenomics</article-title>
<source>Trends Microbiol</source>
<year>2010</year>
<volume>18</volume>
<issue>1</issue>
<fpage>11</fpage>
<lpage>9</lpage>
<pub-id pub-id-type="doi">10.1016/j.tim.2009.11.003</pub-id>
<pub-id pub-id-type="pmid">19942437</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rosario</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Breitbart</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Exploring the viral world through metagenomics</article-title>
<source>Curr Opin Virol</source>
<year>2011</year>
<volume>1</volume>
<issue>4</issue>
<fpage>289</fpage>
<lpage>97</lpage>
<pub-id pub-id-type="doi">10.1016/j.coviro.2011.06.004</pub-id>
<pub-id pub-id-type="pmid">22440785</pub-id>
</element-citation>
</ref>
<ref id="CR10">
<label>10.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mokili</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Dutilh</surname>
<given-names>BE</given-names>
</name>
</person-group>
<article-title>Metagenomics and future perspectives in virus discovery</article-title>
<source>Curr Opin Virol</source>
<year>2012</year>
<volume>2</volume>
<issue>1</issue>
<fpage>63</fpage>
<lpage>77</lpage>
<pub-id pub-id-type="doi">10.1016/j.coviro.2011.12.004</pub-id>
<pub-id pub-id-type="pmid">22440968</pub-id>
</element-citation>
</ref>
<ref id="CR11">
<label>11.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Diemer</surname>
<given-names>GS</given-names>
</name>
<name>
<surname>Stedman</surname>
<given-names>KM</given-names>
</name>
</person-group>
<article-title>A novel virus genome discovered in an extreme environment suggests recombination between unrelated groups of RNA and DNA viruses</article-title>
<source>Biol Direct</source>
<year>2012</year>
<volume>7</volume>
<fpage>13</fpage>
<pub-id pub-id-type="doi">10.1186/1745-6150-7-13</pub-id>
<pub-id pub-id-type="pmid">22515485</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roux</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Enault</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Bronner</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Vaulot</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Forterre</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Krupovic</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Chimeric viruses blur the borders between the major groups of eukaryotic single-stranded DNA viruses</article-title>
<source>Nat Commun</source>
<year>2013</year>
<volume>4</volume>
<fpage>2700</fpage>
<pub-id pub-id-type="doi">10.1038/ncomms3700</pub-id>
<pub-id pub-id-type="pmid">24193254</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krupovic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Zhi</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Multiple layers of chimerism in a single-stranded DNA virus discovered by deep sequencing</article-title>
<source>Genome Biol Evol</source>
<year>2015</year>
<volume>7</volume>
<issue>4</issue>
<fpage>993</fpage>
<lpage>1001</lpage>
<pub-id pub-id-type="doi">10.1093/gbe/evv034</pub-id>
<pub-id pub-id-type="pmid">25840414</pub-id>
</element-citation>
</ref>
<ref id="CR14">
<label>14.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dutilh</surname>
<given-names>BE</given-names>
</name>
<name>
<surname>Cassman</surname>
<given-names>N</given-names>
</name>
<name>
<surname>McNair</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Sanchez</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Silva</surname>
<given-names>GG</given-names>
</name>
<name>
<surname>Boling</surname>
<given-names>L</given-names>
</name>
<etal></etal>
</person-group>
<article-title>A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes</article-title>
<source>Nat Commun</source>
<year>2014</year>
<volume>5</volume>
<fpage>4498</fpage>
<pub-id pub-id-type="doi">10.1038/ncomms5498</pub-id>
<pub-id pub-id-type="pmid">25058116</pub-id>
</element-citation>
</ref>
<ref id="CR15">
<label>15.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mozar</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Claverie</surname>
<given-names>JM</given-names>
</name>
</person-group>
<article-title>Expanding the Mimiviridae family using asparagine synthase as a sequence bait</article-title>
<source>Virology</source>
<year>2014</year>
<volume>466–467</volume>
<fpage>112</fpage>
<lpage>22</lpage>
<pub-id pub-id-type="doi">10.1016/j.virol.2014.05.013</pub-id>
<pub-id pub-id-type="pmid">24908633</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kapitonov</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Jurka</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Self-synthesizing DNA transposons in eukaryotes</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2006</year>
<volume>103</volume>
<issue>12</issue>
<fpage>4540</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0600833103</pub-id>
<pub-id pub-id-type="pmid">16537396</pub-id>
</element-citation>
</ref>
<ref id="CR17">
<label>17.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jurka</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Kapitonov</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Kohany</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Jurka</surname>
<given-names>MV</given-names>
</name>
</person-group>
<article-title>Repetitive sequences in complex genomes: structure and evolution</article-title>
<source>Annu Rev Genomics Hum Genet</source>
<year>2007</year>
<volume>8</volume>
<fpage>241</fpage>
<lpage>59</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.genom.8.080706.092416</pub-id>
<pub-id pub-id-type="pmid">17506661</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pritham</surname>
<given-names>EJ</given-names>
</name>
<name>
<surname>Putliwala</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Feschotte</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Mavericks, a novel class of giant transposable elements widespread in eukaryotes and related to DNA viruses</article-title>
<source>Gene</source>
<year>2007</year>
<volume>390</volume>
<issue>1–2</issue>
<fpage>3</fpage>
<lpage>17</lpage>
<pub-id pub-id-type="doi">10.1016/j.gene.2006.08.008</pub-id>
<pub-id pub-id-type="pmid">17034960</pub-id>
</element-citation>
</ref>
<ref id="CR19">
<label>19.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feschotte</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Pritham</surname>
<given-names>EJ</given-names>
</name>
</person-group>
<article-title>DNA transposons and the evolution of eukaryotic genomes</article-title>
<source>Annu Rev Genet</source>
<year>2007</year>
<volume>41</volume>
<fpage>331</fpage>
<lpage>68</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.genet.40.110405.090448</pub-id>
<pub-id pub-id-type="pmid">18076328</pub-id>
</element-citation>
</ref>
<ref id="CR20">
<label>20.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Haapa-Paananen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wahlberg</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Savilahti</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Phylogenetic analysis of Maverick/Polinton giant transposons across organisms</article-title>
<source>Mol Phylogenet Evol</source>
<year>2014</year>
<volume>78</volume>
<fpage>271</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="doi">10.1016/j.ympev.2014.05.024</pub-id>
<pub-id pub-id-type="pmid">24882428</pub-id>
</element-citation>
</ref>
<ref id="CR21">
<label>21.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krupovic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bamford</surname>
<given-names>DH</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<article-title>Conservation of major and minor jelly-roll capsid proteins in Polinton (Maverick) transposons suggests that they are bona fide viruses</article-title>
<source>Biol Direct</source>
<year>2014</year>
<volume>9</volume>
<fpage>6</fpage>
<pub-id pub-id-type="doi">10.1186/1745-6150-9-6</pub-id>
<pub-id pub-id-type="pmid">24773695</pub-id>
</element-citation>
</ref>
<ref id="CR22">
<label>22.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krupovic</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<article-title>Polintons: a hotbed of eukaryotic virus, transposon and plasmid evolution</article-title>
<source>Nat Rev Microbiol</source>
<year>2015</year>
<volume>13</volume>
<issue>2</issue>
<fpage>105</fpage>
<lpage>15</lpage>
<pub-id pub-id-type="doi">10.1038/nrmicro3389</pub-id>
<pub-id pub-id-type="pmid">25534808</pub-id>
</element-citation>
</ref>
<ref id="CR23">
<label>23.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wuitschick</surname>
<given-names>JD</given-names>
</name>
<name>
<surname>Gershan</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Lochowicz</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Karrer</surname>
<given-names>KM</given-names>
</name>
</person-group>
<article-title>A novel family of mobile genetic elements is limited to the germline genome in Tetrahymena thermophila</article-title>
<source>Nucleic Acids Res</source>
<year>2002</year>
<volume>30</volume>
<issue>11</issue>
<fpage>2524</fpage>
<lpage>37</lpage>
<pub-id pub-id-type="doi">10.1093/nar/30.11.2524</pub-id>
<pub-id pub-id-type="pmid">12034842</pub-id>
</element-citation>
</ref>
<ref id="CR24">
<label>24.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
<name>
<surname>Dolja</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Krupovic</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Origins and evolution of viruses of eukaryotes: The ultimate modularity</article-title>
<source>Virology</source>
<year>2015</year>
<volume>479–480</volume>
<fpage>2</fpage>
<lpage>25</lpage>
<pub-id pub-id-type="doi">10.1016/j.virol.2015.02.039</pub-id>
<pub-id pub-id-type="pmid">25771806</pub-id>
</element-citation>
</ref>
<ref id="CR25">
<label>25.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>La Scola</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Desnues</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Pagnier</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Robert</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Barrassi</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Fournous</surname>
<given-names>G</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The virophage as a unique parasite of the giant mimivirus</article-title>
<source>Nature</source>
<year>2008</year>
<volume>455</volume>
<issue>7209</issue>
<fpage>100</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="doi">10.1038/nature07218</pub-id>
<pub-id pub-id-type="pmid">18690211</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Claverie</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Abergel</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Mimivirus and its virophage</article-title>
<source>Annu Rev Genet</source>
<year>2009</year>
<volume>43</volume>
<fpage>49</fpage>
<lpage>66</lpage>
<pub-id pub-id-type="doi">10.1146/annurev-genet-102108-134255</pub-id>
<pub-id pub-id-type="pmid">19653859</pub-id>
</element-citation>
</ref>
<ref id="CR27">
<label>27.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Desnues</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Boyer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Raoult</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Sputnik, a virophage infecting the viral domain of life</article-title>
<source>Adv Virus Res</source>
<year>2012</year>
<volume>82</volume>
<fpage>63</fpage>
<lpage>89</lpage>
<pub-id pub-id-type="doi">10.1016/B978-0-12-394621-8.00013-3</pub-id>
<pub-id pub-id-type="pmid">22420851</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fischer</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Suttle</surname>
<given-names>CA</given-names>
</name>
</person-group>
<article-title>A virophage at the origin of large DNA transposons</article-title>
<source>Science</source>
<year>2011</year>
<volume>332</volume>
<issue>6026</issue>
<fpage>231</fpage>
<lpage>4</lpage>
<pub-id pub-id-type="doi">10.1126/science.1199412</pub-id>
<pub-id pub-id-type="pmid">21385722</pub-id>
</element-citation>
</ref>
<ref id="CR29">
<label>29.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yutin</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Raoult</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<article-title>Virophages, polintons, and transpovirons: a complex evolutionary network of diverse selfish genetic elements with different reproduction strategies</article-title>
<source>Virol J</source>
<year>2013</year>
<volume>10</volume>
<fpage>158</fpage>
<pub-id pub-id-type="doi">10.1186/1743-422X-10-158</pub-id>
<pub-id pub-id-type="pmid">23701946</pub-id>
</element-citation>
</ref>
<ref id="CR30">
<label>30.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Childers</surname>
<given-names>A</given-names>
</name>
<name>
<surname>McDermott</surname>
<given-names>TR</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Liles</surname>
<given-names>MR</given-names>
</name>
</person-group>
<article-title>Three novel virophage genomes discovered from Yellowstone Lake metagenomes</article-title>
<source>J Virol</source>
<year>2015</year>
<volume>89</volume>
<issue>2</issue>
<fpage>1278</fpage>
<lpage>85</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.03039-14</pub-id>
<pub-id pub-id-type="pmid">25392206</pub-id>
</element-citation>
</ref>
<ref id="CR31">
<label>31.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Diversity of virophages in metagenomic data sets</article-title>
<source>J Virol</source>
<year>2013</year>
<volume>87</volume>
<issue>8</issue>
<fpage>4225</fpage>
<lpage>36</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.03398-12</pub-id>
<pub-id pub-id-type="pmid">23408616</pub-id>
</element-citation>
</ref>
<ref id="CR32">
<label>32.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yutin</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Kapitonov</surname>
<given-names>VV</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV</given-names>
</name>
</person-group>
<article-title>A new family of hybrid virophages from an animal gut metagenome</article-title>
<source>Biol Direct</source>
<year>2015</year>
<volume>10</volume>
<fpage>19</fpage>
<pub-id pub-id-type="doi">10.1186/s13062-015-0054-9</pub-id>
<pub-id pub-id-type="pmid">25909276</pub-id>
</element-citation>
</ref>
<ref id="CR33">
<label>33.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Xiang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Klose</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Raoult</surname>
<given-names>D</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Structure of Sputnik, a virophage, at 3.5-A resolution</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2012</year>
<volume>109</volume>
<issue>45</issue>
<fpage>18431</fpage>
<lpage>6</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1211702109</pub-id>
<pub-id pub-id-type="pmid">23091035</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Santini</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Jeudy</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bartoli</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Poirot</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Lescot</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Abergel</surname>
<given-names>C</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Genome of Phaeocystis globosa virus PgV-16 T highlights the common ancestry of the largest known DNA viruses infecting eukaryotes</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2013</year>
<volume>110</volume>
<issue>26</issue>
<fpage>10800</fpage>
<lpage>5</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1303251110</pub-id>
<pub-id pub-id-type="pmid">23754393</pub-id>
</element-citation>
</ref>
<ref id="CR35">
<label>35.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stepanova</surname>
<given-names>OA</given-names>
</name>
<name>
<surname>Boyko</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Gordienko</surname>
<given-names>AI</given-names>
</name>
<name>
<surname>Sherban</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Shevchenko</surname>
<given-names>TP</given-names>
</name>
<name>
<surname>Polischuck</surname>
<given-names>VP</given-names>
</name>
</person-group>
<article-title>Characteristics of virus of Tetraselmis viridis norris (Chorophyta, Prasinophycea)</article-title>
<source>Dokl Akad Nauk Ukr</source>
<year>2005</year>
<volume>1</volume>
<fpage>158</fpage>
<lpage>62</lpage>
</element-citation>
</ref>
<ref id="CR36">
<label>36.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stepanova</surname>
<given-names>OA</given-names>
</name>
<name>
<surname>Boiko</surname>
<given-names>AL</given-names>
</name>
<name>
<surname>Shcherbatenko</surname>
<given-names>IS</given-names>
</name>
</person-group>
<article-title>Computational genome analysis of three marine algoviruses</article-title>
<source>Mikrobiol Z</source>
<year>2013</year>
<volume>75</volume>
<issue>5</issue>
<fpage>76</fpage>
<lpage>81</lpage>
<pub-id pub-id-type="pmid">24479317</pub-id>
</element-citation>
</ref>
<ref id="CR37">
<label>37.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pagarete</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Grebert</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Stepanova</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Sandaa</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Bratbak</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>Tsv-N1: a novel DNA algal virus that infects Tetraselmis striata</article-title>
<source>Viruses</source>
<year>2015</year>
<volume>7</volume>
<issue>7</issue>
<fpage>3937</fpage>
<lpage>53</lpage>
<pub-id pub-id-type="doi">10.3390/v7072806</pub-id>
<pub-id pub-id-type="pmid">26193304</pub-id>
</element-citation>
</ref>
<ref id="CR38">
<label>38.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Colson</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Yutin</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Shabalina</surname>
<given-names>SA</given-names>
</name>
<name>
<surname>Robert</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Fournous</surname>
<given-names>G</given-names>
</name>
<name>
<surname>La Scola</surname>
<given-names>B</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Viruses with more than 1,000 genes: Mamavirus, a new Acanthamoeba polyphaga mimivirus strain, and reannotation of Mimivirus genes</article-title>
<source>Genome Biol Evol</source>
<year>2011</year>
<volume>3</volume>
<fpage>737</fpage>
<lpage>42</lpage>
<pub-id pub-id-type="doi">10.1093/gbe/evr048</pub-id>
<pub-id pub-id-type="pmid">21705471</pub-id>
</element-citation>
</ref>
<ref id="CR39">
<label>39.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Das</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Martinez</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Midonet</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Barre</surname>
<given-names>FX</given-names>
</name>
</person-group>
<article-title>Integrative mobile elements exploiting Xer recombination</article-title>
<source>Trends Microbiol</source>
<year>2013</year>
<volume>21</volume>
<issue>1</issue>
<fpage>23</fpage>
<lpage>30</lpage>
<pub-id pub-id-type="doi">10.1016/j.tim.2012.10.003</pub-id>
<pub-id pub-id-type="pmid">23127381</pub-id>
</element-citation>
</ref>
<ref id="CR40">
<label>40.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Farr</surname>
<given-names>GA</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>LG</given-names>
</name>
<name>
<surname>Tattersall</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Parvoviral virions deploy a capsid-tethered lipolytic enzyme to breach the endosomal membrane during cell entry</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2005</year>
<volume>102</volume>
<issue>47</issue>
<fpage>17148</fpage>
<lpage>53</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0508477102</pub-id>
<pub-id pub-id-type="pmid">16284249</pub-id>
</element-citation>
</ref>
<ref id="CR41">
<label>41.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cotmore</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Tattersall</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Parvoviral host range and cell entry mechanisms</article-title>
<source>Adv Virus Res</source>
<year>2007</year>
<volume>70</volume>
<fpage>183</fpage>
<lpage>232</lpage>
<pub-id pub-id-type="doi">10.1016/S0065-3527(07)70005-2</pub-id>
<pub-id pub-id-type="pmid">17765706</pub-id>
</element-citation>
</ref>
<ref id="CR42">
<label>42.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Iyer</surname>
<given-names>LM</given-names>
</name>
<name>
<surname>Abhiman</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Aravind</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>A new family of polymerases related to superfamily A DNA polymerases and T7-like DNA-dependent RNA polymerases</article-title>
<source>Biol Direct</source>
<year>2008</year>
<volume>3</volume>
<fpage>39</fpage>
<pub-id pub-id-type="doi">10.1186/1745-6150-3-39</pub-id>
<pub-id pub-id-type="pmid">18834537</pub-id>
</element-citation>
</ref>
<ref id="CR43">
<label>43.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hall</surname>
<given-names>RM</given-names>
</name>
</person-group>
<article-title>Integrons and gene cassettes: hotspots of diversity in bacterial genomes</article-title>
<source>Ann N Y Acad Sci</source>
<year>2012</year>
<volume>1267</volume>
<fpage>71</fpage>
<lpage>8</lpage>
<pub-id pub-id-type="doi">10.1111/j.1749-6632.2012.06588.x</pub-id>
<pub-id pub-id-type="pmid">22954219</pub-id>
</element-citation>
</ref>
<ref id="CR44">
<label>44.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dyda</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Chandler</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hickman</surname>
<given-names>AB</given-names>
</name>
</person-group>
<article-title>The emerging diversity of transpososome architectures</article-title>
<source>Q Rev Biophys</source>
<year>2012</year>
<volume>45</volume>
<issue>4</issue>
<fpage>493</fpage>
<lpage>521</lpage>
<pub-id pub-id-type="doi">10.1017/S0033583512000145</pub-id>
<pub-id pub-id-type="pmid">23217365</pub-id>
</element-citation>
</ref>
<ref id="CR45">
<label>45.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Desnues</surname>
<given-names>C</given-names>
</name>
<name>
<surname>La Scola</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Yutin</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Fournous</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Robert</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Azza</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Provirophages and transpovirons as the diverse mobilome of giant viruses</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2012</year>
<volume>109</volume>
<issue>44</issue>
<fpage>18078</fpage>
<lpage>83</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.1208835109</pub-id>
<pub-id pub-id-type="pmid">23071316</pub-id>
</element-citation>
</ref>
<ref id="CR46">
<label>46.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Krupovic</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Networks of evolutionary interactions underlying the polyphyletic origin of ssDNA viruses</article-title>
<source>Curr Opin Virol</source>
<year>2013</year>
<volume>3</volume>
<issue>5</issue>
<fpage>578</fpage>
<lpage>86</lpage>
<pub-id pub-id-type="doi">10.1016/j.coviro.2013.06.010</pub-id>
<pub-id pub-id-type="pmid">23850154</pub-id>
</element-citation>
</ref>
<ref id="CR47">
<label>47.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Madden</surname>
<given-names>TL</given-names>
</name>
<name>
<surname>Schaffer</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</article-title>
<source>Nucleic Acids Res</source>
<year>1997</year>
<volume>25</volume>
<issue>17</issue>
<fpage>3389</fpage>
<lpage>402</lpage>
<pub-id pub-id-type="doi">10.1093/nar/25.17.3389</pub-id>
<pub-id pub-id-type="pmid">9254694</pub-id>
</element-citation>
</ref>
<ref id="CR48">
<label>48.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Altintas</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Peltier</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource</article-title>
<source>Nucleic Acids Res</source>
<year>2011</year>
<volume>39</volume>
<issue>Database issue</issue>
<fpage>D546</fpage>
<lpage>551</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkq1102</pub-id>
<pub-id pub-id-type="pmid">21045053</pub-id>
</element-citation>
</ref>
<ref id="CR49">
<label>49.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Resource Coordinators</surname>
<given-names>NCBI</given-names>
</name>
</person-group>
<article-title>Database resources of the National Center for Biotechnology Information</article-title>
<source>Nucleic Acids Res</source>
<year>2015</year>
<volume>43</volume>
<issue>Database issue</issue>
<fpage>D6</fpage>
<lpage>17</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gku1130</pub-id>
<pub-id pub-id-type="pmid">25398906</pub-id>
</element-citation>
</ref>
<ref id="CR50">
<label>50.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Price</surname>
<given-names>MN</given-names>
</name>
<name>
<surname>Dehal</surname>
<given-names>PS</given-names>
</name>
<name>
<surname>Arkin</surname>
<given-names>AP</given-names>
</name>
</person-group>
<article-title>FastTree 2 – approximately maximum-likelihood trees for large alignments</article-title>
<source>PLoS ONE</source>
<year>2010</year>
<volume>5</volume>
<issue>3</issue>
<fpage>e9490</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0009490</pub-id>
<pub-id pub-id-type="pmid">20224823</pub-id>
</element-citation>
</ref>
<ref id="CR51">
<label>51.</label>
<mixed-citation publication-type="other">Borodovsky M, Lomsadze A. Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite. Curr Protoc Microbiol. 2014;32:Unit 1E.7.</mixed-citation>
</ref>
<ref id="CR52">
<label>52.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Morgulis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Coulouris</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Raytselis</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Madden</surname>
<given-names>TL</given-names>
</name>
<name>
<surname>Agarwala</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Schaffer</surname>
<given-names>AA</given-names>
</name>
</person-group>
<article-title>Database indexing for production MegaBLAST searches</article-title>
<source>Bioinformatics</source>
<year>2008</year>
<volume>24</volume>
<issue>16</issue>
<fpage>1757</fpage>
<lpage>64</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btn322</pub-id>
<pub-id pub-id-type="pmid">18567917</pub-id>
</element-citation>
</ref>
<ref id="CR53">
<label>53.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marchler-Bauer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Chitsaz</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Derbyshire</surname>
<given-names>MK</given-names>
</name>
<name>
<surname>Geer</surname>
<given-names>LY</given-names>
</name>
<name>
<surname>Geer</surname>
<given-names>RC</given-names>
</name>
<etal></etal>
</person-group>
<article-title>CDD: conserved domains and protein three-dimensional structure</article-title>
<source>Nucleic Acids Res</source>
<year>2013</year>
<volume>41</volume>
<issue>Database issue</issue>
<fpage>D348</fpage>
<lpage>352</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gks1243</pub-id>
<pub-id pub-id-type="pmid">23197659</pub-id>
</element-citation>
</ref>
<ref id="CR54">
<label>54.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Soding</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Protein homology detection by HMM-HMM comparison</article-title>
<source>Bioinformatics</source>
<year>2005</year>
<volume>21</volume>
<issue>7</issue>
<fpage>951</fpage>
<lpage>60</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/bti125</pub-id>
<pub-id pub-id-type="pmid">15531603</pub-id>
</element-citation>
</ref>
<ref id="CR55">
<label>55.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pei</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>BH</given-names>
</name>
<name>
<surname>Grishin</surname>
<given-names>NV</given-names>
</name>
</person-group>
<article-title>PROMALS3D: a tool for multiple protein sequence and structure alignments</article-title>
<source>Nucleic Acids Res</source>
<year>2008</year>
<volume>36</volume>
<issue>7</issue>
<fpage>2295</fpage>
<lpage>300</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkn072</pub-id>
<pub-id pub-id-type="pmid">18287115</pub-id>
</element-citation>
</ref>
<ref id="CR56">
<label>56.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC</given-names>
</name>
</person-group>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
<source>Nucleic Acids Res</source>
<year>2004</year>
<volume>32</volume>
<issue>5</issue>
<fpage>1792</fpage>
<lpage>7</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id>
<pub-id pub-id-type="pmid">15034147</pub-id>
</element-citation>
</ref>
<ref id="CR57">
<label>57.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Capella-Gutierrez</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Silla-Martinez</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Gabaldon</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<issue>15</issue>
<fpage>1972</fpage>
<lpage>3</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp348</pub-id>
<pub-id pub-id-type="pmid">19505945</pub-id>
</element-citation>
</ref>
<ref id="CR58">
<label>58.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guindon</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gascuel</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood</article-title>
<source>Syst Biol</source>
<year>2003</year>
<volume>52</volume>
<issue>5</issue>
<fpage>696</fpage>
<lpage>704</lpage>
<pub-id pub-id-type="doi">10.1080/10635150390235520</pub-id>
<pub-id pub-id-type="pmid">14530136</pub-id>
</element-citation>
</ref>
<ref id="CR59">
<label>59.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guindon</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Dufayard</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>Lefort</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Anisimova</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hordijk</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Gascuel</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0</article-title>
<source>Syst Biol</source>
<year>2010</year>
<volume>59</volume>
<issue>3</issue>
<fpage>307</fpage>
<lpage>21</lpage>
<pub-id pub-id-type="doi">10.1093/sysbio/syq010</pub-id>
<pub-id pub-id-type="pmid">20525638</pub-id>
</element-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000161 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000161 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:4642659
   |texte=   A novel group of diverse Polinton-like viruses discovered by metagenome analysis
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:26560305" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a CyberinfraV1 

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024