Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000142 ( Pmc/Corpus ); précédent : 0001419; suivant : 0001430 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Targeted diversity generation by intraterrestrial archaea and archaeal viruses</title>
<author>
<name sortKey="Paul, Blair G" sort="Paul, Blair G" uniqKey="Paul B" first="Blair G." last="Paul">Blair G. Paul</name>
<affiliation>
<nlm:aff id="a1">
<institution>Marine Science Institute, University of California</institution>
, Santa Barbara, California 93106,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bagby, Sarah C" sort="Bagby, Sarah C" uniqKey="Bagby S" first="Sarah C." last="Bagby">Sarah C. Bagby</name>
<affiliation>
<nlm:aff id="a1">
<institution>Marine Science Institute, University of California</institution>
, Santa Barbara, California 93106,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Czornyj, Elizabeth" sort="Czornyj, Elizabeth" uniqKey="Czornyj E" first="Elizabeth" last="Czornyj">Elizabeth Czornyj</name>
<affiliation>
<nlm:aff id="a2">
<institution>Department of Microbiology, Immunology and Molecular Genetics, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Arambula, Diego" sort="Arambula, Diego" uniqKey="Arambula D" first="Diego" last="Arambula">Diego Arambula</name>
<affiliation>
<nlm:aff id="a2">
<institution>Department of Microbiology, Immunology and Molecular Genetics, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Handa, Sumit" sort="Handa, Sumit" uniqKey="Handa S" first="Sumit" last="Handa">Sumit Handa</name>
<affiliation>
<nlm:aff id="a3">
<institution>Department of Chemistry and Biochemistry, University of California San Diego</institution>
, La Jolla, California 92093,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sczyrba, Alexander" sort="Sczyrba, Alexander" uniqKey="Sczyrba A" first="Alexander" last="Sczyrba">Alexander Sczyrba</name>
<affiliation>
<nlm:aff id="a4">
<institution>Center for Biotechnology and Faculty of Technology, Bielefeld University</institution>
, 33615 Bielefeld,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a5">
<institution>DOE Joint Genome Institute</institution>
, Walnut Creek, California 94598,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ghosh, Partho" sort="Ghosh, Partho" uniqKey="Ghosh P" first="Partho" last="Ghosh">Partho Ghosh</name>
<affiliation>
<nlm:aff id="a3">
<institution>Department of Chemistry and Biochemistry, University of California San Diego</institution>
, La Jolla, California 92093,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Miller, Jeff F" sort="Miller, Jeff F" uniqKey="Miller J" first="Jeff F." last="Miller">Jeff F. Miller</name>
<affiliation>
<nlm:aff id="a2">
<institution>Department of Microbiology, Immunology and Molecular Genetics, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a6">
<institution>Molecular Biology Institute, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a7">
<institution>California NanoSystems Institute, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Valentine, David L" sort="Valentine, David L" uniqKey="Valentine D" first="David L." last="Valentine">David L. Valentine</name>
<affiliation>
<nlm:aff id="a1">
<institution>Marine Science Institute, University of California</institution>
, Santa Barbara, California 93106,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a8">
<institution>Department of Earth Science, University of California Santa Barbara</institution>
, Santa Barbara, California 93106
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">25798780</idno>
<idno type="pmc">4372165</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4372165</idno>
<idno type="RBID">PMC:4372165</idno>
<idno type="doi">10.1038/ncomms7585</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000142</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Targeted diversity generation by intraterrestrial archaea and archaeal viruses</title>
<author>
<name sortKey="Paul, Blair G" sort="Paul, Blair G" uniqKey="Paul B" first="Blair G." last="Paul">Blair G. Paul</name>
<affiliation>
<nlm:aff id="a1">
<institution>Marine Science Institute, University of California</institution>
, Santa Barbara, California 93106,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bagby, Sarah C" sort="Bagby, Sarah C" uniqKey="Bagby S" first="Sarah C." last="Bagby">Sarah C. Bagby</name>
<affiliation>
<nlm:aff id="a1">
<institution>Marine Science Institute, University of California</institution>
, Santa Barbara, California 93106,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Czornyj, Elizabeth" sort="Czornyj, Elizabeth" uniqKey="Czornyj E" first="Elizabeth" last="Czornyj">Elizabeth Czornyj</name>
<affiliation>
<nlm:aff id="a2">
<institution>Department of Microbiology, Immunology and Molecular Genetics, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Arambula, Diego" sort="Arambula, Diego" uniqKey="Arambula D" first="Diego" last="Arambula">Diego Arambula</name>
<affiliation>
<nlm:aff id="a2">
<institution>Department of Microbiology, Immunology and Molecular Genetics, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Handa, Sumit" sort="Handa, Sumit" uniqKey="Handa S" first="Sumit" last="Handa">Sumit Handa</name>
<affiliation>
<nlm:aff id="a3">
<institution>Department of Chemistry and Biochemistry, University of California San Diego</institution>
, La Jolla, California 92093,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sczyrba, Alexander" sort="Sczyrba, Alexander" uniqKey="Sczyrba A" first="Alexander" last="Sczyrba">Alexander Sczyrba</name>
<affiliation>
<nlm:aff id="a4">
<institution>Center for Biotechnology and Faculty of Technology, Bielefeld University</institution>
, 33615 Bielefeld,
<country>Germany</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a5">
<institution>DOE Joint Genome Institute</institution>
, Walnut Creek, California 94598,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ghosh, Partho" sort="Ghosh, Partho" uniqKey="Ghosh P" first="Partho" last="Ghosh">Partho Ghosh</name>
<affiliation>
<nlm:aff id="a3">
<institution>Department of Chemistry and Biochemistry, University of California San Diego</institution>
, La Jolla, California 92093,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Miller, Jeff F" sort="Miller, Jeff F" uniqKey="Miller J" first="Jeff F." last="Miller">Jeff F. Miller</name>
<affiliation>
<nlm:aff id="a2">
<institution>Department of Microbiology, Immunology and Molecular Genetics, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a6">
<institution>Molecular Biology Institute, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a7">
<institution>California NanoSystems Institute, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Valentine, David L" sort="Valentine, David L" uniqKey="Valentine D" first="David L." last="Valentine">David L. Valentine</name>
<affiliation>
<nlm:aff id="a1">
<institution>Marine Science Institute, University of California</institution>
, Santa Barbara, California 93106,
<country>USA</country>
</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="a8">
<institution>Department of Earth Science, University of California Santa Barbara</institution>
, Santa Barbara, California 93106
<country>USA</country>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Nature Communications</title>
<idno type="eISSN">2041-1723</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>In the evolutionary arms race between microbes, their parasites, and their neighbours, the capacity for rapid protein diversification is a potent weapon. Diversity-generating retroelements (DGRs) use mutagenic reverse transcription and retrohoming to generate myriad variants of a target gene. Originally discovered in pathogens, these retroelements have been identified in bacteria and their viruses, but never in archaea. Here we report the discovery of intact DGRs in two distinct intraterrestrial archaeal systems: a novel virus that appears to infect archaea in the marine subsurface, and, separately, two uncultivated nanoarchaea from the terrestrial subsurface. The viral DGR system targets putative tail fibre ligand-binding domains, potentially generating >10
<sup>18</sup>
protein variants. The two single-cell nanoarchaeal genomes each possess ≥4 distinct DGRs. Against an expected background of low genome-wide mutation rates, these results demonstrate a previously unsuspected potential for rapid, targeted sequence diversification in intraterrestrial archaea and their viruses.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Kallmeyer, J" uniqKey="Kallmeyer J">J. Kallmeyer</name>
</author>
<author>
<name sortKey="Pockalny, R" uniqKey="Pockalny R">R. Pockalny</name>
</author>
<author>
<name sortKey="Adhikari, R R" uniqKey="Adhikari R">R. R. Adhikari</name>
</author>
<author>
<name sortKey="Smith, D C" uniqKey="Smith D">D. C. Smith</name>
</author>
<author>
<name sortKey="Dhondt, S" uniqKey="Dhondt S">S. DHondt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lipp, J" uniqKey="Lipp J">J. Lipp</name>
</author>
<author>
<name sortKey="Morono, Y" uniqKey="Morono Y">Y. Morono</name>
</author>
<author>
<name sortKey="Inagaki, F" uniqKey="Inagaki F">F. Inagaki</name>
</author>
<author>
<name sortKey="Hinrichs, K U" uniqKey="Hinrichs K">K.-U. Hinrichs</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Valentine, D L" uniqKey="Valentine D">D. L. Valentine</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoehler, T M" uniqKey="Hoehler T">T. M. Hoehler</name>
</author>
<author>
<name sortKey="J Rgensen, B B" uniqKey="J Rgensen B">B. B. Jørgensen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lewin, A" uniqKey="Lewin A">A. Lewin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, M" uniqKey="Liu M">M. Liu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Doulatov, S" uniqKey="Doulatov S">S. Doulatov</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Medhekar, B" uniqKey="Medhekar B">B. Medhekar</name>
</author>
<author>
<name sortKey="Miller, J F" uniqKey="Miller J">J. F. Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcmahon, S A" uniqKey="Mcmahon S">S. A. McMahon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guo, H" uniqKey="Guo H">H. Guo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Le Coq, J" uniqKey="Le Coq J">J. Le Coq</name>
</author>
<author>
<name sortKey="Ghosh, P" uniqKey="Ghosh P">P. Ghosh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F. Rohwer</name>
</author>
<author>
<name sortKey="Vega Thurber, R" uniqKey="Vega Thurber R">R. Vega Thurber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rowlands, T" uniqKey="Rowlands T">T. Rowlands</name>
</author>
<author>
<name sortKey="Baumann, P" uniqKey="Baumann P">P. Baumann</name>
</author>
<author>
<name sortKey="Jackson, S P" uniqKey="Jackson S">S. P. Jackson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dwivedi, B" uniqKey="Dwivedi B">B. Dwivedi</name>
</author>
<author>
<name sortKey="Xue, B" uniqKey="Xue B">B. Xue</name>
</author>
<author>
<name sortKey="Lundin, D" uniqKey="Lundin D">D. Lundin</name>
</author>
<author>
<name sortKey="Edwards, R A" uniqKey="Edwards R">R. A. Edwards</name>
</author>
<author>
<name sortKey="Breitbart, M" uniqKey="Breitbart M">M. Breitbart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Arambula, D" uniqKey="Arambula D">D. Arambula</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schillinger, T" uniqKey="Schillinger T">T. Schillinger</name>
</author>
<author>
<name sortKey="Lisfi, M" uniqKey="Lisfi M">M. Lisfi</name>
</author>
<author>
<name sortKey="Chi, J" uniqKey="Chi J">J. Chi</name>
</author>
<author>
<name sortKey="Cullum, J" uniqKey="Cullum J">J. Cullum</name>
</author>
<author>
<name sortKey="Zingler, N" uniqKey="Zingler N">N. Zingler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goldrath, A W" uniqKey="Goldrath A">A. W. Goldrath</name>
</author>
<author>
<name sortKey="Bevan, M J" uniqKey="Bevan M">M. J. Bevan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alder, M N" uniqKey="Alder M">M. N. Alder</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stokke, R" uniqKey="Stokke R">R. Stokke</name>
</author>
<author>
<name sortKey="Roalkvam, I" uniqKey="Roalkvam I">I. Roalkvam</name>
</author>
<author>
<name sortKey="Lanzen, A" uniqKey="Lanzen A">A. Lanzen</name>
</author>
<author>
<name sortKey="Haflidason, H" uniqKey="Haflidason H">H. Haflidason</name>
</author>
<author>
<name sortKey="Steen, I H" uniqKey="Steen I">I. H. Steen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rinke, C" uniqKey="Rinke C">C. Rinke</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huber, H" uniqKey="Huber H">H. Huber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Podar, M" uniqKey="Podar M">M. Podar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Minot, S" uniqKey="Minot S">S. Minot</name>
</author>
<author>
<name sortKey="Grunberg, S" uniqKey="Grunberg S">S. Grunberg</name>
</author>
<author>
<name sortKey="Wu, G D" uniqKey="Wu G">G. D. Wu</name>
</author>
<author>
<name sortKey="Lewis, J D" uniqKey="Lewis J">J. D. Lewis</name>
</author>
<author>
<name sortKey="Bushman, F D" uniqKey="Bushman F">F. D. Bushman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simon, D M" uniqKey="Simon D">D. M. Simon</name>
</author>
<author>
<name sortKey="Zimmerly, S" uniqKey="Zimmerly S">S. Zimmerly</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ye, Y" uniqKey="Ye Y">Y. Ye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Louis Jeune, C" uniqKey="Louis Jeune C">C. Louis-Jeune</name>
</author>
<author>
<name sortKey="Andrade Navarro, M A" uniqKey="Andrade Navarro M">M. A. Andrade-Navarro</name>
</author>
<author>
<name sortKey="Perez Iratxeta, C" uniqKey="Perez Iratxeta C">C. Perez-Iratxeta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paull, C K" uniqKey="Paull C">C. K. Paull</name>
</author>
<author>
<name sortKey="Normark, W R" uniqKey="Normark W">W. R. Normark</name>
</author>
<author>
<name sortKey="Ussler, W" uniqKey="Ussler W">W. Ussler</name>
</author>
<author>
<name sortKey="Caress, D W" uniqKey="Caress D">D. W. Caress</name>
</author>
<author>
<name sortKey="Keaten, R" uniqKey="Keaten R">R. Keaten</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Widdel, F" uniqKey="Widdel F">F. Widdel</name>
</author>
<author>
<name sortKey="Bak, F" uniqKey="Bak F">F. Bak</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Thurber, R V" uniqKey="Thurber R">R. V. Thurber</name>
</author>
<author>
<name sortKey="Haynes, M" uniqKey="Haynes M">M. Haynes</name>
</author>
<author>
<name sortKey="Breitbart, M" uniqKey="Breitbart M">M. Breitbart</name>
</author>
<author>
<name sortKey="Wegley, L" uniqKey="Wegley L">L. Wegley</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F. Rohwer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Henn, M R" uniqKey="Henn M">M. R. Henn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schmieder, R" uniqKey="Schmieder R">R. Schmieder</name>
</author>
<author>
<name sortKey="Lim, Y" uniqKey="Lim Y">Y. Lim</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F. Rohwer</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R. Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hurwitz, B" uniqKey="Hurwitz B">B. Hurwitz</name>
</author>
<author>
<name sortKey="Deng, L" uniqKey="Deng L">L. Deng</name>
</author>
<author>
<name sortKey="Poulos, B" uniqKey="Poulos B">B. Poulos</name>
</author>
<author>
<name sortKey="Sullivan, M" uniqKey="Sullivan M">M. Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Niu, B" uniqKey="Niu B">B. Niu</name>
</author>
<author>
<name sortKey="Fu, L" uniqKey="Fu L">L. Fu</name>
</author>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S. Sun</name>
</author>
<author>
<name sortKey="Li, W" uniqKey="Li W">W. Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S. Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Delcher, A L" uniqKey="Delcher A">A. L. Delcher</name>
</author>
<author>
<name sortKey="Bratke, K A" uniqKey="Bratke K">K. A. Bratke</name>
</author>
<author>
<name sortKey="Powers, E C" uniqKey="Powers E">E. C. Powers</name>
</author>
<author>
<name sortKey="Salzberg, S L" uniqKey="Salzberg S">S. L. Salzberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, S F" uniqKey="Altschul S">S. F. Altschul</name>
</author>
<author>
<name sortKey="Gish, W" uniqKey="Gish W">W. Gish</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W. Miller</name>
</author>
<author>
<name sortKey="Myers, E W" uniqKey="Myers E">E. W. Myers</name>
</author>
<author>
<name sortKey="Lipman, D J" uniqKey="Lipman D">D. J. Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Leplae, R" uniqKey="Leplae R">R. Leplae</name>
</author>
<author>
<name sortKey="Hebrant, A" uniqKey="Hebrant A">A. Hebrant</name>
</author>
<author>
<name sortKey="Wodak, S J" uniqKey="Wodak S">S. J. Wodak</name>
</author>
<author>
<name sortKey="Toussaint, A" uniqKey="Toussaint A">A. Toussaint</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hurwitz, B L" uniqKey="Hurwitz B">B. L. Hurwitz</name>
</author>
<author>
<name sortKey="Sullivan, M B" uniqKey="Sullivan M">M. B. Sullivan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kelley, L A" uniqKey="Kelley L">L. A. Kelley</name>
</author>
<author>
<name sortKey="Sternberg, M J" uniqKey="Sternberg M">M. J. Sternberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rice, P" uniqKey="Rice P">P. Rice</name>
</author>
<author>
<name sortKey="Longden, I" uniqKey="Longden I">I. Longden</name>
</author>
<author>
<name sortKey="Bleasby, A" uniqKey="Bleasby A">A. Bleasby</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Larkin, M A" uniqKey="Larkin M">M. A. Larkin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kumar, S" uniqKey="Kumar S">S. Kumar</name>
</author>
<author>
<name sortKey="Nei, M" uniqKey="Nei M">M. Nei</name>
</author>
<author>
<name sortKey="Dudley, J" uniqKey="Dudley J">J. Dudley</name>
</author>
<author>
<name sortKey="Tamura, K" uniqKey="Tamura K">K. Tamura</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guindon, S" uniqKey="Guindon S">S. Guindon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pride, D T" uniqKey="Pride D">D. T. Pride</name>
</author>
<author>
<name sortKey="Meinersmann, R J" uniqKey="Meinersmann R">R. J. Meinersmann</name>
</author>
<author>
<name sortKey="Wassenaar, T M" uniqKey="Wassenaar T">T. M. Wassenaar</name>
</author>
<author>
<name sortKey="Blaser, M J" uniqKey="Blaser M">M. J. Blaser</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H. Teeling</name>
</author>
<author>
<name sortKey="Meyerdierks, A" uniqKey="Meyerdierks A">A. Meyerdierks</name>
</author>
<author>
<name sortKey="Bauer, M" uniqKey="Bauer M">M. Bauer</name>
</author>
<author>
<name sortKey="Amann, R" uniqKey="Amann R">R. Amann</name>
</author>
<author>
<name sortKey="Glockner, F O" uniqKey="Glockner F">F. O. Glöckner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dick, G J" uniqKey="Dick G">G. J. Dick</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Oksanen, J" uniqKey="Oksanen J">J. Oksanen</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nat Commun</journal-id>
<journal-id journal-id-type="iso-abbrev">Nat Commun</journal-id>
<journal-title-group>
<journal-title>Nature Communications</journal-title>
</journal-title-group>
<issn pub-type="epub">2041-1723</issn>
<publisher>
<publisher-name>Nature Pub. Group</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">25798780</article-id>
<article-id pub-id-type="pmc">4372165</article-id>
<article-id pub-id-type="pii">ncomms7585</article-id>
<article-id pub-id-type="doi">10.1038/ncomms7585</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Targeted diversity generation by intraterrestrial archaea and archaeal viruses</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Paul</surname>
<given-names>Blair G.</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bagby</surname>
<given-names>Sarah C.</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Czornyj</surname>
<given-names>Elizabeth</given-names>
</name>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Arambula</surname>
<given-names>Diego</given-names>
</name>
<xref ref-type="aff" rid="a2">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Handa</surname>
<given-names>Sumit</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sczyrba</surname>
<given-names>Alexander</given-names>
</name>
<xref ref-type="aff" rid="a4">4</xref>
<xref ref-type="aff" rid="a5">5</xref>
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0002-4405-3847</contrib-id>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ghosh</surname>
<given-names>Partho</given-names>
</name>
<xref ref-type="aff" rid="a3">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Miller</surname>
<given-names>Jeff F.</given-names>
</name>
<xref ref-type="aff" rid="a2">2</xref>
<xref ref-type="aff" rid="a6">6</xref>
<xref ref-type="aff" rid="a7">7</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Valentine</surname>
<given-names>David L.</given-names>
</name>
<xref ref-type="corresp" rid="c1">a</xref>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="aff" rid="a8">8</xref>
</contrib>
<aff id="a1">
<label>1</label>
<institution>Marine Science Institute, University of California</institution>
, Santa Barbara, California 93106,
<country>USA</country>
</aff>
<aff id="a2">
<label>2</label>
<institution>Department of Microbiology, Immunology and Molecular Genetics, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</aff>
<aff id="a3">
<label>3</label>
<institution>Department of Chemistry and Biochemistry, University of California San Diego</institution>
, La Jolla, California 92093,
<country>USA</country>
</aff>
<aff id="a4">
<label>4</label>
<institution>Center for Biotechnology and Faculty of Technology, Bielefeld University</institution>
, 33615 Bielefeld,
<country>Germany</country>
</aff>
<aff id="a5">
<label>5</label>
<institution>DOE Joint Genome Institute</institution>
, Walnut Creek, California 94598,
<country>USA</country>
</aff>
<aff id="a6">
<label>6</label>
<institution>Molecular Biology Institute, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</aff>
<aff id="a7">
<label>7</label>
<institution>California NanoSystems Institute, University of California</institution>
, Los Angeles, California 90095,
<country>USA</country>
</aff>
<aff id="a8">
<label>8</label>
<institution>Department of Earth Science, University of California Santa Barbara</institution>
, Santa Barbara, California 93106
<country>USA</country>
</aff>
</contrib-group>
<author-notes>
<corresp id="c1">
<label>a</label>
<email>valentine@geol.ucsb.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>23</day>
<month>03</month>
<year>2015</year>
</pub-date>
<volume>6</volume>
<elocation-id>6585</elocation-id>
<history>
<date date-type="received">
<day>10</day>
<month>12</month>
<year>2014</year>
</date>
<date date-type="accepted">
<day>09</day>
<month>02</month>
<year>2015</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2015, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.</copyright-statement>
<copyright-year>2015</copyright-year>
<copyright-holder>Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<pmc-comment>author-paid</pmc-comment>
<license-p>This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
</license-p>
</license>
</permissions>
<abstract>
<p>In the evolutionary arms race between microbes, their parasites, and their neighbours, the capacity for rapid protein diversification is a potent weapon. Diversity-generating retroelements (DGRs) use mutagenic reverse transcription and retrohoming to generate myriad variants of a target gene. Originally discovered in pathogens, these retroelements have been identified in bacteria and their viruses, but never in archaea. Here we report the discovery of intact DGRs in two distinct intraterrestrial archaeal systems: a novel virus that appears to infect archaea in the marine subsurface, and, separately, two uncultivated nanoarchaea from the terrestrial subsurface. The viral DGR system targets putative tail fibre ligand-binding domains, potentially generating >10
<sup>18</sup>
protein variants. The two single-cell nanoarchaeal genomes each possess ≥4 distinct DGRs. Against an expected background of low genome-wide mutation rates, these results demonstrate a previously unsuspected potential for rapid, targeted sequence diversification in intraterrestrial archaea and their viruses.</p>
</abstract>
<abstract abstract-type="web-summary">
<p>
<inline-graphic id="i1" xlink:href="ncomms7585-i1.jpg"></inline-graphic>
Diversity-generating retroelements (DGRs) are genetic elements that introduce sequence variation within target genes in bacteria and their viruses. Here, Paul
<italic>et al</italic>
. report the discovery of DGRs in an archaeal virus and in two archaea from marine and terrestrial subsurface environments, respectively.</p>
</abstract>
</article-meta>
</front>
<body>
<p>Energy-limited marine and terrestrial subsurface environments harbour a microbial reservoir of exceptional magnitude
<xref ref-type="bibr" rid="b1">1</xref>
. Archaea are both numerically dominant
<xref ref-type="bibr" rid="b2">2</xref>
and well adapted to energy limitations faced in various intraterrestrial environments
<xref ref-type="bibr" rid="b3">3</xref>
<xref ref-type="bibr" rid="b4">4</xref>
. Although little is understood about their physiology, metabolism, evolution, or mortality in these environments, current research predicts that they will be characterized by slow growth and low genome-wide mutation rates
<xref ref-type="bibr" rid="b5">5</xref>
.</p>
<p>Independent of the sporadic mutation rate, microbial genetic variation can be increased by processes such as gene conversion and horizontal gene transfer. The single most powerful such mechanism known in nature is the diversity-generating retroelement (DGR)
<xref ref-type="bibr" rid="b6">6</xref>
<xref ref-type="bibr" rid="b7">7</xref>
. DGRs use a process called mutagenic retrohoming for the targeted replacement of a variable repeat (VR) coding region with a sequence derived from reverse transcription of a cognate non-coding template repeat (TR) RNA
<xref ref-type="bibr" rid="b6">6</xref>
<xref ref-type="bibr" rid="b7">7</xref>
<xref ref-type="bibr" rid="b8">8</xref>
<xref ref-type="bibr" rid="b9">9</xref>
. Crucially, the reverse transcriptase (RT) used is error-prone at template adenine bases
<xref ref-type="bibr" rid="b10">10</xref>
, but has high fidelity at other template bases, modulating the rate of diversification to permit rapid exploration of target protein (TP) variants within a recognizable structural framework. Over successive waves of replication, DGR activity leads to rapid evolution of TPs, typically altering ligand-binding specificity
<xref ref-type="bibr" rid="b11">11</xref>
and even permitting phage recognition of novel host ligands
<xref ref-type="bibr" rid="b9">9</xref>
. To date, DGRs have been found widely in bacteria and their viruses, but never in an archaeal system.</p>
<p>Because parasitism is expected to be an important driver of evolution and mortality in intraterrestrial archaea
<xref ref-type="bibr" rid="b12">12</xref>
, we set out to identify and characterize viruses of anaerobic archaea from one system in the marine subsurface, a methane seep in a California borderlands basin. Our survey uncovers the complete genome of a virus that appears to infect archaea. Remarkably, this genome encodes a complete and apparently active DGR. We examine existing sequence data from archaeal systems, discovering multiple DGRs in the genomes of two subterranean nanoarchaea. These findings demonstrate that subsurface archaea and archaeal viruses maintain a mechanism for generating protein hypervariability within targeted genes, bringing the capacity for massive diversification to the archaea-dominated deep biosphere.</p>
<sec disp-level="1" sec-type="results">
<title>Results</title>
<sec disp-level="2">
<title>A putative archaeal virus encodes a DGR</title>
<p>We collected subsurface sediments from a methane seep at 820 m water depth in Santa Monica Basin. After confirming that these sediments exhibited anaerobic oxidation of methane (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 1</xref>
), we prepared and sequenced a viral metagenome, uncovering a novel and apparently complete viral genome (termed ANMV-1;
<xref ref-type="fig" rid="f1">Fig. 1a</xref>
). Examination of ANMV-1 coding sequences offered two key lines of evidence that this virus infects an archaeal host. First, the ANMV-1 genome encodes a TATA-box binding protein, an essential component of the transcriptional machinery in archaea and eukarya that is absent from bacteria
<xref ref-type="bibr" rid="b13">13</xref>
. Second, the ANMV-1 genome contains six genes that show sequence similarity (
<italic>e</italic>
-value 10
<sup>−7</sup>
to 10
<sup>−26</sup>
) with proteins from methanotrophic archaea (ANME-1 and ANME-2D) and none with comparable similarity to eukaryotic proteins (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 1</xref>
). We further hypothesize that ANMV-1’s archaeal host is anaerobic; ribonucleotide reductase activity is essential for phage genome replication
<xref ref-type="bibr" rid="b14">14</xref>
, and ANMV-1 encodes an oxygen-sensitive ribonucleotide reductase. In light of the active anaerobic oxidation of methane metabolism observed in the sample from which ANMV-1 was sequenced, the anaerobic archaeal host may belong to an anaerobic methane-oxidizing (ANME) clade.</p>
<p>Analysis of ANMV-1 identified a cassette bearing a RT gene, two 114-bp proximal repeats that vary from each other at positions corresponding to adenines, and a short inverted repeat with potential for hairpin formation (
<xref ref-type="fig" rid="f1">Fig. 1b</xref>
). Together, these features are hallmarks of a DGR
<xref ref-type="bibr" rid="b6">6</xref>
<xref ref-type="bibr" rid="b7">7</xref>
<xref ref-type="bibr" rid="b8">8</xref>
<xref ref-type="bibr" rid="b9">9</xref>
. Since the discovery of these remarkable elements, >300 DGRs have been identified, all within the bacteria and their viruses
<xref ref-type="bibr" rid="b15">15</xref>
<xref ref-type="bibr" rid="b16">16</xref>
. ANMV-1 represents the first identification of a DGR that appears to operate in an archaeal system.</p>
<p>Although the ANMV-1 VR lies within a gene of unknown function (best BLASTp
<italic>e</italic>
-value >10
<sup>−3</sup>
, to uncharacterized proteins), the predicted secondary structure of the gene product offered important functional insights. The ANMV-1 DGR target (termed AdtA) shares greatest structural homology (37% of residues modelled with 99% Phyre confidence; r.m.s.d. 1.6 Å;
<italic>Z</italic>
=13.6) with the major tropism determinant (Mtd) of Bordetella phage BPP-1, a DGR-targeted tail fibre protein responsible for binding host ligands. AdtA contains 21 codons with potential for adenine-specific amino-acid substitutions (versus 12 in Mtd), including nine AAY codons, with potential for >10
<sup>18</sup>
variants. Thus, ANMV-1 demonstrates a degree of coding variability that is comparable to bacterial DGR systems
<xref ref-type="bibr" rid="b11">11</xref>
and outpaces the vertebrate immune system’s capacity to generate variants of antibodies or T-cell receptor proteins
<xref ref-type="bibr" rid="b17">17</xref>
<xref ref-type="bibr" rid="b18">18</xref>
. Predicted AdtA structural homology to Mtd is greatest in its C terminus, which corresponds to the C-type lectin (CLec)-fold common to many known bacterial DGR targets
<xref ref-type="bibr" rid="b11">11</xref>
<xref ref-type="bibr" rid="b15">15</xref>
. As in Mtd, the targeted AdtA residues map to partially solvent-exposed sites in the CLec domain (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 2</xref>
). Together, these findings point to a binding-related role for AdtA, and the genomic proximity of the
<italic>adtA</italic>
gene to phage tail fibre genes (
<xref ref-type="fig" rid="f1">Fig. 1a</xref>
) suggests host attachment as a possible function.</p>
<p>The discovery of a mechanism for rapid genetic diversification in ANMV-1 raises questions about the distribution and evolution of this virus. We conducted a search for close relatives of the ANMV-1 genome in environmental metagenomic databases, identifying a group of highly similar sequences (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 3</xref>
) found in seafloor sediments of the Nyegga methane seeps, offshore Norway
<xref ref-type="bibr" rid="b19">19</xref>
, and in Coal Oil Point hydrocarbon seeps, offshore Santa Barbara, California. Metagenomes from both seeps cover portions of the ANMV-1 DGR cassette, including a closely related and intact RT open reading frame (ORF) from Nyegga seep sediments. These results indicate that ANMV-1 relatives are widespread in methane seeps. Furthermore, the persistence of DGR sequences in related viruses from widely separated ocean basins suggests a selective pressure to maintain the mechanism for targeted protein diversification.</p>
</sec>
<sec disp-level="2">
<title>Two Nanoarchaeota maintain multiple DGRs</title>
<p>Having identified the first DGR-containing archaeal system, an apparently widespread virus from the marine subsurface, we asked whether distinct DGRs might occur in intraterrestrial archaea themselves. We searched genomic databases for archaeal RT genes and nearby repeats with adenine variability, finding multiple putative DGRs in the two operational taxonomic units (OTU1 and OTU2) of DUSEL4, a clade of uncultivated subterranean
<italic>Nanoarchaeota</italic>
established from four sequenced cells
<xref ref-type="bibr" rid="b20">20</xref>
. Whereas the sequenced genomes of the other known nanoarchaea,
<italic>Nanoarchaeum equitans</italic>
<xref ref-type="bibr" rid="b21">21</xref>
(completely sequenced) and
<italic>Nanoarchaeote</italic>
Nst-1 (ref.
<xref ref-type="bibr" rid="b22">22</xref>
) (~91% sequenced), so far appear to contain neither DGRs nor RT genes, the DUSEL4 genomes have an abundance, with four distinct (non-redundant) DGR cassettes in a single genome (
<xref ref-type="fig" rid="f2">Fig. 2a</xref>
). Examination of DUSEL4 RT and TP sequences revealed four distinct groups of DGRs with conserved
<italic>cis</italic>
- and
<italic>trans</italic>
-acting features, each with a single representative in both OTU1 and OTU2 (
<xref ref-type="fig" rid="f2">Figs 2b</xref>
and
<xref ref-type="fig" rid="f3">3</xref>
). Intriguingly, a further search within these genomes for VR-containing genes revealed two partial DGRs—consisting only of a target gene, VR, and
<italic>cis</italic>
-acting elements—in OTU1, the representative with higher estimated genome coverage
<xref ref-type="bibr" rid="b20">20</xref>
. Evidence of adenine-directed mutagenesis in these VRs (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 4</xref>
) suggests a history of DGR activity in these sites that do not contain an RT gene, indicating either that the fragments are fossils, left behind when the RT was recruited to a different genomic location or simply lost, or that they are diversified remotely by DGRs elsewhere in the genome.</p>
</sec>
<sec disp-level="2">
<title>Archaeal DGR components have distinct evolutionary histories</title>
<p>The possibility that DGRs might not move as a unit led us to examine the evolutionary histories of key DGR cassette components. First, we analysed the phylogeny of the newly identified archaeal DGR RTs. Canonical DGR-type RTs have been shown to form a distinct clade most closely related to bacterial group-II introns
<xref ref-type="bibr" rid="b7">7</xref>
<xref ref-type="bibr" rid="b23">23</xref>
<xref ref-type="bibr" rid="b24">24</xref>
; while known archaeal RTs are most similar to bacterial group-II and group-II-like introns, they fall outside the DGR clade
<xref ref-type="bibr" rid="b24">24</xref>
. We find that the RTs from ANMV-1 and DUSEL4 DGRs lie in a monophyletic group within the DGR clade (
<xref ref-type="fig" rid="f4">Fig. 4a</xref>
), branching separately from bacterial sequences (97% bootstrap support;
<xref ref-type="fig" rid="f4">Fig. 4b</xref>
). Underscoring the likelihood that ANMV-1 has an archaeal host, this pattern suggests that ANMV-1 and DUSEL4 DGR RTs share a common archaeal ancestry.</p>
<p>We next compared the tetranucleotide composition of DUSEL4 DGRs to that of their host genomes (for individual genome signatures, see
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 5</xref>
) at two levels: the concatenated DGRs, and separately concatenated DGR TP genes and RT genes. While TP fragments lie well within the core genomic pattern, RT fragments present as outliers, pulling the overall DGR signature away from the genome core (
<xref ref-type="fig" rid="f5">Fig. 5a,b</xref>
). Together with the RTs’ phylogenetic relationships, this pattern suggests that DUSEL4 may have acquired its DGR RTs via horizontal transfer, perhaps from another archaeal host. The sequence conservation across multiple DGR RTs in DUSEL4 (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 6a</xref>
) suggests that they have a common source, perhaps a single acquisition followed by repeated gene duplications as new DGRs formed.</p>
</sec>
<sec disp-level="2">
<title>Nanoarchaeal DGRs target orphan genes</title>
<p>Most previously identified bacterial and phage DGRs diversify ligand-binding proteins, predominantly C-type lectin-like
<xref ref-type="bibr" rid="b9">9</xref>
<xref ref-type="bibr" rid="b11">11</xref>
<xref ref-type="bibr" rid="b15">15</xref>
or immunoglobulin-like folds
<xref ref-type="bibr" rid="b23">23</xref>
<xref ref-type="bibr" rid="b25">25</xref>
. By contrast, primary sequence analysis of all DUSEL4
<italic>Nanoarchaeota</italic>
DGR and DGR fragment TPs reveals that they share no protein sequence homology with either AdtA or any database representatives, but rather constitute a set of orphan genes (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 6b</xref>
); this finding is supported by Phyre analysis, which predicted no structural homology between characterized proteins and any nanoarchaeal TP. Initial structural investigation of one nanoarchaeal TP (OTU1 contig 3 DGR2 TP;
<xref ref-type="fig" rid="f2">Fig. 2b</xref>
) by circular dichroism (CD) revealed that the purified protein adopts a thermostable fold (
<italic>T</italic>
<sub>m</sub>
~70 °C;
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 7</xref>
) even with limited secondary structure (12% α-helix and 25% β-strand)
<xref ref-type="bibr" rid="b26">26</xref>
. Pairwise sequence alignments of the nanoarchaeal TPs (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 6b</xref>
) suggest that the targets of groups i–iv are unlikely to share substantial structural homology with each other, raising the possibility that nanoarchaeal DGRs may target a broader range of protein activities than are known for bacterial and phage DGRs.</p>
</sec>
</sec>
<sec disp-level="1" sec-type="discussion">
<title>Discussion</title>
<p>Comparison of the putative archaeal DGRs with the canonical bacterial and viral DGRs reveals both similarities and distinctive features that may influence DGR function. In Bordetella phage BPP-1, certain
<italic>cis</italic>
-acting elements appear critical for efficient retrohoming, including (1) an initiation of mutagenic homing (IMH) motif that lies at the 3′ end of VR and an IMH* homologue at the 3′ end of TR; and (2) a short inverted repeat downstream of VR, capable of forming a hairpin/cruciform structure, typically with a GRNA tetraloop
<xref ref-type="bibr" rid="b10">10</xref>
. DUSEL4 DGRs appear to maintain versions of these canonical
<italic>cis</italic>
- acting elements under additional constraints. First, IMH sites in DUSEL4 include a TGGGGT motif, while DUSEL4 IMH* sites carry a corresponding TGGAAT. Second, all DUSEL4 DGR hairpins have highly constrained GRA trinucleotide loops, and each hairpin lies within its DGR’s TP gene, placing this region under selection at the level of both protein structure and DNA sequence. Investigation into the influence of these features on archaeal DGR activity may shed light on differences in the molecular mechanism of DGR retrohoming in bacterial and archaeal systems.</p>
<p>Examination of nanoarchaeal TRs suggests the capacity for individual DGRs to generate 7 × 10
<sup>10</sup>
to 9 × 10
<sup>12</sup>
variants of their TPs, with no risk of nonsense mutations (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 4</xref>
). Although this range is low by comparison with typical bacterial and viral DGRs, the potential evolutionary impact must be considered in light of the multiplicity of DGRs in DUSEL4
<italic>Nanoarchaeota</italic>
; whereas no bacterial or viral genome has been found to harbour >2 distinct DGRs, these nanoarchaea have ≥4. This profusion may enable subterranean nanoarchaea to explore a multidimensional fitness landscape far more rapidly than would sporadic mutation at the low rates observed for other intraterrestrial microbes
<xref ref-type="bibr" rid="b5">5</xref>
. Moreover, the fragmentary DGRs elsewhere in OTU1 suggest either that a single nanoarchaeal DGR can concurrently target multiple genes with homologous VRs, or that these DGRs are dynamic, with mobile RT/TR elements recruited from one locus to another over time. In either case, the diversity of nanoarchaeal DGR target sequences so far discovered raises the possibility that these organisms have used DGRs as a general tool for protein engineering—a hint that scientists might be able to do the same.</p>
<p>It is striking that these first discoveries of DGRs in archaeal systems should occur in a virus and in the
<italic>Nanoarchaeota</italic>
, a phylum associated with parasitism
<xref ref-type="bibr" rid="b21">21</xref>
<xref ref-type="bibr" rid="b22">22</xref>
. Whether the uncultivated organisms represented by the DUSEL4 clade live as obligate parasites remains to be determined; their more important commonality with ANMV-1 may be their occurrence in Earth’s subsurface. While massive and low-risk protein diversification offers clear advantages to any organism caught up in the Red Queen’s race, the occurrence of a DGR in the globally distributed virus ANMV-1 and the proliferation of DGRs in subterranean nanoarchaea suggests that these elements may confer additional selective advantages in a compartmentalized and energy-limited subsurface environment.</p>
</sec>
<sec disp-level="1" sec-type="methods">
<title>Methods</title>
<sec disp-level="2">
<title>Study site and sampling</title>
<p>Paull’s Pingo is a seafloor mound feature (latitude 33.799° N and longitude 118.646° W, depth ~820 m) formed by the expansion of subsurface methane hydrate
<xref ref-type="bibr" rid="b27">27</xref>
. We accessed active methane seeps at the pingo to collect sediment cores using deep submergence vehicle
<italic>Alvin</italic>
, during R/V
<italic>Atlantis</italic>
Leg AT15-53 (September 2009). Sediment core processing was conducted shipboard in an anaerobic chamber, flushed with a nitrogen headspace. One sediment core was subsectioned between 5 and 15 cm (relative to seafloor) and dedicated to methane-amended incubations. Two subsamples of 60 ml sediment were homogenized with 20 ml of sterile, anoxic artificial seawater medium
<xref ref-type="bibr" rid="b28">28</xref>
. Incubations with the homogenized sediments were prepared in 120-ml serum vials, under a 40-ml headspace of ~3% CH
<sub>4</sub>
and 97% N
<sub>2</sub>
. Incubations were amended with
<sup>13</sup>
C-labelled methane (99 atom-%
<sup>13</sup>
C) as an exogenous tracer to track methane oxidation (
<xref ref-type="supplementary-material" rid="S1">Supplementary Fig. 1</xref>
). Stable isotope ratios (
<italic>δ</italic>
<sup>13</sup>
C) for CO
<sub>2</sub>
were measured by isotope ratio mass spectrometry (Thermo Finnigan Delta XP Plus in continuous flow mode). After 1 month of enrichment, the incubation was terminated and viruses were purified for DNA sequencing.</p>
</sec>
<sec disp-level="2">
<title>Virome purification and DNA sequencing</title>
<p>Incubation slurry samples (1:2 sediment:aqueous phase) were used for virus particle purifications. Samples were vigorously homogenized by vortexing (15 min), followed by centrifugation (10 min, 500
<italic>g</italic>
). Supernatant was filtered (0.22 μm) to separate viruses from cells. Viruses were concentrated and viral DNA was extracted as previously described
<xref ref-type="bibr" rid="b29">29</xref>
. Briefly, virus particles were concentrated via caesium chloride density gradient ultracentrifugation (2 h, 22,000 
<italic>g</italic>
, 4 °C) and treated with DNase-I. DNA was extracted by cetrimonium bromide (CTAB)-chloroform and phenol-chloroform separation. Before viral DNA amplification, a 16S PCR assay to screen for cellular DNA contamination was performed with universal bacterial primers Bact27F (5′- AGAGTTTGATCCTGGCTCAG -3′) and Bact1492R (5′- GGTTACCTTGTTACGACTT -3′). Following this check, we performed Phi29 polymerase multiple displacement amplification (MDA) using the Illustra Genomiphi HY DNA Amplification Kit (GE Healthcare). Thermal cycling steps for denaturing template DNA, polymerase amplification, and post-amplification enzyme inactivation were performed according to the manufacturer’s specifications, except that the MDA amplification reaction was incubated for 2 h instead of 4 h (2 h, 30 °C). Amplified product was pyrosequenced on 454-titanium plates at the Broad Institute, as part of the Moore Marine Phage Metagenome Initiative
<xref ref-type="bibr" rid="b30">30</xref>
. Metagenomic reads can be obtained under the NCBI BioSample accession code PRJNA47435.DV-ANM1.</p>
</sec>
<sec disp-level="2">
<title>Read preprocessing, binning, and assembly</title>
<p>Raw sequencing reads were first scanned for sequencing primers, which were identified and removed using TagCleaner
<xref ref-type="bibr" rid="b31">31</xref>
. The reads were then preprocessed to remove low-quality sequence following the method of Hurwitz
<italic>et al</italic>
.
<xref ref-type="bibr" rid="b32">32</xref>
, using a custom R script. Preprocessing included, first, removal of any reads with ambiguous (N) bases; second, removal of the shortest 2.5% and longest 2.5% of reads; third, removal of reads with mean quality score >2 s.d. below the mean; and finally, de-replication with CD-Hit 454 (ref.
<xref ref-type="bibr" rid="b33">33</xref>
).</p>
<p>Reads that passed preprocessing and quality control (QC) steps were subjected to
<italic>de novo</italic>
assembly using CAMERA’s meta-assembler
<xref ref-type="bibr" rid="b34">34</xref>
. As this assembler does not permit user manipulation of read overlap parameters, we compared the meta-assembler output with a custom reassembly approach using Geneious v7.0 (Biomatters Ltd) with the following parameters: minimum overlap 35 bases, overlap pairwise identity 90% and index word length 12 nt. The ANMV-1 contig described in this study was generated from the meta-assembly and aligned globally with 97.7% pairwise nucleotide similarity to a contig obtained by the second custom
<italic>de novo</italic>
assembly. PCR screening confirmed the authenticity of the ANMV-1 DGR cassette in both template and MDA-amplified viral DNA, using primers that partially overlap TP, RT and VR/TR regions: ANMVdgrF (5′- AGGCGATGCAGACGAATGGC -3′) and ANMVdgrR (5′- TTGCCCAGAGTTACACCGAGCG -3′).</p>
</sec>
<sec disp-level="2">
<title>Metagenome annotations</title>
<p>Prediction of open reading frames was performed using Glimmer3 (ref.
<xref ref-type="bibr" rid="b35">35</xref>
) with default parameters. Translated ORF sequences were annotated via CAMERA-HMM and BLASTp
<xref ref-type="bibr" rid="b36">36</xref>
searches against the following databases: TIGRfam, Pfam, COG and NCBI-nr (
<italic>e</italic>
-value <10
<sup>−3</sup>
). To determine which ORFs from ANMV-1 genome share similarity to viral and prophage sequences, we compared our contig’s translated ORFs with the ACLAME prophage-specific database
<xref ref-type="bibr" rid="b37">37</xref>
. To assess similarity to proteins from anaerobic methane-oxidizing archaea, we inspected NCBI-nr BLASTp results for ANME protein hits (uncultured archaeon, ANME-1; ‘
<italic>Candidatus</italic>
Methanoperedens nitroreducens’, ANME-2D; and uncultured archaeon, Gfoz37D1). A BLASTn survey was conducted against environmental metagenomic databases, including NCBI metagenomic sequences (env_nt), Moore Marine Virus Metagenomes
<xref ref-type="bibr" rid="b30">30</xref>
and Pacific Ocean Virome sequences
<xref ref-type="bibr" rid="b38">38</xref>
, to find representatives sharing high nucleotide similarity (
<italic>e</italic>
-value <10
<sup>−20</sup>
; 28-nt word size) with ANMV-1.</p>
<p>The putative DGR TP of ANMV-1, AdtA, was analysed using Phyre2 (ref.
<xref ref-type="bibr" rid="b40">40</xref>
) to find functional representatives based on secondary structural homology. Residues of TP that aligned with high confidence to the CLec fold region of the Mtd protein
<italic>Bordetella</italic>
phage BPP-1 (Phyre confidence >90%) were used to predict a three-dimensional model. Residue positioning was assessed by Ramachandran analysis and C-terminal variable residues were mapped from the primary sequence onto the predicted structure using Geneious v7.0 (Biomatters Ltd).</p>
</sec>
<sec disp-level="2">
<title>Comparative analysis of Nanoarchaeota genomes</title>
<p>We identified DGR-like RTs via BLASTp searches against the NCBI-WGS database. For an initial proxy of DGR repeat features, we used the EMBOSS tool Dotmatcher
<xref ref-type="bibr" rid="b40">40</xref>
to perform a dotplot analysis of homologous regions with moderate proximity (±5 kb) to RT. TR/VR regions were confirmed from candidates that comprised mostly adenine-specific variability, with at least 10 adenine-specific mismatches, with respect to one strand, and no more than 2 non-adenine mismatches in 100 bp of aligned sequence.</p>
<p>DGR-containing sequences that were analysed in this study are from single-cell genomes belonging to DUSEL4
<italic>Nanoarchaeota</italic>
, which were broadly described as part of a genome and metagenome annotation study on ‘microbial dark matter’, published elsewhere
<xref ref-type="bibr" rid="b20">20</xref>
. DUSEL4
<italic>Nanoarchaeota</italic>
representatives were previously assigned into two OTUs comprising four single-cell genomes. We describe
<italic>Nanoarchaeota</italic>
DGRs with reference to their occurrence in combined single-cell sequence assemblies: OTU1 (genomes AAA011-G17 and AAA011-L22) and OTU2 (genomes AAA011-J02 and AAA011-K22). To confirm the presence of multiple distinct DGRs in one single-cell genome, we aligned OTU1 sequences with contigs from
<italic>Nanoarchaeota</italic>
AAA011-G17, which has the highest genome completeness of the DUSEL4 representatives
<xref ref-type="bibr" rid="b20">20</xref>
.</p>
<p>Nanoarchaeota RT sequences were aligned using ClustalW
<xref ref-type="bibr" rid="b41">41</xref>
with sequences containing the catalytic RT domain, representing DGRs, group-II introns, retrons, long terminal repeats (LTRs), retroviruses, non-LTR elements and retroplasmids. The alignment was compared with a position-specific scoring matrix for the RVT-1 protein family (PF00078), and was manually realigned to conserve motifs considered essential for RT activity. Trees were constructed in MEGA v5.2 (ref.
<xref ref-type="bibr" rid="b42">42</xref>
) using PhyML
<xref ref-type="bibr" rid="b42">42</xref>
with the model LG+G+F. In addition, a PhyML tree was constructed from concatenated alignments of RT and TP amino-acid sequences to compare sequence similarities amongst Nanoarchaeota DGR cassettes.</p>
</sec>
<sec disp-level="2">
<title>TP expression and purification</title>
<p>Coding sequences of nanoarchaeal TPs were synthesized with codons optimal for expression in
<italic>Escherichia coli</italic>
(GENEWIZ, Inc.) and cloned into a modified pET28b expression vector with an N-terminal His-tag followed by a PreScission protease cleavage site. Construct integrity was confirmed by DNA sequencing. TPs were expressed in
<italic>Escherichia coli</italic>
BL21-Gold (DE3) cells. Bacteria were grown with shaking at 37 °C to an optical density (OD600) of 0.6–0.8 and then cooled to room temperature, followed by induction with 0.5 mM isopropyl β–
<sc>D</sc>
-1-thiogalactopyranoside. Bacteria were grown with shaking at room temperature for 5–6 h further, then harvested by centrifugation (25 min, 4,000
<italic>g</italic>
, 4 °C); the bacterial pellet was frozen at −80 °C.</p>
<p>Cells were thawed and resuspended in buffer A (300 mM NaCl, 50 mM Tris (pH 8) and 5 mM β-mercaptoethanol; 20 ml l
<sup>−1</sup>
of bacterial culture) supplemented with 1 mM phenylmethylsulfonyl fluoride (PMSF). The bacteria were lysed by sonication and the lysate was centrifuged (30 min, 35,000 
<italic>g</italic>
, 4 °C). The following steps were performed at 4 °C. The supernatant was applied to a column containing His-Select Nickel affinity gel (Sigma, 1 ml of resin per 20 ml of bacterial lysate), which had been equilibrated with buffer A. The column was washed with five column volumes of buffer B (300 mM NaCl, 20 mM Tris (pH 8) and 5 mM β-mercaptoethanol) containing 20 mM imidazole, and the TP was eluted with buffer B containing 250 mM imidazole. The His-tag was removed by PreScission protease cleavage (1:50 TP: protease mass ratio) overnight at 4 °C. Cleaved TP was separated from non-cleaved proteins by applying the sample to a His-Select Nickel affinity gel column (Sigma) and collecting the flowthrough. The TP was further purified by gel filtration chromatography (Superdex 75) in 300 mM NaCl, 20 mM Tris (pH 8) and 1 mM dithiothreitol. Purified protein was concentrated to 2 mg ml
<sup>−1</sup>
using ultrafiltration (10 kDa MWCO Amicon, Millipore); the concentration of TP was determined using a calculated molar extinction coefficient at 280 nm of 28,880 M
<sup>−1</sup>
 cm
<sup>−1</sup>
.</p>
</sec>
<sec disp-level="2">
<title>CD spectroscopy</title>
<p>CD spectra were collected for the purified nanoarchaeal TP at 10 μM in 300 mM NaF, 20 mM sodium phosphate buffer, pH 8, 1 mM dithiothreitol on an Aviv 202 CD spectrometer using a 1-mm pathlength cuvette. Spectra were recorded from 195 to 260 nm at 25 °C, with 1 nm wavelength steps and the measurement at each wavelength being averaged for 30 s. A temperature melt study was carried out by increasing the temperature of the sample from 4 to 90 °C in 1 °C increments, with the ellipticity being monitored at 216 nm. The sample was then incubated at 90 °C for 2 min and cooled from 90 to 4 °C in 1 °C decrements, with the ellipticity being monitored at 216 nm.</p>
</sec>
<sec disp-level="2">
<title>Tetranucleotide composition analysis</title>
<p>Tetranucleotide composition analysis can be used to identify core genome signatures to aid in taxonomic assignment, or to differentiate conserved protein-coding regions from those that were horizontally acquired
<xref ref-type="bibr" rid="b44">44</xref>
<xref ref-type="bibr" rid="b45">45</xref>
<xref ref-type="bibr" rid="b46">46</xref>
. Tetranucleotide distributions of Nanoarchaeota genomes were determined as previously described
<xref ref-type="bibr" rid="b43">43</xref>
, using a custom Python script. Briefly, sequences were fragmented with a 5-kb sliding window (500-bp overlapping step). Tetranucleotide frequencies were calculated by a zero-order Markov method, which applies odds ratios of observed counts for the 256 unique 4-mers, normalized to their respective mononucleotide frequencies. In order to assess tetranucleotide signatures for DGR regions (~2 kb each), while avoiding a compositional bias of flanking sequence, we concatenated DGR cassettes from both OTU1 and OTU2 and fragmented this DGR-specific sequence (~21 kb) with a sliding window as above. In addition, sequences from RT genes and TP genes were separately concatenated and fragmented with a sliding window as above to compare tetranucleotide compositions for the two DGR components. Dimensionality reduction was performed via non-metric multidimensional scaling on Euclidean distances, using the vegan package in R
<xref ref-type="bibr" rid="b47">47</xref>
, and ordination ellipses representing the 95% confidence region were drawn with the ‘ordiellipse()’ function.</p>
</sec>
</sec>
<sec disp-level="1">
<title>Author contributions</title>
<p>B.G.P. performed the sediment incubations and purified viral DNA. B.G.P. and S.C.B. performed preprocessing and annotation of the metagenomic data set. B.G.P., S.C.B., E.C., D.A., S.H., A.S., P.G., J.F.M. and D.L.V conducted bioinformatic analyses of DGR sequences. S.H. and P.G. expressed and assayed nanoarchaeal target proteins and analysed the resulting data. B.G.P., S.C.B. and D.L.V. wrote the manuscript.</p>
</sec>
<sec disp-level="1">
<title>Additional information</title>
<p>
<bold>Accession codes</bold>
: Metagenomic sequence reads have been deposited in the NCBI BioSample database with accession code PRJNA47435.DV-ANM1. The ANMV-1 assembled genome sequence has been deposited in the NCBI nucleotide database with the accession code KP703175.</p>
<p>
<bold>How to cite this article:</bold>
Paul, B. G.
<italic>et al</italic>
. Targeted diversity generation by intraterrestrial archaea and archaeal viruses.
<italic>Nat. Commun.</italic>
6:6585 doi: 10.1038/ncomms7585 (2015).</p>
</sec>
<sec sec-type="supplementary-material" id="S1">
<title>Supplementary Material</title>
<supplementary-material id="d33e18" content-type="local-data">
<caption>
<title>Supplementary Information</title>
<p>Supplementary Figures 1-7 and Supplementary Table 1</p>
</caption>
<media xlink:href="ncomms7585-s1.pdf"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<p>This research was funded by National Science Foundation grant OCE-1046144 to D.L.V. and National Institutes of Health grant RO1 AI069838 to P.G. and J.F.M.; sequencing was provided through a Gordon and Betty Moore Foundation grant to the Broad Institute. We thank Tanja Woyke for assistance in examining
<italic>Nanoarchaeota</italic>
sequences from the Microbial Dark Matter project. For assistance with viral metagenome preparation and advice on bioinformatic analyses, we thank Steven Quistad and Rob Edwards. Yanling Wang provided helpful comments on an earlier draft of the manuscript.</p>
</ack>
<ref-list>
<ref id="b1">
<mixed-citation publication-type="journal">
<name>
<surname>Kallmeyer</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Pockalny</surname>
<given-names>R.</given-names>
</name>
,
<name>
<surname>Adhikari</surname>
<given-names>R. R.</given-names>
</name>
,
<name>
<surname>Smith</surname>
<given-names>D. C.</given-names>
</name>
&
<name>
<surname>DHondt</surname>
<given-names>S.</given-names>
</name>
<article-title>Global distribution of microbial abundance and biomass in subseafloor sediment</article-title>
.
<source>Proc. Natl Acad. Sci. USA</source>
<volume>109</volume>
,
<fpage>16213</fpage>
<lpage>16216</lpage>
(
<year>2012</year>
) .
<pub-id pub-id-type="pmid">22927371</pub-id>
</mixed-citation>
</ref>
<ref id="b2">
<mixed-citation publication-type="journal">
<name>
<surname>Lipp</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Morono</surname>
<given-names>Y.</given-names>
</name>
,
<name>
<surname>Inagaki</surname>
<given-names>F.</given-names>
</name>
&
<name>
<surname>Hinrichs</surname>
<given-names>K.-U.</given-names>
</name>
<article-title>Significant contribution of Archaea to extant biomass in marine subsurface sediments</article-title>
.
<source>Nature</source>
<volume>454</volume>
,
<fpage>991</fpage>
<lpage>994</lpage>
(
<year>2008</year>
) .
<pub-id pub-id-type="pmid">18641632</pub-id>
</mixed-citation>
</ref>
<ref id="b3">
<mixed-citation publication-type="journal">
<name>
<surname>Valentine</surname>
<given-names>D. L.</given-names>
</name>
<article-title>Adaptations to energy stress dictate the ecology and evolution of the Archaea</article-title>
.
<source>Nat. Rev. Microbiol.</source>
<volume>5</volume>
,
<fpage>316</fpage>
<lpage>323</lpage>
(
<year>2007</year>
) .
<pub-id pub-id-type="pmid">17334387</pub-id>
</mixed-citation>
</ref>
<ref id="b4">
<mixed-citation publication-type="journal">
<name>
<surname>Hoehler</surname>
<given-names>T. M.</given-names>
</name>
&
<name>
<surname>Jørgensen</surname>
<given-names>B. B.</given-names>
</name>
<article-title>Microbial life under extreme energy limitation</article-title>
.
<source>Nat. Rev. Microbiol.</source>
<volume>11</volume>
,
<fpage>83</fpage>
<lpage>94</lpage>
(
<year>2013</year>
) .
<pub-id pub-id-type="pmid">23321532</pub-id>
</mixed-citation>
</ref>
<ref id="b5">
<mixed-citation publication-type="journal">
<name>
<surname>Lewin</surname>
<given-names>A.</given-names>
</name>
<etal></etal>
.
<article-title>The microbial communities in two apparently physically separated deep subsurface oil reservoirs show extensive DNA sequence similarities</article-title>
.
<source>Environ. Microbiol.</source>
<volume>16</volume>
,
<fpage>545</fpage>
<lpage>558</lpage>
(
<year>2014</year>
) .
<pub-id pub-id-type="pmid">23827055</pub-id>
</mixed-citation>
</ref>
<ref id="b6">
<mixed-citation publication-type="journal">
<name>
<surname>Liu</surname>
<given-names>M.</given-names>
</name>
<etal></etal>
.
<article-title>Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage</article-title>
.
<source>Science</source>
<volume>295</volume>
,
<fpage>2091</fpage>
<lpage>2094</lpage>
(
<year>2002</year>
) .
<pub-id pub-id-type="pmid">11896279</pub-id>
</mixed-citation>
</ref>
<ref id="b7">
<mixed-citation publication-type="journal">
<name>
<surname>Doulatov</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
.
<article-title>Tropism switching in Bordetella bacteriophage defines a family of diversity-generating retroelements</article-title>
.
<source>Nature</source>
<volume>431</volume>
,
<fpage>476</fpage>
<lpage>481</lpage>
(
<year>2004</year>
) .
<pub-id pub-id-type="pmid">15386016</pub-id>
</mixed-citation>
</ref>
<ref id="b8">
<mixed-citation publication-type="journal">
<name>
<surname>Medhekar</surname>
<given-names>B.</given-names>
</name>
&
<name>
<surname>Miller</surname>
<given-names>J. F.</given-names>
</name>
<article-title>Diversity-generating retroelements</article-title>
.
<source>Curr. Opin. Microbiol.</source>
<volume>10</volume>
,
<fpage>388</fpage>
<lpage>395</lpage>
(
<year>2007</year>
) .
<pub-id pub-id-type="pmid">17703991</pub-id>
</mixed-citation>
</ref>
<ref id="b9">
<mixed-citation publication-type="journal">
<name>
<surname>McMahon</surname>
<given-names>S. A.</given-names>
</name>
<etal></etal>
.
<article-title>The C-type lectin fold as an evolutionary solution for massive sequence variation</article-title>
.
<source>Nat. Struct. Mol. Biol.</source>
<volume>12</volume>
,
<fpage>886</fpage>
<lpage>892</lpage>
(
<year>2005</year>
) .
<pub-id pub-id-type="pmid">16170324</pub-id>
</mixed-citation>
</ref>
<ref id="b10">
<mixed-citation publication-type="journal">
<name>
<surname>Guo</surname>
<given-names>H.</given-names>
</name>
<etal></etal>
.
<article-title>Diversity-generating retroelement homing regenerates target sequences for repeated rounds of codon rewriting and protein diversification</article-title>
.
<source>Mol. Cell</source>
<volume>31</volume>
,
<fpage>813</fpage>
<lpage>823</lpage>
(
<year>2008</year>
) .
<pub-id pub-id-type="pmid">18922465</pub-id>
</mixed-citation>
</ref>
<ref id="b11">
<mixed-citation publication-type="journal">
<name>
<surname>Le Coq</surname>
<given-names>J.</given-names>
</name>
&
<name>
<surname>Ghosh</surname>
<given-names>P.</given-names>
</name>
<article-title>Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement</article-title>
.
<source>Proc. Natl Acad. Sci. USA</source>
<volume>108</volume>
,
<fpage>14649</fpage>
<lpage>14653</lpage>
(
<year>2011</year>
) .
<pub-id pub-id-type="pmid">21873231</pub-id>
</mixed-citation>
</ref>
<ref id="b12">
<mixed-citation publication-type="journal">
<name>
<surname>Rohwer</surname>
<given-names>F.</given-names>
</name>
&
<name>
<surname>Vega Thurber</surname>
<given-names>R.</given-names>
</name>
<article-title>Viruses manipulate the marine environment</article-title>
.
<source>Nature</source>
<volume>459</volume>
,
<fpage>207</fpage>
<lpage>212</lpage>
(
<year>2009</year>
) .
<pub-id pub-id-type="pmid">19444207</pub-id>
</mixed-citation>
</ref>
<ref id="b13">
<mixed-citation publication-type="journal">
<name>
<surname>Rowlands</surname>
<given-names>T.</given-names>
</name>
,
<name>
<surname>Baumann</surname>
<given-names>P.</given-names>
</name>
&
<name>
<surname>Jackson</surname>
<given-names>S. P.</given-names>
</name>
<article-title>The TATA-binding protein: a general transcription factor in eukaryotes and archaebacteria</article-title>
.
<source>Science</source>
<volume>264</volume>
,
<fpage>1326</fpage>
<lpage>1329</lpage>
(
<year>1994</year>
) .
<pub-id pub-id-type="pmid">8191287</pub-id>
</mixed-citation>
</ref>
<ref id="b14">
<mixed-citation publication-type="journal">
<name>
<surname>Dwivedi</surname>
<given-names>B.</given-names>
</name>
,
<name>
<surname>Xue</surname>
<given-names>B.</given-names>
</name>
,
<name>
<surname>Lundin</surname>
<given-names>D.</given-names>
</name>
,
<name>
<surname>Edwards</surname>
<given-names>R. A.</given-names>
</name>
&
<name>
<surname>Breitbart</surname>
<given-names>M.</given-names>
</name>
<article-title>A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes</article-title>
.
<source>BMC Evol. Biol.</source>
<volume>13</volume>
,
<fpage>33</fpage>
(
<year>2013</year>
) .
<pub-id pub-id-type="pmid">23391036</pub-id>
</mixed-citation>
</ref>
<ref id="b15">
<mixed-citation publication-type="journal">
<name>
<surname>Arambula</surname>
<given-names>D.</given-names>
</name>
<etal></etal>
.
<article-title>Surface display of a massively variable lipoprotein by a Legionella diversity-generating retroelement</article-title>
.
<source>Proc. Natl Acad Sci. USA</source>
<volume>110</volume>
,
<fpage>8212</fpage>
<lpage>8217</lpage>
(
<year>2013</year>
) .
<pub-id pub-id-type="pmid">23633572</pub-id>
</mixed-citation>
</ref>
<ref id="b16">
<mixed-citation publication-type="journal">
<name>
<surname>Schillinger</surname>
<given-names>T.</given-names>
</name>
,
<name>
<surname>Lisfi</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Chi</surname>
<given-names>J.</given-names>
</name>
,
<name>
<surname>Cullum</surname>
<given-names>J.</given-names>
</name>
&
<name>
<surname>Zingler</surname>
<given-names>N.</given-names>
</name>
<article-title>Analysis of a comprehensive dataset of diversity generating retroelements generated by the program DiGReF</article-title>
.
<source>BMC Genomics</source>
<volume>13</volume>
,
<fpage>430</fpage>
(
<year>2012</year>
) .
<pub-id pub-id-type="pmid">22928525</pub-id>
</mixed-citation>
</ref>
<ref id="b17">
<mixed-citation publication-type="journal">
<name>
<surname>Goldrath</surname>
<given-names>A. W.</given-names>
</name>
&
<name>
<surname>Bevan</surname>
<given-names>M. J.</given-names>
</name>
<article-title>Selecting and maintaining a diverse T-cell repertoire</article-title>
.
<source>Nature</source>
<volume>402</volume>
,
<fpage>255</fpage>
<lpage>262</lpage>
(
<year>1999</year>
) .
<pub-id pub-id-type="pmid">10580495</pub-id>
</mixed-citation>
</ref>
<ref id="b18">
<mixed-citation publication-type="journal">
<name>
<surname>Alder</surname>
<given-names>M. N.</given-names>
</name>
<etal></etal>
.
<article-title>Diversity and function of adaptive immune receptors in a jawless vertebrate</article-title>
.
<source>Science</source>
<volume>310</volume>
,
<fpage>1970</fpage>
<lpage>1973</lpage>
(
<year>2005</year>
) .
<pub-id pub-id-type="pmid">16373579</pub-id>
</mixed-citation>
</ref>
<ref id="b19">
<mixed-citation publication-type="journal">
<name>
<surname>Stokke</surname>
<given-names>R.</given-names>
</name>
,
<name>
<surname>Roalkvam</surname>
<given-names>I.</given-names>
</name>
,
<name>
<surname>Lanzen</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Haflidason</surname>
<given-names>H.</given-names>
</name>
&
<name>
<surname>Steen</surname>
<given-names>I. H.</given-names>
</name>
<article-title>Integrated metagenomic and metaproteomic analyses of an ANME-1-dominated community in marine cold seep sediments</article-title>
.
<source>Environ. Microbiol.</source>
<volume>14</volume>
,
<fpage>1333</fpage>
<lpage>1346</lpage>
(
<year>2012</year>
) .
<pub-id pub-id-type="pmid">22404914</pub-id>
</mixed-citation>
</ref>
<ref id="b20">
<mixed-citation publication-type="journal">
<name>
<surname>Rinke</surname>
<given-names>C.</given-names>
</name>
<etal></etal>
.
<article-title>Insights into the phylogeny and coding potential of microbial dark matter</article-title>
.
<source>Nature</source>
<volume>499</volume>
,
<fpage>431</fpage>
<lpage>437</lpage>
(
<year>2013</year>
) .
<pub-id pub-id-type="pmid">23851394</pub-id>
</mixed-citation>
</ref>
<ref id="b21">
<mixed-citation publication-type="journal">
<name>
<surname>Huber</surname>
<given-names>H.</given-names>
</name>
<etal></etal>
.
<article-title>A new phylum of Archaea represented by a nanosized hyperthermophilic symbiont</article-title>
.
<source>Nature</source>
<volume>417</volume>
,
<fpage>63</fpage>
<lpage>67</lpage>
(
<year>2002</year>
) .
<pub-id pub-id-type="pmid">11986665</pub-id>
</mixed-citation>
</ref>
<ref id="b22">
<mixed-citation publication-type="journal">
<name>
<surname>Podar</surname>
<given-names>M.</given-names>
</name>
<etal></etal>
.
<article-title>Insights into archaeal evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool, Yellowstone National Park</article-title>
.
<source>Biol. Direct</source>
<volume>8</volume>
,
<fpage>9</fpage>
(
<year>2013</year>
) .
<pub-id pub-id-type="pmid">23607440</pub-id>
</mixed-citation>
</ref>
<ref id="b23">
<mixed-citation publication-type="journal">
<name>
<surname>Minot</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Grunberg</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Wu</surname>
<given-names>G. D.</given-names>
</name>
,
<name>
<surname>Lewis</surname>
<given-names>J. D.</given-names>
</name>
&
<name>
<surname>Bushman</surname>
<given-names>F. D.</given-names>
</name>
<article-title>Hypervariable loci in the human gut virome</article-title>
.
<source>Proc. Natl Acad. Sci. USA</source>
<volume>109</volume>
,
<fpage>3962</fpage>
<lpage>3966</lpage>
(
<year>2012</year>
) .
<pub-id pub-id-type="pmid">22355105</pub-id>
</mixed-citation>
</ref>
<ref id="b24">
<mixed-citation publication-type="journal">
<name>
<surname>Simon</surname>
<given-names>D. M.</given-names>
</name>
&
<name>
<surname>Zimmerly</surname>
<given-names>S.</given-names>
</name>
<article-title>A diversity of uncharacterized reverse transcriptases in bacteria</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>36</volume>
,
<fpage>7219</fpage>
<lpage>7229</lpage>
(
<year>2008</year>
) .
<pub-id pub-id-type="pmid">19004871</pub-id>
</mixed-citation>
</ref>
<ref id="b25">
<mixed-citation publication-type="journal">
<name>
<surname>Ye</surname>
<given-names>Y.</given-names>
</name>
<article-title>Identification of diversity-generating retroelements in human microbiomes</article-title>
.
<source>Int. J. Mol. Sci.</source>
<volume>15</volume>
,
<fpage>14234</fpage>
<lpage>14246</lpage>
(
<year>2014</year>
) .
<pub-id pub-id-type="pmid">25196521</pub-id>
</mixed-citation>
</ref>
<ref id="b26">
<mixed-citation publication-type="journal">
<name>
<surname>Louis-Jeune</surname>
<given-names>C.</given-names>
</name>
,
<name>
<surname>Andrade-Navarro</surname>
<given-names>M. A.</given-names>
</name>
&
<name>
<surname>Perez-Iratxeta</surname>
<given-names>C.</given-names>
</name>
<article-title>Prediction of protein secondary structure from circular dichroism using theoretically derived spectra</article-title>
.
<source>Proteins</source>
<volume>80</volume>
,
<fpage>374</fpage>
<lpage>381</lpage>
(
<year>2012</year>
) .
<pub-id pub-id-type="pmid">22095872</pub-id>
</mixed-citation>
</ref>
<ref id="b27">
<mixed-citation publication-type="journal">
<name>
<surname>Paull</surname>
<given-names>C. K.</given-names>
</name>
,
<name>
<surname>Normark</surname>
<given-names>W. R.</given-names>
</name>
,
<name>
<surname>Ussler</surname>
<given-names>W.</given-names>
</name>
,
<name>
<surname>Caress</surname>
<given-names>D. W.</given-names>
</name>
&
<name>
<surname>Keaten</surname>
<given-names>R.</given-names>
</name>
<article-title>Association among active seafloor deformation, mound formation, and gas hydrate growth and accumulation within the seafloor of the Santa Monica Basin, offshore California</article-title>
.
<source>Mar. Geol.</source>
<volume>250</volume>
,
<fpage>258</fpage>
<lpage>275</lpage>
(
<year>2008</year>
) .</mixed-citation>
</ref>
<ref id="b28">
<mixed-citation publication-type="journal">
<name>
<surname>Widdel</surname>
<given-names>F.</given-names>
</name>
&
<name>
<surname>Bak</surname>
<given-names>F.</given-names>
</name>
in:
<source>The Prokaryotes</source>
2nd edn eds Balows A., Trüper H. G., Dworkin M., Harder W., Schleifer K.-H. Springer (
<year>1992</year>
) .</mixed-citation>
</ref>
<ref id="b29">
<mixed-citation publication-type="journal">
<name>
<surname>Thurber</surname>
<given-names>R. V.</given-names>
</name>
,
<name>
<surname>Haynes</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Breitbart</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Wegley</surname>
<given-names>L.</given-names>
</name>
&
<name>
<surname>Rohwer</surname>
<given-names>F.</given-names>
</name>
<article-title>Laboratory procedures to generate viral metagenomes</article-title>
.
<source>Nat. Protoc.</source>
<volume>4</volume>
,
<fpage>470</fpage>
<lpage>483</lpage>
(
<year>2009</year>
) .
<pub-id pub-id-type="pmid">19300441</pub-id>
</mixed-citation>
</ref>
<ref id="b30">
<mixed-citation publication-type="journal">
<name>
<surname>Henn</surname>
<given-names>M. R.</given-names>
</name>
<etal></etal>
.
<article-title>Analysis of high-throughput sequencing and annotation strategies for phage genomes</article-title>
.
<source>PLoS ONE</source>
<volume>5</volume>
,
<fpage>e9083</fpage>
(
<year>2010</year>
) .
<pub-id pub-id-type="pmid">20140207</pub-id>
</mixed-citation>
</ref>
<ref id="b31">
<mixed-citation publication-type="journal">
<name>
<surname>Schmieder</surname>
<given-names>R.</given-names>
</name>
,
<name>
<surname>Lim</surname>
<given-names>Y.</given-names>
</name>
,
<name>
<surname>Rohwer</surname>
<given-names>F.</given-names>
</name>
&
<name>
<surname>Edwards</surname>
<given-names>R.</given-names>
</name>
<article-title>TagCleaner: identification and removal of tag sequences from genomic and metagenomic datasets</article-title>
.
<source>BMC Bioinformatics</source>
<volume>11</volume>
,
<fpage>341</fpage>
(
<year>2010</year>
) .
<pub-id pub-id-type="pmid">20573248</pub-id>
</mixed-citation>
</ref>
<ref id="b32">
<mixed-citation publication-type="journal">
<name>
<surname>Hurwitz</surname>
<given-names>B.</given-names>
</name>
,
<name>
<surname>Deng</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Poulos</surname>
<given-names>B.</given-names>
</name>
&
<name>
<surname>Sullivan</surname>
<given-names>M.</given-names>
</name>
<article-title>Evaluation of methods to concentrate and purify ocean virus communities through comparative, replicated metagenomics</article-title>
.
<source>Environ. Microbiol.</source>
<volume>15</volume>
,
<fpage>1428</fpage>
<lpage>1440</lpage>
(
<year>2013</year>
) .
<pub-id pub-id-type="pmid">22845467</pub-id>
</mixed-citation>
</ref>
<ref id="b33">
<mixed-citation publication-type="journal">
<name>
<surname>Niu</surname>
<given-names>B.</given-names>
</name>
,
<name>
<surname>Fu</surname>
<given-names>L.</given-names>
</name>
,
<name>
<surname>Sun</surname>
<given-names>S.</given-names>
</name>
&
<name>
<surname>Li</surname>
<given-names>W.</given-names>
</name>
<article-title>Artificial and natural duplicates in pyrosequencing reads of metagenomic data</article-title>
.
<source>BMC Bioinformatics.</source>
<volume>11</volume>
,
<fpage>187</fpage>
(
<year>2010</year>
) .
<pub-id pub-id-type="pmid">20388221</pub-id>
</mixed-citation>
</ref>
<ref id="b34">
<mixed-citation publication-type="journal">
<name>
<surname>Sun</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
.
<article-title>Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>39</volume>
,
<fpage>D546</fpage>
<lpage>D551</lpage>
(
<year>2011</year>
) .
<pub-id pub-id-type="pmid">21045053</pub-id>
</mixed-citation>
</ref>
<ref id="b35">
<mixed-citation publication-type="journal">
<name>
<surname>Delcher</surname>
<given-names>A. L.</given-names>
</name>
,
<name>
<surname>Bratke</surname>
<given-names>K. A.</given-names>
</name>
,
<name>
<surname>Powers</surname>
<given-names>E. C.</given-names>
</name>
&
<name>
<surname>Salzberg</surname>
<given-names>S. L.</given-names>
</name>
<article-title>Identifying bacterial genes and endosymbiont DNA with Glimmer</article-title>
.
<source>Bioinformatics</source>
<volume>23</volume>
,
<fpage>673</fpage>
<lpage>679</lpage>
(
<year>2007</year>
) .
<pub-id pub-id-type="pmid">17237039</pub-id>
</mixed-citation>
</ref>
<ref id="b36">
<mixed-citation publication-type="journal">
<name>
<surname>Altschul</surname>
<given-names>S. F.</given-names>
</name>
,
<name>
<surname>Gish</surname>
<given-names>W.</given-names>
</name>
,
<name>
<surname>Miller</surname>
<given-names>W.</given-names>
</name>
,
<name>
<surname>Myers</surname>
<given-names>E. W.</given-names>
</name>
&
<name>
<surname>Lipman</surname>
<given-names>D. J.</given-names>
</name>
<article-title>Basic local alignment search tool. J</article-title>
.
<source>Mol. Biol.</source>
<volume>215</volume>
,
<fpage>403</fpage>
<lpage>410</lpage>
(
<year>1990</year>
) .</mixed-citation>
</ref>
<ref id="b37">
<mixed-citation publication-type="journal">
<name>
<surname>Leplae</surname>
<given-names>R.</given-names>
</name>
,
<name>
<surname>Hebrant</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Wodak</surname>
<given-names>S. J.</given-names>
</name>
&
<name>
<surname>Toussaint</surname>
<given-names>A.</given-names>
</name>
<article-title>ACLAME: a CLAssification of Mobile genetic Elements</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>32</volume>
,
<fpage>D45</fpage>
<lpage>D49</lpage>
(
<year>2004</year>
) .
<pub-id pub-id-type="pmid">14681355</pub-id>
</mixed-citation>
</ref>
<ref id="b38">
<mixed-citation publication-type="journal">
<name>
<surname>Hurwitz</surname>
<given-names>B. L.</given-names>
</name>
&
<name>
<surname>Sullivan</surname>
<given-names>M. B.</given-names>
</name>
<article-title>The Pacific Ocean Virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology</article-title>
.
<source>PLoS ONE</source>
<volume>8</volume>
,
<fpage>e57355</fpage>
(
<year>2013</year>
) .
<pub-id pub-id-type="pmid">23468974</pub-id>
</mixed-citation>
</ref>
<ref id="b39">
<mixed-citation publication-type="journal">
<name>
<surname>Kelley</surname>
<given-names>L. A.</given-names>
</name>
&
<name>
<surname>Sternberg</surname>
<given-names>M. J.</given-names>
</name>
<article-title>Protein structure prediction on the Web: a case study using the Phyre server</article-title>
.
<source>Nat. Protoc.</source>
<volume>4</volume>
,
<fpage>363</fpage>
<lpage>371</lpage>
(
<year>2009</year>
) .
<pub-id pub-id-type="pmid">19247286</pub-id>
</mixed-citation>
</ref>
<ref id="b40">
<mixed-citation publication-type="journal">
<name>
<surname>Rice</surname>
<given-names>P.</given-names>
</name>
,
<name>
<surname>Longden</surname>
<given-names>I.</given-names>
</name>
&
<name>
<surname>Bleasby</surname>
<given-names>A.</given-names>
</name>
<article-title>EMBOSS: the European molecular biology open software suite</article-title>
.
<source>Trends Genet.</source>
<volume>16</volume>
,
<fpage>276</fpage>
<lpage>277</lpage>
(
<year>2000</year>
) .
<pub-id pub-id-type="pmid">10827456</pub-id>
</mixed-citation>
</ref>
<ref id="b41">
<mixed-citation publication-type="journal">
<name>
<surname>Larkin</surname>
<given-names>M. A.</given-names>
</name>
<etal></etal>
.
<article-title>Clustal W and Clustal X version 2.0</article-title>
.
<source>Bioinformatics</source>
<volume>23</volume>
,
<fpage>2947</fpage>
<lpage>2948</lpage>
(
<year>2007</year>
) .
<pub-id pub-id-type="pmid">17846036</pub-id>
</mixed-citation>
</ref>
<ref id="b42">
<mixed-citation publication-type="journal">
<name>
<surname>Kumar</surname>
<given-names>S.</given-names>
</name>
,
<name>
<surname>Nei</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Dudley</surname>
<given-names>J.</given-names>
</name>
&
<name>
<surname>Tamura</surname>
<given-names>K.</given-names>
</name>
<article-title>MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences</article-title>
.
<source>Brief Bioinform.</source>
<volume>9</volume>
,
<fpage>299</fpage>
<lpage>306</lpage>
(
<year>2008</year>
) .
<pub-id pub-id-type="pmid">18417537</pub-id>
</mixed-citation>
</ref>
<ref id="b43">
<mixed-citation publication-type="journal">
<name>
<surname>Guindon</surname>
<given-names>S.</given-names>
</name>
<etal></etal>
.
<article-title>New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0</article-title>
.
<source>Syst. Biol.</source>
<volume>59</volume>
,
<fpage>307</fpage>
<lpage>321</lpage>
(
<year>2010</year>
) .
<pub-id pub-id-type="pmid">20525638</pub-id>
</mixed-citation>
</ref>
<ref id="b44">
<mixed-citation publication-type="journal">
<name>
<surname>Pride</surname>
<given-names>D. T.</given-names>
</name>
,
<name>
<surname>Meinersmann</surname>
<given-names>R. J.</given-names>
</name>
,
<name>
<surname>Wassenaar</surname>
<given-names>T. M.</given-names>
</name>
&
<name>
<surname>Blaser</surname>
<given-names>M. J.</given-names>
</name>
<article-title>Evolutionary implications of microbial genome tetranucleotide frequency biases</article-title>
.
<source>Genome Res.</source>
<volume>13</volume>
,
<fpage>145</fpage>
<lpage>158</lpage>
(
<year>2003</year>
) .
<pub-id pub-id-type="pmid">12566393</pub-id>
</mixed-citation>
</ref>
<ref id="b45">
<mixed-citation publication-type="journal">
<name>
<surname>Teeling</surname>
<given-names>H.</given-names>
</name>
,
<name>
<surname>Meyerdierks</surname>
<given-names>A.</given-names>
</name>
,
<name>
<surname>Bauer</surname>
<given-names>M.</given-names>
</name>
,
<name>
<surname>Amann</surname>
<given-names>R.</given-names>
</name>
&
<name>
<surname>Glöckner</surname>
<given-names>F. O.</given-names>
</name>
<article-title>Application of tetranucleotide frequencies for the assignment of genomic fragments</article-title>
.
<source>Environ. Microbiol.</source>
<volume>6</volume>
,
<fpage>938</fpage>
<lpage>947</lpage>
(
<year>2004</year>
) .
<pub-id pub-id-type="pmid">15305919</pub-id>
</mixed-citation>
</ref>
<ref id="b46">
<mixed-citation publication-type="journal">
<name>
<surname>Dick</surname>
<given-names>G. J.</given-names>
</name>
<etal></etal>
.
<article-title>Community-wide analysis of microbial genome sequence signatures</article-title>
.
<source>Genome Biol.</source>
<volume>10</volume>
,
<fpage>R85</fpage>
(
<year>2009</year>
) .
<pub-id pub-id-type="pmid">19698104</pub-id>
</mixed-citation>
</ref>
<ref id="b47">
<mixed-citation publication-type="other">
<name>
<surname>Oksanen</surname>
<given-names>J.</given-names>
</name>
<etal></etal>
. Vegan: Community Ecology Package. R package version 1.13-1
<ext-link ext-link-type="uri" xlink:href="http://vegan.r-forge.r-project.org/">http://vegan.r-forge.r-project.org/</ext-link>
(
<year>2008</year>
) .</mixed-citation>
</ref>
</ref-list>
<fn-group>
<fn fn-type="conflict">
<p>J.F.M. is a cofounder, equity holder and chair of the scientific advisory board of AvidBiotics Inc., a biotherapeutics company in San Francisco. The remaining authors declare no competing financial interests.</p>
</fn>
</fn-group>
</back>
<floats-group>
<fig id="f1">
<label>Figure 1</label>
<caption>
<title>Retroelement-containing ANMV-1 genome obtained from methane seep sediment.</title>
<p>(
<bold>a</bold>
) Annotated coding sequences (CDS) designated by arrows that are coloured according to predicted function. Genes with blast similarity to ANME protein sequences are highlighted in red below each corresponding ANMV-1 locus (
<xref ref-type="supplementary-material" rid="S1">Supplementary Table 1</xref>
). Symbols above selected annotations indicate putative gene names: terL, terminase large subunit; tbp, TATA-box binding protein; nrdD, anaerobic ribonucleoside triphosphate reductase; AdtA, DGR TP; RT, reverse transcriptase. An open box highlights the DGR cassette with flanking putative tail fibres (tail fib.), shown below the genome. (
<bold>b</bold>
) Putative
<italic>cis</italic>
- and
<italic>trans</italic>
-acting features of the ANMV-1 DGR. RT, accessory variability determinant (Avd) and AdtA ORFs are shown as blue, grey and green arrows, respectively. Purple boxes indicate template and variable repeat regions (TR and VR). The IMH and cognate IMH* sites are highlighted in yellow. The expanded DGR view depicts the putative retrohoming target site. Estimated number of nucleotide sequence variants is given above VR (TR* cDNAs), based on theoretical mutagenesis of adenines in TR intermediate RNA.</p>
</caption>
<graphic xlink:href="ncomms7585-f1"></graphic>
</fig>
<fig id="f2">
<label>Figure 2</label>
<caption>
<title>Grouping of DGRs from
<italic>Nanoarchaeota</italic>
.</title>
<p>(
<bold>a</bold>
) Positions of four DGR cassettes in each OTU, coloured by homology-based groups (note ungrouped OTU1 DGR in grey). Contigs are shown with DGRs on the forward strand (rev., reverse complement). (
<bold>b</bold>
) DGR groups, ordered by RT and TP homologies. A PhyML tree (left) was constructed with 100 bootstrap replicates (support indicated on branches) from concatenated alignments of TP and RT amino-acid sequences for each complete DGR cassette. Group 4 includes an incomplete DGR for OTU1 contig 26 (missing RT ORF). A schematic for nanoarchaeal DGRs shows the direction of information transfer during targeted mutagenesis. TP and RT genes are shown as green and blue arrows, respectively, while purple boxes indicate variable and template regions (VR and TR). Bar graphs show pairwise similarity between aligned OTU1 and OTU2 sequences for major DGR features, TP, VR, TR and RT. NA (not applicable) indicates that a feature is not found in the DGR.</p>
</caption>
<graphic xlink:href="ncomms7585-f2"></graphic>
</fig>
<fig id="f3">
<label>Figure 3</label>
<caption>
<title>Conserved and putative regulatory features of Nanoarchaeota DGRs.</title>
<p>IMH sites (IMH and IMH*) are shown as yellow boxes, and the trinucleotide-loop hairpin is given in an expanded view at right. Dark grey arrows indicate ORFs between RT and TP whose amino-acid sequences have comparable isoelectric point and molecular weight to accessory variability determinant (Avd; pI=9±1;
<italic>M</italic>
<sub>w</sub>
=10±5).</p>
</caption>
<graphic xlink:href="ncomms7585-f3"></graphic>
</fig>
<fig id="f4">
<label>Figure 4</label>
<caption>
<title>RT phylogeny for archaeal DGRs.</title>
<p>(
<bold>a</bold>
) Maximum-likelihood phylogenetic tree of RT representatives aligned with ANMV-1 and DUSEL4 Nanoarchaeota sequences. Green branches correspond to bacterial and bacteria-derived RTs (from chromosomes, plasmids, mitochondria, chloroplasts and bacteriophage), red branches indicate archaeal and archaeal virus RTs, and black branches represent RTs from eukaryotes and their viruses. Retroelement clades and key representatives are labelled as follows: DGRs, diversity-generating retroelements; DIRS, Dictyostelium retrotransposons; GemV, geminiviridae; G2L, group-II intron-like (G2L are numbered according to Simon and Zimmerly (
<italic>24</italic>
)); Hpdn, hepadnaviruses; LTR, long terminal repeat retroelements; NPV, nucleopolyhedralviruses; non-LTR, non-long terminal repeat retroelements; RtV, retroviridae; unk, unknown or unclassified. The scale shows substitutions per site. For clarity, bootstrap values are not shown for the full RT tree. (
<bold>b</bold>
) Expanded subtree view of DGR RT representatives. A red box highlights the archaeal DGR clade. NCBI accession codes are given for representatives in the subtree, but previously described bacterial DGRs are explicitly named. The representative for Bordetella phage BPP is labelled ‘BPP’. Coloured circles at internal nodes indicate branch support.</p>
</caption>
<graphic xlink:href="ncomms7585-f4"></graphic>
</fig>
<fig id="f5">
<label>Figure 5</label>
<caption>
<title>Tetranucleotide distributions of DUSEL4
<italic>Nanoarchaeota</italic>
.</title>
<p>(
<bold>a</bold>
,
<bold>b</bold>
) Non-metric multidimensional scaling plots of tetranucleotide distributions of (
<bold>a</bold>
) concatenated DUSEL4 DGRs (red) and (
<bold>b</bold>
) separately concatenated DUSEL4 DGR RT (blue) and TP genes (green), compared with the rest of the DUSEL4 Nanoarchaeota OTU1 and OTU2 genomes (greyscale circles). Each point on the ordination plots represents one 5-kb fragment. Dashed ellipses indicate the 95% confidence region.</p>
</caption>
<graphic xlink:href="ncomms7585-f5"></graphic>
</fig>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000142  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000142  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024