Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 0001109 ( Pmc/Corpus ); précédent : 0001108; suivant : 0001110 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">“Every Gene Is Everywhere but the Environment Selects”: Global Geolocalization of Gene Sharing in Environmental Samples through Network Analysis</title>
<author>
<name sortKey="Fondi, Marco" sort="Fondi, Marco" uniqKey="Fondi M" first="Marco" last="Fondi">Marco Fondi</name>
<affiliation>
<nlm:aff id="evw077-aff1">Laboratory of Microbial and Molecular Evolution, Department of Biology, University of Florence, Italy</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff2">Computational Biology Group, University of Florence, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Karkman, Antti" sort="Karkman, Antti" uniqKey="Karkman A" first="Antti" last="Karkman">Antti Karkman</name>
<affiliation>
<nlm:aff id="evw077-aff3">Department of Food and Environmental Sciences, University of Helsinki, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tamminen, Manu V" sort="Tamminen, Manu V" uniqKey="Tamminen M" first="Manu V." last="Tamminen">Manu V. Tamminen</name>
<affiliation>
<nlm:aff id="evw077-aff4">Department of Environmental Systems Science, ETH Zürich, Switzerland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff5">Department of Aquatic Ecology, Eawag, Switzerland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bosi, Emanuele" sort="Bosi, Emanuele" uniqKey="Bosi E" first="Emanuele" last="Bosi">Emanuele Bosi</name>
<affiliation>
<nlm:aff id="evw077-aff1">Laboratory of Microbial and Molecular Evolution, Department of Biology, University of Florence, Italy</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff2">Computational Biology Group, University of Florence, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Virta, Marko" sort="Virta, Marko" uniqKey="Virta M" first="Marko" last="Virta">Marko Virta</name>
<affiliation>
<nlm:aff id="evw077-aff3">Department of Food and Environmental Sciences, University of Helsinki, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fani, Renato" sort="Fani, Renato" uniqKey="Fani R" first="Renato" last="Fani">Renato Fani</name>
<affiliation>
<nlm:aff id="evw077-aff1">Laboratory of Microbial and Molecular Evolution, Department of Biology, University of Florence, Italy</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff2">Computational Biology Group, University of Florence, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Alm, Eric" sort="Alm, Eric" uniqKey="Alm E" first="Eric" last="Alm">Eric Alm</name>
<affiliation>
<nlm:aff id="evw077-aff6">Department of Civil and Environmental Engineering, Massachusetts Institute of Technology</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mcinerney, James O" sort="Mcinerney, James O" uniqKey="Mcinerney J" first="James O." last="Mcinerney">James O. Mcinerney</name>
<affiliation>
<nlm:aff id="evw077-aff7">Department of Biology, National University of Ireland Maynooth, County Kildare, Ireland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff8">Computational Evolutionary Biology, Faculty of Life Sciences, The University of Manchester, United Kingdom</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27190206</idno>
<idno type="pmc">4898794</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4898794</idno>
<idno type="RBID">PMC:4898794</idno>
<idno type="doi">10.1093/gbe/evw077</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000110</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">“Every Gene Is Everywhere but the Environment Selects”: Global Geolocalization of Gene Sharing in Environmental Samples through Network Analysis</title>
<author>
<name sortKey="Fondi, Marco" sort="Fondi, Marco" uniqKey="Fondi M" first="Marco" last="Fondi">Marco Fondi</name>
<affiliation>
<nlm:aff id="evw077-aff1">Laboratory of Microbial and Molecular Evolution, Department of Biology, University of Florence, Italy</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff2">Computational Biology Group, University of Florence, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Karkman, Antti" sort="Karkman, Antti" uniqKey="Karkman A" first="Antti" last="Karkman">Antti Karkman</name>
<affiliation>
<nlm:aff id="evw077-aff3">Department of Food and Environmental Sciences, University of Helsinki, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tamminen, Manu V" sort="Tamminen, Manu V" uniqKey="Tamminen M" first="Manu V." last="Tamminen">Manu V. Tamminen</name>
<affiliation>
<nlm:aff id="evw077-aff4">Department of Environmental Systems Science, ETH Zürich, Switzerland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff5">Department of Aquatic Ecology, Eawag, Switzerland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bosi, Emanuele" sort="Bosi, Emanuele" uniqKey="Bosi E" first="Emanuele" last="Bosi">Emanuele Bosi</name>
<affiliation>
<nlm:aff id="evw077-aff1">Laboratory of Microbial and Molecular Evolution, Department of Biology, University of Florence, Italy</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff2">Computational Biology Group, University of Florence, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Virta, Marko" sort="Virta, Marko" uniqKey="Virta M" first="Marko" last="Virta">Marko Virta</name>
<affiliation>
<nlm:aff id="evw077-aff3">Department of Food and Environmental Sciences, University of Helsinki, Finland</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Fani, Renato" sort="Fani, Renato" uniqKey="Fani R" first="Renato" last="Fani">Renato Fani</name>
<affiliation>
<nlm:aff id="evw077-aff1">Laboratory of Microbial and Molecular Evolution, Department of Biology, University of Florence, Italy</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff2">Computational Biology Group, University of Florence, Italy</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Alm, Eric" sort="Alm, Eric" uniqKey="Alm E" first="Eric" last="Alm">Eric Alm</name>
<affiliation>
<nlm:aff id="evw077-aff6">Department of Civil and Environmental Engineering, Massachusetts Institute of Technology</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mcinerney, James O" sort="Mcinerney, James O" uniqKey="Mcinerney J" first="James O." last="Mcinerney">James O. Mcinerney</name>
<affiliation>
<nlm:aff id="evw077-aff7">Department of Biology, National University of Ireland Maynooth, County Kildare, Ireland</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw077-aff8">Computational Evolutionary Biology, Faculty of Life Sciences, The University of Manchester, United Kingdom</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Genome Biology and Evolution</title>
<idno type="eISSN">1759-6653</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>The spatial distribution of microbes on our planet is famously formulated in the Baas Becking hypothesis as “everything is everywhere but the environment selects.” While this hypothesis does not strictly rule out patterns caused by geographical effects on ecology and historical founder effects, it does propose that the remarkable dispersal potential of microbes leads to distributions generally shaped by environmental factors rather than geographical distance. By constructing sequence similarity networks from uncultured environmental samples, we show that microbial gene pool distributions are not influenced nearly as much by geography as ecology, thus extending the Bass Becking hypothesis from whole organisms to microbial genes. We find that gene pools are shaped by their broad ecological niche (such as sea water, fresh water, host, and airborne). We find that freshwater habitats act as a gene exchange bridge between otherwise disconnected habitats. Finally, certain antibiotic resistance genes deviate from the general trend of habitat specificity by exhibiting a high degree of cross-habitat mobility. The strong cross-habitat mobility of antibiotic resistance genes is a cause for concern and provides a paradigmatic example of the rate by which genes colonize new habitats when new selective forces emerge.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, Sf" uniqKey="Altschul S">SF Altschul</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Alvarez Ponce, D" uniqKey="Alvarez Ponce D">D Alvarez-Ponce</name>
</author>
<author>
<name sortKey="Lopez, P" uniqKey="Lopez P">P Lopez</name>
</author>
<author>
<name sortKey="Bapteste, E" uniqKey="Bapteste E">E Bapteste</name>
</author>
<author>
<name sortKey="Mcinerney, Jo" uniqKey="Mcinerney J">JO. McInerney</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Aravind, L" uniqKey="Aravind L">L Aravind</name>
</author>
<author>
<name sortKey="Tatusov, Rl" uniqKey="Tatusov R">RL Tatusov</name>
</author>
<author>
<name sortKey="Wolf, Yi" uniqKey="Wolf Y">YI Wolf</name>
</author>
<author>
<name sortKey="Walker, Dr" uniqKey="Walker D">DR Walker</name>
</author>
<author>
<name sortKey="Koonin, Ev" uniqKey="Koonin E">EV. Koonin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Baas Becking, Lgm" uniqKey="Baas Becking L">LGM. Baas Becking</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Baquero, F" uniqKey="Baquero F">F Baquero</name>
</author>
<author>
<name sortKey="Martinez, Jl" uniqKey="Martinez J">JL Martinez</name>
</author>
<author>
<name sortKey="Canton, R" uniqKey="Canton R">R. Canton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bastian, M" uniqKey="Bastian M">M Bastian</name>
</author>
<author>
<name sortKey="Heymann, S" uniqKey="Heymann S">S Heymann</name>
</author>
<author>
<name sortKey="Jacomy, M" uniqKey="Jacomy M">M. Jacomy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bosi, E" uniqKey="Bosi E">E Bosi</name>
</author>
<author>
<name sortKey="Fani, R" uniqKey="Fani R">R Fani</name>
</author>
<author>
<name sortKey="Fondi, M" uniqKey="Fondi M">M. Fondi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boucher, Y" uniqKey="Boucher Y">Y Boucher</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Caro Quintero, A" uniqKey="Caro Quintero A">A Caro-Quintero</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Clewell, Db" uniqKey="Clewell D">DB Clewell</name>
</author>
<author>
<name sortKey="Flannagan, Se" uniqKey="Flannagan S">SE Flannagan</name>
</author>
<author>
<name sortKey="Jaworski, Dd" uniqKey="Jaworski D">DD. Jaworski</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dagan, T" uniqKey="Dagan T">T. Dagan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Luca, G" uniqKey="De Luca G">G De Luca</name>
</author>
<author>
<name sortKey="Zanetti, F" uniqKey="Zanetti F">F Zanetti</name>
</author>
<author>
<name sortKey="Fateh Moghadm, P" uniqKey="Fateh Moghadm P">P Fateh-Moghadm</name>
</author>
<author>
<name sortKey="Stampi, S" uniqKey="Stampi S">S. Stampi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dishaw, Lj" uniqKey="Dishaw L">LJ Dishaw</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dunny, Gm" uniqKey="Dunny G">GM Dunny</name>
</author>
<author>
<name sortKey="Leonard, Ba" uniqKey="Leonard B">BA Leonard</name>
</author>
<author>
<name sortKey="Hedberg, Pj" uniqKey="Hedberg P">PJ. Hedberg</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, Rc" uniqKey="Edgar R">RC. Edgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, Rc" uniqKey="Edgar R">RC. Edgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Finn, Rd" uniqKey="Finn R">RD Finn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Finn, Rd" uniqKey="Finn R">RD Finn</name>
</author>
<author>
<name sortKey="Clements, J" uniqKey="Clements J">J Clements</name>
</author>
<author>
<name sortKey="Eddy, Sr" uniqKey="Eddy S">SR. Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fondi, M" uniqKey="Fondi M">M Fondi</name>
</author>
<author>
<name sortKey="Fani, R" uniqKey="Fani R">R. Fani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Forster, D" uniqKey="Forster D">D Forster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fuhrman, Ja" uniqKey="Fuhrman J">JA. Fuhrman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Halary, S" uniqKey="Halary S">S Halary</name>
</author>
<author>
<name sortKey="Leigh, Jw" uniqKey="Leigh J">JW Leigh</name>
</author>
<author>
<name sortKey="Cheaib, B" uniqKey="Cheaib B">B Cheaib</name>
</author>
<author>
<name sortKey="Lopez, P" uniqKey="Lopez P">P Lopez</name>
</author>
<author>
<name sortKey="Bapteste, E" uniqKey="Bapteste E">E. Bapteste</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hanson, Ca" uniqKey="Hanson C">CA Hanson</name>
</author>
<author>
<name sortKey="Fuhrman, Ja" uniqKey="Fuhrman J">JA Fuhrman</name>
</author>
<author>
<name sortKey="Horner Devine, Mc" uniqKey="Horner Devine M">MC Horner-Devine</name>
</author>
<author>
<name sortKey="Martiny, Jb" uniqKey="Martiny J">JB. Martiny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hooper, Sd" uniqKey="Hooper S">SD Hooper</name>
</author>
<author>
<name sortKey="Mavromatis, K" uniqKey="Mavromatis K">K Mavromatis</name>
</author>
<author>
<name sortKey="Kyrpides, Nc" uniqKey="Kyrpides N">NC. Kyrpides</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hooper, Sd" uniqKey="Hooper S">SD Hooper</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jones, Pw" uniqKey="Jones P">PW. Jones</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jukes, Th" uniqKey="Jukes T">TH Jukes</name>
</author>
<author>
<name sortKey="Cantor, Cr" uniqKey="Cantor C">CR. Cantor</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kohl, M" uniqKey="Kohl M">M Kohl</name>
</author>
<author>
<name sortKey="Wiese, S" uniqKey="Wiese S">S Wiese</name>
</author>
<author>
<name sortKey="Warscheid, B" uniqKey="Warscheid B">B. Warscheid</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Shao, B" uniqKey="Shao B">B Shao</name>
</author>
<author>
<name sortKey="Shen, J" uniqKey="Shen J">J Shen</name>
</author>
<author>
<name sortKey="Wang, S" uniqKey="Wang S">S Wang</name>
</author>
<author>
<name sortKey="Wu, Y" uniqKey="Wu Y">Y. Wu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lima Mendez, G" uniqKey="Lima Mendez G">G Lima-Mendez</name>
</author>
<author>
<name sortKey="Van Helden, J" uniqKey="Van Helden J">J Van Helden</name>
</author>
<author>
<name sortKey="Toussaint, A" uniqKey="Toussaint A">A Toussaint</name>
</author>
<author>
<name sortKey="Leplae, R" uniqKey="Leplae R">R. Leplae</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, B" uniqKey="Liu B">B Liu</name>
</author>
<author>
<name sortKey="Pop, M" uniqKey="Pop M">M. Pop</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Machado, M" uniqKey="Machado M">M Machado</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Martiny, Jb" uniqKey="Martiny J">JB Martiny</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcinerney, Jo" uniqKey="Mcinerney J">JO McInerney</name>
</author>
<author>
<name sortKey="Pisani, D" uniqKey="Pisani D">D Pisani</name>
</author>
<author>
<name sortKey="Bapteste, E" uniqKey="Bapteste E">E Bapteste</name>
</author>
<author>
<name sortKey="O Connell, Mj" uniqKey="O Connell M">MJ. O'Connell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nalbantoglu, Ou" uniqKey="Nalbantoglu O">OU Nalbantoglu</name>
</author>
<author>
<name sortKey="Way, Sf" uniqKey="Way S">SF Way</name>
</author>
<author>
<name sortKey="Hinrichs, Sh" uniqKey="Hinrichs S">SH Hinrichs</name>
</author>
<author>
<name sortKey="Sayood, K" uniqKey="Sayood K">K. Sayood</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Newman, Me" uniqKey="Newman M">ME. Newman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Popa, O" uniqKey="Popa O">O Popa</name>
</author>
<author>
<name sortKey="Dagan, T" uniqKey="Dagan T">T. Dagan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raes, J" uniqKey="Raes J">J Raes</name>
</author>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Yamada, T" uniqKey="Yamada T">T Yamada</name>
</author>
<author>
<name sortKey="Jensen, Lj" uniqKey="Jensen L">LJ Jensen</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P. Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reno, Ml" uniqKey="Reno M">ML Reno</name>
</author>
<author>
<name sortKey="Held, Nl" uniqKey="Held N">NL Held</name>
</author>
<author>
<name sortKey="Fields, Cj" uniqKey="Fields C">CJ Fields</name>
</author>
<author>
<name sortKey="Burke, Pv" uniqKey="Burke P">PV Burke</name>
</author>
<author>
<name sortKey="Whitaker, Rj" uniqKey="Whitaker R">RJ. Whitaker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rice, P" uniqKey="Rice P">P Rice</name>
</author>
<author>
<name sortKey="Longden, I" uniqKey="Longden I">I Longden</name>
</author>
<author>
<name sortKey="Bleasby, A" uniqKey="Bleasby A">A. Bleasby</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schluter, A" uniqKey="Schluter A">A Schluter</name>
</author>
<author>
<name sortKey="Szczepanowski, R" uniqKey="Szczepanowski R">R Szczepanowski</name>
</author>
<author>
<name sortKey="Puhler, A" uniqKey="Puhler A">A Puhler</name>
</author>
<author>
<name sortKey="Top, Em" uniqKey="Top E">EM. Top</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schmieder, R" uniqKey="Schmieder R">R Schmieder</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R. Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schmieder, R" uniqKey="Schmieder R">R Schmieder</name>
</author>
<author>
<name sortKey="Lim, Yw" uniqKey="Lim Y">YW Lim</name>
</author>
<author>
<name sortKey="Edwards, R" uniqKey="Edwards R">R. Edwards</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sermonti, G" uniqKey="Sermonti G">G Sermonti</name>
</author>
<author>
<name sortKey="Petris, A" uniqKey="Petris A">A Petris</name>
</author>
<author>
<name sortKey="Micheli, M" uniqKey="Micheli M">M Micheli</name>
</author>
<author>
<name sortKey="Lanfaloni, L" uniqKey="Lanfaloni L">L. Lanfaloni</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shanahan, Ef" uniqKey="Shanahan E">EF Shanahan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smillie, Cs" uniqKey="Smillie C">CS Smillie</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Staley, Jt" uniqKey="Staley J">JT Staley</name>
</author>
<author>
<name sortKey="Konopka, A" uniqKey="Konopka A">A. Konopka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sul, Wj" uniqKey="Sul W">WJ Sul</name>
</author>
<author>
<name sortKey="Oliver, Ta" uniqKey="Oliver T">TA Oliver</name>
</author>
<author>
<name sortKey="Ducklow, Hw" uniqKey="Ducklow H">HW Ducklow</name>
</author>
<author>
<name sortKey="Amaral Zettler, La" uniqKey="Amaral Zettler L">LA Amaral-Zettler</name>
</author>
<author>
<name sortKey="Sogin, Ml" uniqKey="Sogin M">ML. Sogin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, S" uniqKey="Sun S">S Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sunagawa, S" uniqKey="Sunagawa S">S Sunagawa</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Takamatsu, D" uniqKey="Takamatsu D">D Takamatsu</name>
</author>
<author>
<name sortKey="Osaki, M" uniqKey="Osaki M">M Osaki</name>
</author>
<author>
<name sortKey="Sekizaki, T" uniqKey="Sekizaki T">T. Sekizaki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Talavera, G" uniqKey="Talavera G">G Talavera</name>
</author>
<author>
<name sortKey="Castresana, J" uniqKey="Castresana J">J. Castresana</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tamminen, M" uniqKey="Tamminen M">M Tamminen</name>
</author>
<author>
<name sortKey="Virta, M" uniqKey="Virta M">M Virta</name>
</author>
<author>
<name sortKey="Fani, R" uniqKey="Fani R">R Fani</name>
</author>
<author>
<name sortKey="Fondi, M" uniqKey="Fondi M">M. Fondi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teeling, H" uniqKey="Teeling H">H Teeling</name>
</author>
<author>
<name sortKey="Meyerdierks, A" uniqKey="Meyerdierks A">A Meyerdierks</name>
</author>
<author>
<name sortKey="Bauer, M" uniqKey="Bauer M">M Bauer</name>
</author>
<author>
<name sortKey="Amann, R" uniqKey="Amann R">R Amann</name>
</author>
<author>
<name sortKey="Glockner, Fo" uniqKey="Glockner F">FO. Glockner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tringe, Sg" uniqKey="Tringe S">SG Tringe</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Dongen, S" uniqKey="Van Dongen S">S van Dongen</name>
</author>
<author>
<name sortKey="Abreu Goodger, C" uniqKey="Abreu Goodger C">C. Abreu-Goodger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, T" uniqKey="Zhang T">T Zhang</name>
</author>
<author>
<name sortKey="Zhang, Xx" uniqKey="Zhang X">XX Zhang</name>
</author>
<author>
<name sortKey="Ye, L" uniqKey="Ye L">L. Ye</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, F" uniqKey="Zhou F">F Zhou</name>
</author>
<author>
<name sortKey="Xu, Y" uniqKey="Xu Y">Y. Xu</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Genome Biol Evol</journal-id>
<journal-id journal-id-type="iso-abbrev">Genome Biol Evol</journal-id>
<journal-id journal-id-type="publisher-id">gbe</journal-id>
<journal-id journal-id-type="hwp">gbe</journal-id>
<journal-title-group>
<journal-title>Genome Biology and Evolution</journal-title>
</journal-title-group>
<issn pub-type="epub">1759-6653</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27190206</article-id>
<article-id pub-id-type="pmc">4898794</article-id>
<article-id pub-id-type="doi">10.1093/gbe/evw077</article-id>
<article-id pub-id-type="publisher-id">evw077</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>“Every Gene Is Everywhere but the Environment Selects”: Global Geolocalization of Gene Sharing in Environmental Samples through Network Analysis</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Fondi</surname>
<given-names>Marco</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="evw077-aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="evw077-FM1">
<sup></sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Karkman</surname>
<given-names>Antti</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff3">
<sup>3</sup>
</xref>
<xref ref-type="author-notes" rid="evw077-FM1">
<sup></sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tamminen</surname>
<given-names>Manu V.</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff4">
<sup>4</sup>
</xref>
<xref ref-type="aff" rid="evw077-aff5">
<sup>5</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bosi</surname>
<given-names>Emanuele</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="evw077-aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Virta</surname>
<given-names>Marko</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Fani</surname>
<given-names>Renato</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="evw077-aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Alm</surname>
<given-names>Eric</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff6">
<sup>6</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>McInerney</surname>
<given-names>James O.</given-names>
</name>
<xref ref-type="aff" rid="evw077-aff7">
<sup>7</sup>
</xref>
<xref ref-type="aff" rid="evw077-aff8">
<sup>8</sup>
</xref>
<xref ref-type="corresp" rid="evw077-cor1">*</xref>
</contrib>
<aff id="evw077-aff1">
<sup>1</sup>
Laboratory of Microbial and Molecular Evolution, Department of Biology, University of Florence, Italy</aff>
<aff id="evw077-aff2">
<sup>2</sup>
Computational Biology Group, University of Florence, Italy</aff>
<aff id="evw077-aff3">
<sup>3</sup>
Department of Food and Environmental Sciences, University of Helsinki, Finland</aff>
<aff id="evw077-aff4">
<sup>4</sup>
Department of Environmental Systems Science, ETH Zürich, Switzerland</aff>
<aff id="evw077-aff5">
<sup>5</sup>
Department of Aquatic Ecology, Eawag, Switzerland</aff>
<aff id="evw077-aff6">
<sup>6</sup>
Department of Civil and Environmental Engineering, Massachusetts Institute of Technology</aff>
<aff id="evw077-aff7">
<sup>7</sup>
Department of Biology, National University of Ireland Maynooth, County Kildare, Ireland</aff>
<aff id="evw077-aff8">
<sup>8</sup>
Computational Evolutionary Biology, Faculty of Life Sciences, The University of Manchester, United Kingdom</aff>
</contrib-group>
<author-notes>
<fn id="evw077-FM2">
<p>
<bold>Associate editor:</bold>
Tal Dagan</p>
</fn>
<corresp id="evw077-cor1">*Corresponding author: E-mail:
<email>james.mcinerney@manchester.ac.uk</email>
.</corresp>
<fn id="evw077-FM1">
<p>
<sup></sup>
These authors contributed equally to this work.</p>
</fn>
</author-notes>
<pub-date pub-type="collection">
<month>5</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="epub">
<day>29</day>
<month>4</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>29</day>
<month>4</month>
<year>2016</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>8</volume>
<issue>5</issue>
<fpage>1388</fpage>
<lpage>1400</lpage>
<history>
<date date-type="accepted">
<day>31</day>
<month>3</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.</copyright-statement>
<copyright-year>2016</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/" license-type="creative-commons">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<abstract>
<p>The spatial distribution of microbes on our planet is famously formulated in the Baas Becking hypothesis as “everything is everywhere but the environment selects.” While this hypothesis does not strictly rule out patterns caused by geographical effects on ecology and historical founder effects, it does propose that the remarkable dispersal potential of microbes leads to distributions generally shaped by environmental factors rather than geographical distance. By constructing sequence similarity networks from uncultured environmental samples, we show that microbial gene pool distributions are not influenced nearly as much by geography as ecology, thus extending the Bass Becking hypothesis from whole organisms to microbial genes. We find that gene pools are shaped by their broad ecological niche (such as sea water, fresh water, host, and airborne). We find that freshwater habitats act as a gene exchange bridge between otherwise disconnected habitats. Finally, certain antibiotic resistance genes deviate from the general trend of habitat specificity by exhibiting a high degree of cross-habitat mobility. The strong cross-habitat mobility of antibiotic resistance genes is a cause for concern and provides a paradigmatic example of the rate by which genes colonize new habitats when new selective forces emerge.</p>
</abstract>
<kwd-group>
<kwd>biogeography</kwd>
<kwd>horizontal gene transfer</kwd>
<kwd>antibiotic resistance</kwd>
</kwd-group>
<counts>
<page-count count="13"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>The spatial distribution of microorganisms on the planet is often expressed according to Baas Becking’s famous tenet “everything is everywhere but the environment selects” (
<xref rid="evw077-B4" ref-type="bibr">Baas Becking 1934</xref>
). “Everything is everywhere” alludes to the remarkable dispersal potential of microorganisms, whereas “the environment selects” implies that only specifically adapted organisms will thrive and proliferate in a particular environment (
<xref rid="evw077-B21" ref-type="bibr">Fuhrman 2009</xref>
). The Baas Becking hypothesis does not rule out the possibility of strong geographic patterns but rather suggests that geography per se does not drive the distribution of species—geographic patterns could simply reflect an association between geography and ecology. Empirical testing of the Baas Becking hypothesis has focused mainly on specific microorganisms and/or specific environments (
<xref rid="evw077-B39" ref-type="bibr">Reno et al. 2009</xref>
;
<xref rid="evw077-B48" ref-type="bibr">Sul et al. 2013</xref>
). Because most members of microbial communities resist cultivation, understanding of molecular and ecological details of microbial biogeography remains vague (
<xref rid="evw077-B47" ref-type="bibr">Staley and Konopka 1985</xref>
;
<xref rid="evw077-B33" ref-type="bibr">Martiny et al. 2006</xref>
;
<xref rid="evw077-B38" ref-type="bibr">Raes et al. 2011</xref>
;
<xref rid="evw077-B23" ref-type="bibr">Hanson et al. 2012</xref>
). However, the recent increase in the number of metagenomes in public repositories offers an opportunity to explore the global distribution of coding sequences, universally shared phylogenetic marker genes, and horizontally transferred genes, including genes of clinical importance such as antibiotic resistance genes (
<xref rid="evw077-B19" ref-type="bibr">Fondi and Fani 2010</xref>
).</p>
<p>Furthermore, many studies have highlighted the importance of network theory and approaches based on sequence similarity networks (SSNs) in studying large-scale evolutionary relationships, including the influence of habitat and ecology in the distribution of gene pools, evolution of organisms, and horizontal gene transfer (HGT,
<xref rid="evw077-B30" ref-type="bibr">Lima-Mendez et al. 2008</xref>
;
<xref rid="evw077-B22" ref-type="bibr">Halary et al. 2010</xref>
;
<xref rid="evw077-B11" ref-type="bibr">Dagan 2011</xref>
;
<xref rid="evw077-B53" ref-type="bibr">Tamminen et al. 2012</xref>
;
<xref rid="evw077-B2" ref-type="bibr">Alvarez-Ponce et al. 2013</xref>
;
<xref rid="evw077-B20" ref-type="bibr">Forster et al. 2015</xref>
). However, in most cases, only completely sequenced genomes (including plasmids and phages) were used for these analyses, thus limiting the scope of the studies to mainly cultivable microorganisms or specific phyla (i.e., ciliates). Indeed, often the initial habitat assignment stems from where the organism was first isolated, which may not be its only, or even its preferred, habitat (
<xref rid="evw077-B24" ref-type="bibr">Hooper et al. 2009</xref>
).</p>
<p>Here, we empirically test the Baas Becking hypothesis by applying it to genes as well as organisms. By studying 339 metagenomes (pooled into roughly 100 sampling points) using an SSN approach (
<xref rid="evw077-B19" ref-type="bibr">Fondi and Fani 2010</xref>
;
<xref rid="evw077-B22" ref-type="bibr">Halary et al. 2010</xref>
), we offer a culture-independent view of microbial gene pool commonalities and differences and investigate whether the distributions of genes are limited to particular ecological niches or whether they display a cosmopolitan or geographically defined distribution. Geographical influence on overall patterns of gene distribution is measured as the correlation between the physical distance and the degree of shared homologous sequences between the metagenomes. A positive or negative correlation indicates a distance-effect on global macroscale patterns of gene distributions, whereas absence of such correlation suggests independence between geographical distance and proportion of shared sequences. While gene dispersal may depend on the distribution patterns of microbial species, genes can also rapidly move between phylogenetically distant cells by means of HGT. To test whether the putative horizontally transferred genes follow the distribution of their hosts or form their own distribution, we converted the reconstructed SSN into an HGT network and investigated its main topological features.</p>
<p>By applying a network-oriented analysis pipeline on culture-independent environmental data, we here demonstrate the cosmopolitan distribution of genes and the influence of ecology on their distribution and, in parallel, we show that the same patterns hold for “mobile” genes. Our findings have important implications in several areas of biology, from environmental microbiology to antibiotic resistance, to microbial evolution and to the structure of present day common gene pools.</p>
</sec>
<sec sec-type="materials|methods">
<title>Materials and Methods</title>
<sec>
<title>Data Set Assembly and Validation</title>
<p>Metagenomic sequences (contigs) used in this work were downloaded from three major repositories, IMG (
<ext-link ext-link-type="uri" xlink:href="http://img.jgi.doe.gov">http://img.jgi.doe.gov</ext-link>
/), MG-RAST (
<ext-link ext-link-type="uri" xlink:href="http://metagenomics.anl.gov/">http://metagenomics.anl.gov/</ext-link>
, Meyer et al. 2008), and CAMERA (
<ext-link ext-link-type="uri" xlink:href="http://camera.calit2.net/">http://camera.calit2.net/</ext-link>
,
<xref rid="evw077-B49" ref-type="bibr">Sun et al. 2011</xref>
). The presence of redundant projects (i.e., the same project deposited in two different repositories) was checked manually and, in those cases, only one of the two projects was maintained. When only sequencing reads were available, shotgun metagenomics assembly was performed. Quality control and removal of identical reads were done with Prinseq (
<xref rid="evw077-B42" ref-type="bibr">Schmieder and Edwards 2011</xref>
). For most of the samples, assembled contigs were available on the public repositories mentioned above. In those cases where (Roche 454) shotgun DNA sequences were available, assembly was carried out using Phrap using the default parameters (
<xref rid="evw077-B32" ref-type="bibr">Machado et al. 2011</xref>
).</p>
<p>A total of 339 metagenome projects (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary material</ext-link>
S1,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online) were retrieved, processed, and analyzed. Each of the retrieved projects was associated with a habitat, according to its sampling point as indicated in the metafiles associated with each of the project. Nine main categories were defined for sampling habitats, including soil, seawater, inland-water, wastewater, host, air, bioremediation, biotransformation, and sludge waste. Samples for which a clear habitat of the corresponding sampling point was not available were labeled as “Unknown.”</p>
<p>Additionally, for each metagenome the exact sampling point (latitude and longitude) was retrieved (Global Positioning System [GPS] coordinates). The physical distance (
<italic>d</italic>
, expressed in km) among the different sampling points was computed from their GPS coordinates using the spherical law of cosines, that is:
<disp-formula id="E1">
<mml:math id="EQ1">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext> acos </mml:mtext>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mtext>sin</mml:mtext>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mi>ϕ</mml:mi>
<mml:mtext>1</mml:mtext>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mo>*</mml:mo>
<mml:mtext>sin</mml:mtext>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mi>ϕ</mml:mi>
<mml:mtext>2</mml:mtext>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mo>+</mml:mo>
<mml:mtext> cos</mml:mtext>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mi>ϕ</mml:mi>
<mml:mtext>1</mml:mtext>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mo>*</mml:mo>
<mml:mtext>cos</mml:mtext>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mi>ϕ</mml:mi>
<mml:mtext>2</mml:mtext>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mo>*</mml:mo>
<mml:mtext>cos</mml:mtext>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:mrow>
<mml:mi>Δ</mml:mi>
<mml:mi>λ</mml:mi>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mo>*</mml:mo>
<mml:mi>R</mml:mi>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
where ϕ1 and ϕ2 represent latitude values (in degree) of points 1 and 2, Δλ represents the difference between longitude values of points 1 and 2, and
<italic>R</italic>
is the earth’s radius (mean radius = 6,371 km). In cases in which we found different metagenome projects (i.e., different naming and different number of sequences but same habitat) with (almost) identical sampling points (i.e., within a radius of 20 km), the corresponding projects were pooled into a single sequence fasta file. Ribosomal sequences were removed from each sequence data set using Ribopicker software (
<xref rid="evw077-B43" ref-type="bibr">Schmieder et al. 2012</xref>
) with default parameters.</p>
<p>At the end of the data set assembly and checking procedures, 97 Fasta files were obtained, embedding a total of 1,019,781 contig sequences (longer than 1,500 bp). These were used as input for homology-based network construction pipeline. Fasta files and scripts used in this work have been made publicly available at
<ext-link ext-link-type="uri" xlink:href="http://sourceforge.net/projects/metanetwork/">http://sourceforge.net/projects/metanetwork/</ext-link>
.</p>
</sec>
<sec>
<title>BLAST Searches and Evolutionary Distances Computation</title>
<p>Homology searches among sampled contigs were performed using BLASTp and BLASTn from the BLAST suite (
<xref rid="evw077-B1" ref-type="bibr">Altschul et al. 1997</xref>
). Only hits longer than 500 bp and with an
<italic>E</italic>
value lower than 1e
<sup></sup>
<sup>100</sup>
were considered for further analysis (multiple hits among two contigs were counted only once and no constraints on the alignment coverage were imposed). Furthermore, several identity thresholds were considered, that is, 70%, 80%, 90%, 95%, and 99%. A summary of the main features of contigs embedded in our data set and BLAST hits is reported in
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online.</p>
<p>BLAST outputs were then postprocessed in the form of undirected networks (accounting for the different identity thresholds). Two different kinds of network were obtained 1) metagenomic network and 2) contig network. In the first type of network, nodes represent single metagenome projects (or metagenome project pools), whereas links represent the amount of BLAST hits they share. In the second kind of network (“contig network”), every node represents a contig and two nodes are connected if a significant hit was retrieved among them.</p>
<p>Five different identity thresholds were selected (70%, 80%, 90%, 95%, and 99%) and an alignment length threshold of 500 bp was set to place links between two different metagenomes (nodes). It must be noted that the size of the different metagenomes in the data set may influence their degree (i.e., the number of their connections) in the network; indeed, larger metagenomes might have higher probability to be more connected in the graph, just by random chance. To overcome this issue, we also computed a normalized value for each link, dividing the actual number of BLAST hits by the sum of the number of sequences possessed by the two metagenomes and evaluated the correlation between connectivity and number of sequences for each metagenome in the normalized network. A Pearson product moment calculation over the original (not normalized graph) revealed a (low) positive correlation among connectivity and sample size (Pearson-product-moment correlation = 0.126,
<italic>P</italic>
value < 2.2e-16). The same calculation repeated after normalizing link values produced a Pearson-product-moment correlation of 0.044, with a
<italic>P</italic>
value of 0.002117, suggesting a minor size effect on the computed similarity network. All BLAST postprocessing was performed with in-house-developed Perl and Python scripts.</p>
<p>To account for the actual amount of sequence possessed by each sample (and not only the number of contigs possessed), we performed an alternative normalization process, dividing the number of BLAST hits between two nodes by the number of bases (not the contigs) possessed by the two corresponding samples. General trends computed in the rest of the article were not affected by the normalization procedure implemented since the clustering of the different samples was still influenced by ecology rather than by their physical distance.</p>
<p>To test whether a correlation exists among the number of BLAST hits shared by two metagenomes and their geographical distance, the Pearson-product-moment correlation was calculated. Results obtained (Pearson-product-moment correlation = −0.038,
<italic>P</italic>
value = 06 × 10
<sup></sup>
<sup>3</sup>
) revealed the absence of a statistically significant correlation among physical distance (expressed in km) and the number of shared hits (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online).</p>
<p>To account for the evolutionary distances among the (coding) sequences in our data set, we have also implemented the following pipeline. First, we have performed an all versus all BLAST of the coding sequences embedded in our metagenomes data set. Next, we extracted 100 000 groups of homologs among the different samples using an
<italic>E</italic>
-value threshold of 1e-70. At this stage, to avoid considering underrepresented samples, we focused our analysis only on the most represented samples (i.e., inland water, host associated, sea water, and soil). Such a low
<italic>E</italic>
-value threshold was used to retrieve highly similar sequence from the different data sets that could facilitate accurate sequence alignment and distance calculation in the next steps of the pipeline. Identified groups of orthologs were then aligned using Muscle (
<xref rid="evw077-B15" ref-type="bibr">Edgar 2004a</xref>
,
<xref rid="evw077-B16" ref-type="bibr">2004b</xref>
) and the resulting multialignments were automatically edited using Gblocks (
<xref rid="evw077-B52" ref-type="bibr">Talavera and Castresana 2007</xref>
) to remove poorly aligned regions. Edited multialignments were then used as input for the distmat tool of EMBOSS (
<xref rid="evw077-B40" ref-type="bibr">Rice et al. 2000</xref>
) suite, leading to the creation of one distance matrix (according to Jukes–Cantor model [
<xref rid="evw077-B27" ref-type="bibr">Jukes and Cantor 1969</xref>
]) for each group of homologs shared by the samples. From these we calculated and compared the evolutionary distances among genes shared by the same samples and among those shared by samples from different niches.</p>
</sec>
<sec>
<title>Identification of Marker Genes</title>
<p>Universal phylogenetic marker genes were identified from the metagenomes using the fetchMG program version 1.0 (
<xref rid="evw077-B50" ref-type="bibr">Sunagawa et al. 2013</xref>
). All identified marker genes from one metagenome were pooled and used in network analysis. Connections between metagenomes were normalized with the sum of sequences in the two metagenomes, as described previously. To test whether a correlation existed among the number of shared marker genes by two metagenomes and their geographical distance, the Pearson-product-moment correlation was calculated (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online).</p>
</sec>
<sec>
<title>Network Analysis and Visualization</title>
<p>Graph topology and statistical tests were performed with the igraph (v. 1.0.0) library of the R statistical package (v. 3.1.3,
<ext-link ext-link-type="uri" xlink:href="http://www.r-project.org">http://www.r-project.org</ext-link>
/) and in-house-developed Perl and R scripts. The main graph metrics evaluated in this work were betweenness centrality, clustering coefficient, closeness centrality, and assortativity. Briefly, betweenness is a centrality measure that indicates which nodes are holding the network together; nodes with high betweenness values can be bridges between otherwise disconnected regions of the network. The clustering coefficient measures the extent to which the neighbors of a given node are interlinked. We used this coefficient as an indicator of cohesiveness around a node neighborhood. The closeness of a node is the inverse of its average distance to all other nodes in the graph. The higher the closeness, the more central is the node. Finally, assortativity measures the tendency of nodes with the same label (the source ecological niche in our case) to preferentially connect with one another in the graph (
<xref rid="evw077-B36" ref-type="bibr">Newman 2003</xref>
). If a network has perfect assortativity (
<italic>r</italic>
= 1), then all nodes connect only with nodes of the same kind. If the network has no assortativity (
<italic>r</italic>
= 0), then any node can randomly connect to any other node. If a network is perfectly disassortative (
<italic>r</italic>
= −1), all nodes will have to connect to nodes with different degrees.</p>
<p>Statistical support to these centrality measures was provided through randomization of the original graph. More in detail, here the null model reflects the possibility that interactions are equally likely between any pair of nodes in the graph. In other words, our stochastic null model has no centrality structure. In this case, our randomized networks contained the same nodes, but edges were rearranged randomly among them (edges rearrangement). Statistical tests (e.g., Mann–Whitney test) were carried out each time to infer whether original and randomized networks differed significantly./</p>
<p>Network visualization and postprocessing were done using the Cytoscape and Gephi software (
<xref rid="evw077-B6" ref-type="bibr">Bastian et al. 2009</xref>
;
<xref rid="evw077-B28" ref-type="bibr">Kohl et al. 2011</xref>
). The GeoLayout Gephi plugin was used to build geocoded graphs of gene sharing.</p>
</sec>
<sec>
<title>Computational Strategy for Clusters Identification and Testing</title>
<p>To identify network clusters in the metagenomes network, a community detection algorithm (MCL,
<xref rid="evw077-B56" ref-type="bibr">van Dongen and Abreu-Goodger 2012</xref>
) was first applied to the graph. The main parameter of this algorithm is the inflation factor (IF) that modulates cluster granularity. To choose the optimal IF (i.e., to select the proper trade-off between clusters size and their overall homogeneity), we explored values ranging from 1.2 to 5 by steps of 0.2 and estimated cluster homogeneity by computing the average intracluster cluster coefficient (ICCC) at every step. Briefly, the clustering coefficient measures the “cliquishness” around a node; hence, its average over the nodes of a cluster can be used as a measure of the cluster homogeneity. ICCC is computed considering only the edges within clusters and, in principle, a clustering result that maximizes the ICCC produces more homogeneous graphs.
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary fig. S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online, shows the trend of the ICCC and the number of clusters at different IF values for the 90% network. As expected, the number of clusters increases as the IF increases, whereas the opposite holds for ICCC. The peak at inflation value of 1.4 suggests that this clustering solution is the best trade-off between network fragmentation and cluster size (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig. S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online). Additionally, this threshold was shown to perform reasonably well also for the networks obtained at different network clustering, allowing the identification of eight major clusters (i.e., with at least two nodes).</p>
<p>To test the presence of a correlation between the clustering of the different nodes (metagenomes) and their source habitat, we implemented a computational strategy similar to the one applied by
<xref rid="evw077-B30" ref-type="bibr">Lima-Mendez et al. (2008)</xref>
. Once the clusters were identified in the network, we evaluated the correspondence between such clusters and the source habitat of the different nodes represented by the different nodes. In other words, we evaluated whether metagenomes belonging to the same ecological niche tended to cluster together or not in a significant manner.</p>
<p>Three different measures are classically adopted to evaluate the overlap between some kind of classifications (in our case network clustering and source ecological niche): recall (R), precision (P), and accuracy (A). R evaluates whether all nodes of a given habitat are found in the same cluster (
<italic>R</italic>
= 1) or there are found embedded in different clusters of the network (
<italic>R</italic>
< 1). Conversely, P measures how well a given cluster corresponds to its best-matching habitat; a value of 1 indicates that all nodes in the cluster belong to the same habitat. Similar to
<xref rid="evw077-B30" ref-type="bibr">Lima-Mendez et al. (2008)</xref>
from the class- and cluster-wise statistics, the clustering-wise statistics were computed as the weighted means over all habitat/clusters of the class/cluster-wise values. The geometric mean of R and P gives the accuracy measure. Results obtained with this approach were compared to random expectations performing 1,000 permutation tests by shuffling labels of the nodes in the network while maintaining the structure of the network. The null hypothesis underlying this approach is that any node (group of sequences) can occupy any network position (i.e., could cluster with any other node in the network). Accordingly, during our randomizations, the network structure is held constant and the node labels are permuted. A graph sampled with this approach retains all network traits of the empirical graph and this enables assessment of whether the node characteristics depend on the structure of the graph. For each of the permutations, the same statistics (R, P, A) were computed and finally compared to the observed ones.</p>
</sec>
<sec>
<title>Contig Taxonomic Annotation and Source Molecule Identification</title>
<p>Each contig of the metagenome data set was assigned to the (putative) corresponding genus using the approach implemented in RAIphy (
<xref rid="evw077-B35" ref-type="bibr">Nalbantoglu et al. 2011</xref>
).</p>
<p>Finally, since RAIphy is a semisupervised method that relies on reference genomes, sample types that have better representative set of sequenced genomes may achieve higher supervised classification rates and will tend to connect with each other more frequently. To avoid possible biases due to the use of a semisupervised method, we also implemented a composition-based method (using tetranucleotide frequency distributions) for the identification of (putative) HGTs. Briefly, for each match between two contigs, the tetranucleotide frequencies of the flanking regions were compared as described in
<xref rid="evw077-B54" ref-type="bibr">Teeling et al. (2004)</xref>
. Only matches where the flanking region was at least 1,000 bp and the Pearson correlation coefficient between the tetranucleotide profiles was below 0.7 were considered as putative HGT events.</p>
<p>The most likely source molecule of each contig (i.e., plasmid or chromosome) was identified using the composition-oriented software cBar (
<xref rid="evw077-B58" ref-type="bibr">Zhou and Xu 2010</xref>
). Both tools were used with default parameters.</p>
</sec>
<sec>
<title>ORF Identification and Functional Annotation</title>
<p>ORFs were identified using the FragGeneScan software (Rho et al. 2010). Functional annotation of identified ORFs was performed using hmmscan from HMMER (version 3.1b2 [
<xref rid="evw077-B18" ref-type="bibr">Finn et al. 2011</xref>
]) with an
<italic>E</italic>
-value cut-off of 0.1 and probing the Pfam database (
<xref rid="evw077-B17" ref-type="bibr">Finn et al. 2014</xref>
). Antibiotic Resistance (AR)-related genes were identified through BLAST (blastp) searches against Antibiotic Resistance Database (
<xref rid="evw077-B31" ref-type="bibr">Liu and Pop 2009</xref>
).</p>
</sec>
<sec>
<title>Adjacency Matrix Construction</title>
<p>The adjacency matrix accounting for the degree of interconnections among samples from the different environments was computed as follows:</p>
<p>For each habitat, the proportion of connections of that habitat with all the other habitats has been computed. The proportion of connections connecting habitat A with habitat B (
<inline-formula id="IE1">
<mml:math id="IEQ1">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="normal">P</mml:mi>
<mml:mi mathvariant="normal">C</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>
) is given by this formula:
<disp-formula id="E2">
<mml:math id="EQ2">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mtext>PC</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mtext>Weight</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mtext>Edge</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mstyle mathsize="140%" displaystyle="true">
<mml:mo></mml:mo>
</mml:mstyle>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mtext>Weight</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mtext>Edge</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>The PC index ranges from 0 to 1 and measures the specificity of the connection between one habitat in respect to the others. Since the denominator represents the amount of sequences in one of the two analyzed samples, this measure is specific to each of the analyzed environments and is not symmetric
<inline-formula id="IE2">
<mml:math id="IEQ2">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="normal">P</mml:mi>
<mml:mi mathvariant="normal">C</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="normal">P</mml:mi>
<mml:mi mathvariant="normal">C</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>
. The PC values have been organized in the form of a matrix where all these values have been normalized by computing the row
<italic>Z</italic>
score, which means that rows of the matrix are centered and scaled by subtracting the mean of the row from every value and then dividing the resulting values by the standard deviation of the row.
<disp-formula id="E3">
<mml:math id="EQ3">
<mml:mrow>
<mml:msubsup>
<mml:mi>Z</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mtext>row</mml:mtext>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mo>µ</mml:mo>
<mml:mrow>
<mml:mtext>row</mml:mtext>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>σ</mml:mi>
<mml:mrow>
<mml:mtext>row</mml:mtext>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
</sec>
</sec>
<sec>
<title>Results and Discussion</title>
<sec>
<title>General Features</title>
<p>We built an SSN using metagenome sequences from 97 sampling sites (representing 339 metagenomic projects, see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary material</ext-link>
S1,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online) where nodes represent sampling points and links reflect the number of shared homologous sequences (see Materials and Methods for network construction details). We used different sequence identity thresholds in building these SSNs (i.e., 70%, 80%, 90%, 95%, and 99%). Results presented here refer to the 90% network, although the results are valid for all identity thresholds (see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary material</ext-link>
S1,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online). In
<xref ref-type="fig" rid="evw077-F1">figure 1</xref>
, the extent of sequence sharing among the different samples is presented as a network, together with the geographical location of each sampling site. To test whether physical distance and the number of homologous DNA fragments shared by the different metagenomes correlate, we calculated Pearson-product-moment correlation coefficients for samples from different (Pearson Correlation Coefficient [PCC] = −0.038 and
<italic>P</italic>
value = 6 × 10
<sup></sup>
<sup>3</sup>
) and same habitats (from PCC = −0.2 in soil samples to 0.04 in fresh water samples,
<italic>P</italic>
values < 6 × 10
<sup></sup>
<sup>3</sup>
;
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig</ext-link>
. S2,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online). Therefore, physical distance at the spatial resolution provided by the available metagenomes does not explain the distribution of the links in the metagenome-derived SSN, suggesting a relatively marginal role of physical distance in the shaping of the biological relationships. Exemplars of this situation are reported in
<xref ref-type="fig" rid="evw077-F1">figure 1
<italic>b</italic>
and
<italic>c</italic>
</xref>
for host- and sea water-derived samples. Metagenomes of the subnetwork of
<xref ref-type="fig" rid="evw077-F1">figure 1
<italic>b</italic>
</xref>
(samples no. 77, 25, 88 and 89, see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary material</ext-link>
S2,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online), although connected to almost all the other metagenomes in the network, share many more sequences among themselves. The sequences embedded in these metagenomes were obtained from microbiomes of geographically distant Arthropods:
<italic>Dendroctonus ponderosae</italic>
(samples 88 and 89),
<italic>D. frontalis</italic>
(sample 25),
<italic>Xyleborus affinis</italic>
(sample 77), and
<italic>Sirex noctilio</italic>
(sample 54). We observed a similar trend in geographically disparate specimens of sea squirt
<italic>Ciona intestinalis</italic>
(
<xref rid="evw077-B13" ref-type="bibr">Dishaw et al. 2014</xref>
), consistent with the selection of a core community by that particular ecosystem. We observed the same feature for metagenomes displayed in
<xref ref-type="fig" rid="evw077-F1">figure 1
<italic>c</italic>
</xref>
(samples no. 2, 97, 10, 39, 14, 28, 27, 2, and 8, see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary material</ext-link>
S2,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online), all from seawater samples and all sharing heavy connections despite most being separated by large geographical distances. Accordingly, we speculate that the similarity of the ecological niches in which samples were collected explains the high level of gene sharing among these two sets of metagenomes.
<xref ref-type="fig" rid="evw077-F1">Figure 1
<italic>c</italic>
</xref>
also shows that, within samples sharing the same source niche, some nodes that are close in the network (e.g., 10, 7, and 97) display fewer connections among them in respect, for example, to those shared with nodes 28 and 2 (being far away in the map). This, in turn, might suggest the limit of using physical distances as a proxy for estimating the “real” distance among gene pools. Indeed, other barriers and forces (besides geographical distance) might account for the actual dispersal. This is the case, for example, of sea currents that may contribute to creating quite different environments in two close points in the network of metagenomic samples. Similarly, mountains might create a separation among physically close terrestrial DNA pools. On the other hand, these features are quite hard to be confidently modeled on a large, global scale as the one used in this work.
<fig id="evw077-F1" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 1.—</label>
<caption>
<p>(
<italic>A</italic>
) Overall SSN among the 97 sampling points together with their geographical positions. Each node represents a metagenome project and the links represent the presence of homologous sequences between them. Node and link sizes are proportional to the number of sequences embedded in the sample and the (normalized) number of shared sequences, respectively. In (
<italic>B</italic>
) and (
<italic>C</italic>
) specific study cases are reported (see text for details) for host-(red nodes) and sea-water (blue)-derived samples. The connections among samples from the same ecological niche and those among samples from different ecological niches are shown in (
<italic>D</italic>
) and (
<italic>E</italic>
), respectively.</p>
</caption>
<graphic xlink:href="evw077f1p"></graphic>
</fig>
</p>
<p>A preliminary visual inspection of the network revealed that samples from same ecological niches (
<xref ref-type="fig" rid="evw077-F1">fig. 1
<italic>D</italic>
</xref>
) are more tightly connected than samples from different niches (
<xref ref-type="fig" rid="evw077-F1">fig. 1
<italic>E</italic>
</xref>
). Thus, to explicitly test the ecological niche versus geographical distribution hypotheses, we evaluated the correlation between the grouping of the different metagenomes (i.e., the habitat composition of the major clusters in the network of
<xref ref-type="fig" rid="evw077-F1">fig. 1</xref>
) and their source habitat. We first clustered the metagenomes according to the Markov Cluster (MCL) algorithm (see Materials and Methods) and then evaluated whether metagenomes belonging to the same ecological niche tended to (significantly) cluster together using recall (R), precision (P), and accuracy (A) measures. This analysis (
<xref ref-type="fig" rid="evw077-F2">fig. 2</xref>
) revealed relatively high values of both R and P across all the different networks (average
<italic>R</italic>
= 0.588 and average
<italic>P</italic>
= 0.71). A similar trend was observed also when measuring clustering accuracy (A) (
<xref ref-type="fig" rid="evw077-F2">fig. 2</xref>
). Such high values of P, R, and A were never obtained during 1,000 random permutations (label shuffling, see Materials and Methods) of the original networks, giving a
<italic>P</italic>
value estimate < 10
<sup></sup>
<sup>3</sup>
. The same results were observed for networks obtained with lower sequence identity thresholds (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig. S6</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online) and when evolutionary distances were considered for a set of 10,000 randomly sampled coding sequences in the data set (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig. S7</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online).
<fig id="evw077-F2" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 2.—</label>
<caption>
<p>Recall, precision, and accuracy values for real and random network at 90% sequence identity threshold.</p>
</caption>
<graphic xlink:href="evw077f2p"></graphic>
</fig>
<fig id="evw077-F3" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 3.—</label>
<caption>
<p>(
<italic>A</italic>
) Force-directed layout representation of the metagenome network (at 90% sequence identity threshold). Each metagenome is colored according to its source habitat as indicated in the legend and major coherent clusters are highlighted. (
<italic>B</italic>
) The putative HGT network derived from network shown in (
<italic>A</italic>
) (see text for details on HGT network construction).</p>
</caption>
<graphic xlink:href="evw077f3p"></graphic>
</fig>
</p>
<p>Additionally, assortativity was used to evaluate the tendency (if any) of nodes of the same type (i.e., sequences from the same source habitats) to cluster together in the network. Briefly, assortativity coefficient measures the preference for a network's nodes to attach to others that share a particular attribute (source environment in our case) and can be comprised between −1 (disassortative network) and 1 (assortative network). Assortativity for the network in
<xref ref-type="fig" rid="evw077-F3">figure 3</xref>
was found to be 0.157, thus confirming a general pattern of preferential connections between nodes of a particular ecological niche. Importantly, higher assortativity values were never encountered when (1,000) randomization of the original network were performed (edge rearrangement, see Network Analysis and Visualization), allowing to infer a rough estimation of a
<italic>P</italic>
value lower than 10
<sup></sup>
<sup>3</sup>
.</p>
<p>From this we conclude that the source habitat of the different sequence samples is a key factor in determining their clustering within the different SSNs.</p>
<p>A force-directed layout of the network (
<xref ref-type="fig" rid="evw077-F3">fig. 3
<italic>a</italic>
</xref>
) reveals a clear separation between sea samples (in dark blue) and samples coming from other sources such as host (red), soil (yellow), waste waters (black), and air filters (light blue). Interestingly, inland-water samples (blue) appear to lay half way between these two major clusters. As listed in
<xref ref-type="table" rid="evw077-T1">table 1</xref>
, metagenomes from inland water samples possess the highest betweenness values in the SSN in comparison to all the other sample sources, expressing that these nodes have a central position in the network and that, in turn, they serve as connectors among otherwise separated regions of the network (Mann–Whitney
<italic>U</italic>
test,
<italic>P</italic>
values in
<xref ref-type="table" rid="evw077-T1">table 1</xref>
). These results were confirmed by randomizations (edge replacement, see Network Analysis and Visualization) of the original graph (
<xref ref-type="table" rid="evw077-T1">table 1</xref>
) according to which inland water metagenomes, and (to a lower extent) sea water metagenomes, have betweenness centrality values higher than is expected by chance. Inland water metagenomes are also less prone to form clusters within the network, since they show, on average, the lowest clustering coefficient (Mann–Whitney
<italic>U</italic>
test,
<xref ref-type="table" rid="evw077-T1">table 1</xref>
). Inland water metagenomes possess also the highest closeness centrality values in the SSN (Mann–Whitney
<italic>U</italic>
test,
<xref ref-type="table" rid="evw077-T1">table 1</xref>
). This suggests that, in water, bacteria from different origins (human, animal, and environmental) may be able to mix, co-exist, and travel to an extent that is higher than in other ecological niches. This could give rise to exchange and shuffling of genes, genetic platforms, and genetic vectors (
<xref rid="evw077-B5" ref-type="bibr">Baquero et al. 2008</xref>
). This result confirms and extends previous findings on the horizontal flow of the plasmid encoded resistome (
<xref rid="evw077-B19" ref-type="bibr">Fondi and Fani 2010</xref>
).
<table-wrap id="evw077-T1" orientation="portrait" position="float">
<label>Table 1</label>
<caption>
<p>Centrality Measures in Relation to Sample Environmental Origin in Observed and Random Networks</p>
</caption>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col align="left" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
<col align="center" valign="top" span="1"></col>
</colgroup>
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Network Metric</th>
<th colspan="2" align="center" rowspan="1">Soil
<hr></hr>
</th>
<th colspan="2" align="center" rowspan="1">Sea
<hr></hr>
</th>
<th colspan="2" align="center" rowspan="1">Host
<hr></hr>
</th>
<th colspan="2" align="center" rowspan="1">Inland water
<hr></hr>
</th>
</tr>
<tr>
<th rowspan="1" colspan="1"></th>
<th rowspan="1" colspan="1">Real</th>
<th rowspan="1" colspan="1">Random</th>
<th rowspan="1" colspan="1">Real</th>
<th rowspan="1" colspan="1">Random</th>
<th rowspan="1" colspan="1">Real</th>
<th rowspan="1" colspan="1">Random</th>
<th rowspan="1" colspan="1">Real</th>
<th rowspan="1" colspan="1">Random</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="2" colspan="1">Betweenness</td>
<td align="left" rowspan="1" colspan="1">4.6</td>
<td rowspan="2" colspan="1">16.8(6.02)</td>
<td align="left" rowspan="1" colspan="1">52.46</td>
<td rowspan="2" colspan="1">42.20(4.05)</td>
<td align="left" rowspan="1" colspan="1">50.52</td>
<td rowspan="2" colspan="1">61.14(5.24)</td>
<td rowspan="2" colspan="1">102.67</td>
<td rowspan="2" colspan="1">63.76(6.9)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>P</italic>
= 2*10
<sup>−3</sup>
</td>
<td align="left" rowspan="1" colspan="1">
<italic>P</italic>
= 2*10
<sup>−2</sup>
</td>
<td align="left" rowspan="1" colspan="1">
<italic>P</italic>
= 2*10
<sup>−2</sup>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1">Closeness</td>
<td align="left" rowspan="1" colspan="1">0.42</td>
<td rowspan="2" colspan="1">0.44(0.008)</td>
<td align="left" rowspan="1" colspan="1">0.46</td>
<td rowspan="2" colspan="1">0.48(0.07)</td>
<td align="left" rowspan="1" colspan="1">0.49</td>
<td rowspan="2" colspan="1">0.50(0.009)</td>
<td rowspan="2" colspan="1">0.50</td>
<td rowspan="2" colspan="1">0.48(0.01)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>P</italic>
= 1*10
<sup>−3</sup>
</td>
<td align="left" rowspan="1" colspan="1">
<italic>P</italic>
= 9*10
<sup>−3</sup>
</td>
<td align="left" rowspan="1" colspan="1">
<italic>P</italic>
= 4*10
<sup>−3</sup>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1">Clustering c.</td>
<td align="left" rowspan="1" colspan="1">0.68</td>
<td rowspan="2" colspan="1">0.18(0.03)</td>
<td align="left" rowspan="1" colspan="1">0.6</td>
<td rowspan="2" colspan="1">0.24(0.03)</td>
<td align="left" rowspan="1" colspan="1">0.56</td>
<td rowspan="2" colspan="1">0.24(0.02)</td>
<td rowspan="2" colspan="1">0.39</td>
<td rowspan="2" colspan="1">0.20(0.03)</td>
</tr>
<tr>
<td rowspan="1" colspan="1">
<italic>P</italic>
= 1*10
<sup>−3</sup>
</td>
<td align="left" rowspan="1" colspan="1">
<italic>P</italic>
= 3*10
<sup>−1</sup>
</td>
<td align="left" rowspan="1" colspan="1">
<italic>P</italic>
= 2*10
<sup>−3</sup>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="evw077-TF1">
<p>N
<sc>ote</sc>
.—Values in parentheses after randomized values indicate standard deviation. Values after real values for soil, host, and sea metagenomes indicate
<italic>P</italic>
values for comparisons to inland water samples (Mann–Whitney
<italic>U</italic>
test).</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
<p>As shown in
<xref ref-type="fig" rid="evw077-F3">figure 3
<italic>a</italic>
</xref>
, nine metagenomes remained disconnected from the overall network. These metagenomes included five seawater samples, two soil samples, one host, and one inland water samples. Not surprisingly, these metagenomes embed fewer sequences than others present in the data set. Indeed, although it has been shown that the metagenome size has a negligible effect on the overall connectivity within the network (see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary material</ext-link>
S1,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online, and Materials and Methods), some exceptions may still exist. These metagenomes are connected to the others at lower identity thresholds (data not shown).</p>
</sec>
<sec>
<title>HGT Networks</title>
<p>The extent of sequence sharing among the metagenomes can be partially explained by the overlapping taxonomical space of the different samples; indeed, similar habitats may tend to be colonized by the same major taxonomical groups. This latter observation is supported by the results obtained repeating the same analysis pipeline for marker genes retrieved in the studied metagenomic samples (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary fig S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online) and likely with a reduced susceptibility to HGT. Nevertheless, the assembled data set permits us the opportunity to assess the relationships (if any) between physical proximity, ecological niche, and HGT. To account for this task, a second set of networks was constructed, accounting for putative HGT events among the analyzed sequence data sets. We identified putative HGTs as blocks of nearly identical DNA (≥500 nucleotides and ≥98% sequence identity) in otherwise distantly related contigs (i.e., contigs from different genera inferred by a composition-based, semisupervised, taxonomic binning algorithm). Since the method adopted for taxonomic binning of metagenome sequences is mainly suited to microbial sequences (
<xref rid="evw077-B35" ref-type="bibr">Nalbantoglu et al. 2011</xref>
), only prokaryote to prokaryote putative gene exchanges will be considered in the following sections. Importantly, trends in sequence sharing described below were observed also when a composition-oriented method (based on the evaluation of differences in tetranucleotide frequency distribution between two contigs, see Contig taxonomic annotation and source molecule identification) was used for the identification of (putative) HGT.</p>
<p>The network of HGT among metagenomes is reported in
<xref ref-type="fig" rid="evw077-F3">figure 3
<italic>b</italic>
</xref>
, displaying a topology very similar to the network of gene sharing (
<xref ref-type="fig" rid="evw077-F3">fig. 3
<italic>a</italic>
</xref>
) although, as might be expected, possessing fewer links. The HGT network also proves that sequence sharing between metagenomes is not just due to overlapping taxonomical space. To further investigate the HGT network, we built a second type of network in which each node represents a single contig, whereas links account for (putative) HGT events. This network contains 34,555 nodes (contigs) and 34,398 edges (putative HGT events,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">supplementary material</ext-link>
S3,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary Material</ext-link>
online) and can be divided into 8,017 connected components (CC), the great majority embedding only few contigs (≤10). We identified 46 larger CCs, embedding 50 or more contigs. Functional annotation was missing for 38% of the genes involved in putative HGT events. Among those that were successfully annotated using Pfam database, the two most represented functional categories were ABC transporters and transposase DDE domain. Considering the biological role of genes embedded into these categories (resistance to xenobiotics and horizontal transfer of genes) this finding highlights the dangerous implications of the horizontal flow of genes in the spreading of microbial resistance (and resistance to xenobionts in general) in natural environments (
<xref rid="evw077-B5" ref-type="bibr">Baquero et al. 2008</xref>
;
<xref rid="evw077-B19" ref-type="bibr">Fondi and Fani 2010</xref>
). Two examples of this are provided below.</p>
<p>To investigate the influence of ecology shaping the HGT network, we estimated whether each CC was either homogeneous or heterogeneous in terms of the habitat of the embedded contigs. Results shown in
<xref ref-type="fig" rid="evw077-F4">figure 4
<italic>a</italic>
</xref>
revealed that almost 90% of the CCs (6,814 CCs) contain contigs belonging to the same environment. Heterogeneous clusters are less frequent, although interesting exceptions do exist (see below). The observed distribution of homogeneous clusters was compared against the (averaged) distribution of the same measure from 1,000 networks, obtained through random label reshuffling (see Computational strategy for clusters identification and testing). The distinctness of the two distributions is shown in
<xref ref-type="fig" rid="evw077-F4">figure 4
<italic>a</italic>
</xref>
and was assessed by a Mann–Whitney
<italic>U</italic>
test (
<italic>P</italic>
value < 2.2e-16). A high number of interconnections inside each of the examined habitats (e.g., host–host and sea water–sea water) were observed for most of the samples (
<xref ref-type="fig" rid="evw077-F5">fig. 5</xref>
; see below), in agreement with overall samples clustering reported in
<xref ref-type="fig" rid="evw077-F4">figure 4
<italic>a</italic>
</xref>
and with previous findings concerning the possible presence of barriers or trends to HGT (
<xref rid="evw077-B37" ref-type="bibr">Popa and Dagan 2011</xref>
). According to this whole body of data, ecology seems to exert a broad influence on recent gene exchange in environmental samples. This is in agreement with the theory according to which ecological similarity shapes networks of gene exchange by selecting for the transfer and proliferation of adaptive traits or by increasing physical interactions between community members (
<xref rid="evw077-B3" ref-type="bibr">Aravind et al. 1998</xref>
;
<xref rid="evw077-B9" ref-type="bibr">Caro-Quintero et al. 2011</xref>
;
<xref rid="evw077-B46" ref-type="bibr">Smillie et al. 2011</xref>
). For example, strong geographical differentiation apparently caused by recent gene transfer among co-occurring bacteria was observed for
<italic>Vibrio</italic>
representatives (
<xref rid="evw077-B8" ref-type="bibr">Boucher et al. 2011</xref>
).
<fig id="evw077-F4" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 4.—</label>
<caption>
<p>Composition of network clusters in terms of habitat and molecule categories. One hundred percentage values on the
<italic>X</italic>
axis indicate clusters with contigs belonging to the same category; conversely, lower values indicate more heterogeneous clusters (i.e., contigs belonging to different habitat or to different molecules). The cluster composition is shown for (
<italic>A</italic>
) habitat coherence and (
<italic>B</italic>
) molecule coherence (i.e., plasmid–plasmid and chromosome–chromosome).</p>
</caption>
<graphic xlink:href="evw077f4p"></graphic>
</fig>
<fig id="evw077-F5" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 5.—</label>
<caption>
<p>Adjacency matrix showing the relationships among the different habitat types in the putative HGT events network. For each habitat, the proportion of connections of that habitat with all the other habitats has been computed. The proportion of connections connecting habitat A with habitat B (
<inline-formula id="IE3">
<mml:math id="IEQ3">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mtext>PC</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>
) is given by this formula:
<inline-formula id="IE6">
<mml:math id="IEQ6">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mtext>PC</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mtext>Weight</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mtext>Edge</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mstyle mathsize="140%" displaystyle="true">
<mml:mo></mml:mo>
</mml:mstyle>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mtext>Weight</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mtext>Edge</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mtext>A</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext>i</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula>
Since the denominator represents the amount of sequences in one of the two analyzed samples, this measure is specific to each of the analyzed environments and is not symmetric (
<inline-formula id="IE4">
<mml:math id="IEQ4">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mtext>PC</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mrow>
<mml:mtext>PC</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>
). Color gradient within the matrix refers to the proportion of connections of contigs from a given habitat with all the others from other habitats, with lighter tones representing less abundant interconnections among the corresponding habitats.</p>
</caption>
<graphic xlink:href="evw077f5p"></graphic>
</fig>
</p>
<p>An adjacency matrix was built to explore more thoroughly the interconnections that link sequences from different habitats and common patterns of gene exchange among samples retrieved from different ecosystems (
<xref ref-type="fig" rid="evw077-F5">fig. 5</xref>
). Two major clusters can be identified on the basis of the dendrogram topology (Clusters 1 and 2 in
<xref ref-type="fig" rid="evw077-F5">fig. 5</xref>
). Contigs embedded in each of these clusters have similar connections toward the other environments present in the HGT network. This suggests the presence of a common pool of genes in ecosystems embedded in these clusters. Cluster 1, for example, embeds Host, Sludge waste, and Air ecosystems. This particular clustering is supported by
<xref rid="evw077-B46" ref-type="bibr">Smillie et al. (2011)</xref>
and studies showing that fecal coliforms and other animal pathogens are indeed present in sludge waste samples (
<xref rid="evw077-B26" ref-type="bibr">Jones 1980</xref>
;
<xref rid="evw077-B12" ref-type="bibr">De Luca et al. 1998</xref>
;
<xref rid="evw077-B45" ref-type="bibr">Shanahan et al. 2010</xref>
) and that opportunistic pathogens commonly isolated from human-inhabited environments have been identified in airborne environments (
<xref rid="evw077-B55" ref-type="bibr">Tringe et al. 2008</xref>
). Also, the fact that activated sludge microbiomes are characterized by high microbial density and high levels of various HGT associated traits (e.g., AR-related genes and plasmids/integrons/transposons) (
<xref rid="evw077-B41" ref-type="bibr">Schluter et al. 2007</xref>
;
<xref rid="evw077-B57" ref-type="bibr">Zhang et al. 2011</xref>
) indirectly supports the observed clustering of sludge waste samples together with microbes from other (diverse) ecological niches (e.g., clinical environment). Similarly, Cluster 2 contains ecosystems that embed overlapping microbial communities (i.e., biotransformation, bioremediation, and soil environments) and thus showing similar patterns of interconnections against microbes from other ecosystems.</p>
<p>Exceptions to ecologically homogeneous clusters can be highlighted within our data set. Two paradigmatic examples of cross-habitat putative HGT were chosen in the overall putative HGT network and are shown in
<xref ref-type="fig" rid="evw077-F6">figure 6</xref>
. In detail,
<xref ref-type="fig" rid="evw077-F6">figure 6a</xref>
reports putative HGTs among contigs embedding tetracycline resistance determinants (
<italic>tet</italic>
34) in samples isolated from host and inland waters. Tetracycline resistance is often associated with conjugative transposons or other transferable elements (e.g., pheromone-inducible plasmids) (
<xref rid="evw077-B10" ref-type="bibr">Clewell et al. 1995</xref>
;
<xref rid="evw077-B14" ref-type="bibr">Dunny et al. 1995</xref>
) and plasmid-mediated HGT events involving such determinants have been previously identified (
<xref rid="evw077-B19" ref-type="bibr">Fondi and Fani 2010</xref>
;
<xref rid="evw077-B7" ref-type="bibr">Bosi et al. 2011</xref>
). Similarly (
<xref ref-type="fig" rid="evw077-F6">fig. 6
<italic>b</italic>
</xref>
), contigs embedding chloramphenicol resistance determinants belong to samples of very different origin (soil and host). This latter finding shows possible pathways for cross-habitat chloramphenicol-resistance propagation in the environment and is in line with previous observations on swine feedlot wastewater as a possible source of chloramphenicol-resistance genes (
<xref rid="evw077-B29" ref-type="bibr">Li et al. 2013</xref>
) and the overall capability of this class of genes to undergo HGT (
<xref rid="evw077-B44" ref-type="bibr">Sermonti et al. 1978</xref>
;
<xref rid="evw077-B51" ref-type="bibr">Takamatsu et al. 2003</xref>
). Taken together, these two cases show that interhabitat barriers and taxonomic distance can be overcome by certain genes since phylogenetically unrelated bacteria, and those inhabiting distinct environments were found to share common antibiotic resistance determinants, probably as a result of (one or multiple) HGT event(
<xref rid="evw077-B23" ref-type="bibr">s) (Halary et al. 2010</xref>
;
<xref rid="evw077-B46" ref-type="bibr">Smillie et al. 2011</xref>
).
<fig id="evw077-F6" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 6.—</label>
<caption>
<p>Examples of putative cross-habitat HGT events among contigs (nodes) embedding (
<italic>A</italic>
) tetracycline resistance determinants and retrieved from inland waters (blue nodes) and host (red nodes) and (
<italic>B</italic>
) chloramphenicol resistance in host (red) and soil (yellow) derived samples.</p>
</caption>
<graphic xlink:href="evw077f6p"></graphic>
</fig>
</p>
<p>The network-based approach adopted here allows testing the role of plasmids and chromosomes in the overall gene exchange pattern within environmental samples. Indeed, the importance of plasmids and chromosomes in shaping the microbial HGT network has been assessed in recent works (
<xref rid="evw077-B22" ref-type="bibr">Halary et al. 2010</xref>
;
<xref rid="evw077-B46" ref-type="bibr">Smillie et al. 2011</xref>
).
<xref rid="evw077-B22" ref-type="bibr">Halary et al. (2010)</xref>
showed that gene sharing mostly occurs among molecules of the same type (molecule coherence), meaning that plasmid-plasmid and chromosome–chromosome gene sharing is more frequent than cross-molecule sharing. Accordingly, we investigated whether contigs embedded in the same CC belonged to the same or different molecules (i.e., plasmids or chromosomes). Contig sequences were assigned to their source molecule adopting a composition-based strategy as implemented in cBar (
<xref rid="evw077-B58" ref-type="bibr">Zhou and Xu 2010</xref>
) and the source molecule composition of each cluster was evaluated. Results reported in
<xref ref-type="fig" rid="evw077-F4">figure 4b</xref>
show an overall coherence within the CCs identified in the network. In particular, 5,199 CCs (∼65% of all the CCs) are highly homogeneous: more than 90% of the embedded contigs belong to the same type of DNA molecule. Conversely, heterogeneous clusters (those in which contigs are almost evenly distributed among the two types of molecules) represent 24.3% of the total number of clusters. Again, the observed distribution of homogeneous clusters was compared against the same (averaged) distribution obtained from 1,000 networks, obtained through label reshuffling (red line in
<xref ref-type="fig" rid="evw077-F4">fig. 4
<italic>b</italic>
</xref>
). The distinctness of the two distributions was assessed by a Mann–Whitney
<italic>U</italic>
test (
<italic>P</italic>
value < 2.2e-16). This finding indicates that DNA pools are mainly transferred between molecules of the same type.</p>
<p>Notably, general trends (i.e., molecule and habitat coherence) among the various clusters were not affected by the method used for estimating the number of HGT events as adopting a composition-based (i.e., tetranucleotide frequencies, see Materials and Methods) approach led to the same overall results (data not shown).</p>
</sec>
</sec>
<sec sec-type="conclusions">
<title>Conclusions</title>
<p>By adopting a similarity network approach on a comprehensive set of environmental sequences, we revealed the absence of an overall distance effect in the level of sequence sharing among microbial samples; even distant microbial communities may share more homologous sequences than geographically closer DNA pools. Metagenome gene composition is therefore strongly affected by ecology. Interestingly, inland water samples occupy a “bridge-like” position in the overall metagenome network (
<xref ref-type="fig" rid="evw077-F3">fig. 3
<italic>a</italic>
</xref>
). Hence, despite maintaining their own (specific) gene pool as assessed by clustering analyses, these samples connect microbial communities that otherwise would remain disconnected (e.g., host and seawater samples). This is in agreement with previous findings on the horizontal flow of plasmid genes (
<xref rid="evw077-B19" ref-type="bibr">Fondi and Fani 2010</xref>
) and speculations on the role of aquatic environments in the spreading of AR-related determinants (
<xref rid="evw077-B5" ref-type="bibr">Baquero et al. 2008</xref>
). These trends were confirmed when the SSN was converted into a putative HGT network by maintaining only those connections linking very similar sequences (identity ≥ 98%) in distantly related microorganisms (i.e., belonging to different genera). Ecology strongly influences the network of HGT in microbes even when samples not strictly related to human are considered, as has also been preliminarily observed in terrestrial and aquatic environments (
<xref rid="evw077-B25" ref-type="bibr">Hooper et al. 2008</xref>
). Moreover, HGT events mainly involve molecules of the same kind (i.e., either plasmids or chromosomes) with promiscuous gene exchange being less frequent.</p>
<p>Our work shows the possible use of SSN for studying patterns in microbial ecology and also lays foundations for integrating such networks with other environmental parameters (e.g., temperature, pH, pressure, and physical barriers) on the structure of the gene sharing and HGT networks. Finally, our findings provide support for the Baas Becking hypothesis (formulated in 1934), suggesting that it also applies to genes, besides microbes for which it was originally formulated. Overlapping microbial gene pools are likely to be found in widely geographically disparate environments, and tighter associations are observed among gene pools from similar habitats. This holds true regardless of microbial evolutionary lineages (i.e., their common evolutionary history) since we have shown that the same patterns of common gene pools still remain when only genes likely shared by means of HGT events are maintained in the network. This suggests that it is not so important which organism transcribes and translates a gene and it matters more where that organism is located, demonstrating that at least some genes act as public goods (
<xref rid="evw077-B34" ref-type="bibr">McInerney et al. 2011</xref>
). Accordingly, they are available for all organisms to integrate into their genomes although the kind of ecological niches occupied and the type of informative molecules harboring them might impose some constraints on the overall possibility of gene pools to undergo HGT. Finally, besides drafting an overall scheme of pathways for the global distribution of gene pools, results presented here provide important biological insights into the spreading of antibiotic-resistance-related genes across multiple hosts and habitats.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material id="PMC_1" content-type="local-data">
<caption>
<title>Supplementary Data</title>
</caption>
<media mimetype="text" mime-subtype="html" xlink:href="supp_8_5_1388__index.html"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="x-zip-compressed" xlink:href="supp_evw077_suppl_data.zip"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>This work was supported by a grant under the SFI Incoming STTF Programme (09/RFP/EOB2510 - ISTTF 1) and a grant from Academy of Finland. This work was supported by a grant under the Science Foundation Ireland Incoming STTF Programme (09/RFP/EOB2510 - ISTTF 1) and a grant from Academy of Finland.</p>
</ack>
<sec sec-type="materials">
<title>Supplementary Material</title>
<p>
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw077/-/DC1">Supplementary materials</ext-link>
S1–S3 are available at
<italic>Genome Biology and Evolution</italic>
online (
<ext-link ext-link-type="uri" xlink:href="http://www.gbe.oxfordjournals.org/">http://www.gbe.oxfordjournals.org/</ext-link>
).</p>
</sec>
<ref-list>
<title>Literature Cited</title>
<ref id="evw077-B1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Altschul</surname>
<given-names>SF</given-names>
</name>
</person-group>
,
<etal></etal>
<year>1997</year>
<article-title>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>25</volume>
:
<fpage>3389</fpage>
<lpage>3402</lpage>
.
<pub-id pub-id-type="pmid">9254694</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alvarez-Ponce</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Lopez</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bapteste</surname>
<given-names>E</given-names>
</name>
<name>
<surname>McInerney</surname>
<given-names>JO.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Gene similarity networks provide tools for understanding eukaryote origins and evolution</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>110</volume>
:
<fpage>E1594</fpage>
<lpage>E1603</lpage>
. doi: 10.1073/pnas.1211371110
<pub-id pub-id-type="pmid">23576716</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B3">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aravind</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Tatusov</surname>
<given-names>RL</given-names>
</name>
<name>
<surname>Wolf</surname>
<given-names>YI</given-names>
</name>
<name>
<surname>Walker</surname>
<given-names>DR</given-names>
</name>
<name>
<surname>Koonin</surname>
<given-names>EV.</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles</article-title>
.
<source>Trends Genet.</source>
<volume>14</volume>
:
<fpage>442</fpage>
<lpage>444</lpage>
.
<pub-id pub-id-type="pmid">9825671</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B4">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Baas Becking</surname>
<given-names>LGM.</given-names>
</name>
</person-group>
<year>1934</year>
<source>Geobiologie of inleiding tot de milieukunde</source>
.
<publisher-name>Den Haag [The Netherlands] : W.P. Van Stockum & Zoon N.V.</publisher-name>
, </mixed-citation>
</ref>
<ref id="evw077-B5">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baquero</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Martinez</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Canton</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Antibiotics and antibiotic resistance in water environments</article-title>
.
<source>Curr Opin Biotechnol</source>
.
<volume>19</volume>
:
<fpage>260</fpage>
<lpage>265</lpage>
. doi: S0958-1669(08)00059-1 [pii]
<pub-id pub-id-type="pmid">18534838</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B6">
<mixed-citation publication-type="other">
<person-group person-group-type="author">
<name>
<surname>Bastian</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Heymann</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Jacomy</surname>
<given-names>M.</given-names>
</name>
</person-group>
<year>2009</year>
. Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media.</mixed-citation>
</ref>
<ref id="evw077-B7">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bosi</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Fani</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Fondi</surname>
<given-names>M.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>The mosaicism of plasmids revealed by atypical genes detection and analysis</article-title>
.
<source>BMC Genomics</source>
<volume>12</volume>
:
<fpage>403.</fpage>
doi: 10.1186/1471-2164-12-403
<pub-id pub-id-type="pmid">21824433</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B8">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boucher</surname>
<given-names>Y</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Local mobile gene pools rapidly cross species boundaries to create endemicity within global
<italic>Vibrio cholerae</italic>
populations</article-title>
.
<source>MBio</source>
<volume>2</volume>
. doi: 10.1128/mBio.00335-10</mixed-citation>
</ref>
<ref id="evw077-B9">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Caro-Quintero</surname>
<given-names>A</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Unprecedented levels of horizontal gene transfer among spatially co-occurring Shewanella bacteria from the Baltic Sea</article-title>
.
<source>ISME J.</source>
<volume>5</volume>
:
<fpage>131</fpage>
<lpage>140</lpage>
. doi: 10.1038/ismej.2010.93
<pub-id pub-id-type="pmid">20596068</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Clewell</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Flannagan</surname>
<given-names>SE</given-names>
</name>
<name>
<surname>Jaworski</surname>
<given-names>DD.</given-names>
</name>
</person-group>
<year>1995</year>
<article-title>Unconstrained bacterial promiscuity: the Tn916-Tn1545 family of conjugative transposons</article-title>
.
<source>Trends Microbiol.</source>
<volume>3</volume>
:
<fpage>229</fpage>
<lpage>236</lpage>
.
<pub-id pub-id-type="pmid">7648031</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dagan</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Phylogenomic networks</article-title>
.
<source>Trends Microbiol.</source>
<volume>19</volume>
:
<fpage>483</fpage>
<lpage>491</lpage>
. doi: 10.1016/j.tim.2011.07.001
<pub-id pub-id-type="pmid">21820313</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B12">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>De Luca</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Zanetti</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Fateh-Moghadm</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Stampi</surname>
<given-names>S.</given-names>
</name>
</person-group>
<year>1998</year>
<article-title>Occurrence of listeria monocytogenes in sewage sludge</article-title>
.
<source>Zentralbl Hyg Umweltmed</source>
<volume>201</volume>
:
<fpage>269</fpage>
<lpage>277</lpage>
.
<pub-id pub-id-type="pmid">9789361</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dishaw</surname>
<given-names>LJ</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2014</year>
<article-title>The gut of geographically disparate ciona intestinalis harbors a core microbiota</article-title>
.
<source>PLoS One</source>
<volume>9</volume>
:
<fpage>e93386.</fpage>
doi: 10.1371/journal.pone.0093386
<pub-id pub-id-type="pmid">24695540</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dunny</surname>
<given-names>GM</given-names>
</name>
<name>
<surname>Leonard</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Hedberg</surname>
<given-names>PJ.</given-names>
</name>
</person-group>
<year>1995</year>
<article-title>Pheromone-inducible conjugation in
<italic>Enterococcus faecalis</italic>
: interbacterial and host-parasite chemical communication</article-title>
.
<source>J Bacteriol</source>
.
<volume>177</volume>
:
<fpage>871</fpage>
<lpage>876</lpage>
.
<pub-id pub-id-type="pmid">7860595</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC.</given-names>
</name>
</person-group>
<year>2004a</year>
<article-title>MUSCLE: a multiple sequence alignment method with reduced time and space complexity</article-title>
.
<source>BMC Bioinformatics</source>
<volume>5</volume>
:
<fpage>113.</fpage>
doi: 10.1186/1471-2105-5-113
<pub-id pub-id-type="pmid">15318951</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B16">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC.</given-names>
</name>
</person-group>
<year>2004b</year>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>32</volume>
:
<fpage>1792</fpage>
<lpage>1797</lpage>
. doi: 10.1093/nar/gkh340
<pub-id pub-id-type="pmid">15034147</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Finn</surname>
<given-names>RD</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2014</year>
<article-title>Pfam: the protein families database</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>42</volume>
:
<fpage>D222</fpage>
<lpage>D230</lpage>
. doi: 10.1093/nar/gkt1223
<pub-id pub-id-type="pmid">24288371</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Finn</surname>
<given-names>RD</given-names>
</name>
<name>
<surname>Clements</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Eddy</surname>
<given-names>SR.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>HMMER web server: interactive sequence similarity searching</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>39</volume>
:
<fpage>W29</fpage>
<lpage>W37</lpage>
. doi: 10.1093/nar/gkr367
<pub-id pub-id-type="pmid">21593126</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fondi</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fani</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>The horizontal flow of the plasmid resistome: clues from inter-generic similarity networks</article-title>
.
<source>Environ Microbiol.</source>
<volume>12</volume>
:
<fpage>3228</fpage>
<lpage>3242</lpage>
. doi: 10.1111/j.1462-2920.2010.02295.x
<pub-id pub-id-type="pmid">20636373</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Forster</surname>
<given-names>D</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2015</year>
<article-title>Testing ecological theories with sequence similarity networks: marine ciliates exhibit similar geographic dispersal patterns as multicellular organisms</article-title>
.
<source>BMC Biol.</source>
<volume>13</volume>
:
<fpage>16.</fpage>
doi: 10.1186/s12915-015-0125-5
<pub-id pub-id-type="pmid">25762112</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B21">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fuhrman</surname>
<given-names>JA.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Microbial community structure and its functional implications</article-title>
.
<source>Nature</source>
<volume>459</volume>
:
<fpage>193</fpage>
<lpage>199</lpage>
. doi: 10.1038/nature08058
<pub-id pub-id-type="pmid">19444205</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B22">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Halary</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Leigh</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Cheaib</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Lopez</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Bapteste</surname>
<given-names>E.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Network analyses structure genetic diversity in independent genetic worlds</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>107</volume>
:
<fpage>127</fpage>
<lpage>132</lpage>
. doi: 10.1073/pnas.0908978107
<pub-id pub-id-type="pmid">20007769</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B23">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hanson</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Fuhrman</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Horner-Devine</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Martiny</surname>
<given-names>JB.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Beyond biogeographic patterns: processes shaping the microbial landscape</article-title>
.
<source>Nat Rev Microbiol.</source>
<volume>10</volume>
:
<fpage>497</fpage>
<lpage>506</lpage>
. doi: 10.1038/nrmicro2795
<pub-id pub-id-type="pmid">22580365</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B24">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hooper</surname>
<given-names>SD</given-names>
</name>
<name>
<surname>Mavromatis</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kyrpides</surname>
<given-names>NC.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Microbial co-habitation and lateral gene transfer: what transposases can tell us</article-title>
.
<source>Genome Biol.</source>
<volume>10</volume>
:
<fpage>R45.</fpage>
doi: 10.1186/gb-2009-10-4-r45
<pub-id pub-id-type="pmid">19393086</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B25">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hooper</surname>
<given-names>SD</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2008</year>
<article-title>A molecular study of microbe transfer between distant environments</article-title>
.
<source>PloS One</source>
<volume>3</volume>
:
<fpage>e2607.</fpage>
doi: 10.1371/journal.pone.0002607
<pub-id pub-id-type="pmid">18612393</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B26">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jones</surname>
<given-names>PW.</given-names>
</name>
</person-group>
<year>1980</year>
<article-title>Health hazards associated with the handling of animal wastes</article-title>
.
<source>Vet Rec</source>
.
<volume>106</volume>
:
<fpage>4</fpage>
<lpage>7</lpage>
.
<pub-id pub-id-type="pmid">7355557</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B27">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jukes</surname>
<given-names>TH</given-names>
</name>
<name>
<surname>Cantor</surname>
<given-names>CR.</given-names>
</name>
</person-group>
<year>1969</year>
<article-title>Evolution of protein molecules</article-title>
.
<source>Mamm Protein Metab</source>
.
<volume>3</volume>
:
<fpage>21</fpage>
<lpage>132</lpage>
.</mixed-citation>
</ref>
<ref id="evw077-B28">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kohl</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Wiese</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Warscheid</surname>
<given-names>B.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Cytoscape: software for visualization and analysis of biological networks</article-title>
.
<source>Methods Mol Biol.</source>
<volume>696</volume>
:
<fpage>291</fpage>
<lpage>303</lpage>
. doi: 10.1007/978-1-60761-987-1_18
<pub-id pub-id-type="pmid">21063955</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B29">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Shao</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Y.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Occurrence of chloramphenicol-resistance genes as environmental pollutants from swine feedlots</article-title>
.
<source>Environ Sci Technol</source>
.
<volume>47</volume>
:
<fpage>2892</fpage>
<lpage>2897</lpage>
. doi: 10.1021/es304616c
<pub-id pub-id-type="pmid">23419160</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B30">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lima-Mendez</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Van Helden</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Toussaint</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Leplae</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2008</year>
<article-title>Reticulate representation of evolutionary and functional relationships between phage genomes</article-title>
.
<source>Mol Biol Evol.</source>
<volume>25</volume>
:
<fpage>762</fpage>
<lpage>777</lpage>
. doi: 10.1093/molbev/msn023
<pub-id pub-id-type="pmid">18234706</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B31">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Pop</surname>
<given-names>M.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>ARDB–antibiotic resistance genes database</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>37</volume>
:
<fpage>D443</fpage>
<lpage>D447</lpage>
. doi: 10.1093/nar/gkn656
<pub-id pub-id-type="pmid">18832362</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B32">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Machado</surname>
<given-names>M</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Phred-Phrap package to analyses tools: a pipeline to facilitate population genetics re-sequencing studies</article-title>
.
<source>Investig Genet.</source>
<volume>2</volume>
:
<fpage>3.</fpage>
doi: 10.1186/2041-2223-2-3</mixed-citation>
</ref>
<ref id="evw077-B33">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martiny</surname>
<given-names>JB</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2006</year>
<article-title>Microbial biogeography: putting microorganisms on the map</article-title>
.
<source>Nat Rev Microbiol.</source>
<volume>4</volume>
:
<fpage>102</fpage>
<lpage>112</lpage>
. doi: 10.1038/nrmicro1341
<pub-id pub-id-type="pmid">16415926</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B34">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McInerney</surname>
<given-names>JO</given-names>
</name>
<name>
<surname>Pisani</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Bapteste</surname>
<given-names>E</given-names>
</name>
<name>
<surname>O'Connell</surname>
<given-names>MJ.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>The public goods hypothesis for the evolution of life on earth</article-title>
.
<source>Biol Direct.</source>
<volume>6</volume>
:
<fpage>41.</fpage>
doi: 10.1186/1745-6150-6-41
<pub-id pub-id-type="pmid">21861918</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B35">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nalbantoglu</surname>
<given-names>OU</given-names>
</name>
<name>
<surname>Way</surname>
<given-names>SF</given-names>
</name>
<name>
<surname>Hinrichs</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Sayood</surname>
<given-names>K.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>RAIphy: phylogenetic classification of metagenomics samples using iterative refinement of relative abundance index profiles</article-title>
.
<source>BMC Bioinformatics</source>
<volume>12</volume>
:
<fpage>41.</fpage>
doi: 10.1186/1471-2105-12-41
<pub-id pub-id-type="pmid">21281493</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B36">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Newman</surname>
<given-names>ME.</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Mixing patterns in networks</article-title>
.
<source>Phys Rev E Stat Nonlin Soft Matter Phys</source>
.
<volume>67</volume>
:
<fpage>026126.</fpage>
doi: 10.1103/PhysRevE.67.026126
<pub-id pub-id-type="pmid">12636767</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B37">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Popa</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Dagan</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Trends and barriers to lateral gene transfer in prokaryotes</article-title>
.
<source>Curr Opin Microbiol.</source>
<volume>14</volume>
:
<fpage>615</fpage>
<lpage>623</lpage>
. doi: 10.1016/j.mib.2011.07.027
<pub-id pub-id-type="pmid">21856213</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B38">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Raes</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Yamada</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Jensen</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Toward molecular trait-based ecology through integration of biogeochemical, geographical and metagenomic data</article-title>
.
<source>Mol Syst Biol.</source>
<volume>7</volume>
:
<fpage>473.</fpage>
doi: 10.1038/msb.2011.6
<pub-id pub-id-type="pmid">21407210</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B39">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reno</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Held</surname>
<given-names>NL</given-names>
</name>
<name>
<surname>Fields</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Burke</surname>
<given-names>PV</given-names>
</name>
<name>
<surname>Whitaker</surname>
<given-names>RJ.</given-names>
</name>
</person-group>
<year>2009</year>
<article-title>Biogeography of the
<italic>Sulfolobus islandicus</italic>
pan-genome</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>106</volume>
:
<fpage>8605</fpage>
<lpage>8610</lpage>
. doi: 10.1073/pnas.0808945106
<pub-id pub-id-type="pmid">19435847</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B40">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rice</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Longden</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Bleasby</surname>
<given-names>A.</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>EMBOSS: the European molecular biology open software suite</article-title>
.
<source>Trends Genet.</source>
<volume>16</volume>
:
<fpage>276</fpage>
<lpage>277</lpage>
.
<pub-id pub-id-type="pmid">10827456</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B41">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schluter</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Szczepanowski</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Puhler</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Top</surname>
<given-names>EM.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Genomics of IncP-1 antibiotic resistance plasmids isolated from wastewater treatment plants provides evidence for a widely accessible drug resistance gene pool</article-title>
.
<source>FEMS Microbiol Rev</source>
.
<volume>31</volume>
:
<fpage>449</fpage>
<lpage>477</lpage>
. doi: FMR074 [pii]
<pub-id pub-id-type="pmid">17553065</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B42">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schmieder</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Quality control and preprocessing of metagenomic datasets</article-title>
.
<source>Bioinformatics</source>
<volume>27</volume>
:
<fpage>863</fpage>
<lpage>864</lpage>
. doi: 10.1093/bioinformatics/btr026
<pub-id pub-id-type="pmid">21278185</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B43">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schmieder</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Lim</surname>
<given-names>YW</given-names>
</name>
<name>
<surname>Edwards</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Identification and removal of ribosomal RNA sequences from metatranscriptomes</article-title>
.
<source>Bioinformatics</source>
<volume>28</volume>
:
<fpage>433</fpage>
<lpage>435</lpage>
. doi: 10.1093/bioinformatics/btr669
<pub-id pub-id-type="pmid">22155869</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B44">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sermonti</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Petris</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Micheli</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lanfaloni</surname>
<given-names>L.</given-names>
</name>
</person-group>
<year>1978</year>
<article-title>Chloramphenicol resistance in Streptomyces coelicolor A3(2): possible involvement of a transposable element</article-title>
.
<source>Mol Gen Genet.</source>
<volume>164</volume>
:
<fpage>99</fpage>
<lpage>103</lpage>
.
<pub-id pub-id-type="pmid">703761</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B45">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shanahan</surname>
<given-names>EF</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2010</year>
<article-title>Evaluation of pathogen removal in a solar sludge drying facility using microbial indicators</article-title>
.
<source>Int J Environ Res Public Health</source>
.
<volume>7</volume>
:
<fpage>565</fpage>
<lpage>582</lpage>
. doi: 10.3390/ijerph7020565
<pub-id pub-id-type="pmid">20616991</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B46">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smillie</surname>
<given-names>CS</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Ecology drives a global network of gene exchange connecting the human microbiome</article-title>
.
<source>Nature</source>
<volume>480</volume>
:
<fpage>241</fpage>
<lpage>244</lpage>
. doi: 10.1038/nature10571
<pub-id pub-id-type="pmid">22037308</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B47">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Staley</surname>
<given-names>JT</given-names>
</name>
<name>
<surname>Konopka</surname>
<given-names>A.</given-names>
</name>
</person-group>
<year>1985</year>
<article-title>Measurement of in situ activities of nonphotosynthetic microorganisms in aquatic and terrestrial habitats</article-title>
.
<source>Annu Rev Microbiol.</source>
<volume>39</volume>
:
<fpage>321</fpage>
<lpage>346</lpage>
. doi: 10.1146/annurev.mi.39.100185.001541
<pub-id pub-id-type="pmid">3904603</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B48">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sul</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Oliver</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Ducklow</surname>
<given-names>HW</given-names>
</name>
<name>
<surname>Amaral-Zettler</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Sogin</surname>
<given-names>ML.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Marine bacteria exhibit a bipolar distribution</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>110</volume>
:
<fpage>2342</fpage>
<lpage>2347</lpage>
. doi: 10.1073/pnas.1212424110
<pub-id pub-id-type="pmid">23324742</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B49">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>S</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Community cyberinfrastructure for advanced microbial ecology research and analysis: the camera resource</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>39</volume>
:
<fpage>D546</fpage>
<lpage>D551</lpage>
. doi: 10.1093/nar/gkq1102
<pub-id pub-id-type="pmid">21045053</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B50">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sunagawa</surname>
<given-names>S</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2013</year>
<article-title>Metagenomic species profiling using universal phylogenetic marker genes</article-title>
.
<source>Nat Methods</source>
.
<volume>10</volume>
:
<fpage>1196</fpage>
<lpage>1199</lpage>
. doi: 10.1038/nmeth.2693
<pub-id pub-id-type="pmid">24141494</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B51">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Takamatsu</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Osaki</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sekizaki</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>Chloramphenicol resistance transposable element TnSs1 of
<italic>Streptococcus suis</italic>
, a transposon flanked by IS6-family elements</article-title>
.
<source>Plasmid</source>
<volume>49</volume>
:
<fpage>143</fpage>
<lpage>151</lpage>
.
<pub-id pub-id-type="pmid">12726767</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B52">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Talavera</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Castresana</surname>
<given-names>J.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments</article-title>
.
<source>Syst Biol.</source>
<volume>56</volume>
:
<fpage>564</fpage>
<lpage>577</lpage>
. doi: 10.1080/10635150701472164
<pub-id pub-id-type="pmid">17654362</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B53">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tamminen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Virta</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Fani</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Fondi</surname>
<given-names>M.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Large-scale analysis of plasmid relationships through gene-sharing networks</article-title>
.
<source>Mol Biol Evol.</source>
<volume>29</volume>
:
<fpage>1225</fpage>
<lpage>1240</lpage>
. doi: 10.1093/molbev/msr292
<pub-id pub-id-type="pmid">22130968</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B54">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Teeling</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Meyerdierks</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Bauer</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Amann</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Glockner</surname>
<given-names>FO.</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Application of tetranucleotide frequencies for the assignment of genomic fragments</article-title>
.
<source>Environ Microbiol.</source>
<volume>6</volume>
:
<fpage>938</fpage>
<lpage>947</lpage>
. doi: 10.1111/j.1462-2920.2004.00624.x
<pub-id pub-id-type="pmid">15305919</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B55">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tringe</surname>
<given-names>SG</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2008</year>
<article-title>The airborne metagenome in an indoor urban environment</article-title>
.
<source>PloS One</source>
<volume>3</volume>
:
<fpage>e1862.</fpage>
doi: 10.1371/journal.pone.0001862
<pub-id pub-id-type="pmid">18382653</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B56">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Dongen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Abreu-Goodger</surname>
<given-names>C.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Using MCL to extract clusters from networks</article-title>
.
<source>Methods Mol Biol.</source>
<volume>804</volume>
:
<fpage>14.</fpage>
doi: 10.1007/978-1-61779-361-5_15</mixed-citation>
</ref>
<ref id="evw077-B57">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>XX</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>L.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Plasmid metagenome reveals high levels of antibiotic resistance genes and mobile genetic elements in activated sludge</article-title>
.
<source>PloS One</source>
<volume>6</volume>
:
<fpage>e26041.</fpage>
doi: 10.1371/journal.pone.0026041
<pub-id pub-id-type="pmid">22016806</pub-id>
</mixed-citation>
</ref>
<ref id="evw077-B58">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Y.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data</article-title>
.
<source>Bioinformatics</source>
<volume>26</volume>
:
<fpage>2051</fpage>
<lpage>2052</lpage>
. doi: 10.1093/bioinformatics/btq299
<pub-id pub-id-type="pmid">20538725</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0001109 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0001109 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024