Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Avoiding Regions Symptomatic of Conformational and Functional Flexibility to Identify Antiviral Targets in Current and Future Coronaviruses

Identifieur interne : 000C99 ( Pmc/Corpus ); précédent : 000C98; suivant : 000D00

Avoiding Regions Symptomatic of Conformational and Functional Flexibility to Identify Antiviral Targets in Current and Future Coronaviruses

Auteurs : Jordon Rahaman ; Jessica Siltberg-Liberles

Source :

RBID : PMC:5203785

Abstract

Within the last 15 years, two related coronaviruses (Severe Acute Respiratory Syndrome [SARS]-CoV and Middle East Respiratory Syndrome [MERS]-CoV) expanded their host range to include humans, with increased virulence in their new host. Coronaviruses were recently found to have little intrinsic disorder compared with many other virus families. Because intrinsically disordered regions have been proposed to be important for rewiring interactions between virus and host, we investigated the conservation of intrinsic disorder and secondary structure in coronaviruses in an evolutionary context. We found that regions of intrinsic disorder are rarely conserved among different coronavirus protein families, with the primary exception of the nucleocapsid. Also, secondary structure predictions are only conserved across 50–80% of sites for most protein families, with the implication that 20–50% of sites do not have conserved secondary structure prediction. Furthermore, nonconserved structure sites are significantly less constrained in sequence divergence than either sites conserved in the secondary structure or sites conserved in loop. Avoiding regions symptomatic of conformational flexibility such as disordered sites and sites with nonconserved secondary structure to identify potential broad-specificity antiviral targets, only one sequence motif (five residues or longer) remains from the >10,000 starting sites across all coronaviruses in this study. The identified sequence motif is found within the nonstructural protein (NSP) 12 and constitutes an antiviral target potentially effective against the present day and future coronaviruses. On shorter evolutionary timescales, the SARS and MERS clades have more sequence motifs fulfilling the criteria applied. Interestingly, many motifs map to NSP12 making this a prime target for coronavirus antivirals.


Url:
DOI: 10.1093/gbe/evw246
PubMed: 27797946
PubMed Central: 5203785

Links to Exploration step

PMC:5203785

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Avoiding Regions Symptomatic of Conformational and Functional Flexibility to Identify Antiviral Targets in Current and Future Coronaviruses</title>
<author>
<name sortKey="Rahaman, Jordon" sort="Rahaman, Jordon" uniqKey="Rahaman J" first="Jordon" last="Rahaman">Jordon Rahaman</name>
<affiliation>
<nlm:aff id="evw246-aff1">Department of Biological Sciences, Florida International University, Miami, FL</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Siltberg Liberles, Jessica" sort="Siltberg Liberles, Jessica" uniqKey="Siltberg Liberles J" first="Jessica" last="Siltberg-Liberles">Jessica Siltberg-Liberles</name>
<affiliation>
<nlm:aff id="evw246-aff1">Department of Biological Sciences, Florida International University, Miami, FL</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw246-aff2">Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27797946</idno>
<idno type="pmc">5203785</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5203785</idno>
<idno type="RBID">PMC:5203785</idno>
<idno type="doi">10.1093/gbe/evw246</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000C99</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000C99</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Avoiding Regions Symptomatic of Conformational and Functional Flexibility to Identify Antiviral Targets in Current and Future Coronaviruses</title>
<author>
<name sortKey="Rahaman, Jordon" sort="Rahaman, Jordon" uniqKey="Rahaman J" first="Jordon" last="Rahaman">Jordon Rahaman</name>
<affiliation>
<nlm:aff id="evw246-aff1">Department of Biological Sciences, Florida International University, Miami, FL</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Siltberg Liberles, Jessica" sort="Siltberg Liberles, Jessica" uniqKey="Siltberg Liberles J" first="Jessica" last="Siltberg-Liberles">Jessica Siltberg-Liberles</name>
<affiliation>
<nlm:aff id="evw246-aff1">Department of Biological Sciences, Florida International University, Miami, FL</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="evw246-aff2">Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Genome Biology and Evolution</title>
<idno type="eISSN">1759-6653</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Within the last 15 years, two related coronaviruses (Severe Acute Respiratory Syndrome [SARS]-CoV and Middle East Respiratory Syndrome [MERS]-CoV) expanded their host range to include humans, with increased virulence in their new host. Coronaviruses were recently found to have little intrinsic disorder compared with many other virus families. Because intrinsically disordered regions have been proposed to be important for rewiring interactions between virus and host, we investigated the conservation of intrinsic disorder and secondary structure in coronaviruses in an evolutionary context. We found that regions of intrinsic disorder are rarely conserved among different coronavirus protein families, with the primary exception of the nucleocapsid. Also, secondary structure predictions are only conserved across 50–80% of sites for most protein families, with the implication that 20–50% of sites do not have conserved secondary structure prediction. Furthermore, nonconserved structure sites are significantly less constrained in sequence divergence than either sites conserved in the secondary structure or sites conserved in loop. Avoiding regions symptomatic of conformational flexibility such as disordered sites and sites with nonconserved secondary structure to identify potential broad-specificity antiviral targets, only one sequence motif (five residues or longer) remains from the >10,000 starting sites across all coronaviruses in this study. The identified sequence motif is found within the nonstructural protein (NSP) 12 and constitutes an antiviral target potentially effective against the present day and future coronaviruses. On shorter evolutionary timescales, the SARS and MERS clades have more sequence motifs fulfilling the criteria applied. Interestingly, many motifs map to NSP12 making this a prime target for coronavirus antivirals.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Anderson, Lj" uniqKey="Anderson L">LJ Anderson</name>
</author>
<author>
<name sortKey="Tong, S" uniqKey="Tong S">S. Tong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Berman, Hm" uniqKey="Berman H">HM. Berman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bermingham, A" uniqKey="Bermingham A">A Bermingham</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bornholdt, Za" uniqKey="Bornholdt Z">ZA Bornholdt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Le Breton, M" uniqKey="Le Breton M">M Le Breton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bryson, K" uniqKey="Bryson K">K Bryson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Campen, A" uniqKey="Campen A">A Campen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cohen, O" uniqKey="Cohen O">O Cohen</name>
</author>
<author>
<name sortKey="Ashkenazy, H" uniqKey="Ashkenazy H">H Ashkenazy</name>
</author>
<author>
<name sortKey="Belinky, F" uniqKey="Belinky F">F Belinky</name>
</author>
<author>
<name sortKey="Huchon, D" uniqKey="Huchon D">D Huchon</name>
</author>
<author>
<name sortKey="Pupko, T" uniqKey="Pupko T">T. Pupko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cohen, O" uniqKey="Cohen O">O Cohen</name>
</author>
<author>
<name sortKey="Pupko, T" uniqKey="Pupko T">T. Pupko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Groot, Rj" uniqKey="De Groot R">RJ de Groot</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dosztanyi, Z" uniqKey="Dosztanyi Z">Z Dosztányi</name>
</author>
<author>
<name sortKey="Csizmok, V" uniqKey="Csizmok V">V Csizmok</name>
</author>
<author>
<name sortKey="Tompa, P" uniqKey="Tompa P">P Tompa</name>
</author>
<author>
<name sortKey="Simon, I" uniqKey="Simon I">I. Simon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dosztanyi, Z" uniqKey="Dosztanyi Z">Z Dosztányi</name>
</author>
<author>
<name sortKey="Csizm K, V" uniqKey="Csizm K V">V Csizmók</name>
</author>
<author>
<name sortKey="Tompa, P" uniqKey="Tompa P">P Tompa</name>
</author>
<author>
<name sortKey="Simon, I" uniqKey="Simon I">I. Simon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Drozdetskiy, A" uniqKey="Drozdetskiy A">A Drozdetskiy</name>
</author>
<author>
<name sortKey="Cole, C" uniqKey="Cole C">C Cole</name>
</author>
<author>
<name sortKey="Procter, J" uniqKey="Procter J">J Procter</name>
</author>
<author>
<name sortKey="Barton, Gj" uniqKey="Barton G">GJ. Barton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fehr, Ar" uniqKey="Fehr A">AR Fehr</name>
</author>
<author>
<name sortKey="Perlman, S" uniqKey="Perlman S">S. Perlman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Flipse, J" uniqKey="Flipse J">J Flipse</name>
</author>
<author>
<name sortKey="Smit, Jm" uniqKey="Smit J">JM. Smit</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Giles, Bm" uniqKey="Giles B">BM Giles</name>
</author>
<author>
<name sortKey="Ross, Tm" uniqKey="Ross T">TM. Ross</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Giles, Bm" uniqKey="Giles B">BM Giles</name>
</author>
<author>
<name sortKey="Ross, Tm" uniqKey="Ross T">TM. Ross</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gitlin, L" uniqKey="Gitlin L">L Gitlin</name>
</author>
<author>
<name sortKey="Hagai, T" uniqKey="Hagai T">T Hagai</name>
</author>
<author>
<name sortKey="Labarbera, A" uniqKey="Labarbera A">A LaBarbera</name>
</author>
<author>
<name sortKey="Solovey, M" uniqKey="Solovey M">M Solovey</name>
</author>
<author>
<name sortKey="Andino, R" uniqKey="Andino R">R. Andino</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gralinski, Le" uniqKey="Gralinski L">LE Gralinski</name>
</author>
<author>
<name sortKey="Baric, Rs" uniqKey="Baric R">RS. Baric</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huelsenbeck, Jp" uniqKey="Huelsenbeck J">JP Huelsenbeck</name>
</author>
<author>
<name sortKey="Ronquist, F" uniqKey="Ronquist F">F. Ronquist</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Huerta Cepas, J" uniqKey="Huerta Cepas J">J Huerta-Cepas</name>
</author>
<author>
<name sortKey="Serra, F" uniqKey="Serra F">F Serra</name>
</author>
<author>
<name sortKey="Bork, P" uniqKey="Bork P">P. Bork</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hunter, Jd" uniqKey="Hunter J">JD. Hunter</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jones, Dt" uniqKey="Jones D">DT Jones</name>
</author>
<author>
<name sortKey="Taylor, Wr" uniqKey="Taylor W">WR Taylor</name>
</author>
<author>
<name sortKey="Thornton, Jm" uniqKey="Thornton J">JM. Thornton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Katoh, K" uniqKey="Katoh K">K Katoh</name>
</author>
<author>
<name sortKey="Misawa, K" uniqKey="Misawa K">K Misawa</name>
</author>
<author>
<name sortKey="Kuma, K" uniqKey="Kuma K">K Kuma</name>
</author>
<author>
<name sortKey="Miyata, T" uniqKey="Miyata T">T. Miyata</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kesturu, Gs" uniqKey="Kesturu G">GS Kesturu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lehmann, Kc" uniqKey="Lehmann K">KC Lehmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, Q" uniqKey="Li Q">Q Li</name>
</author>
<author>
<name sortKey="Dahl, Db" uniqKey="Dahl D">DB Dahl</name>
</author>
<author>
<name sortKey="Vannucci, M" uniqKey="Vannucci M">M Vannucci</name>
</author>
<author>
<name sortKey="Hyun, J" uniqKey="Hyun J">J Hyun</name>
</author>
<author>
<name sortKey="Tsai, Jw" uniqKey="Tsai J">JW. Tsai</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lu, G" uniqKey="Lu G">G Lu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ma, Y" uniqKey="Ma Y">Y Ma</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mayrose, I" uniqKey="Mayrose I">I Mayrose</name>
</author>
<author>
<name sortKey="Graur, D" uniqKey="Graur D">D Graur</name>
</author>
<author>
<name sortKey="Ben Tal, N" uniqKey="Ben Tal N">N Ben-Tal</name>
</author>
<author>
<name sortKey="Pupko, T" uniqKey="Pupko T">T. Pupko</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mccloskey, Rm" uniqKey="Mccloskey R">RM McCloskey</name>
</author>
<author>
<name sortKey="Liang, Rh" uniqKey="Liang R">RH Liang</name>
</author>
<author>
<name sortKey="Harrigan, Pr" uniqKey="Harrigan P">PR Harrigan</name>
</author>
<author>
<name sortKey="Brumme, Zl" uniqKey="Brumme Z">ZL Brumme</name>
</author>
<author>
<name sortKey="Poon, Afy" uniqKey="Poon A">AFY. Poon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcguffin, Lj" uniqKey="Mcguffin L">LJ McGuffin</name>
</author>
<author>
<name sortKey="Bryson, K" uniqKey="Bryson K">K Bryson</name>
</author>
<author>
<name sortKey="Jones, Dt" uniqKey="Jones D">DT. Jones</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mokili, Jl" uniqKey="Mokili J">JL Mokili</name>
</author>
<author>
<name sortKey="Rohwer, F" uniqKey="Rohwer F">F Rohwer</name>
</author>
<author>
<name sortKey="Dutilh, Be" uniqKey="Dutilh B">BE. Dutilh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ortiz, Jf" uniqKey="Ortiz J">JF Ortiz</name>
</author>
<author>
<name sortKey="Macdonald, Ml" uniqKey="Macdonald M">ML MacDonald</name>
</author>
<author>
<name sortKey="Masterson, P" uniqKey="Masterson P">P Masterson</name>
</author>
<author>
<name sortKey="Uversky, Vn" uniqKey="Uversky V">VN Uversky</name>
</author>
<author>
<name sortKey="Siltberg Liberles, J" uniqKey="Siltberg Liberles J">J. Siltberg-Liberles</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pushker, R" uniqKey="Pushker R">R Pushker</name>
</author>
<author>
<name sortKey="Mooney, C" uniqKey="Mooney C">C Mooney</name>
</author>
<author>
<name sortKey="Davey, Ne" uniqKey="Davey N">NE Davey</name>
</author>
<author>
<name sortKey="Jacque, J M" uniqKey="Jacque J">J-M Jacqué</name>
</author>
<author>
<name sortKey="Shields, Dc" uniqKey="Shields D">DC. Shields</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reusken, Cb" uniqKey="Reusken C">CB Reusken</name>
</author>
<author>
<name sortKey="Raj, Vs" uniqKey="Raj V">VS Raj</name>
</author>
<author>
<name sortKey="Koopmans, Mp" uniqKey="Koopmans M">MP Koopmans</name>
</author>
<author>
<name sortKey="Haagmans, Bl" uniqKey="Haagmans B">BL. Haagmans</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ronquist, F" uniqKey="Ronquist F">F Ronquist</name>
</author>
<author>
<name sortKey="Huelsenbeck, Jp" uniqKey="Huelsenbeck J">JP. Huelsenbeck</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rosario, K" uniqKey="Rosario K">K Rosario</name>
</author>
<author>
<name sortKey="Breitbart, M" uniqKey="Breitbart M">M. Breitbart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Siltberg Liberles, J" uniqKey="Siltberg Liberles J">J Siltberg-Liberles</name>
</author>
<author>
<name sortKey="Grahnen, Ja" uniqKey="Grahnen J">JA Grahnen</name>
</author>
<author>
<name sortKey="Liberles, Da" uniqKey="Liberles D">DA. Liberles</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Slabinski, L" uniqKey="Slabinski L">L Slabinski</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Song, H D" uniqKey="Song H">H-D Song</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Su, S" uniqKey="Su S">S Su</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Boheemen, S" uniqKey="Van Boheemen S">S van Boheemen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Der Hoek, L" uniqKey="Van Der Hoek L">L. van der Hoek</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ward, Jj" uniqKey="Ward J">JJ Ward</name>
</author>
<author>
<name sortKey="Mcguffin, Lj" uniqKey="Mcguffin L">LJ McGuffin</name>
</author>
<author>
<name sortKey="Bryson, K" uniqKey="Bryson K">K Bryson</name>
</author>
<author>
<name sortKey="Buxton, Bf" uniqKey="Buxton B">BF Buxton</name>
</author>
<author>
<name sortKey="Jones, Dt" uniqKey="Jones D">DT. Jones</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Woo, Pc" uniqKey="Woo P">PC Woo</name>
</author>
<author>
<name sortKey="Lau, Sk" uniqKey="Lau S">SK Lau</name>
</author>
<author>
<name sortKey="Li, Ks" uniqKey="Li K">KS Li</name>
</author>
<author>
<name sortKey="Tsang, Ak" uniqKey="Tsang A">AK Tsang</name>
</author>
<author>
<name sortKey="Yuen, K Y" uniqKey="Yuen K">K-Y. Yuen</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Xue, B" uniqKey="Xue B">B Xue</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yang, H" uniqKey="Yang H">H Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, C" uniqKey="Yu C">C Yu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
<author>
<name sortKey="Shen, L" uniqKey="Shen L">L Shen</name>
</author>
<author>
<name sortKey="Gu, X" uniqKey="Gu X">X. Gu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zheng, A" uniqKey="Zheng A">A Zheng</name>
</author>
<author>
<name sortKey="Yuan, F" uniqKey="Yuan F">F Yuan</name>
</author>
<author>
<name sortKey="Kleinfelter, Lm" uniqKey="Kleinfelter L">LM Kleinfelter</name>
</author>
<author>
<name sortKey="Kielian, M" uniqKey="Kielian M">M. Kielian</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Genome Biol Evol</journal-id>
<journal-id journal-id-type="iso-abbrev">Genome Biol Evol</journal-id>
<journal-id journal-id-type="publisher-id">gbe</journal-id>
<journal-id journal-id-type="hwp">gbe</journal-id>
<journal-title-group>
<journal-title>Genome Biology and Evolution</journal-title>
</journal-title-group>
<issn pub-type="epub">1759-6653</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27797946</article-id>
<article-id pub-id-type="pmc">5203785</article-id>
<article-id pub-id-type="doi">10.1093/gbe/evw246</article-id>
<article-id pub-id-type="publisher-id">evw246</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Avoiding Regions Symptomatic of Conformational and Functional Flexibility to Identify Antiviral Targets in Current and Future Coronaviruses</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Rahaman</surname>
<given-names>Jordon</given-names>
</name>
<xref ref-type="aff" rid="evw246-aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Siltberg-Liberles</surname>
<given-names>Jessica</given-names>
</name>
<xref ref-type="aff" rid="evw246-aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="evw246-aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="evw246-cor1">*</xref>
</contrib>
<aff id="evw246-aff1">
<label>1</label>
Department of Biological Sciences, Florida International University, Miami, FL</aff>
<aff id="evw246-aff2">
<label>2</label>
Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL</aff>
</contrib-group>
<author-notes>
<fn id="evw246-FM1">
<p>
<bold>Associate editor</bold>
: Dr. Chantal Abergel</p>
</fn>
<corresp id="evw246-cor1">
<label>*</label>
Corresponding author: E-mail:
<email>jliberle@fiu.edu</email>
.</corresp>
</author-notes>
<pub-date pub-type="collection">
<month>11</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="epub">
<day>09</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>09</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>8</volume>
<issue>11</issue>
<fpage>3471</fpage>
<lpage>3484</lpage>
<history>
<date date-type="accepted">
<day>03</day>
<month>10</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.</copyright-statement>
<copyright-year>2016</copyright-year>
<license xlink:href="http://creativecommons.org/licenses/by-nc/4.0/" license-type="creative-commons">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by-nc/4.0/">http://creativecommons.org/licenses/by-nc/4.0/</ext-link>
), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com</license-p>
</license>
</permissions>
<abstract>
<p>Within the last 15 years, two related coronaviruses (Severe Acute Respiratory Syndrome [SARS]-CoV and Middle East Respiratory Syndrome [MERS]-CoV) expanded their host range to include humans, with increased virulence in their new host. Coronaviruses were recently found to have little intrinsic disorder compared with many other virus families. Because intrinsically disordered regions have been proposed to be important for rewiring interactions between virus and host, we investigated the conservation of intrinsic disorder and secondary structure in coronaviruses in an evolutionary context. We found that regions of intrinsic disorder are rarely conserved among different coronavirus protein families, with the primary exception of the nucleocapsid. Also, secondary structure predictions are only conserved across 50–80% of sites for most protein families, with the implication that 20–50% of sites do not have conserved secondary structure prediction. Furthermore, nonconserved structure sites are significantly less constrained in sequence divergence than either sites conserved in the secondary structure or sites conserved in loop. Avoiding regions symptomatic of conformational flexibility such as disordered sites and sites with nonconserved secondary structure to identify potential broad-specificity antiviral targets, only one sequence motif (five residues or longer) remains from the >10,000 starting sites across all coronaviruses in this study. The identified sequence motif is found within the nonstructural protein (NSP) 12 and constitutes an antiviral target potentially effective against the present day and future coronaviruses. On shorter evolutionary timescales, the SARS and MERS clades have more sequence motifs fulfilling the criteria applied. Interestingly, many motifs map to NSP12 making this a prime target for coronavirus antivirals.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>structural disorder</kwd>
<kwd>evolutionary dynamics</kwd>
<kwd>Coronavirus</kwd>
<kwd>evolution</kwd>
<kwd>divergence</kwd>
<kwd>MERS-CoV</kwd>
</kwd-group>
<counts>
<page-count count="14"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro">
<title>Introduction</title>
<p>Severe Acute Respiratory Syndrome (SARS)-CoV and Middle East Respiratory Syndrome (MERS)-CoV are two closely related zoonotic coronaviruses. Both have successfully crossed the species barrier to allow animal-to-human transmission, and further to allow human-to-human transmission (
<xref rid="evw246-B42" ref-type="bibr">Song et al. 2005</xref>
;
<xref rid="evw246-B37" ref-type="bibr">Reusken et al. 2016</xref>
). The SARS outbreak in 2003 had a mortality rate of 10% (
<xref rid="evw246-B1" ref-type="bibr">Anderson et al. 2010</xref>
), and SARS-CoV was considered the most aggressive coronavirus compared to other human coronaviruses that commonly cause mild to moderate infection in their hosts (
<xref rid="evw246-B45" ref-type="bibr">van der Hoek 2007</xref>
). MERS-CoV is the cause of an ongoing outbreak of the respiratory illness MERS (
<xref rid="evw246-B10" ref-type="bibr">de Groot et al. 2013</xref>
). At the time of writing, 1791 MERS cases have been confirmed with a mortality rate of approximately 35% (
<xref rid="evw246-B48" ref-type="bibr">World Health Organization 2016</xref>
). Both MERS and SARS have higher mortality rates in elderly and immunosuppressed populations (
<xref rid="evw246-B19" ref-type="bibr">Gralinski and Baric 2015</xref>
).</p>
<p>The host changes by MERS-CoV and SARS-CoV suggest that other coronaviruses can potentially cross the species barrier, become zoonotic, and enable human-to-human transmission, ultimately causing high morbidity and mortality. SARS-CoV and MERS-CoV exploited mechanistically different approaches to overcome the human species barrier, but these two viruses have a lot in common (
<xref rid="evw246-B29" ref-type="bibr">Lu et al. 2015</xref>
). Here, we aim to identify the vulnerable regions in the proteomes of coronaviruses that neither SARS-CoV nor MERS-CoV nor their contemporary and forthcoming relatives can proliferate without, and address how to mobilize a defense against the present and future coronaviruses by targeting these regions.</p>
<p>SARS-CoV and MERS-CoV are positive (+)-strand RNA viruses encoding approximately 25 protein products. The MERS-CoV proteome is primarily composed of two polyproteins, ORF1a and ORF1ab; the latter is generated by a -1 ribosomal slippage frameshift. These proteins are cleaved into 16 nonstructural proteins (NSPs). NSPs 1–10 are products of both polyproteins, whereas NSPs 12–16 are only yielded by ORF1ab. NSP11 is unique to ORF1a (
<xref rid="evw246-B44" ref-type="bibr">van Boheemen et al. 2012</xref>
). Structural proteins envelope (E), spike (S), membrane (M), and nucleocapsid (N) are elements of the physical structure that encloses the viral genome and come from distinct reading frames, unlike ORF1a and ORF1ab, which come from overlapping reading frames. Additionally, the structural proteins are the product of subgenomic mRNAs that are joined during discontinuous negative RNA strand synthesis (
<xref rid="evw246-B44" ref-type="bibr">van Boheemen et al. 2012</xref>
). Finally, NS3 protein (NS3), NS4A protein (NS4A), NS4B protein (NS4B), NS5 protein (NS5), and Orf8b protein encompass the remainder of the proteome and also arise from distinct reading frames (
<xref rid="evw246-B44" ref-type="bibr">van Boheemen et al. 2012</xref>
).</p>
<p>Our approach utilizes genomic sequence data, which is readily available for viruses known to cause disease. However, because most viruses pose no major threat to their host, they pass by unnoticed leaving the majority of virus genome space uncharted. With the availability of cost-efficient genome sequencing technology, and recent developments in the field of viral metagenomics, large-scale identification of viral genome space is on the rise (
<xref rid="evw246-B39" ref-type="bibr">Rosario and Breitbart 2011</xref>
;
<xref rid="evw246-B34" ref-type="bibr">Mokili et al. 2012</xref>
). By exploring viral diversity, critical components constituting a viral genus’ fitness can be evaluated. Examples such as the common influenza virus illustrate the rapidity of viral gene mutation and in order to maintain immune protection, an annual flu vaccination is recommended. Underway efforts aim to generate broadly neutralizing vaccines whose design accounts for the genomic sequences of multiple types of influenza virus to eliminate frequent re-vaccination against the flu (
<xref rid="evw246-B16" ref-type="bibr">Giles and Ross 2011</xref>
,
<xref rid="evw246-B17" ref-type="bibr">2012</xref>
). Development of broadly neutralizing vaccines often relies on the consensus or ancestral sequences of extant viral sequences in order to provide greater coverage for related viruses (
<xref rid="evw246-B26" ref-type="bibr">Kesturu et al. 2006</xref>
). Unfortunately, consensus sequences can be misleading, and ancestral sequence reconstruction is error-prone for quickly diverging sequences (
<xref rid="evw246-B32" ref-type="bibr">McCloskey et al. 2014</xref>
). In addition, viruses with compact genomes often express proteins with structural disorder that may undergo structural transformations. Although these transformer proteins, like VP40 in Ebola, are masters at changing their structure, and thus expanding their functional repertoire as needed for the life cycle of the virus (
<xref rid="evw246-B4" ref-type="bibr">Bornholdt et al. 2013</xref>
), flexible regions are potentially important in rewiring protein–protein interactions between the virus and its host (
<xref rid="evw246-B5" ref-type="bibr">Le Breton et al. 2011</xref>
;
<xref rid="evw246-B35" ref-type="bibr">Ortiz et al. 2013</xref>
;
<xref rid="evw246-B18" ref-type="bibr">Gitlin et al. 2014</xref>
). The flexibility trait of many viral proteins is a complicating factor in vaccine development. For instance, Dengue virus exhibits serotype-specific antibody affinity that causes antibody-dependent enhancement, an obstacle in the development of Dengue vaccines that protects against all four serotypes (
<xref rid="evw246-B15" ref-type="bibr">Flipse and Smit 2015</xref>
). To overcome the hurdle posed by structural flexibility, we propose an additional screening step in identifying potential vaccine or antiviral targets that considers the structural flexibility of the viral proteins. The Structural Genomics Initiatives increased their success rate by excluding proteins predicted to be structurally disordered (
<xref rid="evw246-B41" ref-type="bibr">Slabinski et al. 2007</xref>
). A similar approach can perhaps benefit vaccine development. Furthermore, to make this approach robust to potential mutations, minimizing loss in efficacy or resistance, the evolutionary context of sequence and structure must be considered. Thus, we suggest expanding the concept of broadly neutralizing vaccines/antivirals by increasing the diversity of viruses considered if possible. Sites conserved for sequence, structure, and with low disorder propensity among diverse virus protein homologs are very likely to be constrained from 1) changing sequence on evolutionary time scales and 2) undergoing real-time structural transitions. These sites have potential as targets for broad-specificity antivirals or vaccines because conservation makes them broad-specificity and low dynamics avoids targeting a conformational ensemble, which is not only difficult (
<xref rid="evw246-B51" ref-type="bibr">Yu et al. 2016</xref>
), but that may change as the sequence diverges (
<xref rid="evw246-B40" ref-type="bibr">Siltberg-Liberles et al. 2011</xref>
).</p>
<p>A recent large-scale study of structural disorder in >2,000 viral genomes in 41 viral families found the amount of disorder in different virus families varying from 2.9% to 23.1% (
<xref rid="evw246-B36" ref-type="bibr">Pushker et al. 2013</xref>
). It was reported that
<italic>Coronaviridae</italic>
has very low disorder content (mean disorder 3.68%) (
<xref rid="evw246-B36" ref-type="bibr">Pushker et al. 2013</xref>
).
<italic>Coronaviridae</italic>
contains two subfamilies:
<italic>Coronavirinae</italic>
and
<italic>Torovirinae</italic>
. SARS-CoV and MERS-CoV are part the
<italic>Coronavirinae</italic>
subfamily, from here on referred to as coronavirus (CoV). The lack of disorder is intriguing because it may be important for rewiring interactions between viral proteins and host proteins (
<xref rid="evw246-B35" ref-type="bibr">Ortiz et al. 2013</xref>
) and providing opportunities to acquire novel functional sequence motifs (
<xref rid="evw246-B18" ref-type="bibr">Gitlin et al. 2014</xref>
). Structural disorder has also been proposed to be important for viral viability, enabling multifunctionality and vigor in response to changes in the environment (
<xref rid="evw246-B49" ref-type="bibr">Xue et al. 2014</xref>
). Given the low fraction of structural disorder reported across
<italic>Coronaviridae</italic>
, we set out to investigate the conservation of structural disorder and secondary structure across CoV. Sites identified as conserved for structure and lacking disorder can be considered to be vulnerable and druggable in the proteomes of coronaviruses. The structural divergence capacity of these regions is limited, leaving a wider range of the present and emergent coronaviruses susceptible to the effects of potential broadly neutralizing anti-CoV therapies targeting these sites. We will refer to these sites as target sites.</p>
</sec>
<sec sec-type="materials|methods">
<title>Materials and Methods</title>
<sec>
<title>Protein Family Reconstruction</title>
<p>Protein sequences were identified by individual BLAST searches with MERS-CoV (Taxonomy ID: 1335626) proteins ORF1ab (YP_009047202.1; polyprotein), S protein (YP_009047204.1), M protein (YP_009047210.1), E protein (YP_009047209.1), and N protein (YP_009047211.1) against coronaviruses. BLAST searches of the ORF1ab protein were performed, using start and end positions as detailed in the ORF1ab NCBI Reference Sequence file, against the refseq_protein database. The sequences retrieved from the BLAST output maintained the following cutoff: >30% sequence identity and >50% coverage relative to MERS-CoV sequence query. The 30% sequence identity and 50% query coverage cutoff strikes a balance between alignment quality and at least 10 sequences for most protein families. NSP1 (YP_009047202.1; 1-193), NSP2 (YP_009047202.1; 194-853), NS3 (YP_009047205.1), NS4A (YP_009047206.1), NS4B (YP_009047207.1), NS5 (YP_009047208.1), ORF8b protein (YP_009047212.1), and NSP11 (YP_009047203.1; 4378-4391) are not included in this study due to <10 BLAST hits.</p>
<p>Multiple sequence alignments were constructed for the selected BLAST hits using MAFFT (
<xref rid="evw246-B25" ref-type="bibr">Katoh et al. 2002</xref>
). Phylogenetic trees were constructed using MrBayes 3.2.2 with a four category gamma distribution and the mixed model for amino acid substitution (
<xref rid="evw246-B20" ref-type="bibr">Huelsenbeck and Ronquist 2001</xref>
;
<xref rid="evw246-B38" ref-type="bibr">Ronquist and Huelsenbeck 2003</xref>
). Each tree ran for five million generations, with a sample frequency of 100. The final tree was constructed from the last 75% of samples, discarding the first 25% of samples as the default burnin, and using the half-compatible parameter, to avoid weakly supported nodes (i.e., with a posterior probability <0.5). All trees were midpoint rooted.</p>
<p>For every protein family, the amino acid substitution rate per site in its multiple sequence alignment was calculated using empirical Bayesian estimation as implemented in Rate4Site (
<xref rid="evw246-B31" ref-type="bibr">Mayrose et al. 2004</xref>
). Substitution rates were calculated using 16 gamma categories, the JTT substitution matrix (
<xref rid="evw246-B24" ref-type="bibr">Jones et al. 1992</xref>
), and the reconstructed phylogenies. The rates were normalized per protein family with an average across all sites equal to zero and SD equal to 1. This means that sites with a rate <0 are evolving slower than average, whereas sites with a rate >0 are evolving faster than average.</p>
</sec>
<sec>
<title>Prediction of Intrinsic Disorder Propensity and Secondary Structure</title>
<p>Intrinsic disorder propensity was inferred using two different predictors: IUPred (default settings; “long” option) (
<xref rid="evw246-B11" ref-type="bibr">Dosztányi et al. 2005a</xref>
,
<xref rid="evw246-B12" ref-type="bibr">2005b</xref>
) and DISOPRED2 (
<xref rid="evw246-B46" ref-type="bibr">Ward et al. 2004</xref>
) for all proteins. For IUPred, the site-specific continuous disorder propensities for each protein were mapped onto their corresponding position in the multiple sequence alignment as raw disorder propensities and as binary states, order or disorder, using two cutoffs of 0.4 and 0.5. Disorder propensities below the cutoff were assigned order and disorder propensities at the cutoff or above were assigned disorder. For the DISOPRED2 predictions that were inferred using the nr database, the continuous disorder propensities for every site in a protein were mapped onto their corresponding position in the multiple sequence alignment as raw disorder propensities and as binary states, order or disorder, using a cutoff of 5. Consequently, for every protein family (a multiple sequence alignment and its corresponding phylogenetic tree), two continuous matrices and three binary matrices resulted: IUPred 0.4, IUPred 0.5, and DISOPRED2. An additional matrix was generated to indicate sites where the binary order and disorder assignments differ between IUPred 0.4 and DISOPRED2.</p>
<p>A similar methodology was employed to analyze secondary structure predicted by PSIPRED (
<xref rid="evw246-B33" ref-type="bibr">McGuffin et al. 2000</xref>
) and JPred (
<xref rid="evw246-B13" ref-type="bibr">Drozdetskiy et al. 2015</xref>
). For both predictors, the uniref90 database was used and sites were classified as loops, alpha helices, or beta strands and mapped back onto their corresponding sites in the multiple sequence alignment. This resulted in two three-state matrices for each protein family alignment, one for each predictor, and two binary matrices displaying secondary structure elements (alpha helix and beta strand) or loops. An additional matrix was generated to indicate sites where the secondary structure assignments differ between PSIPRED and JPred.</p>
<p>For every protein family, the binary matrices resulting from the different disorder predictions and from the different secondary structure predictions were analyzed in the corresponding evolutionary context using GLOOME. GLOOME (Gain-Loss Mapping Engine) analyzes binary presence and absence patterns in a phylogenetic context (
<xref rid="evw246-B8" ref-type="bibr">Cohen et al. 2010</xref>
). In this study, the Rate4Site option in GLOOME was used to analyze the binary matrices (IUPred 0.4, IUPred 0.5, DISOPRED2, PSIPRED, and JPred) with the corresponding phylogenetic trees to map change of state across sites in each individual protein phylogeny (
<xref rid="evw246-B9" ref-type="bibr">Cohen and Pupko 2010</xref>
;
<xref rid="evw246-B8" ref-type="bibr">Cohen et al. 2010</xref>
). GLOOME was run with 16 gamma categories and a substitution matrix set to equal rates within each state and transitions between states treated equally. From the binary disorder and order matrices, transition rates between disorder and order or vice versa (DOT) were estimated. From the binary structure and loop matrices, transition rates between structure and loop or vice versa (SLT) were estimated. Similar to Rate4Site, the rates were normalized per protein family with an average across all sites equal to zero and SD equal to 1. This means that sites with a rate <0 are evolving slower than average, while sites with a rate >0 are evolving faster than average.</p>
</sec>
<sec>
<title>Protein Family Visualization</title>
<p>Protein families were visualized in an integrative manner with a phylogenetic tree, any matrix (multiple sequence alignment or predictor based) displayed as a heatmap, and site-specific sequence transition rates using Python packages ETE3 (
<xref rid="evw246-B21" ref-type="bibr">Huerta-Cepas et al. 2016</xref>
) and Matplotlib (
<xref rid="evw246-B22" ref-type="bibr">Hunter 2007</xref>
).</p>
</sec>
<sec>
<title>Statistical Analysis of Amino Acid Evolutionary Rate Distributions</title>
<p>Amino acid evolutionary rates (SEQ) for all sites across all alignments were aggregated and binned into four possible categories characterized by the distribution of PSIPRED predicted secondary structure at each site. Sites predicted to have a loop across all sequences are “conserved loops; C(L)” and sites predicted to have a helix across all sequences or a strand across all sequences are “conserved helix-strand; C(HS)” (
<xref ref-type="table" rid="evw246-T3">table 3</xref>
). Sites predicted to have all three states (helix, strand, and loop) or any combination of loop and one other state are “non-conserved helix, loop, strand; NC(HLS)” and sites predicted to have a mixture of helix and strand are “non-conserved helix-strand; NC(HS)” (
<xref ref-type="table" rid="evw246-T3">table 3</xref>
). In all cases, gaps were ignored when classifying combinations of secondary structure at a site or if secondary structure conservation exists at a particular site.</p>
</sec>
</sec>
<sec sec-type="results">
<title>Results</title>
<sec>
<title>Phylogenies</title>
<p>Phylogenies were built for all protein products encoded in the MERS-CoV single-stranded RNA genome, except for NSP1, NSP2, NS3, NS4A, NS4B, ORF8b protein, and NSP11, all of which had insufficient sequence data (<10 sequence hits with BLAST).</p>
<p>NSP12 is often used as a measure for newly identified coronaviruses. According to the International Committee of Taxonomy of Viruses, a major criterion in determining if a coronavirus is considered novel is pairwise sequence identity below 90% for NSP12 in all comparisons to previously known coronaviruses (
<xref rid="evw246-B3" ref-type="bibr">Bermingham et al. 2012</xref>
). Four main clades, alphacoronavirus, betacoronavirus, gammacoronavirus, and deltacoronavirus (
<xref ref-type="fig" rid="evw246-F1">fig. 1</xref>
), are identified in agreement with the taxonomic classifications described by the ICTV (
<xref rid="evw246-B23" ref-type="bibr">International Committee on Taxonomy of Viruses 2015</xref>
). Coronaviruses not listed by the ICTV are assumed to be a part of the clade in which representatives with known classifications are situated in our NSP12 phylogeny.
<fig id="evw246-F1" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 1.—</label>
<caption>
<p>CoV representative phylogeny. (
<italic>A</italic>
) NSP12 is a representative for the CoV protein phylogenies, colored by clade (alphacoronavirus or FIPV, green; betacoronavirus has four different subclades: SARS, blue; MERS, gray; HKU1, pink; EQU, purple; gammacoronavirus or SW1, yellow; deltacoronavirus or HKU19, white.) Posterior probability indicating node support is shown in red. (
<italic>B</italic>
) Protein family distribution across coronavirus based on the given cutoff (>30% sequence identity and >50% coverage relative to MERS-CoV sequence query). Clade color applied throughout the remaining figures. Areas shaded in gray with an arrow indicate that the protein family is not identified for that clade with the given cutoffs, but is found from the arrow tip. See
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online, for the remaining phylogenetic trees.</p>
</caption>
<graphic xlink:href="evw246f1p"></graphic>
</fig>
</p>
<p>The MERS clade and SARS clade are sister clades in the NSP12 phylogeny. The HKU1 clade and EQU clade are also sister clades. Together these four clades form the Betacoronavirus clade, in accordance with the ICTV classification (
<xref rid="evw246-B23" ref-type="bibr">International Committee on Taxonomy of Viruses 2015</xref>
). Betacoronavirus is represented in all phylogenies although the order of the individual subclades varies. Alphacoronavirus is often found as the sister clade or outgroup to betacoronavirus. Deltacoronavirus or gammacoronavirus are the most distantly related to the betacoronavirus. In the nucleocapsid phylogeny, gammacoronavirus is the first outgroup clade to betacoronavirus, and alphacoronavirus is the most distant outgroup. Most NSP trees exhibit some unresolved nodes at junctures immediately preceding terminal nodes. As an effect of the 50% majority rule, most of the 546 resolved nodes are well supported with posterior probability >0.9 for 82% and >0.99 for 68% (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online). Most trees follow the NSP12 topology for the main clades, with minor clade rearrangements. It should be noted that for NSP5, the entire alphacoronavirus clade is placed within the betacoronavirus clade, as a sister clade to the MERS clade (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary fig. S1</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online). This may be due to increased sequence divergence rates or due to recombination. Recombination events are rather frequent in coronaviruses (
<xref rid="evw246-B43" ref-type="bibr">Su et al. 2016</xref>
), and the MERS clade potentially underwent multiple recombination events as part of the host change (
<xref rid="evw246-B52" ref-type="bibr">Zhang et al. 2016</xref>
).</p>
<p>The phylogenies for membrane protein, spike protein, NSP5, and NSP8–NSP16 demonstrate (with the given BLAST cutoffs) recoverable protein homologs such that all coronaviruses are represented (i.e., all coronaviruses represented in the NSP12 phylogeny). Nucleocapsid, NSP4, and NSP7 have recoverable homologs in all clades except deltacoronavirus. NSP3 and NSP6 homologs are too divergent in deltacoronavirus and/or gammacoronavirus relative to MERS-CoV. Envelope appears specific to betacoronavirus (
<xref ref-type="fig" rid="evw246-F1">fig. 1</xref>
), but it is a short protein that has been found to diverge rapidly and is likely present outside betacoronavirus (
<xref rid="evw246-B14" ref-type="bibr">Fehr and Perlman 2015</xref>
). Because different protein families yield slightly different phylogenies, for the remaining evolutionary analyses, every protein family was analyzed in the context of its own phylogeny.</p>
</sec>
<sec>
<title>Intrinsic Disorder Is Rarely Conserved</title>
<p>For all protein families, structural disorder propensities were predicted using IUPred (
<xref rid="evw246-B11" ref-type="bibr">Dosztányi et al. 2005a</xref>
,
<xref rid="evw246-B12" ref-type="bibr">2005b</xref>
) and DISOPRED2 (
<xref rid="evw246-B46" ref-type="bibr">Ward et al. 2004</xref>
). To verify the robustness of the binary IUPred and DISOPRED2 predictions, the binary assignments were compared on a site-by-site basis (
<xref ref-type="table" rid="evw246-T1">table 1</xref>
). When converted to binary (i.e., two states per site disordered or ordered) IUPred 0.4 and IUPred 0.5 are in good agreement with the larger differences seen for NSP8, NSP9, and nucleocapsid (7.5%, 6.5%, and 19.0%, respectively) (
<xref ref-type="table" rid="evw246-T1">table 1</xref>
). Comparing IUPred 0.4 or IUPred 0.5 to DISOPRED2, large differences are in particular seen for nucleocapsid (38.7% and 29.7% respectively) and NSP8 (23.5% and 25.9%, respectively) (
<xref ref-type="table" rid="evw246-T1">table 1</xref>
). For nucleocapsid, regions that are found to be disordered by IUPred 0.4 are found to be ordered by IUPred 0.5 and DISOPRED2 (
<xref ref-type="fig" rid="evw246-F2">fig. 2</xref>
and
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary fig. S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online). For NSP8, regions that are only slightly disordered in a few sequences according to IUPred 0.4 and IUPred 0.5, DISOPRED2 predicts disorder to be conserved for all sequences (
<xref ref-type="fig" rid="evw246-F3">fig. 3</xref>
).
<table-wrap id="evw246-T1" orientation="portrait" position="float">
<label>Table 1</label>
<caption>
<p>Protein Family Wide Disagreement of Disorder and Secondary Structure Predictions</p>
</caption>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col valign="top" align="left" span="1"></col>
</colgroup>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">
<inline-graphic xlink:href="evw246ie1p.jpg"></inline-graphic>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="evw246-TF6">
<label>1</label>
<p>Tukey boxplot constructed using the IUPred 0.4 predicated disorder fraction (number of disordered sites/total sites) per sequence per protein. Green dots represent outliers; red diamond are the mean and red lines are the median values.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<fig id="evw246-F2" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 2.—</label>
<caption>
<p>The evolutionary context of intrinsic disorder in nucleocapsid. The phylogenetic tree was built using the multiple sequence alignments for nucleocapsid. Here, the multiple sequence alignment is colored by disorder propensity (with gaps in gray): (
<italic>A</italic>
) IUPred 0.4, blue-to-white-to-red shows disorder propensity according to the scale for IUPred 0.4. (
<italic>B</italic>
) IUPred 0.5, blue-to-white-to-red shows disorder propensity according to the scale for IUPred 0.5. (
<italic>C</italic>
) DISOPRED2, blue-to-white-to-red shows disorder propensity according to the scale for DISOPRED2. Above the heat maps, the normalized evolutionary rates per site for amino acid substitution (SEQ) and the DOT for the binary transformations of
<italic>A</italic>
<italic>C</italic>
are shown. Heat maps visualized with the Python packages ETE3 (
<xref rid="evw246-B21" ref-type="bibr">Huerta-Cepas et al. 2016</xref>
) and Matplotlib (
<xref rid="evw246-B22" ref-type="bibr">Hunter 2007</xref>
).</p>
</caption>
<graphic xlink:href="evw246f2p"></graphic>
</fig>
<fig id="evw246-F3" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 3.—</label>
<caption>
<p>The evolutionary context of intrinsic disorder in NSP8. The phylogenetic tree was built using the multiple sequence alignments for NSP8. (
<italic>A</italic>
) The multiple sequence alignment is colored by amino acid according to scale, arranged based on TOP-IDP disorder promoting propensity of the amino acids (
<xref rid="evw246-B7" ref-type="bibr">Campen et al. 2008</xref>
), and gray denotes gaps. (
<italic>B</italic>
) IUPred disorder propensity per site in the multiple sequence alignment. Blue-to-white-to-red shows disorder propensity according to the scale for IUPred 0.4. (
<italic>C</italic>
) IUPred disorder propensity per site in the multiple sequence alignment. Blue-to-white-to-red shows disorder propensity according to the scale for IUPred 0.5. (
<italic>D</italic>
) DISOPRED2 disorder propensity per site in the multiple sequence alignment. Blue-to-white-to-red shows disorder propensity according to the scale. Above the multiple sequence alignment, the normalized evolutionary rates per site for amino acid substitution (SEQ) and the DOT for the binary transformations of
<italic>B</italic>
<italic>D</italic>
are shown. Heat maps visualized with the Python packages ETE3 (
<xref rid="evw246-B21" ref-type="bibr">Huerta-Cepas et al. 2016</xref>
) and Matplotlib (
<xref rid="evw246-B22" ref-type="bibr">Hunter 2007</xref>
). See
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary figures S2 and S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online for additional graphics for every protein family.</p>
</caption>
<graphic xlink:href="evw246f3p"></graphic>
</fig>
</p>
<p>To quantify the fraction of disordered sites per protein family, we report the IUPred 0.4 results only for simplicity (
<xref ref-type="table" rid="evw246-T1">table 1</xref>
). In general, IUPred 0.4 predicts more disorder than DISOPRED2, but several protein families have almost no disordered sites. NSP3 and NSP8-10 have some variation in disorder content for different viruses. Based on the fraction of disorder, nucleocapsid is the only highly disordered protein among the CoVs in this study, even if NSPs 8–10 have outliers that are >20% disordered.</p>
<p>To compare the disorder-to-order transition rates (DOT) for all protein families where the binary matrices of disorder and order include both states, the quadrant count ratio (QCR) was estimated as a measure of association in assigning slower than average vs. faster than average transition rates. For IUPred 0.4 vs. IUPred 0.5, for IUPred 0.5 vs DISOPRED2, and for IUPred 0.4 vs. DISOPRED2, the QCRs were 0.76, 0.69, and 0.63, respectively. This shows a strong positive association for site-specific DOT for all methods and cutoffs, with IUPred 0.4 vs. IUPred 0.5 being the strongest (
<xref ref-type="table" rid="evw246-T2">table 2</xref>
). For nucleocapsid and NSP8, the positive associations are weaker, suggesting that many sites have IUPred disorder propensity in the 0.4 to 0.5 range and large differences between IUPred and DISOPRED2, in accordance with the large disagreement between the binary assignment of these predictors (
<xref ref-type="table" rid="evw246-T1">tables 1</xref>
and
<xref ref-type="table" rid="evw246-T2">2</xref>
).
<table-wrap id="evw246-T2" orientation="portrait" position="float">
<label>Table 2</label>
<caption>
<p>QCR
<xref ref-type="table-fn" rid="evw246-TF2">
<sup>a</sup>
</xref>
<bold>for DOT and SLT</bold>
</p>
</caption>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col valign="top" align="left" span="1"></col>
<col valign="top" align="char" char="." span="1"></col>
<col valign="top" align="char" char="." span="1"></col>
<col valign="top" align="char" char="." span="1"></col>
<col valign="top" align="char" char="." span="1"></col>
</colgroup>
<thead align="left">
<tr>
<th rowspan="1" colspan="1"></th>
<th colspan="4" align="center" rowspan="1">Rate
<hr></hr>
</th>
</tr>
<tr>
<th rowspan="2" colspan="1">Protein family</th>
<th rowspan="1" colspan="1">DOT</th>
<th rowspan="1" colspan="1">DOT</th>
<th rowspan="1" colspan="1">DOT</th>
<th rowspan="1" colspan="1">SLT</th>
</tr>
<tr>
<th rowspan="1" colspan="1">IUPred 0.4 vs. IUPred 0.5</th>
<th rowspan="1" colspan="1">IUPred 0.5 vs. DISOPRED2</th>
<th rowspan="1" colspan="1">IUPred 0.4 vs. DISOPRED2</th>
<th rowspan="1" colspan="1">PSIPRED vs. JPred</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">NSP3</td>
<td rowspan="1" colspan="1">0.75</td>
<td rowspan="1" colspan="1">0.68</td>
<td rowspan="1" colspan="1">0.61</td>
<td rowspan="1" colspan="1">0.51</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP4</td>
<td rowspan="1" colspan="1">N/A
<xref ref-type="table-fn" rid="evw246-TF3">
<sup>b</sup>
</xref>
</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">0.58</td>
<td rowspan="1" colspan="1">0.61</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP5</td>
<td rowspan="1" colspan="1">0.72</td>
<td rowspan="1" colspan="1">0.84</td>
<td rowspan="1" colspan="1">0.73</td>
<td rowspan="1" colspan="1">0.65</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP6</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">0.70</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP7</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">0.9</td>
<td rowspan="1" colspan="1">0.66</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP8</td>
<td rowspan="1" colspan="1">0.86</td>
<td rowspan="1" colspan="1">0.43</td>
<td rowspan="1" colspan="1">0.38</td>
<td rowspan="1" colspan="1">0.67</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP9</td>
<td rowspan="1" colspan="1">0.76</td>
<td rowspan="1" colspan="1">0.55</td>
<td rowspan="1" colspan="1">0.62</td>
<td rowspan="1" colspan="1">0.67</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP10</td>
<td rowspan="1" colspan="1">0.93</td>
<td rowspan="1" colspan="1">0.67</td>
<td rowspan="1" colspan="1">0.68</td>
<td rowspan="1" colspan="1">0.42</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP12</td>
<td rowspan="1" colspan="1">0.96</td>
<td rowspan="1" colspan="1">0.89</td>
<td rowspan="1" colspan="1">0.86</td>
<td rowspan="1" colspan="1">0.57</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP13</td>
<td rowspan="1" colspan="1">0.76</td>
<td rowspan="1" colspan="1">0.70</td>
<td rowspan="1" colspan="1">0.59</td>
<td rowspan="1" colspan="1">0.36</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP14</td>
<td rowspan="1" colspan="1">0.87</td>
<td rowspan="1" colspan="1">0.85</td>
<td rowspan="1" colspan="1">0.8</td>
<td rowspan="1" colspan="1">0.58</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP15</td>
<td rowspan="1" colspan="1">0.82</td>
<td rowspan="1" colspan="1">0.85</td>
<td rowspan="1" colspan="1">0.67</td>
<td rowspan="1" colspan="1">0.53</td>
</tr>
<tr>
<td rowspan="1" colspan="1">NSP16</td>
<td rowspan="1" colspan="1">0.93</td>
<td rowspan="1" colspan="1">0.86</td>
<td rowspan="1" colspan="1">0.85</td>
<td rowspan="1" colspan="1">0.59</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Envelope</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">N/A</td>
<td rowspan="1" colspan="1">0.58</td>
<td rowspan="1" colspan="1">0.53</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Membrane</td>
<td rowspan="1" colspan="1">0.54</td>
<td rowspan="1" colspan="1">0.57</td>
<td rowspan="1" colspan="1">0.67</td>
<td rowspan="1" colspan="1">0.55</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Nucleocapsid</td>
<td rowspan="1" colspan="1">0.33</td>
<td rowspan="1" colspan="1">0.43</td>
<td rowspan="1" colspan="1">0.31</td>
<td rowspan="1" colspan="1">0.61</td>
</tr>
<tr>
<td rowspan="1" colspan="1">Spike</td>
<td rowspan="1" colspan="1">0.81</td>
<td rowspan="1" colspan="1">0.62</td>
<td rowspan="1" colspan="1">0.49</td>
<td rowspan="1" colspan="1">0.55</td>
</tr>
<tr>
<td rowspan="1" colspan="1">All</td>
<td rowspan="1" colspan="1">0.76</td>
<td rowspan="1" colspan="1">0.69</td>
<td rowspan="1" colspan="1">0.63</td>
<td rowspan="1" colspan="1">0.55</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="evw246-TF2">
<label>a</label>
<p>QCR: Quadrant Count Ratio measures the association for the same site-specific rate with different predictors or cutoffs.</p>
</fn>
<fn id="evw246-TF3">
<label>b</label>
<p>N/A: at least one of the rates in the comparison could not be estimated due to the lack of any disordered state in the binary state matrix (
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary fig. S5</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online).</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec>
<title>Secondary Structure Prediction and Structure-to-Loop Transitions</title>
<p>For all protein families, secondary structure elements were predicted using PSIPRED (
<xref rid="evw246-B33" ref-type="bibr">McGuffin et al. 2000</xref>
) and JPred (
<xref rid="evw246-B13" ref-type="bibr">Drozdetskiy et al. 2015</xref>
). For most protein families, the disagreement between secondary structure predictors is greater than for the disorder predictors (
<xref ref-type="table" rid="evw246-T1">table 1</xref>
). In fact, 15 of the 17 protein families compared disagree at more than 10% of alignment sites, and two of these disagree at more than 20% of sites. To compare the binary structure-to-loop transitions (SLT), QCR was estimated as a measure of association for SLT based on the different predictors. In general, there is a moderate positive association between SLT for PSIPRED vs. SLT for JPred that is weaker than for the different DOT comparisons (
<xref ref-type="table" rid="evw246-T2">table 2</xref>
). It should be noted that SLT does not differentiate between alpha helix and beta strand, but considers both as “structure.” This is a correct assumption if protein structure is conserved and consistently predicted, but for some protein families that is not the case.</p>
<p>Four protein families (NSP3, NSP12, NSP13, and SPIKE) have more than 40% of their sites found within the NC(HLS) category with non-conserved helix, strand, and loop (two or three states present at the same site) (
<xref ref-type="table" rid="evw246-T3">table 3</xref>
). For NSP13, JPred predicts 72% of all sites to be a mixture of helix, strand, and loop, or any combination of loop and one other structural element (
<xref ref-type="fig" rid="evw246-F4">fig. 4</xref>
). Envelope and NSP6 have 13% and 12% of their respective sites in the NC(HS) category. Considering only the PSIPRED predictions, the NC(HS) category has 245 sites across all 17 protein families. That is one-tenth the size of the next smallest set which is C(HS) with 2275 sites. Next, C(L) has 3344 sites, and the largest category is NC(HLS) with 4257 sites. Comparing the evolutionary sequence rates for the sites in the different categories, based on PSIPRED predictions only, reveals that sites in the C(HS) category are evolving at a slower rate than all other categories. NC(HS) is only just significantly different (
<italic>P</italic>
= 4.62E−03) from C(HS), and is not significantly different from NC(HLS) and C(L) (
<italic>P</italic>
= 1.85E−02 and
<italic>P</italic>
= 8.33E−01, respectively). However, NC(HLS) and C(L) are significantly different from each other, and both are significantly different from C(HS) (
<italic>P</italic>
= 1.82E−46 and
<italic>P</italic>
= 2.33E−21, respectively) (
<xref ref-type="fig" rid="evw246-F5">fig. 5</xref>
).
<fig id="evw246-F4" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 4.—</label>
<caption>
<p>The evolutionary context of secondary structure in NSP13. The phylogenetic trees were built using the multiple sequence alignments for NSP13. (
<italic>A</italic>
) The multiple sequence alignment is colored as in
<xref ref-type="fig" rid="evw246-F3">fig. 3</xref>
. (
<italic>B</italic>
) PSIPRED secondary structure prediction per site in the multiple sequence alignment, color coded according to the scale. (
<italic>C</italic>
) JPred secondary structure prediction per site in the multiple sequence alignment, color coded according to the scale. Above the multiple sequence alignment, the normalized evolutionary rates per site for sequence substitution (SEQ) and SLT based on the binary transformations of B-C are shown. Heat maps visualized with the Python packages ETE2 (
<xref rid="evw246-B21" ref-type="bibr">Huerta-Cepas et al. 2016</xref>
) and Matplotlib (
<xref rid="evw246-B22" ref-type="bibr">Hunter 2007</xref>
). See
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary figs. S3 and S4</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online for a complete set of graphics for every protein family.</p>
</caption>
<graphic xlink:href="evw246f4p"></graphic>
</fig>
<fig id="evw246-F5" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 5.—</label>
<caption>
<p>Comparison of SEQ at sites characterized by secondary structure. All pairwise rate distributions, except NC(HS) vs. NC(HLS) and NC(HS) vs. C(L), are significantly different (
<italic>P</italic>
< 0.05, after Bonferroni correction:
<italic>P</italic>
< 0.008). For a summary of the U statistic and two-tailed
<italic>P</italic>
values for each pairwise comparison see
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary table S2</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online.</p>
</caption>
<graphic xlink:href="evw246f5p"></graphic>
</fig>
<table-wrap id="evw246-T3" orientation="portrait" position="float">
<label>Table 3</label>
<caption>
<p>Structural Conservation of Sites Per Protein Family</p>
</caption>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col valign="top" align="left" span="1"></col>
</colgroup>
<tbody align="left">
<tr>
<td rowspan="1" colspan="1">
<inline-graphic xlink:href="evw246ie2p.jpg"></inline-graphic>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
</sec>
<sec>
<title>Identifying Target Sites</title>
<p>For regions with five or more consecutive sites that were 100% conserved in sequence across 1) all CoV or 2) across the MERS and SARS clades, the information of structural disorder prediction from IUPred and DISOPRED2 was used to identify all ungapped sites that were consistently predicted to have 100% conserved order. Next, the information of secondary structure prediction from PSIPRED and JPred was used to narrow down this list further by only including sites that are not changing their predicted secondary structure state for both predictors. Applying the aforementioned filters to the initial 10,000 sites resulted in one (1) region of five residues or more conserved across all CoV within the N-terminal domain of NSP12: DNQDL (
<xref ref-type="table" rid="evw246-T4">table 4</xref>
). Interestingly, this region is in the vicinity of sites found important for nucleotidylating activity across the order
<italic>Nidovirales</italic>
(
<xref rid="evw246-B27" ref-type="bibr">Lehmann et al. 2015</xref>
).
<table-wrap id="evw246-T4" orientation="portrait" position="float">
<label>Table 4</label>
<caption>
<p>Sites Conserved in Sequence and Structural Property</p>
</caption>
<table frame="hsides" rules="groups">
<colgroup span="1">
<col valign="top" align="left" span="1"></col>
<col valign="top" align="left" span="1"></col>
<col valign="top" align="left" span="1"></col>
</colgroup>
<thead align="left">
<tr>
<th rowspan="1" colspan="1">Protein family</th>
<th rowspan="1" colspan="1">PfamA domain</th>
<th rowspan="1" colspan="1">Conserved sites in the MSA
<xref ref-type="table-fn" rid="evw246-TF4">
<sup>a</sup>
</xref>
</th>
</tr>
</thead>
<tbody align="left">
<tr>
<td rowspan="2" align="left" colspan="1">NSP5</td>
<td rowspan="2" align="left" colspan="1">
<italic>Peptidase_C30</italic>
</td>
<td align="left" rowspan="1" colspan="1">149-GSCGS-153
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">213-AWLYAA-218
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
<tr>
<td rowspan="2" align="left" colspan="1">NSP7</td>
<td rowspan="2" align="left" colspan="1">
<italic>Replicase</italic>
</td>
<td align="left" rowspan="1" colspan="1">7-KCTSVVLL-14
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">16-VLQQL-20
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
<tr>
<td rowspan="2" align="left" colspan="1">NSP12</td>
<td rowspan="2" align="left" colspan="1">
<italic>RPol N-term</italic>
</td>
<td align="left" rowspan="1" colspan="1">228-L
<bold>
<underline>DNQDL</underline>
</bold>
NG-235</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">239-DFGDF-243</td>
</tr>
<tr>
<td rowspan="4" colspan="1"></td>
<td rowspan="4" align="left" colspan="1">
<italic>RdRP_1</italic>
</td>
<td align="left" rowspan="1" colspan="1">521-DKSAG-525</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">588-MTNRQ-592</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">677-LANECAQVL-685</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">800-
<bold>
<underline>GGT</underline>
</bold>
SSGD-706</td>
</tr>
<tr>
<td rowspan="3" colspan="1"></td>
<td rowspan="3" align="left" colspan="1">
<italic>C-term</italic>
</td>
<td align="left" rowspan="1" colspan="1">853-YPDPSR-858</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">871-KTDGT-875</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">889-
<bold>
<underline>YPL</underline>
</bold>
TK-893</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">NSP13</td>
<td align="left" rowspan="1" colspan="1">
<italic>N-term</italic>
</td>
<td align="left" rowspan="1" colspan="1">10-SQTSLR-15</td>
</tr>
<tr>
<td rowspan="2" align="left" colspan="1"></td>
<td rowspan="2" align="left" colspan="1">
<italic>AAA_30</italic>
</td>
<td align="left" rowspan="1" colspan="1">362-NALPE-366</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">402-DPAQLP-407</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1">
<italic>AAA_12</italic>
</td>
<td align="left" rowspan="1" colspan="1">539-SSQGS-543</td>
</tr>
<tr>
<td rowspan="4" colspan="1">NSP14</td>
<td rowspan="4" align="left" colspan="1">
<italic>NSP11</italic>
</td>
<td align="left" rowspan="1" colspan="1">281-AHVAS-285
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">290-MTRCLA-295
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">438-HAFHT-442
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">494-CNLGG-499
<xref ref-type="table-fn" rid="evw246-TF5">
<sup>b</sup>
</xref>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="evw246-TF4">
<label>a</label>
<p>Sites conserved across all clades in the protein family are underlined and in BOLD font. All other sites are conserved across the SARS and MERS clades.</p>
</fn>
<fn id="evw246-TF5">
<label>b</label>
<p>Experimentally determined structures are available in Protein Data Bank (
<xref rid="evw246-B2" ref-type="bibr">Berman 2000</xref>
)..</p>
</fn>
</table-wrap-foot>
</table-wrap>
</p>
<p>Considering only the sequences in the SARS and MERS clades, 21 sequence regions of five residues or more were found in seven protein families (
<xref ref-type="table" rid="evw246-T4">table 4</xref>
). For NSP5, NSP7, and NSP14, experimentally determined structures show that most regions are surface accessible (
<xref ref-type="fig" rid="evw246-F6">fig. 6</xref>
). Some of the identified target sites are known for their functional importance. For instance, C145 in the middle of GSCGS in NSP5 is part of the catalytic dyad in the NSP5 protease (
<xref rid="evw246-B50" ref-type="bibr">Yang et al. 2003</xref>
). For NSP12 and NSP13, which have the majority of all sites, no structures are available. The sites adjacent to DNQDL are also conserved in the SARS and MERS clades, and five additional target sites, conserved for the SARS and MERS clades, are found in the C-terminal direction relative to the DNQDL motif (
<xref ref-type="table" rid="evw246-T4">table 4</xref>
). Continuing into the RNA-dependent RNA polymerase domain (RdRP) in NSP12, four additional regions of target sites are found, and the last three regions are found in the C-terminal part. Importantly, in RdRP and in the C-terminal part are sites that are also conserved across all CoVs in this study. NSP13 has four regions of target sites distributed across the protein.
<fig id="evw246-F6" orientation="portrait" position="float">
<label>F
<sc>ig</sc>
. 6.—</label>
<caption>
<p>Target sites shown in 3D context.
<bold>(</bold>
<italic>A</italic>
) NSP5 dimer, based on PDB id 1UK4 (
<xref rid="evw246-B50" ref-type="bibr">Yang et al. 2003</xref>
). (
<italic>B</italic>
) NSP7, based on PDB id 5F22 (unpublished). (
<italic>C</italic>
) NSP14, based on PDB id 5C8T (
<xref rid="evw246-B30" ref-type="bibr">Ma et al. 2015</xref>
). Protein structure visualized with Bioviva Discovery Studio .</p>
</caption>
<graphic xlink:href="evw246f6p"></graphic>
</fig>
</p>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>We have analyzed the protein evolution of the genetic components that make up the MERS-CoV proteome. As previously established, MERS-CoV has the same genomic makeup as HKU4-CoV and HKU5-CoV in the MERS clade (
<xref rid="evw246-B47" ref-type="bibr">Woo et al. 2012</xref>
). Some protein products are only found in the MERS clade, and these were excluded from this study due to insufficient data. Furthermore, for other protein products, some clades may not be represented in our protein families if their proteins were too divergent. This was an important factor in determining the applied BLAST hit cutoffs, as relaxing cutoffs produced alignments with more gaps and increasing stringency reduced the representative pool. Because alignment quality is important due to the sensitivity of both Rate4Site and for phylogenetic reconstruction, the chosen cutoffs are suitable. We note some clade-specific differences in recoverable homologs between different CoV, but many components are shared among them (
<xref ref-type="fig" rid="evw246-F1">fig. 1</xref>
).</p>
<p>Viral proteins often possess multifunctionality, mediated by a conformational change in response to environment-specific factors (
<xref rid="evw246-B49" ref-type="bibr">Xue et al. 2014</xref>
). Although conformational flexibility is important for function, it also offers flexibility in what sequence motifs are on display. If these sequences are rapidly diverging, different sequence motifs will be displayed, reinforcing the notion that flexible regions are potentially important in rewiring protein–protein interactions between virus and host (
<xref rid="evw246-B18" ref-type="bibr">Gitlin et al. 2014</xref>
). Although most CoV proteins have almost no intrinsic disorder, several CoV protein families have homologous sites that display loop in some sequences, helix in others and strands in some (
<xref ref-type="table" rid="evw246-T3">table 3</xref>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">supplementary fig. S3</ext-link>
,
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary Material</ext-link>
online). These sites are not necessarily disordered but they may be conformationally flexible in real-time (with secondary structure transitions in the same sequence, making them difficult to predict) or on evolutionary time-scales (so that different secondary structure elements actually are present in different sequences). The C(HS) and C(L) sites make up approximately 50–80% of most multiple sequence alignments. With the common expectation that protein structure is more conserved than sequence these numbers are surprisingly low. Neither PSIPRED nor JPred consistently predicts the same state for 20–50% of all sites in these multiple sequence alignments.</p>
<p>The accuracy of PSIPRED and JPred’s secondary structure predictions are about 80% (
<xref rid="evw246-B6" ref-type="bibr">Bryson et al. 2005</xref>
;
<xref rid="evw246-B13" ref-type="bibr">Drozdetskiy et al. 2015</xref>
). PSIPRED has been found to rarely predict an alpha helix instead of a beta strand and vice versa, and most of the PSIPRED errors are due to secondary structure not being predicted (
<xref rid="evw246-B28" ref-type="bibr">Li et al. 2014</xref>
). When secondary structure is not conserved for the same site in a multiple sequence alignment, it suggests that the secondary structure prediction may be 1) inaccurate, 2) not predicted with high confidence, or 3) the regions are indeed metamorphic; they can transition from one element to another. Although (1) is difficult to address without experimentally determined structures for all sequences, (2) and (3) are not necessarily incompatible interpretations because low confidence secondary structure prediction could indicate metamorphic secondary structure regions. Metamorphic secondary structure regions have interesting consequences for conformational and functional flexibility.</p>
<p>It should be noted that, despite the low amount of disordered sites in most CoV proteins, several regions are not conserved in disorder propensity across all sequences, but sometimes the different predictors disagree as in the case of NSP8. Clade-specific disordered regions resulting from indel events suggest that they are not essential to the critical functions of the protein, but could cause gain-and-loss of interactions with its hosts. However, when disorder propensity is only mildly fading for a region that is present across the protein family, it may be important for the fundamental function of the protein. The virus structural proteins that interact to form the virion commonly include an envelope protein, a membrane protein, and a capsid protein that together form the machinery that encases, transports, and releases the virus. The interactions between the structural proteins are often regulated by conformational changes like VP40 in Ebola (
<xref rid="evw246-B4" ref-type="bibr">Bornholdt et al. 2013</xref>
) and Envelope protein from Dengue virus (
<xref rid="evw246-B53" ref-type="bibr">Zheng et al. 2014</xref>
). Conformational changes in these proteins are needed for the virus life cycle. For CoVs, nucleocapsid is the only structural protein that is highly disordered. Yet, rapid evolutionary dynamics of disorder is present in nucleocapsid using two different IUPred cutoffs (0.4 and 0.5) and with DISOPRED2. Even if the different predictors and cutoffs disagree somewhat where regions with rapid evolutionary dynamics are present, these patterns suggest that nucleocapsid may be rapidly changing from one virus to another. It should also be noted that two MERS clade specific inserts around position 241 and toward the C-terminal are consistently predicted to be highly disordered. With inserts and changing structural dynamics between clades or viruses, the questions become 1) which sequence motif are displayed and 2) to what extent are these sequence motifs displayed?</p>
<p>Furthermore, based on the inconsistent prediction of secondary structure elements, the possibility that CoVs are more conformationally flexible than their intrinsic disorder content implies is noteworthy. Altogether, this suggests that various mechanisms for rewiring conformational and functional space are operating in the coronaviruses studied here. If regions symptomatic of conformational and functional flexibility can be avoided in order to identify broad-specificity antiviral targets with potential to be effective against coronaviruses of today and in the future, coronaviruses as a group may become more attractive drug targets for the pharmaceutical industry in the event an additional coronavirus changes host to include humans or increase its virulence. </p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material id="PMC_1" content-type="local-data">
<caption>
<title>Supplementary Data</title>
</caption>
<media mimetype="text" mime-subtype="html" xlink:href="supp_8_11_3471__index.html"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_evw246_Accessions.pdf"></media>
<media xlink:role="associated-file" mimetype="application" mime-subtype="pdf" xlink:href="supp_evw246_supplementary_figs1-5_reduced.pdf"></media>
</supplementary-material>
</sec>
</body>
<back>
<ack>
<title>Acknowledgments</title>
<p>We thank Joseph Ahrens, Janelle Nunez-Castilla, and Helena Gomes Dos Santos for assistance in the lab and for helpful discussions. The authors would also like to acknowledge the Instructional & Research Computing Center (IRCC) at Florida International University for providing HPC computing resources that have contributed to the research results reported within this article, web:
<ext-link ext-link-type="uri" xlink:href="http://ircc.fiu.edu">http://ircc.fiu.edu</ext-link>
.</p>
</ack>
<sec sec-type="materials">
<title>Supplementary Material</title>
<p>
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">Supplementary tables S1 and S2</ext-link>
and
<ext-link ext-link-type="uri" xlink:href="http://gbe.oxfordjournals.org/lookup/suppl/doi:10.1093/gbe/evw246/-/DC1">figures S1–S5</ext-link>
are available at Genome Biology and Evolution online (
<ext-link ext-link-type="uri" xlink:href="http://www.gbe.oxfordjournals.org/">http://www.gbe. oxfordjournals.org/</ext-link>
).</p>
</sec>
<ref-list>
<title>Literature Cited</title>
<ref id="evw246-B1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Anderson</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Tong</surname>
<given-names>S.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Update on SARS research and other possibly zoonotic coronaviruses</article-title>
.
<source>Int J Antimicrob Agents</source>
<volume>36 Suppl 1</volume>
:
<fpage>S21</fpage>
<lpage>S25</lpage>
.
<pub-id pub-id-type="pmid">20801001</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Berman</surname>
<given-names>HM.</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>The Protein Data Bank</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>28</volume>
:
<fpage>235</fpage>
<lpage>242</lpage>
.
<pub-id pub-id-type="pmid">10592235</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B3">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bermingham</surname>
<given-names>A</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2012</year>
<article-title>Severe respiratory illness caused by a novel coronavirus, in a patient transferred to the United Kingdom from the Middle East, September 2012</article-title>
.
<source>Euro Surveill</source>
.
<volume>17</volume>
:
<fpage>20290.</fpage>
<pub-id pub-id-type="pmid">23078800</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B4">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bornholdt</surname>
<given-names>ZA</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2013</year>
<article-title>Structural rearrangement of ebola virus VP40 begets multiple functions in the virus life cycle</article-title>
.
<source>Cell</source>
<volume>154</volume>
:
<fpage>763</fpage>
<lpage>774</lpage>
.
<pub-id pub-id-type="pmid">23953110</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B5">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Le Breton</surname>
<given-names>M</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2011</year>
<article-title>Flavivirus NS3 and NS5 proteins interaction network: a high-throughput yeast two-hybrid screen</article-title>
.
<source>BMC Microbiol.</source>
<volume>11</volume>
:
<fpage>234.</fpage>
<pub-id pub-id-type="pmid">22014111</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B6">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bryson</surname>
<given-names>K</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2005</year>
<article-title>Protein structure prediction servers at University College London</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>33</volume>
:
<fpage>W36</fpage>
<lpage>W38</lpage>
.
<pub-id pub-id-type="pmid">15980489</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B7">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Campen</surname>
<given-names>A</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2008</year>
<article-title>TOP-IDP-scale: a new amino acid scale measuring propensity for intrinsic disorder</article-title>
.
<source>Protein Pept Lett.</source>
<volume>15</volume>
:
<fpage>956</fpage>
<lpage>963</lpage>
.
<pub-id pub-id-type="pmid">18991772</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B8">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cohen</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Ashkenazy</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Belinky</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Huchon</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Pupko</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>GLOOME: gain loss mapping engine</article-title>
.
<source>Bioinformatics</source>
<volume>26</volume>
:
<fpage>2914</fpage>
<lpage>2915</lpage>
.
<pub-id pub-id-type="pmid">20876605</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B9">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cohen</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Pupko</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2010</year>
<article-title>Inference and characterization of horizontally transferred gene families using stochastic mapping</article-title>
.
<source>Mol Biol Evol.</source>
<volume>27</volume>
:
<fpage>703</fpage>
<lpage>713</lpage>
.
<pub-id pub-id-type="pmid">19808865</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>de Groot</surname>
<given-names>RJ</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2013</year>
<article-title>Middle East respiratory syndrome coronavirus (MERS-CoV): announcement of the Coronavirus Study Group</article-title>
.
<source>J Virol</source>
<volume>87</volume>
:
<fpage>7790</fpage>
<lpage>7792</lpage>
.
<pub-id pub-id-type="pmid">23678167</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dosztányi</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Csizmok</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Tompa</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Simon</surname>
<given-names>I.</given-names>
</name>
</person-group>
<year>2005a</year>
<article-title>IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content</article-title>
.
<source>Bioinformatics</source>
<volume>21</volume>
:
<fpage>3433</fpage>
<lpage>3434</lpage>
.
<pub-id pub-id-type="pmid">15955779</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B12">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dosztányi</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Csizmók</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Tompa</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Simon</surname>
<given-names>I.</given-names>
</name>
</person-group>
<year>2005b</year>
<article-title>The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins</article-title>
.
<source>J Mol Biol.</source>
<volume>347</volume>
:
<fpage>827</fpage>
<lpage>839</lpage>
.
<pub-id pub-id-type="pmid">15769473</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Drozdetskiy</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Cole</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Procter</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Barton</surname>
<given-names>GJ.</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>JPred4: a protein secondary structure prediction server</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>43</volume>
:
<fpage>W389</fpage>
<lpage>W394</lpage>
.
<pub-id pub-id-type="pmid">25883141</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fehr</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Perlman</surname>
<given-names>S.</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>Coronaviruses: an overview of their replication and pathogenesis</article-title>
.
<source>Methods Mol Biol.</source>
<volume>1282</volume>
:
<fpage>1</fpage>
<lpage>23</lpage>
.
<pub-id pub-id-type="pmid">25720466</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Flipse</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Smit</surname>
<given-names>JM.</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>The Complexity of a Dengue Vaccine: A Review of the Human Antibody Response</article-title>
.
<source>PLoS Negl Trop Dis</source>
.
<volume>9</volume>
:
<fpage>e0003749</fpage>
.
<pub-id pub-id-type="pmid">26065421</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B16">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Giles</surname>
<given-names>BM</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>TM.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>A computationally optimized broadly reactive antigen (COBRA) based H5N1 VLP vaccine elicits broadly reactive antibodies in mice and ferrets</article-title>
.
<source>Vaccine</source>
<volume>29</volume>
:
<fpage>3043</fpage>
<lpage>3054</lpage>
.
<pub-id pub-id-type="pmid">21320540</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Giles</surname>
<given-names>BM</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>TM.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Computationally optimized antigens to overcome influenza viral diversity</article-title>
.
<source>Expert Rev Vaccines</source>
<volume>11</volume>
:
<fpage>267</fpage>
<lpage>269</lpage>
.
<pub-id pub-id-type="pmid">22380818</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gitlin</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Hagai</surname>
<given-names>T</given-names>
</name>
<name>
<surname>LaBarbera</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Solovey</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Andino</surname>
<given-names>R.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Rapid evolution of virus sequences in intrinsically disordered protein regions</article-title>
.
<source>PLoS Pathog</source>
<volume>10</volume>
:
<fpage>e1004529.</fpage>
<pub-id pub-id-type="pmid">25502394</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gralinski</surname>
<given-names>LE</given-names>
</name>
<name>
<surname>Baric</surname>
<given-names>RS.</given-names>
</name>
</person-group>
<year>2015</year>
<article-title>Molecular pathology of emerging coronavirus infections</article-title>
.
<source>J Pathol</source>
.
<volume>235</volume>
:
<fpage>185</fpage>
<lpage>195</lpage>
<pub-id pub-id-type="pmid">25270030</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huelsenbeck</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Ronquist</surname>
<given-names>F.</given-names>
</name>
</person-group>
<year>2001</year>
<article-title>MRBAYES: Bayesian inference of phylogenetic trees</article-title>
.
<source>Bioinformatics</source>
<volume>17</volume>
:
<fpage>754</fpage>
<lpage>755</lpage>
.
<pub-id pub-id-type="pmid">11524383</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B21">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huerta-Cepas</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Serra</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Bork</surname>
<given-names>P.</given-names>
</name>
</person-group>
<year>2016</year>
<article-title>ETE 3: Reconstruction, analysis, and visualization of phylogenomic data</article-title>
.
<source>Mol Biol Evol.</source>
<volume>33</volume>
:
<fpage>1635</fpage>
<lpage>1638</lpage>
. doi: 10.1093/molbev/msw046.
<pub-id pub-id-type="pmid">26921390</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B22">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hunter</surname>
<given-names>JD.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Matplotlib: a 2D graphics environment</article-title>
.
<source>Comput Sci Eng</source>
.
<volume>9</volume>
:
<fpage>90</fpage>
<lpage>95</lpage>
.</mixed-citation>
</ref>
<ref id="evw246-B23">
<mixed-citation publication-type="other">
<collab>International Committee on Taxonomy of Viruses</collab>
.
<year>2015</year>
. Virus taxonomy: classification and nomenclature of viruses: Ninth Report of the International Committee on Taxonomy of Viruses. (2012) Ed: King, A.M.Q., Adams, M.J., Carstens, E.B. and Lefkowitz, E.J. San Diego: Elsevier Academic Press.</mixed-citation>
</ref>
<ref id="evw246-B24">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jones</surname>
<given-names>DT</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>WR</given-names>
</name>
<name>
<surname>Thornton</surname>
<given-names>JM.</given-names>
</name>
</person-group>
<year>1992</year>
<article-title>The rapid generation of mutation data matrices from protein sequences</article-title>
.
<source>Comput. Appl. Biosci</source>
<volume>8</volume>
:
<fpage>275</fpage>
<lpage>282</lpage>
.
<pub-id pub-id-type="pmid">1633570</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B25">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katoh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Misawa</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Kuma</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Miyata</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2002</year>
<article-title>MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>30</volume>
:
<fpage>3059</fpage>
<lpage>3066</lpage>
.
<pub-id pub-id-type="pmid">12136088</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B26">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kesturu</surname>
<given-names>GS</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2006</year>
<article-title>Minimization of genetic distances by the consensus, ancestral, and center-of-tree (COT) sequences for HIV-1 variants within an infected individual and the design of reagents to test immune reactivity</article-title>
.
<source>Virology</source>
<volume>348</volume>
:
<fpage>437</fpage>
<lpage>448</lpage>
.
<pub-id pub-id-type="pmid">16545415</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B27">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lehmann</surname>
<given-names>KC</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2015</year>
<article-title>Discovery of an essential nucleotidylating activity associated with a newly delineated conserved domain in the RNA polymerase-containing protein of all nidoviruses</article-title>
.
<source>Nucleic Acids Res.</source>
<volume>43</volume>
:
<fpage>8416</fpage>
<lpage>8434</lpage>
.
<pub-id pub-id-type="pmid">26304538</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B28">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Dahl</surname>
<given-names>DB</given-names>
</name>
<name>
<surname>Vannucci</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hyun</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Tsai</surname>
<given-names>JW.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>Bayesian model of protein primary sequence for secondary structure prediction</article-title>
.
<source>PLoS One</source>
<volume>9</volume>
:
<fpage>e109832.</fpage>
<pub-id pub-id-type="pmid">25314659</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B29">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>G</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2015</year>
<article-title>Bat-to-human: spike features determining ‘host jump’ of coronaviruses SARS-CoV, MERS-CoV, and beyond</article-title>
.
<source>Trends Microbiol.</source>
<volume>23</volume>
:
<fpage>468</fpage>
<lpage>478</lpage>
.
<pub-id pub-id-type="pmid">26206723</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B30">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>Y</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2015</year>
<article-title>Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 complex</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>112</volume>
:
<fpage>9436</fpage>
<lpage>9441</lpage>
.
<pub-id pub-id-type="pmid">26159422</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B31">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mayrose</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Graur</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Ben-Tal</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Pupko</surname>
<given-names>T.</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior</article-title>
.
<source>Mol Biol Evol.</source>
<volume>21</volume>
:
<fpage>1781</fpage>
<lpage>1791</lpage>
.
<pub-id pub-id-type="pmid">15201400</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B32">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McCloskey</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Liang</surname>
<given-names>RH</given-names>
</name>
<name>
<surname>Harrigan</surname>
<given-names>PR</given-names>
</name>
<name>
<surname>Brumme</surname>
<given-names>ZL</given-names>
</name>
<name>
<surname>Poon</surname>
<given-names>AFY.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>An evaluation of phylogenetic methods for reconstructing transmitted HIV variants using longitudinal clonal HIV sequence data</article-title>
.
<source>J Virol</source>
.
<volume>88</volume>
:
<fpage>6181</fpage>
<lpage>6194</lpage>
.
<pub-id pub-id-type="pmid">24648453</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B33">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McGuffin</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Bryson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>DT.</given-names>
</name>
</person-group>
<year>2000</year>
<article-title>The PSIPRED protein structure prediction server</article-title>
.
<source>Bioinformatics</source>
<volume>16</volume>
:
<fpage>404</fpage>
<lpage>405</lpage>
.
<pub-id pub-id-type="pmid">10869041</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B34">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mokili</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Rohwer</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Dutilh</surname>
<given-names>BE.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Metagenomics and future perspectives in virus discovery</article-title>
.
<source>Curr Opin Virol</source>
.
<volume>2</volume>
:
<fpage>63</fpage>
<lpage>77</lpage>
.
<pub-id pub-id-type="pmid">22440968</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B35">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ortiz</surname>
<given-names>JF</given-names>
</name>
<name>
<surname>MacDonald</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Masterson</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Uversky</surname>
<given-names>VN</given-names>
</name>
<name>
<surname>Siltberg-Liberles</surname>
<given-names>J.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Rapid evolutionary dynamics of structural disorder as a potential driving force for biological divergence in flaviviruses</article-title>
.
<source>Genome Biol Evol.</source>
<volume>5</volume>
:
<fpage>504</fpage>
<lpage>513</lpage>
.
<pub-id pub-id-type="pmid">23418179</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B36">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pushker</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Mooney</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Davey</surname>
<given-names>NE</given-names>
</name>
<name>
<surname>Jacqué</surname>
<given-names>J-M</given-names>
</name>
<name>
<surname>Shields</surname>
<given-names>DC.</given-names>
</name>
</person-group>
<year>2013</year>
<article-title>Marked variability in the extent of protein disorder within and between viral families</article-title>
.
<source>PLoS One</source>
<volume>8</volume>
:
<fpage>e60724</fpage>
<pub-id pub-id-type="pmid">23620725</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B37">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reusken</surname>
<given-names>CB</given-names>
</name>
<name>
<surname>Raj</surname>
<given-names>VS</given-names>
</name>
<name>
<surname>Koopmans</surname>
<given-names>MP</given-names>
</name>
<name>
<surname>Haagmans</surname>
<given-names>BL.</given-names>
</name>
</person-group>
<year>2016</year>
<article-title>Cross host transmission in the emergence of MERS coronavirus</article-title>
.
<source>Curr Opin Virol</source>
<volume>16</volume>
:
<fpage>55</fpage>
<lpage>62</lpage>
.
<pub-id pub-id-type="pmid">26826951</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B38">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ronquist</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Huelsenbeck</surname>
<given-names>JP.</given-names>
</name>
</person-group>
<year>2003</year>
<article-title>MrBayes 3: Bayesian phylogenetic inference under mixed models</article-title>
.
<source>Bioinformatics</source>
<volume>19</volume>
:
<fpage>1572</fpage>
<lpage>1574</lpage>
.
<pub-id pub-id-type="pmid">12912839</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B39">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rosario</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Breitbart</surname>
<given-names>M.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>Exploring the viral world through metagenomics</article-title>
.
<source>Curr. Opin. Virol</source>
<volume>1</volume>
:
<fpage>289</fpage>
<lpage>297</lpage>
.
<pub-id pub-id-type="pmid">22440785</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B40">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Siltberg-Liberles</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Grahnen</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Liberles</surname>
<given-names>DA.</given-names>
</name>
</person-group>
<year>2011</year>
<article-title>The evolution of protein structures and structural Ensembles under functional constraint</article-title>
.
<source>Genes (Basel)</source>
<volume>2</volume>
:
<fpage>748</fpage>
<lpage>762</lpage>
.
<pub-id pub-id-type="pmid">24710290</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B41">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Slabinski</surname>
<given-names>L</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2007</year>
<article-title>The challenge of protein structure determination–lessons from structural genomics</article-title>
.
<source>Protein Sci.</source>
<volume>16</volume>
:
<fpage>2472</fpage>
<lpage>2482</lpage>
.
<pub-id pub-id-type="pmid">17962404</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B42">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname>
<given-names>H-D</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2005</year>
<article-title>Cross-host evolution of severe acute respiratory syndrome coronavirus in palm civet and human</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>102</volume>
:
<fpage>2430</fpage>
<lpage>2435</lpage>
.
<pub-id pub-id-type="pmid">15695582</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B43">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Su</surname>
<given-names>S</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2016</year>
<article-title>Epidemiology, genetic recombination, and pathogenesis of coronaviruses</article-title>
.
<source>Trends Microbiol.</source>
<volume>24</volume>
:
<fpage>490</fpage>
<lpage>502</lpage>
.
<pub-id pub-id-type="pmid">27012512</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B44">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van Boheemen</surname>
<given-names>S</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2012</year>
<article-title>Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans</article-title>
.
<source>MBio</source>
<volume>3</volume>
:
<fpage>e00473</fpage>
<lpage>e00412</lpage>
.
<pub-id pub-id-type="pmid">23170002</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B45">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>van der Hoek</surname>
<given-names>L.</given-names>
</name>
</person-group>
<year>2007</year>
<article-title>Human coronaviruses: what do they cause?</article-title>
<source>Antivir Ther</source>
.
<volume>12</volume>
:
<fpage>651</fpage>
<lpage>658</lpage>
.
<pub-id pub-id-type="pmid">17944272</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B46">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ward</surname>
<given-names>JJ</given-names>
</name>
<name>
<surname>McGuffin</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Bryson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Buxton</surname>
<given-names>BF</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>DT.</given-names>
</name>
</person-group>
<year>2004</year>
<article-title>The DISOPRED server for the prediction of protein disorder</article-title>
.
<source>Bioinformatics</source>
<volume>20</volume>
:
<fpage>2138</fpage>
<lpage>2139</lpage>
.
<pub-id pub-id-type="pmid">15044227</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B47">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Woo</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Lau</surname>
<given-names>SK</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>KS</given-names>
</name>
<name>
<surname>Tsang</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Yuen</surname>
<given-names>K-Y.</given-names>
</name>
</person-group>
<year>2012</year>
<article-title>Genetic relatedness of the novel human group C betacoronavirus to Tylonycteris bat coronavirus HKU4 and Pipistrellus bat coronavirus HKU5</article-title>
.
<source>Emerg Microbes Infect</source>
<volume>1</volume>
:
<fpage>e35</fpage>
.
<pub-id pub-id-type="pmid">26038405</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B48">
<mixed-citation publication-type="other">
<collab>World Health Organization</collab>
.
<year>2016</year>
WHO | Middle East respiratory syndrome coronavirus (MERS-CoV).
<ext-link ext-link-type="uri" xlink:href="http://www.who.int/emergencies/mers-cov/en/">http://www.who.int/emergencies/mers-cov/en/</ext-link>
.</mixed-citation>
</ref>
<ref id="evw246-B49">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xue</surname>
<given-names>B</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2014</year>
<article-title>Structural disorder in viral proteins</article-title>
.
<source>Chem. Rev</source>
<volume>114</volume>
:
<fpage>6880</fpage>
<lpage>6911</lpage>
.
<pub-id pub-id-type="pmid">24823319</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B50">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>H</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2003</year>
<article-title>The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor</article-title>
.
<source>Proc Natl Acad Sci U S A.</source>
<volume>100</volume>
:
<fpage>13190</fpage>
<lpage>13195</lpage>
.
<pub-id pub-id-type="pmid">14585926</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B51">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>C</given-names>
</name>
</person-group>
,
<etal></etal>
<year>2016</year>
<article-title>Structure-based inhibitor design for the intrinsically disordered protein c-Myc</article-title>
.
<source>Sci Rep</source>
.
<volume>6</volume>
:
<fpage>22298.</fpage>
<pub-id pub-id-type="pmid">26931396</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B52">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Gu</surname>
<given-names>X.</given-names>
</name>
</person-group>
<year>2016</year>
<article-title>Evolutionary dynamics of MERS-CoV: potential recombination, positive selection and transmission</article-title>
.
<source>Sci Rep</source>
.
<volume>6</volume>
:
<fpage>25049</fpage>
.
<pub-id pub-id-type="pmid">27142087</pub-id>
</mixed-citation>
</ref>
<ref id="evw246-B53">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Kleinfelter</surname>
<given-names>LM</given-names>
</name>
<name>
<surname>Kielian</surname>
<given-names>M.</given-names>
</name>
</person-group>
<year>2014</year>
<article-title>A toggle switch controls the low pH-triggered rearrangement and maturation of the dengue virus envelope proteins</article-title>
.
<source>Nat Commun.</source>
<volume>5</volume>
:
<fpage>3877</fpage>
.
<pub-id pub-id-type="pmid">24846574</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C99 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000C99 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:5203785
   |texte=   Avoiding Regions Symptomatic of Conformational and Functional Flexibility to Identify Antiviral Targets in Current and Future Coronaviruses
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:27797946" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021