Serveur d'exploration SRAS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 001029 ( Pmc/Corpus ); précédent : 0010289; suivant : 0010300 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">The R Protein of SARS-CoV: Analyses of Structure and Function Based on Four Complete Genome Sequences of Isolates BJ01-BJ04</title>
<author>
<name sortKey="Xu, Zuyuan" sort="Xu, Zuyuan" uniqKey="Xu Z" first="Zuyuan" last="Xu">Zuyuan Xu</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Haiqing" sort="Zhang, Haiqing" uniqKey="Zhang H" first="Haiqing" last="Zhang">Haiqing Zhang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tian, Xiangjun" sort="Tian, Xiangjun" uniqKey="Tian X" first="Xiangjun" last="Tian">Xiangjun Tian</name>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ji, Jia" sort="Ji, Jia" uniqKey="Ji J" first="Jia" last="Ji">Jia Ji</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Wei" sort="Li, Wei" uniqKey="Li W" first="Wei" last="Li">Wei Li</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Yan" sort="Li, Yan" uniqKey="Li Y" first="Yan" last="Li">Yan Li</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tian, Wei" sort="Tian, Wei" uniqKey="Tian W" first="Wei" last="Tian">Wei Tian</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0015">Medical College, Xi’an Jiaotong University, Xi’an 710049, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Han, Yujun" sort="Han, Yujun" uniqKey="Han Y" first="Yujun" last="Han">Yujun Han</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Lili" sort="Wang, Lili" uniqKey="Wang L" first="Lili" last="Wang">Lili Wang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Zizhang" sort="Zhang, Zizhang" uniqKey="Zhang Z" first="Zizhang" last="Zhang">Zizhang Zhang</name>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Xu, Jing" sort="Xu, Jing" uniqKey="Xu J" first="Jing" last="Xu">Jing Xu</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wei, Wei" sort="Wei, Wei" uniqKey="Wei W" first="Wei" last="Wei">Wei Wei</name>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhu, Jingui" sort="Zhu, Jingui" uniqKey="Zhu J" first="Jingui" last="Zhu">Jingui Zhu</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sun, Haiyan" sort="Sun, Haiyan" uniqKey="Sun H" first="Haiyan" last="Sun">Haiyan Sun</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Xiaowei" sort="Zhang, Xiaowei" uniqKey="Zhang X" first="Xiaowei" last="Zhang">Xiaowei Zhang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhou, Jun" sort="Zhou, Jun" uniqKey="Zhou J" first="Jun" last="Zhou">Jun Zhou</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Songgang" sort="Li, Songgang" uniqKey="Li S" first="Songgang" last="Li">Songgang Li</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0020">College of Life Sciences, Peking University, Beijing 100871, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jun" sort="Wang, Jun" uniqKey="Wang J" first="Jun" last="Wang">Jun Wang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jian" sort="Wang, Jian" uniqKey="Wang J" first="Jian" last="Wang">Jian Wang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bi, Shengli" sort="Bi, Shengli" uniqKey="Bi S" first="Shengli" last="Bi">Shengli Bi</name>
<affiliation>
<nlm:aff id="aff0025">Center of Disease Control and Prevention, Beijing 100050, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yang, Huanming" sort="Yang, Huanming" uniqKey="Yang H" first="Huanming" last="Yang">Huanming Yang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">15626345</idno>
<idno type="pmc">5172245</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5172245</idno>
<idno type="RBID">PMC:5172245</idno>
<idno type="doi">10.1016/S1672-0229(03)01019-2</idno>
<date when="2003">2003</date>
<idno type="wicri:Area/Pmc/Corpus">001029</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">001029</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">The R Protein of SARS-CoV: Analyses of Structure and Function Based on Four Complete Genome Sequences of Isolates BJ01-BJ04</title>
<author>
<name sortKey="Xu, Zuyuan" sort="Xu, Zuyuan" uniqKey="Xu Z" first="Zuyuan" last="Xu">Zuyuan Xu</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Haiqing" sort="Zhang, Haiqing" uniqKey="Zhang H" first="Haiqing" last="Zhang">Haiqing Zhang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tian, Xiangjun" sort="Tian, Xiangjun" uniqKey="Tian X" first="Xiangjun" last="Tian">Xiangjun Tian</name>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Ji, Jia" sort="Ji, Jia" uniqKey="Ji J" first="Jia" last="Ji">Jia Ji</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Wei" sort="Li, Wei" uniqKey="Li W" first="Wei" last="Li">Wei Li</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Yan" sort="Li, Yan" uniqKey="Li Y" first="Yan" last="Li">Yan Li</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Tian, Wei" sort="Tian, Wei" uniqKey="Tian W" first="Wei" last="Tian">Wei Tian</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0015">Medical College, Xi’an Jiaotong University, Xi’an 710049, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Han, Yujun" sort="Han, Yujun" uniqKey="Han Y" first="Yujun" last="Han">Yujun Han</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Lili" sort="Wang, Lili" uniqKey="Wang L" first="Lili" last="Wang">Lili Wang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Zizhang" sort="Zhang, Zizhang" uniqKey="Zhang Z" first="Zizhang" last="Zhang">Zizhang Zhang</name>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Xu, Jing" sort="Xu, Jing" uniqKey="Xu J" first="Jing" last="Xu">Jing Xu</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wei, Wei" sort="Wei, Wei" uniqKey="Wei W" first="Wei" last="Wei">Wei Wei</name>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhu, Jingui" sort="Zhu, Jingui" uniqKey="Zhu J" first="Jingui" last="Zhu">Jingui Zhu</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Sun, Haiyan" sort="Sun, Haiyan" uniqKey="Sun H" first="Haiyan" last="Sun">Haiyan Sun</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Xiaowei" sort="Zhang, Xiaowei" uniqKey="Zhang X" first="Xiaowei" last="Zhang">Xiaowei Zhang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Zhou, Jun" sort="Zhou, Jun" uniqKey="Zhou J" first="Jun" last="Zhou">Jun Zhou</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Li, Songgang" sort="Li, Songgang" uniqKey="Li S" first="Songgang" last="Li">Songgang Li</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0020">College of Life Sciences, Peking University, Beijing 100871, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jun" sort="Wang, Jun" uniqKey="Wang J" first="Jun" last="Wang">Jun Wang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jian" sort="Wang, Jian" uniqKey="Wang J" first="Jian" last="Wang">Jian Wang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Bi, Shengli" sort="Bi, Shengli" uniqKey="Bi S" first="Shengli" last="Bi">Shengli Bi</name>
<affiliation>
<nlm:aff id="aff0025">Center of Disease Control and Prevention, Beijing 100050, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Yang, Huanming" sort="Yang, Huanming" uniqKey="Yang H" first="Huanming" last="Yang">Huanming Yang</name>
<affiliation>
<nlm:aff id="aff0005">Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="aff0010">James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Genomics, Proteomics & Bioinformatics</title>
<idno type="ISSN">1672-0229</idno>
<idno type="eISSN">2210-3244</idno>
<imprint>
<date when="2003">2003</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>The R (replicase) protein is the uniquely defined non-structural protein (NSP) responsible for RNA replication, mutation rate or fidelity, regulation of transcription in coronaviruses and many other ssRNA viruses. Based on our complete genome sequences of four isolates (BJ01-BJ04) of SARS-CoV from Beijing, China, we analyzed the structure and predicted functions of the R protein in comparison with 13 other isolates of SARS-CoV and 6 other coronaviruses. The entire ORF (open-reading frame) encodes for two major enzyme activities, RNA-dependent RNA polymerase (RdRp) and proteinase activities. The R polyprotein undergoes a complex proteolytic process to produce 15 function-related peptides. A hydrophobic domain (HOD) and a hydrophilic domain (HID) are newly identified within NSP1. The substitution rate of the R protein is close to the average of the SARS-CoV genome. The functional domains in all NSPs of the R protein give different phylogenetic results that suggest their different mutation rate under selective pressure. Eleven highly conserved regions in RdRp and twelve cleavage sites by 3CLP (chymotrypsin-like protein) have been identified as potential drug targets. Findings suggest that it is possible to obtain information about the phylogeny of SARS-CoV, as well as potential tools for drug design, genotyping and diagnostics of SARS.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Cavanngh, D" uniqKey="Cavanngh D">D. Cavanngh</name>
</author>
<author>
<name sortKey="Brown, T D K" uniqKey="Brown T">T.D.K. Brown</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ziebuhr, J" uniqKey="Ziebuhr J">J. Ziebuhr</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Qin, E D" uniqKey="Qin E">E.D. Qin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brierley, I" uniqKey="Brierley I">I. Brierley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Norman, M" uniqKey="Norman M">M. Norman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Myers, E" uniqKey="Myers E">E. Myers</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W. Miller</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lim, K P" uniqKey="Lim K">K.P. Lim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ziebuhr, J" uniqKey="Ziebuhr J">J. Ziebuhr</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jens, H" uniqKey="Jens H">H. Jens</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kanjanahaluethai, A" uniqKey="Kanjanahaluethai A">A. Kanjanahaluethai</name>
</author>
<author>
<name sortKey="Baker, S C" uniqKey="Baker S">S.C. Baker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rueckert, R R" uniqKey="Rueckert R">R.R. Rueckert</name>
</author>
<author>
<name sortKey="Wimmer, E" uniqKey="Wimmer E">E. Wimmer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, D X" uniqKey="Liu D">D.X. Liu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lu, Y" uniqKey="Lu Y">Y. Lu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Anand, K" uniqKey="Anand K">K. Anand</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Der Meer" uniqKey="Van Der Meer">van der Meer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pedersen, K W" uniqKey="Pedersen K">K.W. Pedersen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raamsman, M J" uniqKey="Raamsman M">M.J. Raamsman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Clercq, E" uniqKey="De Clercq E">E. De Clercq</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Walker, M P" uniqKey="Walker M">M.P. Walker</name>
</author>
<author>
<name sortKey="Hong, Z" uniqKey="Hong Z">Z. Hong</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="De Clercq, E" uniqKey="De Clercq E">E. De Clercq</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bruenn, J A" uniqKey="Bruenn J">J.A. Bruenn</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Den Boon, J A" uniqKey="Den Boon J">J.A. Den Boon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhu, Q Y" uniqKey="Zhu Q">Q.Y. Zhu</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Genomics Proteomics Bioinformatics</journal-id>
<journal-id journal-id-type="iso-abbrev">Genomics Proteomics Bioinformatics</journal-id>
<journal-title-group>
<journal-title>Genomics, Proteomics & Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="ppub">1672-0229</issn>
<issn pub-type="epub">2210-3244</issn>
<publisher>
<publisher-name>Elsevier</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">15626345</article-id>
<article-id pub-id-type="pmc">5172245</article-id>
<article-id pub-id-type="publisher-id">S1672-0229(03)01019-2</article-id>
<article-id pub-id-type="doi">10.1016/S1672-0229(03)01019-2</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Invited Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>The R Protein of SARS-CoV: Analyses of Structure and Function Based on Four Complete Genome Sequences of Isolates BJ01-BJ04</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Xu</surname>
<given-names>Zuyuan</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="aff0010" ref-type="aff">2</xref>
<xref rid="fn1" ref-type="fn">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhang</surname>
<given-names>Haiqing</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="fn1" ref-type="fn">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tian</surname>
<given-names>Xiangjun</given-names>
</name>
<xref rid="aff0010" ref-type="aff">2</xref>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="fn1" ref-type="fn">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ji</surname>
<given-names>Jia</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="fn1" ref-type="fn">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Wei</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Yan</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tian</surname>
<given-names>Wei</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="aff0010" ref-type="aff">2</xref>
<xref rid="aff0015" ref-type="aff">3</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Han</surname>
<given-names>Yujun</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Lili</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhang</surname>
<given-names>Zizhang</given-names>
</name>
<xref rid="aff0010" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Xu</surname>
<given-names>Jing</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wei</surname>
<given-names>Wei</given-names>
</name>
<xref rid="aff0010" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhu</surname>
<given-names>Jingui</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sun</surname>
<given-names>Haiyan</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhang</surname>
<given-names>Xiaowei</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhou</surname>
<given-names>Jun</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Songgang</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="aff0020" ref-type="aff">4</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Jun</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Jian</given-names>
</name>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="aff0010" ref-type="aff">2</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bi</surname>
<given-names>Shengli</given-names>
</name>
<xref rid="aff0025" ref-type="aff">5</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yang</surname>
<given-names>Huanming</given-names>
</name>
<email>yanghm@genomics.org.cn</email>
<xref rid="aff0005" ref-type="aff">1</xref>
<xref rid="aff0010" ref-type="aff">2</xref>
<xref rid="cor1" ref-type="corresp">#</xref>
</contrib>
</contrib-group>
<aff id="aff0005">
<label>1</label>
Beijing Genomics Institute, Chinese Academy of Sciences, Beijing 101300, China</aff>
<aff id="aff0010">
<label>2</label>
James D. Watson Institute of Genome Sciences, Zhijiang Campus, Zhejiang University, Hangzhou 310008, China</aff>
<aff id="aff0015">
<label>3</label>
Medical College, Xi’an Jiaotong University, Xi’an 710049, China</aff>
<aff id="aff0020">
<label>4</label>
College of Life Sciences, Peking University, Beijing 100871, China</aff>
<aff id="aff0025">
<label>5</label>
Center of Disease Control and Prevention, Beijing 100050, China</aff>
<author-notes>
<corresp id="cor1">
<label>#</label>
Corresponding author.
<email>yanghm@genomics.org.cn</email>
</corresp>
<fn id="fn1">
<label>*</label>
<p id="ntp0050">These authors contributed equally to this work.</p>
</fn>
</author-notes>
<pub-date pub-type="pmc-release">
<day>28</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on .</pmc-comment>
<pub-date pub-type="ppub">
<month>5</month>
<year>2003</year>
</pub-date>
<pub-date pub-type="epub">
<day>28</day>
<month>11</month>
<year>2016</year>
</pub-date>
<volume>1</volume>
<issue>2</issue>
<fpage>155</fpage>
<lpage>165</lpage>
<permissions>
<copyright-statement>.</copyright-statement>
<copyright-year>2003</copyright-year>
<copyright-holder>Beijing Institute of Genomics, the Chinese Academy of Sciences and the Genetics Society of China</copyright-holder>
<license license-type="CC BY-NC-ND" xlink:href="http://creativecommons.org/licenses/by-nc-nd/4.0/">
<license-p>This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).</license-p>
</license>
</permissions>
<abstract id="ab0005">
<p>The R (replicase) protein is the uniquely defined non-structural protein (NSP) responsible for RNA replication, mutation rate or fidelity, regulation of transcription in coronaviruses and many other ssRNA viruses. Based on our complete genome sequences of four isolates (BJ01-BJ04) of SARS-CoV from Beijing, China, we analyzed the structure and predicted functions of the R protein in comparison with 13 other isolates of SARS-CoV and 6 other coronaviruses. The entire ORF (open-reading frame) encodes for two major enzyme activities, RNA-dependent RNA polymerase (RdRp) and proteinase activities. The R polyprotein undergoes a complex proteolytic process to produce 15 function-related peptides. A hydrophobic domain (HOD) and a hydrophilic domain (HID) are newly identified within NSP1. The substitution rate of the R protein is close to the average of the SARS-CoV genome. The functional domains in all NSPs of the R protein give different phylogenetic results that suggest their different mutation rate under selective pressure. Eleven highly conserved regions in RdRp and twelve cleavage sites by 3CLP (chymotrypsin-like protein) have been identified as potential drug targets. Findings suggest that it is possible to obtain information about the phylogeny of SARS-CoV, as well as potential tools for drug design, genotyping and diagnostics of SARS.</p>
</abstract>
<kwd-group id="keys0005">
<title>Key words</title>
<kwd>SARS</kwd>
<kwd>SARS-CoV</kwd>
<kwd>RNA-dependent RNA polymerase</kwd>
<kwd>RNA viruses</kwd>
<kwd>proteolysis</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s0005">
<title>Introduction</title>
<p>In the life cycle of coronaviruses, the R (replicase) protein, the largest protein of the virus, is the first translated product following the infection of host cells by the virus. It is immediately translated by host ribosomes into a large polyprotein. This protein is then post-translationally modified to generate structurally independent or associated functioning components, thus initiating RNA replication of the viral genome.</p>
<p>The R protein mainly harbors the RdRp (RNA-dependent RNA polymerase) activity for replication of the genomic RNA by producing the (-) and (+) stranded RNA molecules, generating the subgenomic (+) transcripts required for the function of all the viral structural proteins and other uncharacterized proteins
<xref rid="bib1" ref-type="bibr">1.</xref>
,
<xref rid="bib2" ref-type="bibr">2.</xref>
. In addition, it also supports the proteinase activities, namely, the main proteinase 3CLP that primarily mediates cleavage of RdRp and HEL (helicase), and accessory proteinases, such as PLP (papain-like protein), that are involved in the post-translationally proteolytic processes for other structural or non-structural proteins
<xref rid="bib3" ref-type="bibr">(
<italic>3</italic>
)</xref>
.</p>
<p>Analyses of the structure of the R protein are important because the information derived from the primary structure can be directly applied to drug design. All of the viral structural proteins are believed to be individually expressed from a nested set of coterminal subgenomic mRNAs. This expression arises through a unique discontinuous transcription mechanism, mainly involving RdRp, making it a potential target for drug inhibition. The endogenous proteolytic processing plays a prominent role in the production of the functional polypeptides. The predicted cleavage sites may be a potential peptido-mimetic substrate analogue for the protease inhibitors.</p>
<p>We report here an analysis of the R protein involving its structure, correlated enzymatic activities and other possible functions, as well as its evolution and potential medical implications, based on four complete genome sequences of the SARS-CoV isolates, BJ01-BJ04, and in comparison with 13 other published isolate genomes.</p>
</sec>
<sec id="s0010">
<title>Results</title>
<sec id="s0015">
<title>Genomic structure of the R protein</title>
<p>The whole ORF (open-reading frame) for the R polyprotein accounts for approximately two thirds of the viral genome at the 5′ end, nucleotide (nt) position 246 to 21,466
<xref rid="bib4" ref-type="bibr">(
<italic>4</italic>
)</xref>
. It theoretically encodes for a predicted protein of 7,073 residues with an estimated molecular weight of 790.28 KD. Two ORFs, ORF1ab (nt position 246-21,446) and ORF1a (nt position 246-13,394), are overlapped by a single nucleotide (cytosine) at nt position 13,379 that is the proposed site for (-1) ribosomal frameshift
<xref rid="bib4" ref-type="bibr">(
<italic>4</italic>
)</xref>
. A putative pseudo-knot structure, the main signal for the (-1) frameshift, is located immediately downstream of the conserved slippery site (
<underline>UUUAAAC</underline>
at nt position 13,380 to 13,457).</p>
<p>The ORF1ab for the R protein has an average GC content of 40.8% (A: U: C: G = 28%: 31%: 19%: 21%). The distribution of GC appears relatively even except for the most 5′ end GC-rich region which corresponds to the putative leader protein sequence that locates at ~1-800 nt of the R protein (
<xref rid="f0005" ref-type="fig">Figure 1A</xref>
)
<xref rid="bib5" ref-type="bibr">(
<italic>5</italic>
)</xref>
.</p>
<p>A substantial fraction (41.95%, 2,967/7,073) of the ORF1ab for the R protein is composed of non-polar hydrophobic residues, such as Leu (9.54%), Val (8.19%) and Ala (7.22%) (
<xref rid="t0005" ref-type="table">Table 1</xref>
), and thus it is overall weakly acidic (pI 6.3)
<xref rid="bib5" ref-type="bibr">(
<italic>5</italic>
)</xref>
. In comparison with other viral proteins, the R protein has a relatively even distribution (1.09–9.54%) of the 20 natural amino acids
<xref rid="bib5" ref-type="bibr">(
<italic>5</italic>
)</xref>
. An obvious codon usage preference was identified, that is, codon CUU accounts for approximately 30% of Leu, GUU for 40.9% of Val, and GCU for 52.4% of Ala. However, the codon usage preference shows high similarity with the R protein of five other coronaviruses we have analyzed, representing the three groups in
<italic>Coronaviridae</italic>
(
<xref rid="ec0005" ref-type="supplementary-material">Table S1</xref>
).</p>
<p>The distribution of GC content and hydrophobicity revealed three highly hydrophobic but AT-rich subregions close to the middle of the ORF. The 5′-end one (nt position ~6,500-7,100) was located immediately downstream of PLP (papain-like protein, nt position 4896
<sup>±15</sup>
-5535
<sup>±15</sup>
), but the other two (nt position ~9,000-9,500 and ~10,700-11,600) corresponded to the known HODs (hydrophobic domains) (
<xref rid="f0005" ref-type="fig">Figure 1A and 1B</xref>
). An obvious negatively charged and highly hydrophilic region of Asp- and Glu-rich, named BGI Hydrophilic Domain (BGI-HID), was identified at nt position ~2,600-3,200 (
<xref rid="f0005" ref-type="fig">Figure 1 B and 1C</xref>
).</p>
</sec>
<sec id="s0020">
<title>Localization of the function-related regions in the R protein</title>
<p>The whole ORF is composed of fifteen regions, conventionally named NSPs (non-structural proteins), which are defined by the putative cleavage sites by 3CLP and PLP, including the four known functional peptides (PLP, 3CLP, RdRp and HEL) (
<xref rid="f0010" ref-type="fig">Figure 2</xref>
). Three out of the eleven uncharacterized NSPs have predicted functions derived from their similarity to known or putative counterparts in other coronaviruses, while the function of the remaining eight has yet been totally unknown.</p>
<sec id="s0025">
<title>The RdRp activity</title>
<p>The region (NSP9) responsible for the RdRp activity is located between Codons 4,370 and 5,301 of the ORF1ab for the R protein (
<xref rid="f0010" ref-type="fig">Figure 2</xref>
;
<xref rid="t0010" ref-type="table">Table 2</xref>
). At least 11 highly homologous subregions were identified in RdRp by similarity analysis (
<xref rid="f0015" ref-type="fig">Figure 3</xref>
). The DD (double Asp) domain (Subregion H in
<xref rid="ec0005" ref-type="supplementary-material">Figure S1</xref>
), which consists of two conserved Asp residues flanked by at least five uncharged, mainly polar residues, is the most conserved region present in viral RdRp. It has been postulated to be involved in RNA binding
<xref rid="bib6" ref-type="bibr">(
<italic>6</italic>
)</xref>
.</p>
</sec>
<sec id="s0030">
<title>The Proteinase activity</title>
<p>The region for 3CLP is located between amino acid 3,241 and 3,546 of the ORF1a for the R protein, flanked by two known HODs, HD1 and HD2 (renamed as HOD1 and HOD2 to be different from the HID) (
<xref rid="f0010" ref-type="fig">Figure 2</xref>
;
<xref rid="t0010" ref-type="table">Table 2</xref>
). EMBOSS polydot and polycon demonstrated high-conserved peaks at the N-terminus. Multiple-alignment located the peaks at two segments,
<italic>“LNGLWLDD”</italic>
(Codons 27-34) and
<italic>“CPRHVI”</italic>
(Codons 38–43), and defined the putative catalytic sites, His
<sup>41</sup>
and Cys
<sup>147</sup>
(
<xref rid="f0020" ref-type="fig">Fig. 4</xref>
,
<xref rid="f0025" ref-type="fig">Fig. 5</xref>
).</p>
<p>The region for PLP is located between Codons 1,632
<sup>±5</sup>
and 1,845
<sup>±5</sup>
, and also between the newly identified BHID and BHOD in ORF1a for the R protein. It appears to be a single domain with moderate conservation by our analysis (
<xref rid="f0010" ref-type="fig">Figure 2</xref>
).</p>
</sec>
<sec id="s0035">
<title>HEL and other NSPs</title>
<p>HEL is located on NSP10 (Codons 5,302-5,902), immediately downstream of RdRp that is postulated to be associated with HEL and ATPase activities of HEL both structurally and functionally. Besides, the N-terminal encoded coronavirus-specific LP (leader protein) region appears to be homologous to the tetrahydrofolate dehydrogenase/cyclohydrolase in Codons 76-160. NSP1, in addition to PLP, is similar to the appr-1′-p processing enzyme family and zinc carboxypeptidase A metalloprotease (M14). NSP2 is found to encompass a YejX-family in DUF463. NSP7 is similar to a common growth-factor-like protein (GFL). An FtsJ-like methyltransferase, which was thought to be involved in mRNA-Cap, was identified in NSP13 at the C-terminus with low identity (
<xref rid="f0010" ref-type="fig">Figure 2</xref>
;
<xref rid="t0010" ref-type="table">Table 2</xref>
).</p>
</sec>
</sec>
<sec id="s0040">
<title>Sequence variations in coronaviruses</title>
<p>With the complete sequence of BJ01 as reference, BJ02, BJ03, and BJ04 have 8, 10, and 10 substitutions, respectively, in the R protein, and 5 (62.5%), 8 (80%), 9 (90%) of them are non-synonymous substitutions (
<xref rid="t0015" ref-type="table">Table 3</xref>
).</p>
<p>Overall, we have detected 92 substitutions in the R protein in comparison with 17 published full-length genome sequences. Approximately three quarters (70.65%, 65/92) are non-synonymous. Using the complete sequence of BJ01 as the reference, Isolate GD01 has the biggest number of variations, 29, and TW1 the smallest, 5.</p>
</sec>
<sec id="s0045">
<title>Comparative genomics of the R protein</title>
<p>Besides the 14 coronaviruses, we have only found 4 matches in GenBank (Venezuelan equine encephalitis virus, Gill-associated virus, Yellow head virus and Simian hemorrhagic fever virus), either complete or partial, with the overall ORF of the R protein. It should be noticed that the identified conserved regions of the R protein mainly contributed to the matches.</p>
<p>The essentially complete sequence of the R protein has been identified in another 6 species of
<italic>Coronaviridae</italic>
. Our comparative analysis demonstrated that the one-third N-terminal region is highly variable, in a sharp contrast with the other two-thirds, constituting a relatively stable region (
<xref rid="f0030" ref-type="fig">Figure 6</xref>
). Pair-wise global alignment was also made among 7 coronaviruses, which indicated that BCoV (Bovine Coronavirus) and MHV (Murine Hepatitis Virus) would have a relatively higher similarity index (
<xref rid="f0035" ref-type="fig">Figure 7</xref>
).</p>
<p>An unrooted phylogenetic tree based on multiple-alignment of amino acid sequences is proposed (
<xref rid="f0040" ref-type="fig">Figure 8A</xref>
). This proposed tree places SARS-CoV outside the three known groups, between Group 1 and Group 3, with genetic distance almost similar to Group 2.</p>
<p>The unrooted phylogenetic trees were also constructed with amino acid sequences of NSP1, PLP, 3CLP, RdRp, and HEL. They demonstrated that the evolution of different regions is non-synchronous (
<xref rid="f0040" ref-type="fig">Figure 8B, C, D, E, F</xref>
).</p>
</sec>
</sec>
<sec id="s0050">
<title>Discussion</title>
<sec id="s0055">
<title>Proteinase activities and drug design</title>
<p>Two regions have been identified to be responsible for the proteinase activities of the R protein (
<xref rid="f0010" ref-type="fig">Figure 2</xref>
) based on our comparative analyses of the four complete genome sequences of the BJ group and other previously published experimental data
<xref rid="bib8" ref-type="bibr">8.</xref>
,
<xref rid="bib9" ref-type="bibr">9.</xref>
,
<xref rid="bib10" ref-type="bibr">10.</xref>
,
<xref rid="bib11" ref-type="bibr">11.</xref>
. The PLP domain, which was named after the similar catalytic dyad arrangement to cellular proteinases related to papain
<xref rid="bib3" ref-type="bibr">(
<italic>3</italic>
)</xref>
, appears to be a characteristic functional domain within the putative NSP1. The possibility that it was composed of two interactive peptides as the putative PLP1 and PLP2 in MHV was excluded, based on homologous comparison with PLPs in other coronaviruses. Substrate specificities of the PLP show that it has several preferred cleavage sites in different coronaviruses
<xref rid="bib3" ref-type="bibr">(
<italic>3</italic>
)</xref>
. We have found that the structural S (spike), E (envelope), M (membrane), and N (nuleocapsid) proteins, have 18, 1, 4 and 10 possible cleavage sites, respectively. However, amino acids around those cleavage sites might also be the major determinants for recognition
<xref rid="bib10" ref-type="bibr">(
<italic>10</italic>
)</xref>
. Therefore, among the above 33 sites, we could only identify, with certainty, one cleavage site (a.a. position 39, RG|V) in the S protein and two (a.a. position 198, RG|N and a.a. position 205, RG|N) in the N protein by the PLP.</p>
<p>The 3CLP domain, which was first reported from picornavirus 3C proteinases (3C
<sup>pro</sup>
) and thus named
<xref rid="bib12" ref-type="bibr">(
<italic>12</italic>
)</xref>
, is located on NSP2. Experimental data from AIBV, MHV and HCoV-229E also suggested that the 3CLP is responsible for proteinase activities demonstrated by the R protein
<xref rid="bib9" ref-type="bibr">9.</xref>
,
<xref rid="bib13" ref-type="bibr">13.</xref>
,
<xref rid="bib14" ref-type="bibr">14.</xref>
. The recombinant main proteinase of SARS-CoV can mediate cleavage of a TGEV M
<sup>pro</sup>
(main proteinase) substrate
<xref rid="bib15" ref-type="bibr">(
<italic>15</italic>
)</xref>
. The putative catalytic sites, His
<sup>41</sup>
and Cys
<sup>147</sup>
(a.a. position of NSP2) may be associated with the functional performance of 3CLP in the replication complex
<xref rid="bib16" ref-type="bibr">16.</xref>
,
<xref rid="bib17" ref-type="bibr">17.</xref>
. By searching for the 3CLP conserved cleavage sites, we have found a single site each in the S and N proteins, but none in the E and M proteins. This is consistent with a previous study reporting that M proteins become integrated without involvement of a cleaved signal peptide
<xref rid="bib18" ref-type="bibr">(
<italic>18</italic>
)</xref>
.</p>
<p>The three HODs we have identified by overall analysis of the entire ORF are significantly hydrophobic (
<xref rid="f0005" ref-type="fig">Figure 1B</xref>
). The downstream two, HOD1 and HOD2, have been postulated to mediate the microsomal membrane association of the replication complex and to alter dramatically the architecture of host cell membranes, resulting in the optimal construction of the reaction complex for replication and translation
<xref rid="bib9" ref-type="bibr">9.</xref>
,
<xref rid="bib16" ref-type="bibr">16.</xref>
. The function of BGI-HOD, as well as the BGI-HID, requires further experimental data.</p>
<p>It has been established that HIV protease inhibitors have been designed to mimic the peptidic linkages that are cleaved by the protease
<xref rid="bib19" ref-type="bibr">(
<italic>19</italic>
)</xref>
. More detailed understanding of the proteinase activities and their localization in the R protein would provide essential information for drug screening and design.</p>
</sec>
<sec id="s0060">
<title>RdRp as a potential target for drug design</title>
<p>Our effort has been devoted to searching for homologous domains in the RdRp that might be candidate targets for future inhibitors for RdRp and potentially for drugs designed against SARS. RdRp inhibitors have been previously reported for HCV (hepatitis C virus)
<xref rid="bib20" ref-type="bibr">(
<italic>20</italic>
)</xref>
. It is noteworthy that the R protein does not have any homology in human. Therefore, to minimize the possible toxicity, the R protein may be a candidate target for a drug development.</p>
<p>We have defined 11 domains in the RdRp subregion (
<xref rid="ec0005" ref-type="supplementary-material">Figure S1</xref>
). Among them, the F, G, J and K domains are the most conservative. Analyses of hydrophobicity and electric charge show that F and G are hydrophilic with positive charge, while J and K are neutral. The conserved amino acid in these four domains and that of the other six could all contribute to drug design, based on the assumption that the conserved region is the most likely to be essential to the function of protein.</p>
<p>We also note that Ribavirin, which has recently been found to inhibit RdRp in Influenza virus, is promising for clinical treatment of SARS
<xref rid="bib21" ref-type="bibr">(
<italic>21</italic>
)</xref>
. A comparison of RdRp between Influenza virus and SARS-CoV has been performed, but no obvious sequence similarity has been identified, even three-dimensionally or functionally similar motifs or domains cannot be excluded. However, it should be noted that homologous primary structures might not share the same tertiary structure.</p>
</sec>
<sec id="s0065">
<title>Evolution of the R protein</title>
<p>The RdRp probably evolved very early because it is one of the essential proteins in all RNA viruses
<xref rid="bib22" ref-type="bibr">(
<italic>22</italic>
)</xref>
. The R protein may be the only protein that is rooted in a common ancestry of many ssRNA viruses, and may have its orthologues in virus genomes outside family
<italic>Coronaviridae</italic>
on comparative analysis. The phylogenetic relationship with the R protein has been established mainly on the basis of the conservation of the homologous RdRp domains, together with the similar polycistronic genome organization, and the use of common transcriptional and post-translational strategies of the viruses
<xref rid="bib23" ref-type="bibr">(
<italic>23</italic>
)</xref>
.</p>
<p>However, the homology analysis yields rather disappointing results. No significant homology has been revealed thus far regarding the R protein of coronaviruses. The hypothesis of the divergent evolution from a common
<italic>Nidovirales</italic>
ancestor, containing a replicase gene with an organization resembling that of the contemporary subsets of
<italic>Nidovirales</italic>
, provides the most parsimonious explanation for the observed diversity of the proteolytic enzymes
<xref rid="bib3" ref-type="bibr">(
<italic>3</italic>
)</xref>
.</p>
<p>The results we have achieved for PLP, RdRp, and HEL show they have different mutation rates, indicating that the R protein might not be an intact element in evolution. An alternative interpretation would be the non-even distribution of mutations in different regions, without the involvement of selective pressure.</p>
</sec>
<sec id="s0070">
<title>The R protein itself does not have a high rate of mutation</title>
<p>It is well known that RNA viruses have relatively higher detectable rate of mutation than DNA viruses
<xref rid="bib5" ref-type="bibr">(
<italic>5</italic>
)</xref>
. It is postulated that a dysfunction in proof reading of the RNA polymerase is responsible for the higher mution rate. The infidelity of transcription would affect both the R protein per se and other protein or functional elements.</p>
<p>It would be misleading to suggest that the substitutions of the R protein represent a large proportion of the variation detected in sequences of various isolates of SARS-CoV. As seen in our comparative analysis, it appears that the R protein accounts for 64.8% of the total number of substitutions. However, if the large size of the protein, accounting for a substantial proportion, is taken into consideration, the substitution rate of the R protein is only 0.43%, which is actually lower than the overall substitution rate of the entire SARS-CoV genome.</p>
<p>It can be preliminarily concluded that selective pressure might play a more significant role than the high rate of replication error of the R protein in maintaining the mutations that would be beneficial to the growth rate, host range, and so on. The relatively low substitution rate of the R protein itself may be a reflection of its stability in evolution.</p>
<p>However, it has not escaped from our attention that, in spite of the high rate of substitutions, none or rare indel (insertion or deletion) has been found in various isolates of SARS-CoV so far. This observation would suggest that the RdRp has a high fidelity of frame moving, even though its related region or structure has not been defined yet.</p>
</sec>
</sec>
<sec id="s0075">
<title>Materials and Methods</title>
<p>Samples and sequences: SARS-CoV samples, Isolates BJ01-BJ04, were taken from the SARS patients diagnosed in February and March 2003 in Beijing, China, according to WHO guidelines (
<ext-link ext-link-type="uri" xlink:href="http://www.who.int/csr/sars/guidelines/en/" id="ir0105">http://www.who.int/csr/sars/guidelines/en/</ext-link>
). The processing of tissue samples, inoculation into Vero-6 cell culture, virion preparation and viral RNA extraction, and RT-PCR amplification and cloning into sequencing vectors, was performed according to standard protocols at BGI and the Center of Disease Control and Prevention of China (CDC)
<xref rid="bib24" ref-type="bibr">(
<italic>24</italic>
)</xref>
. Sequencing was performed on MegaBACE 1000 (Amersham, New Jersey, USA) and ABI 377 (Applied Biosystems, California, USA).</p>
<p>The updated complete genome sequences of the BJ Group (BJ01-BJ04) have been deposited by BGI in GenBank (accession numbers: AY278488, AY278487, AY278490, AY279354) and are freely available (
<ext-link ext-link-type="uri" xlink:href="http://www.genomics.org.cn/bgi/news/zhongxin/news030416-2_fasta.htm" id="ir0205">http://www.genomics.org.cn/bgi/news/zhongxin/news030416-2_fasta.htm</ext-link>
). All the experimental materials, including all the cDNA clones representing various segments of the viral genome with known sequences, are available for collaborators.</p>
<p>Thirteen other full-length sequences of SARS-CoV genomes published from May 2003 by BGI or others were used in this study (accession numbers: AY278554, AY297028, AY274119, AY291451, AY283798, AY283797, AY283796, AY283795, AY283794, AY282752, AY278741, AY278491, AY278489). Ten coronavirus genome sequences containing the complete or partial ORF for the R protein were downloaded from GenBank and used for comparative analysis (accession numbers: NC_004718, AY274119, NC_003436, NC_002306, NC_003045, NC_002645, AF220295, NC_001846, NC_001451, M23694). The nucleotide positions of all SARS-CoV referred to the complete genome sequence of Isolate BJ01
<xref rid="bib5" ref-type="bibr">(
<italic>5</italic>
)</xref>
.</p>
<p>We used six presently available software packages for structure and function analysis. ORF Finder (
<ext-link ext-link-type="uri" xlink:href="http://ww.ncbi.nlm.nih.gov/gorf/gorf.html" id="ir0305">http://ww.ncbi.nlm.nih.gov/gorf/gorf.html</ext-link>
) was used to determine ORFs and DNA_GC_Calculator to calculate GC content (
<ext-link ext-link-type="uri" xlink:href="http://www.genome.iastate.edu/ftp/share/DNAgcCal/" id="ir0405">http://www.genome.iastate.edu/ftp/share/DNAgcCal/</ext-link>
). TopPred2 (
<ext-link ext-link-type="uri" xlink:href="http://www.sbc.su.se/~erikw/toppred2/" id="ir0605">http://www.sbc.su.se/~erikw/toppred2/</ext-link>
) was selected to predict the hydrophobic region, AnTheProt (
<ext-link ext-link-type="uri" xlink:href="http://www.bimcore.emory.edu/home/Software/NPSA/Npsa.html" id="ir0705">http://www.bimcore.emory.edu/home/Software/NPSA/Npsa.html</ext-link>
) and the EMBOSS package (
<ext-link ext-link-type="uri" xlink:href="http://www.hgmp.mrc.ac.uk/Software/EMBOSS/" id="ir0805">http://www.hgmp.mrc.ac.uk/Software/EMBOSS/</ext-link>
) to characterize the proteins, and ClustalW to perform multiple-alignment and phylogenetic analysis. All analyses mentioned above were accomplished on supercomputers DOWNING 2000/3000 (DOWNING Computers Inc., Beijing, China), SUN E10K (SUN Microsystems Inc., California, USA), SGI Origin 3800 (Silicon Graphics, Inc., California, USA), and IBM P690 (IBM Corp., New York, USA).</p>
</sec>
</body>
<back>
<ref-list id="bibliog0005">
<title>References</title>
<ref id="bib1">
<label>1.</label>
<element-citation publication-type="book" id="sbref1">
<person-group person-group-type="editor">
<name>
<surname>Cavanngh</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>T.D.K.</given-names>
</name>
</person-group>
<source>Coronaviruses and their diseases</source>
<year>1997</year>
<publisher-name>Plenum Press</publisher-name>
<publisher-loc>New York, USA</publisher-loc>
<fpage>327</fpage>
<lpage>356</lpage>
</element-citation>
</ref>
<ref id="bib2">
<label>2.</label>
<mixed-citation publication-type="other" id="othref0005">De Vries, A.F.,
<italic>et al</italic>
. The genome organization of the Nidovirales: similarities and differences between Arteri-, Toro-, and Coronaviurses.
<italic>Semin. Virol</italic>
. 8: 33–47.</mixed-citation>
</ref>
<ref id="bib3">
<label>3.</label>
<element-citation publication-type="journal" id="sbref2">
<person-group person-group-type="author">
<name>
<surname>Ziebuhr</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Virus-encoded proteinases and proteolytic processing in the Nidovirales</article-title>
<source>J. Gen. Virol.</source>
<volume>81</volume>
<year>2000</year>
<fpage>853</fpage>
<lpage>879</lpage>
<pub-id pub-id-type="pmid">10725411</pub-id>
</element-citation>
</ref>
<ref id="bib4">
<label>4.</label>
<element-citation publication-type="journal" id="sbref3">
<person-group person-group-type="author">
<name>
<surname>Qin</surname>
<given-names>E.D.</given-names>
</name>
</person-group>
<article-title>A complete sequence and comparative analysis of a SARS-associated virus (Isolate BJ01)</article-title>
<source>Chin. Sci. Bull.</source>
<volume>48</volume>
<year>2003</year>
<fpage>941</fpage>
<lpage>948</lpage>
</element-citation>
</ref>
<ref id="bib5">
<label>5.</label>
<element-citation publication-type="journal" id="sbref4">
<person-group person-group-type="author">
<name>
<surname>Brierley</surname>
<given-names>I.</given-names>
</name>
</person-group>
<article-title>Ribosomal frameshifting on viral RNAs</article-title>
<source>J. Gen. Virol.</source>
<volume>76</volume>
<year>1995</year>
<fpage>1885</fpage>
<lpage>1892</lpage>
<pub-id pub-id-type="pmid">7636469</pub-id>
</element-citation>
</ref>
<ref id="bib6">
<label>6.</label>
<element-citation publication-type="book" id="sbref5">
<person-group person-group-type="editor">
<name>
<surname>Norman</surname>
<given-names>M.</given-names>
</name>
</person-group>
<series>Oxford Surveys on Eukaryotic Genes</series>
<volume>
<italic>Vol. 5</italic>
</volume>
<year>1988</year>
<publisher-name>Oxford University Press</publisher-name>
<publisher-loc>Oxford, United Kingdom</publisher-loc>
<fpage>91</fpage>
<lpage>131</lpage>
</element-citation>
</ref>
<ref id="bib7">
<label>7.</label>
<element-citation publication-type="journal" id="sbref6">
<person-group person-group-type="author">
<name>
<surname>Myers</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>W.</given-names>
</name>
</person-group>
<article-title>Optimal alignments in linear space</article-title>
<source>CABIOS</source>
<volume>4</volume>
<year>1988</year>
<fpage>11</fpage>
<lpage>17</lpage>
<pub-id pub-id-type="pmid">3382986</pub-id>
</element-citation>
</ref>
<ref id="bib8">
<label>8.</label>
<element-citation publication-type="journal" id="sbref7">
<person-group person-group-type="author">
<name>
<surname>Lim</surname>
<given-names>K.P.</given-names>
</name>
</person-group>
<article-title>Identification of a novel cleavage of the first papain-like proteinase domain encoded by open reading frame la of the coronavirus avian infectious bronchitis virus and characterization of the cleavage products</article-title>
<source>J. Virol.</source>
<volume>74</volume>
<year>2000</year>
<fpage>1674</fpage>
<lpage>1685</lpage>
<pub-id pub-id-type="pmid">10644337</pub-id>
</element-citation>
</ref>
<ref id="bib9">
<label>9.</label>
<element-citation publication-type="journal" id="sbref8">
<person-group person-group-type="author">
<name>
<surname>Ziebuhr</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Characterization of human coronavirus (strain 229E) 3C-Like proteinase activity</article-title>
<source>J. Virol.</source>
<volume>69</volume>
<year>1995</year>
<fpage>4331</fpage>
<lpage>4338</lpage>
<pub-id pub-id-type="pmid">7769694</pub-id>
</element-citation>
</ref>
<ref id="bib10">
<label>10.</label>
<element-citation publication-type="journal" id="sbref9">
<person-group person-group-type="author">
<name>
<surname>Jens</surname>
<given-names>H.</given-names>
</name>
</person-group>
<article-title>Proteolytic processing at the amino terminus of human coronavius 229E gene 1-encoded polyproteins: identification of a papain-like proteinase and its substrate</article-title>
<source>J. Virol.</source>
<volume>72</volume>
<year>1998</year>
<fpage>910</fpage>
<lpage>918</lpage>
<pub-id pub-id-type="pmid">9444982</pub-id>
</element-citation>
</ref>
<ref id="bib11">
<label>11.</label>
<element-citation publication-type="journal" id="sbref10">
<person-group person-group-type="author">
<name>
<surname>Kanjanahaluethai</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>S.C.</given-names>
</name>
</person-group>
<article-title>Identification of Mouse Hepatitis Virus papain-like proteinase 2 activity</article-title>
<source>J. Virol.</source>
<volume>74</volume>
<year>2000</year>
<fpage>7911</fpage>
<lpage>7921</lpage>
<pub-id pub-id-type="pmid">10933699</pub-id>
</element-citation>
</ref>
<ref id="bib12">
<label>12.</label>
<element-citation publication-type="journal" id="sbref11">
<person-group person-group-type="author">
<name>
<surname>Rueckert</surname>
<given-names>R.R.</given-names>
</name>
<name>
<surname>Wimmer</surname>
<given-names>E.</given-names>
</name>
</person-group>
<article-title>Systematic nomenclature of picornavirus proteins</article-title>
<source>J. Virol.</source>
<volume>50</volume>
<year>1984</year>
<fpage>957</fpage>
<lpage>959</lpage>
<pub-id pub-id-type="pmid">6726891</pub-id>
</element-citation>
</ref>
<ref id="bib13">
<label>13.</label>
<element-citation publication-type="journal" id="sbref12">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>D.X.</given-names>
</name>
</person-group>
<article-title>A 100-kilodalton polypeptide encoded by open reading frame (ORF) lb of the coronavirus infectious bronchitis virus is processed by ORF1a products</article-title>
<source>J. Virol.</source>
<volume>68</volume>
<year>1994</year>
<fpage>5772</fpage>
<lpage>5780</lpage>
<pub-id pub-id-type="pmid">8057459</pub-id>
</element-citation>
</ref>
<ref id="bib14">
<label>14.</label>
<element-citation publication-type="journal" id="sbref13">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>Y.</given-names>
</name>
</person-group>
<article-title>Identification and characterization of a serine-like perteinase of the murine coronavirus MHV-A59</article-title>
<source>J. Virol.</source>
<volume>69</volume>
<year>1995</year>
<fpage>3554</fpage>
<lpage>3559</lpage>
<pub-id pub-id-type="pmid">7745703</pub-id>
</element-citation>
</ref>
<ref id="bib15">
<label>15.</label>
<element-citation publication-type="journal" id="sbref14">
<person-group person-group-type="author">
<name>
<surname>Anand</surname>
<given-names>K.</given-names>
</name>
</person-group>
<article-title>Coronavirus main proteinase (3CL
<sup>pro</sup>
) structure: Basis for design of anti-SARS Drugs</article-title>
<source>Science</source>
<volume>300</volume>
<year>2003</year>
<fpage>1763</fpage>
<lpage>1767</lpage>
<pub-id pub-id-type="pmid">12746549</pub-id>
</element-citation>
</ref>
<ref id="bib16">
<label>16.</label>
<element-citation publication-type="journal" id="sbref15">
<person-group person-group-type="author">
<name>
<surname>van der Meer</surname>
</name>
</person-group>
<article-title>ORFla-encoded replicase subunits are involved in the membrane association of the arterivirus replication complex</article-title>
<source>J. Virol.</source>
<volume>72</volume>
<year>1998</year>
<fpage>6689</fpage>
<lpage>6698</lpage>
<pub-id pub-id-type="pmid">9658116</pub-id>
</element-citation>
</ref>
<ref id="bib17">
<label>17.</label>
<element-citation publication-type="journal" id="sbref16">
<person-group person-group-type="author">
<name>
<surname>Pedersen</surname>
<given-names>K.W.</given-names>
</name>
</person-group>
<article-title>Open reading frame la-encoded subunits of the arterivirus replicase induce endoplasmic reticulum-derived double-membrane vesicles which carry the viral replication complex</article-title>
<source>J. Virol.</source>
<volume>73</volume>
<year>1999</year>
<fpage>2016</fpage>
<lpage>2026</lpage>
<pub-id pub-id-type="pmid">9971782</pub-id>
</element-citation>
</ref>
<ref id="bib18">
<label>18.</label>
<element-citation publication-type="journal" id="sbref17">
<person-group person-group-type="author">
<name>
<surname>Raamsman</surname>
<given-names>M.J.</given-names>
</name>
</person-group>
<article-title>Characterization of the coronavirus mouse hepatitis virus strain A59 small membrane protein E</article-title>
<source>J. Virol.</source>
<volume>74</volume>
<year>2000</year>
<fpage>2333</fpage>
<lpage>2342</lpage>
<pub-id pub-id-type="pmid">10666264</pub-id>
</element-citation>
</ref>
<ref id="bib19">
<label>19.</label>
<element-citation publication-type="journal" id="sbref18">
<person-group person-group-type="author">
<name>
<surname>De Clercq</surname>
<given-names>E.</given-names>
</name>
</person-group>
<article-title>Strategies in the design of antivral drugs</article-title>
<source>Nature Rev.</source>
<volume>1</volume>
<year>2002</year>
<fpage>13</fpage>
<lpage>24</lpage>
</element-citation>
</ref>
<ref id="bib20">
<label>20.</label>
<element-citation publication-type="journal" id="sbref19">
<person-group person-group-type="author">
<name>
<surname>Walker</surname>
<given-names>M.P.</given-names>
</name>
<name>
<surname>Hong</surname>
<given-names>Z.</given-names>
</name>
</person-group>
<article-title>HCV RNA-dependent RNA polymerase as a target for antiviral development</article-title>
<source>Curr. Opin. Pharmacol.</source>
<volume>2</volume>
<year>2002</year>
<fpage>1</fpage>
<lpage>7</lpage>
</element-citation>
</ref>
<ref id="bib21">
<label>21.</label>
<element-citation publication-type="journal" id="sbref20">
<person-group person-group-type="author">
<name>
<surname>De Clercq</surname>
<given-names>E.</given-names>
</name>
</person-group>
<article-title>Antiviral drugs: current state of the art</article-title>
<source>J. Clin. Virol.</source>
<volume>22</volume>
<year>2001</year>
<fpage>7</fpage>
<lpage>10</lpage>
</element-citation>
</ref>
<ref id="bib22">
<label>22.</label>
<element-citation publication-type="journal" id="sbref21">
<person-group person-group-type="author">
<name>
<surname>Bruenn</surname>
<given-names>J.A.</given-names>
</name>
</person-group>
<article-title>A structural and primary sequence comparison of the viral RNA-dependent RNA polymerases</article-title>
<source>Nucleic Acids Res.</source>
<volume>31</volume>
<year>2003</year>
<fpage>1821</fpage>
<lpage>1829</lpage>
<pub-id pub-id-type="pmid">12654997</pub-id>
</element-citation>
</ref>
<ref id="bib23">
<label>23.</label>
<element-citation publication-type="journal" id="sbref22">
<person-group person-group-type="author">
<name>
<surname>Den Boon</surname>
<given-names>J.A.</given-names>
</name>
</person-group>
<article-title>Processing and evolution of the N-terminal region of the arterivirus replicase ORF1a protein: identification of two papain-like cysteine proteases</article-title>
<source>J. Virol.</source>
<volume>69</volume>
<year>1991</year>
<fpage>4500</fpage>
<lpage>4505</lpage>
</element-citation>
</ref>
<ref id="bib24">
<label>24.</label>
<element-citation publication-type="journal" id="sbref23">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>Q.Y.</given-names>
</name>
</person-group>
<article-title>Isolation and identification of a novel coronavirus form patients with SARS</article-title>
<source>J. Chin. Biotech.</source>
<volume>23</volume>
<year>2003</year>
<fpage>106</fpage>
<lpage>112</lpage>
</element-citation>
</ref>
</ref-list>
<sec id="s0095" sec-type="supplementary-material">
<title>Supporting Online Material</title>
<p>
<ext-link ext-link-type="uri" xlink:href="http://www.gpbjournal.org/journal/pdf/GPB1(2)-08.htm" id="ir0005">http://www.gpbjournal.org/journal/pdf/GPB1(2)-08.htm</ext-link>
</p>
<p>
<supplementary-material content-type="local-data" id="ec0005">
<caption>
<p>Table S1</p>
<p>Figure S1</p>
</caption>
<media xlink:href="mmc1.doc"></media>
</supplementary-material>
</p>
</sec>
<ack id="ack0005">
<title>Acknowledgements</title>
<p>The authors thank the Ministry of Science and Technology of China, Chinese Academy of Sciences, and National Natural Science Foundation of China for financial support. We are indebted to collaborators and clinicians from Peking Union Medical College Hospital, National Center of Disease Control of China, the Provincial Government of Zhejiang, the Municipal Governments of Beijing and Hangzhou, and the Library of Chinese Academy of Sciences. Special gratitude is expressed here to the patients and their families for their devotion and cooperation. We appreciate the comments of Dr. Gwendolyn E. P. Zahner, visiting professor at BGI, Dr. Qimin You, Dr. Lin Hu, and other colleagues on drafts of this manuscript.</p>
</ack>
</back>
<floats-group>
<fig id="f0005">
<label>Fig. 1</label>
<caption>
<p>Diagrams of the GC content (A), hydrophobicity (B) and charge distribution (C) of the R protein. The
<italic>X</italic>
-axes stand respectively for GC-content (A), hydrophobicity score (B) and charge score (C), generated by corresponding algorithms (see
<xref rid="s0075" ref-type="sec">materials and methods</xref>
for details). The corresponding
<italic>Y</italic>
-axes stand for nt position (A) or amino acid (a.a.) position (B, C) of the R protein. The window sizes are 300 nt (A) and 100 a.a. (B, C).</p>
</caption>
<alt-text id="at0005">Fig. 1</alt-text>
<graphic xlink:href="gr1"></graphic>
</fig>
<fig id="f0010">
<label>Fig. 2</label>
<caption>
<p>Diagram of the putative function-related regions in the R protein (ORF1ab and ORP1a). Based on sequence analysis, we speculated and defined 15 regions that potentially function in SARS-CoV. 3CLP and PLP function as proteinase in the R protein. The blank triangles indicate the cleavage sites by PLP, and the solid triangles by 3CLP. The narrow black rectangles indicate the functional regions. The bottom ruler stand for the position of the amino acid of the R protein with a unit of kilo-amino acids (ka.a.). LP: leader protein. p65-LP: MHV p65 like protein. (-1) RF: (-1) ribosome frameshift. BHID: hydrophilic domain identified by BGI. BHOD: hydrophobic domain identified by BGI.</p>
</caption>
<alt-text id="at0010">Fig. 2</alt-text>
<graphic xlink:href="gr2"></graphic>
</fig>
<fig id="f0015">
<label>Fig. 3</label>
<caption>
<p>Similarity analysis of the region for RdRp (NSP9) in the R protein. The
<italic>X</italic>
-axis stands for the similarity score of the multiple-alignment, and the
<italic>Y</italic>
-axis stands for the amino acid position of the consensus sequence of RdRps. We used the sequences of RdRp from 7 coronaviruses, including SARS-CoV, to do multiple-alignment. The other 6 coronaviruses are avian infectious bronchitis virus (AIBV), bovine coronavirus (BCoV), human coronavirus 229E (HCoV-229E), murine hepatitis virus (MHV), porcine epidemic diarrhea virus (PEDV), and transmissible gastroenteritis virus (TGEV). Based on the graphic show (generated by EMBOSS-ploycon, window size = 10, see
<xref rid="s0075" ref-type="sec">materials and methods</xref>
for details) of the multiple-alignment, we highlighted 11 high-conserved subregions of the R protein, which might contribute to some important functions and can be potentially used as the target for anti-SARS drug design.</p>
</caption>
<alt-text id="at0015">Fig. 3</alt-text>
<graphic xlink:href="gr3"></graphic>
</fig>
<fig id="f0020">
<label>Fig. 4</label>
<caption>
<p>Similarity analysis and conserved subregion in 3CLP (NSP2). Based on the multiple-alignment of 3CLP from seven coronaviruses that are similar to the samples used in
<xref rid="f0015" ref-type="fig">Fig. 3</xref>
, the diagram A (generated by EMBOSS-polycon, window size = 10, see
<xref rid="s0075" ref-type="sec">materials and methods</xref>
for details) shows the most similar subregions
<italic>a</italic>
and
<italic>b</italic>
of 3CLP. Based on the global pair-wise alignment of the seven 3CLPs, polydot diagram B (generated by EMBOSS-polydot, see
<xref rid="s0075" ref-type="sec">materials and methods</xref>
for details) shows the conserved regions of every pair.</p>
</caption>
<alt-text id="at0020">Fig. 4</alt-text>
<graphic xlink:href="gr4"></graphic>
</fig>
<fig id="f0025">
<label>Fig. 5</label>
<caption>
<p>Multiple alignment of the region for 3CLP (NSP2) among seven coronaviruses. 3CLP is the main proteinase of coronaviruses, with the catalytic sites His
<sup>41</sup>
and Cys
<sup>147</sup>
. The black triangles indicate the putative catalytic sites of 3CLP. The numbers above the sequences indicate the amino acid position of 3CLP. The amino acid was highlighted in different colors.</p>
</caption>
<alt-text id="at0025">Fig. 5</alt-text>
<graphic xlink:href="gr5"></graphic>
</fig>
<fig id="f0030">
<label>Fig. 6</label>
<caption>
<p>Dotplot diagram (generated by EMBOSS-doplot, window size=10, threshold=23, see
<xref rid="s0075" ref-type="sec">materials and methods</xref>
for details) of the similarity in the R protein between SARS-CoV and other 6 coronaviruses. The
<italic>X</italic>
- and
<italic>Y</italic>
-axes stand for the amino acid position of corresponding R protein. (A) SARS vs MHV; (B) SARS vs BcoV; (C) SARS vs AIBV; (D) SARS vs HCoV229E; (E) SARS vs PEDV; (F) SARS vs TGEV. It is suggested that MHV and BCoV are more homologous to the SARS-CoV.</p>
</caption>
<alt-text id="at0030">Fig. 6</alt-text>
<graphic xlink:href="gr6"></graphic>
</fig>
<fig id="f0035">
<label>Fig. 7</label>
<caption>
<p>Pair-wise alignment based on amino acid sequences of the R protein among SARS-CoV and the other 6 coronaviruses. The alignment was performed by EMBOSS-stretcher (see
<xref rid="s0075" ref-type="sec">materials and methods</xref>
for details), in which Myers and Miller algorithm
<xref rid="bib7" ref-type="bibr">(
<italic>7</italic>
)</xref>
was used instead of the standard sequence global alignment, Needleman and Wunsch algorithm, only to save time and disk memory. The bold number and the normal number indicate the identity and the similarity score, respectively.</p>
</caption>
<alt-text id="at0035">Fig. 7</alt-text>
<graphic xlink:href="gr7"></graphic>
</fig>
<fig id="f0040">
<label>Fig. 8</label>
<caption>
<p>Proposed phylogenetic trees based on amino acid sequences of the R protein (A), and that of NSP1 (B), PLP (C), 3CLP (D), RdRp (E), and NSP10 (HEL) (F). All the bootstrap trees are generated by ClustlW (see
<xref rid="s0075" ref-type="sec">materials and methods</xref>
for details). The numerical value near the node of branches is the trial for bootstrap.</p>
</caption>
<alt-text id="at0040">Fig. 8</alt-text>
<graphic xlink:href="gr8"></graphic>
</fig>
<table-wrap id="t0005" position="float">
<label>Table 1</label>
<caption>
<p>General Biochemical Features of the R Protein</p>
</caption>
<alt-text id="at0045">Table 1</alt-text>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th></th>
<th>a. a.</th>
<th>No.
<xref rid="tbl1fn1" ref-type="table-fn">*</xref>
</th>
<th align="right">F
<xref rid="tbl1fn2" ref-type="table-fn">#</xref>
(%)</th>
<th></th>
<th>a.a.</th>
<th align="center">No.
<xref rid="tbl1fn1" ref-type="table-fn">*</xref>
</th>
<th align="right">F
<xref rid="tbl1fn2" ref-type="table-fn">#</xref>
(%)</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="10">Non-polar,
<break></break>
neutral
<hr></hr>
</td>
<td>Ala</td>
<td>511</td>
<td align="char">7.22</td>
<td rowspan="10">Polar,
<break></break>
neutral
<hr></hr>
</td>
<td>Ser</td>
<td align="center">458</td>
<td align="char">6.48</td>
</tr>
<tr>
<td>Val</td>
<td>579</td>
<td align="char">8.19</td>
<td>Thr</td>
<td align="center">495</td>
<td align="char">7.00</td>
</tr>
<tr>
<td>Leu</td>
<td>675</td>
<td align="char">9.54</td>
<td>Cys</td>
<td align="center">233</td>
<td align="char">3.29</td>
</tr>
<tr>
<td>Ile</td>
<td>343</td>
<td align="char">4.85</td>
<td>Tyr</td>
<td align="center">324</td>
<td align="char">4.58</td>
</tr>
<tr>
<td>Pro</td>
<td>274</td>
<td align="char">3.87</td>
<td>Asn</td>
<td align="center">366</td>
<td align="char">5.17</td>
</tr>
<tr>
<td>Phe</td>
<td>331</td>
<td align="char">4.68</td>
<td>Gln</td>
<td align="center">234</td>
<td align="char">3.31</td>
</tr>
<tr>
<td>Trp</td>
<td>77</td>
<td align="char">1.09</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Met</td>
<td>177</td>
<td align="char">2.50</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Gly</td>
<td>419</td>
<td align="char">5.92</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Total</td>
<td>2,967</td>
<td align="char">41.95</td>
<td>Total</td>
<td align="center">2,529</td>
<td align="char">35.76</td>
</tr>
<tr>
<td colspan="8">
<hr></hr>
</td>
</tr>
<tr>
<td rowspan="4">Charged,
<break></break>
negative</td>
<td>Asp</td>
<td>395</td>
<td align="char">5.58</td>
<td rowspan="4">Charged,
<break></break>
positive</td>
<td>Lys</td>
<td align="center">415</td>
<td align="char">5.87</td>
</tr>
<tr>
<td>Glu</td>
<td>348</td>
<td align="char">4.92</td>
<td>Arg</td>
<td align="center">259</td>
<td align="char">3.66</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>His</td>
<td align="center">160</td>
<td align="char">2.26</td>
</tr>
<tr>
<td>Total</td>
<td>743</td>
<td align="char">10.50</td>
<td>Total</td>
<td align="center">834</td>
<td align="char">11.79</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="tbl1fn1">
<label>*</label>
<p id="ntp0005">No.: Number of the amino acid.</p>
</fn>
</table-wrap-foot>
<table-wrap-foot>
<fn id="tbl1fn2">
<label>#</label>
<p id="ntp0010">F: Frequency in percentage of the amino acid in the R protein.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="t0010" position="float">
<label>Table 2</label>
<caption>
<p>The Location and Size of the Putative Regions of the R Protein</p>
</caption>
<alt-text id="at0050">Table 2</alt-text>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="right">Region</th>
<th align="right">Location
<xref rid="tbl2fn1" ref-type="table-fn">§</xref>
</th>
<th align="right">Size (a.a.)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="right">LP
<xref rid="tbl2fn2" ref-type="table-fn"></xref>
</td>
<td align="right">246–782</td>
<td align="right">179</td>
</tr>
<tr>
<td align="right">p65-LP
<xref rid="tbl2fn3" ref-type="table-fn"></xref>
</td>
<td align="right">783-2,669</td>
<td align="right">639</td>
</tr>
<tr>
<td align="right">PLP (NSP1)</td>
<td align="right">2,670-9,965</td>
<td align="right">2,422</td>
</tr>
<tr>
<td align="right">3CLP (NSP2)</td>
<td align="right">9,966-10,883</td>
<td align="right">306</td>
</tr>
<tr>
<td align="right">NSP3</td>
<td align="right">10,884-11,753</td>
<td align="right">290</td>
</tr>
<tr>
<td align="right">NSP4</td>
<td align="right">11,754-12,002</td>
<td align="right">83</td>
</tr>
<tr>
<td align="right">NSP5</td>
<td align="right">12,003–12,596</td>
<td align="right">198</td>
</tr>
<tr>
<td align="right">NSP6</td>
<td align="right">12,597-12,935</td>
<td align="right">113</td>
</tr>
<tr>
<td align="right">NSP7</td>
<td align="right">12,936-13,352</td>
<td align="right">139</td>
</tr>
<tr>
<td align="right">NSP8</td>
<td align="right">13,353-13,394</td>
<td align="right">13</td>
</tr>
<tr>
<td align="right">RdRp (NSP9)</td>
<td align="right">13,353-13,379 13,379-16,147</td>
<td align="right">932</td>
</tr>
<tr>
<td align="right">HEL (NSP10)</td>
<td align="right">16,148-17,950</td>
<td align="right">601</td>
</tr>
<tr>
<td align="right">NSP11</td>
<td align="right">17,951-19,531</td>
<td align="right">527</td>
</tr>
<tr>
<td align="right">NSP12</td>
<td align="right">19,532-20,569</td>
<td align="right">346</td>
</tr>
<tr>
<td align="right">NSP13</td>
<td align="right">20,570-21,466</td>
<td align="right">298</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="tbl2fn1">
<label>§</label>
<p id="ntp0015">nucleotide position of the ORF for the R protein.</p>
</fn>
</table-wrap-foot>
<table-wrap-foot>
<fn id="tbl2fn2">
<label></label>
<p id="ntp0020">LP: leader protein.</p>
</fn>
</table-wrap-foot>
<table-wrap-foot>
<fn id="tbl2fn3">
<label></label>
<p id="ntp0025">p65-LP: MHV p65 like protein.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="t0015" position="float">
<label>Table 3</label>
<caption>
<p>Substitutions in Different Regions of the R Protein</p>
</caption>
<alt-text id="at0055">Table 3</alt-text>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Regions</th>
<th align="right">Sn</th>
<th align="right">F (%)
<xref rid="tbl3fn1" ref-type="table-fn">*</xref>
</th>
<th align="right">Syn</th>
<th align="right">nSyn</th>
<th align="right">F (nSyn)(%)
<xref rid="tbl3fn2" ref-type="table-fn">#</xref>
</th>
</tr>
</thead>
<tbody>
<tr>
<td>LP
<xref rid="tbl3fn3" ref-type="table-fn">§</xref>
</td>
<td align="right">5</td>
<td align="char">0.93</td>
<td align="right">0</td>
<td align="right">5</td>
<td align="right">100</td>
</tr>
<tr>
<td>p65-LP
<xref rid="tbl3fn4" ref-type="table-fn"></xref>
</td>
<td align="right">4</td>
<td align="char">0.21</td>
<td align="right">2</td>
<td align="right">2</td>
<td align="right">50</td>
</tr>
<tr>
<td>NSP1 (PLP)</td>
<td align="right">36</td>
<td align="char">0.50</td>
<td align="right">8</td>
<td align="right">28</td>
<td align="right">77.78</td>
</tr>
<tr>
<td>NSP2 (3CLP)</td>
<td align="right">5</td>
<td align="char">0.54</td>
<td align="right">3</td>
<td align="right">2</td>
<td align="right">40</td>
</tr>
<tr>
<td>NSP3</td>
<td align="right">2</td>
<td align="char">0.23</td>
<td align="right">0</td>
<td align="right">2</td>
<td align="right">100</td>
</tr>
<tr>
<td>NSP4</td>
<td align="right">2</td>
<td align="char">0.80</td>
<td align="right">0</td>
<td align="right">2</td>
<td align="right">100</td>
</tr>
<tr>
<td>NSP5</td>
<td align="right">3</td>
<td align="char">0.51</td>
<td align="right">1</td>
<td align="right">2</td>
<td align="right">66.67</td>
</tr>
<tr>
<td>NSP6</td>
<td align="right">0</td>
<td align="char">0.00</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td>NSP7</td>
<td align="right">4</td>
<td align="char">0.96</td>
<td align="right">0</td>
<td align="right">4</td>
<td align="right">100</td>
</tr>
<tr>
<td>NSP8</td>
<td align="right">0</td>
<td align="char">0.00</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td>NSP9 (RdRp)</td>
<td align="right">4</td>
<td align="char">0.14</td>
<td align="right">0</td>
<td align="right">4</td>
<td align="right">100</td>
</tr>
<tr>
<td>NSP10 (HEL)</td>
<td align="right">7</td>
<td align="char">0.39</td>
<td align="right">4</td>
<td align="right">3</td>
<td align="right">42.86</td>
</tr>
<tr>
<td>NSP11</td>
<td align="right">6</td>
<td align="char">0.38</td>
<td align="right">4</td>
<td align="right">2</td>
<td align="right">33.33</td>
</tr>
<tr>
<td>NSP12</td>
<td align="right">4</td>
<td align="char">0.39</td>
<td align="right">2</td>
<td align="right">2</td>
<td align="right">50</td>
</tr>
<tr>
<td>NSP13</td>
<td align="right">10</td>
<td align="char">1.11</td>
<td align="right">3</td>
<td align="right">7</td>
<td align="right">70</td>
</tr>
<tr>
<td>Total</td>
<td align="right">92</td>
<td align="char">0.43</td>
<td align="right">27</td>
<td align="right">65</td>
<td align="right">70.65</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Sn: Substitutions No.; Syn:Synonymous No.; nSyn: non-Synonymous No..</p>
</fn>
</table-wrap-foot>
<table-wrap-foot>
<fn id="tbl3fn1">
<label>*</label>
<p id="ntp0030">F: Frequency in percentage of the number of substitutions in the corresponding region vs. its size in nucleotide.</p>
</fn>
</table-wrap-foot>
<table-wrap-foot>
<fn id="tbl3fn2">
<label>#</label>
<p id="ntp0035">F (nSyn): Frequency in percentage of non-synonymous substitutions.</p>
</fn>
</table-wrap-foot>
<table-wrap-foot>
<fn id="tbl3fn3">
<label>§</label>
<p id="ntp0040">LP: leader protein. An L→STOP substitution was found in the leader protein.</p>
</fn>
</table-wrap-foot>
<table-wrap-foot>
<fn id="tbl3fn4">
<label></label>
<p id="ntp0045">p65-LP: MHV p65 like protein</p>
</fn>
</table-wrap-foot>
</table-wrap>
</floats-group>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/SrasV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001029  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 001029  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    SrasV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Apr 28 14:49:16 2020. Site generation: Sat Mar 27 22:06:49 2021