Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 000F57 ( Pmc/Corpus ); précédent : 000F569; suivant : 000F580 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Virus Variation Resource – improved response to emergent viral outbreaks</title>
<author>
<name sortKey="Hatcher, Eneida L" sort="Hatcher, Eneida L" uniqKey="Hatcher E" first="Eneida L." last="Hatcher">Eneida L. Hatcher</name>
</author>
<author>
<name sortKey="Zhdanov, Sergey A" sort="Zhdanov, Sergey A" uniqKey="Zhdanov S" first="Sergey A." last="Zhdanov">Sergey A. Zhdanov</name>
</author>
<author>
<name sortKey="Bao, Yiming" sort="Bao, Yiming" uniqKey="Bao Y" first="Yiming" last="Bao">Yiming Bao</name>
</author>
<author>
<name sortKey="Blinkova, Olga" sort="Blinkova, Olga" uniqKey="Blinkova O" first="Olga" last="Blinkova">Olga Blinkova</name>
</author>
<author>
<name sortKey="Nawrocki, Eric P" sort="Nawrocki, Eric P" uniqKey="Nawrocki E" first="Eric P." last="Nawrocki">Eric P. Nawrocki</name>
</author>
<author>
<name sortKey="Ostapchuck, Yuri" sort="Ostapchuck, Yuri" uniqKey="Ostapchuck Y" first="Yuri" last="Ostapchuck">Yuri Ostapchuck</name>
</author>
<author>
<name sortKey="Sch Ffer, Alejandro A" sort="Sch Ffer, Alejandro A" uniqKey="Sch Ffer A" first="Alejandro A." last="Sch Ffer">Alejandro A. Sch Ffer</name>
</author>
<author>
<name sortKey="Brister, J Rodney" sort="Brister, J Rodney" uniqKey="Brister J" first="J. Rodney" last="Brister">J. Rodney Brister</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">27899678</idno>
<idno type="pmc">5210549</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210549</idno>
<idno type="RBID">PMC:5210549</idno>
<idno type="doi">10.1093/nar/gkw1065</idno>
<date when="2016">2016</date>
<idno type="wicri:Area/Pmc/Corpus">000F57</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000F57</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Virus Variation Resource – improved response to emergent viral outbreaks</title>
<author>
<name sortKey="Hatcher, Eneida L" sort="Hatcher, Eneida L" uniqKey="Hatcher E" first="Eneida L." last="Hatcher">Eneida L. Hatcher</name>
</author>
<author>
<name sortKey="Zhdanov, Sergey A" sort="Zhdanov, Sergey A" uniqKey="Zhdanov S" first="Sergey A." last="Zhdanov">Sergey A. Zhdanov</name>
</author>
<author>
<name sortKey="Bao, Yiming" sort="Bao, Yiming" uniqKey="Bao Y" first="Yiming" last="Bao">Yiming Bao</name>
</author>
<author>
<name sortKey="Blinkova, Olga" sort="Blinkova, Olga" uniqKey="Blinkova O" first="Olga" last="Blinkova">Olga Blinkova</name>
</author>
<author>
<name sortKey="Nawrocki, Eric P" sort="Nawrocki, Eric P" uniqKey="Nawrocki E" first="Eric P." last="Nawrocki">Eric P. Nawrocki</name>
</author>
<author>
<name sortKey="Ostapchuck, Yuri" sort="Ostapchuck, Yuri" uniqKey="Ostapchuck Y" first="Yuri" last="Ostapchuck">Yuri Ostapchuck</name>
</author>
<author>
<name sortKey="Sch Ffer, Alejandro A" sort="Sch Ffer, Alejandro A" uniqKey="Sch Ffer A" first="Alejandro A." last="Sch Ffer">Alejandro A. Sch Ffer</name>
</author>
<author>
<name sortKey="Brister, J Rodney" sort="Brister, J Rodney" uniqKey="Brister J" first="J. Rodney" last="Brister">J. Rodney Brister</name>
</author>
</analytic>
<series>
<title level="j">Nucleic Acids Research</title>
<idno type="ISSN">0305-1048</idno>
<idno type="eISSN">1362-4962</idno>
<imprint>
<date when="2016">2016</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<title>Abstract</title>
<p>The Virus Variation Resource is a value-added viral sequence data resource hosted by the National Center for Biotechnology Information. The resource is located at
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genome/viruses/variation/">http://www.ncbi.nlm.nih.gov/genome/viruses/variation/</ext-link>
and includes modules for seven viral groups: influenza virus,
<italic>Dengue virus, West Nile virus, Ebolavirus</italic>
, MERS coronavirus,
<italic>Rotavirus A</italic>
and
<italic>Zika virus</italic>
. Each module is supported by pipelines that scan newly released GenBank records, annotate genes and proteins and parse sample descriptors and then map them to controlled vocabulary. These processes in turn support a purpose-built search interface where users can select sequences based on standardized gene, protein and metadata terms. Once sequences are selected, a suite of tools for downloading data, multi-sequence alignment and tree building supports a variety of user directed activities. This manuscript describes a series of features and functionalities recently added to the Virus Variation Resource.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Cochrane, G" uniqKey="Cochrane G">G. Cochrane</name>
</author>
<author>
<name sortKey="Karsch Mizrachi, I" uniqKey="Karsch Mizrachi I">I. Karsch-Mizrachi</name>
</author>
<author>
<name sortKey="Takagi, T" uniqKey="Takagi T">T. Takagi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Benson, D A" uniqKey="Benson D">D.A. Benson</name>
</author>
<author>
<name sortKey="Clark, K" uniqKey="Clark K">K. Clark</name>
</author>
<author>
<name sortKey="Karsch Mizrachi, I" uniqKey="Karsch Mizrachi I">I. Karsch-Mizrachi</name>
</author>
<author>
<name sortKey="Lipman, D J" uniqKey="Lipman D">D.J. Lipman</name>
</author>
<author>
<name sortKey="Ostell, J" uniqKey="Ostell J">J. Ostell</name>
</author>
<author>
<name sortKey="Sayers, E W" uniqKey="Sayers E">E.W. Sayers</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Paul, D" uniqKey="Paul D">D. Paul</name>
</author>
<author>
<name sortKey="Bartenschlager, R" uniqKey="Bartenschlager R">R. Bartenschlager</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lingala, S" uniqKey="Lingala S">S. Lingala</name>
</author>
<author>
<name sortKey="Ghany, M G" uniqKey="Ghany M">M.G. Ghany</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcvey, D S" uniqKey="Mcvey D">D.S. McVey</name>
</author>
<author>
<name sortKey="Wilson, W C" uniqKey="Wilson W">W.C. Wilson</name>
</author>
<author>
<name sortKey="Gay, C G" uniqKey="Gay C">C.G. Gay</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bavia, L" uniqKey="Bavia L">L. Bavia</name>
</author>
<author>
<name sortKey="Mosimann, A L" uniqKey="Mosimann A">A.L. Mosimann</name>
</author>
<author>
<name sortKey="Aoki, M N" uniqKey="Aoki M">M.N. Aoki</name>
</author>
<author>
<name sortKey="Duarte Dos Santos, C N" uniqKey="Duarte Dos Santos C">C.N. Duarte Dos Santos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Faggioni, G" uniqKey="Faggioni G">G. Faggioni</name>
</author>
<author>
<name sortKey="Pomponi, A" uniqKey="Pomponi A">A. Pomponi</name>
</author>
<author>
<name sortKey="De Santis, R" uniqKey="De Santis R">R. De Santis</name>
</author>
<author>
<name sortKey="Masuelli, L" uniqKey="Masuelli L">L. Masuelli</name>
</author>
<author>
<name sortKey="Ciammaruconi, A" uniqKey="Ciammaruconi A">A. Ciammaruconi</name>
</author>
<author>
<name sortKey="Monaco, F" uniqKey="Monaco F">F. Monaco</name>
</author>
<author>
<name sortKey="Di Gennaro, A" uniqKey="Di Gennaro A">A. Di Gennaro</name>
</author>
<author>
<name sortKey="Marzocchella, L" uniqKey="Marzocchella L">L. Marzocchella</name>
</author>
<author>
<name sortKey="Sambri, V" uniqKey="Sambri V">V. Sambri</name>
</author>
<author>
<name sortKey="Lelli, R" uniqKey="Lelli R">R. Lelli</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Bolotov, P" uniqKey="Bolotov P">P. Bolotov</name>
</author>
<author>
<name sortKey="Dernovoy, D" uniqKey="Dernovoy D">D. Dernovoy</name>
</author>
<author>
<name sortKey="Kiryutin, B" uniqKey="Kiryutin B">B. Kiryutin</name>
</author>
<author>
<name sortKey="Zaslavsky, L" uniqKey="Zaslavsky L">L. Zaslavsky</name>
</author>
<author>
<name sortKey="Tatusova, T" uniqKey="Tatusova T">T. Tatusova</name>
</author>
<author>
<name sortKey="Ostell, J" uniqKey="Ostell J">J. Ostell</name>
</author>
<author>
<name sortKey="Lipman, D" uniqKey="Lipman D">D. Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Foley, B" uniqKey="Foley B">B. Foley</name>
</author>
<author>
<name sortKey="Leitner, T" uniqKey="Leitner T">T. Leitner</name>
</author>
<author>
<name sortKey="Apetrei, C" uniqKey="Apetrei C">C. Apetrei</name>
</author>
<author>
<name sortKey="Hahn, B" uniqKey="Hahn B">B. Hahn</name>
</author>
<author>
<name sortKey="Mizrachi, I" uniqKey="Mizrachi I">I. Mizrachi</name>
</author>
<author>
<name sortKey="Mullins, J" uniqKey="Mullins J">J. Mullins</name>
</author>
<author>
<name sortKey="Rambaut, A" uniqKey="Rambaut A">A. Rambaut</name>
</author>
<author>
<name sortKey="Wolinsky, S" uniqKey="Wolinsky S">S. Wolinsky</name>
</author>
<author>
<name sortKey="Korber, B" uniqKey="Korber B">B. Korber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Greene, J M" uniqKey="Greene J">J.M. Greene</name>
</author>
<author>
<name sortKey="Collins, F" uniqKey="Collins F">F. Collins</name>
</author>
<author>
<name sortKey="Lefkowitz, E J" uniqKey="Lefkowitz E">E.J. Lefkowitz</name>
</author>
<author>
<name sortKey="Roos, D" uniqKey="Roos D">D. Roos</name>
</author>
<author>
<name sortKey="Scheuermann, R H" uniqKey="Scheuermann R">R.H. Scheuermann</name>
</author>
<author>
<name sortKey="Sobral, B" uniqKey="Sobral B">B. Sobral</name>
</author>
<author>
<name sortKey="Stevens, R" uniqKey="Stevens R">R. Stevens</name>
</author>
<author>
<name sortKey="White, O" uniqKey="White O">O. White</name>
</author>
<author>
<name sortKey="Di Francesco, V" uniqKey="Di Francesco V">V. Di Francesco</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pickett, B E" uniqKey="Pickett B">B.E. Pickett</name>
</author>
<author>
<name sortKey="Sadat, E L" uniqKey="Sadat E">E.L. Sadat</name>
</author>
<author>
<name sortKey="Zhang, Y" uniqKey="Zhang Y">Y. Zhang</name>
</author>
<author>
<name sortKey="Noronha, J M" uniqKey="Noronha J">J.M. Noronha</name>
</author>
<author>
<name sortKey="Squires, R B" uniqKey="Squires R">R.B. Squires</name>
</author>
<author>
<name sortKey="Hunt, V" uniqKey="Hunt V">V. Hunt</name>
</author>
<author>
<name sortKey="Liu, M" uniqKey="Liu M">M. Liu</name>
</author>
<author>
<name sortKey="Kumar, S" uniqKey="Kumar S">S. Kumar</name>
</author>
<author>
<name sortKey="Zaremba, S" uniqKey="Zaremba S">S. Zaremba</name>
</author>
<author>
<name sortKey="Gu, Z" uniqKey="Gu Z">Z. Gu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Van Doorslaer, K" uniqKey="Van Doorslaer K">K. Van Doorslaer</name>
</author>
<author>
<name sortKey="Tan, Q" uniqKey="Tan Q">Q. Tan</name>
</author>
<author>
<name sortKey="Xirasagar, S" uniqKey="Xirasagar S">S. Xirasagar</name>
</author>
<author>
<name sortKey="Bandaru, S" uniqKey="Bandaru S">S. Bandaru</name>
</author>
<author>
<name sortKey="Gopalan, V" uniqKey="Gopalan V">V. Gopalan</name>
</author>
<author>
<name sortKey="Mohamoud, Y" uniqKey="Mohamoud Y">Y. Mohamoud</name>
</author>
<author>
<name sortKey="Huyen, Y" uniqKey="Huyen Y">Y. Huyen</name>
</author>
<author>
<name sortKey="Mcbride, A A" uniqKey="Mcbride A">A.A. McBride</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Resch, W" uniqKey="Resch W">W. Resch</name>
</author>
<author>
<name sortKey="Zaslavsky, L" uniqKey="Zaslavsky L">L. Zaslavsky</name>
</author>
<author>
<name sortKey="Kiryutin, B" uniqKey="Kiryutin B">B. Kiryutin</name>
</author>
<author>
<name sortKey="Rozanov, M" uniqKey="Rozanov M">M. Rozanov</name>
</author>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Tatusova, T A" uniqKey="Tatusova T">T.A. Tatusova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brister, J R" uniqKey="Brister J">J.R. Brister</name>
</author>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Zhdanov, S A" uniqKey="Zhdanov S">S.A. Zhdanov</name>
</author>
<author>
<name sortKey="Ostapchuck, Y" uniqKey="Ostapchuck Y">Y. Ostapchuck</name>
</author>
<author>
<name sortKey="Chetvernin, V" uniqKey="Chetvernin V">V. Chetvernin</name>
</author>
<author>
<name sortKey="Kiryutin, B" uniqKey="Kiryutin B">B. Kiryutin</name>
</author>
<author>
<name sortKey="Zaslavsky, L" uniqKey="Zaslavsky L">L. Zaslavsky</name>
</author>
<author>
<name sortKey="Kimelman, M" uniqKey="Kimelman M">M. Kimelman</name>
</author>
<author>
<name sortKey="Tatusova, T A" uniqKey="Tatusova T">T.A. Tatusova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Klema, V J" uniqKey="Klema V">V.J. Klema</name>
</author>
<author>
<name sortKey="Ye, M" uniqKey="Ye M">M. Ye</name>
</author>
<author>
<name sortKey="Hindupur, A" uniqKey="Hindupur A">A. Hindupur</name>
</author>
<author>
<name sortKey="Teramoto, T" uniqKey="Teramoto T">T. Teramoto</name>
</author>
<author>
<name sortKey="Gottipati, K" uniqKey="Gottipati K">K. Gottipati</name>
</author>
<author>
<name sortKey="Padmanabhan, R" uniqKey="Padmanabhan R">R. Padmanabhan</name>
</author>
<author>
<name sortKey="Choi, K H" uniqKey="Choi K">K.H. Choi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bell, A" uniqKey="Bell A">A. Bell</name>
</author>
<author>
<name sortKey="Lewandowski, K" uniqKey="Lewandowski K">K. Lewandowski</name>
</author>
<author>
<name sortKey="Myers, R" uniqKey="Myers R">R. Myers</name>
</author>
<author>
<name sortKey="Wooldridge, D" uniqKey="Wooldridge D">D. Wooldridge</name>
</author>
<author>
<name sortKey="Aarons, E" uniqKey="Aarons E">E. Aarons</name>
</author>
<author>
<name sortKey="Simpson, A" uniqKey="Simpson A">A. Simpson</name>
</author>
<author>
<name sortKey="Vipond, R" uniqKey="Vipond R">R. Vipond</name>
</author>
<author>
<name sortKey="Jacobs, M" uniqKey="Jacobs M">M. Jacobs</name>
</author>
<author>
<name sortKey="Gharbia, S" uniqKey="Gharbia S">S. Gharbia</name>
</author>
<author>
<name sortKey="Zambon, M" uniqKey="Zambon M">M. Zambon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Agbemabiese, C A" uniqKey="Agbemabiese C">C.A. Agbemabiese</name>
</author>
<author>
<name sortKey="Nakagomi, T" uniqKey="Nakagomi T">T. Nakagomi</name>
</author>
<author>
<name sortKey="Doan, Y H" uniqKey="Doan Y">Y.H. Doan</name>
</author>
<author>
<name sortKey="Do, L P" uniqKey="Do L">L.P. Do</name>
</author>
<author>
<name sortKey="Damanka, S" uniqKey="Damanka S">S. Damanka</name>
</author>
<author>
<name sortKey="Armah, G E" uniqKey="Armah G">G.E. Armah</name>
</author>
<author>
<name sortKey="Nakagomi, O" uniqKey="Nakagomi O">O. Nakagomi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Bolotov, P" uniqKey="Bolotov P">P. Bolotov</name>
</author>
<author>
<name sortKey="Dernovoy, D" uniqKey="Dernovoy D">D. Dernovoy</name>
</author>
<author>
<name sortKey="Kiryutin, B" uniqKey="Kiryutin B">B. Kiryutin</name>
</author>
<author>
<name sortKey="Tatusova, T" uniqKey="Tatusova T">T. Tatusova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="O Leary, N A" uniqKey="O Leary N">N.A. O'Leary</name>
</author>
<author>
<name sortKey="Wright, M W" uniqKey="Wright M">M.W. Wright</name>
</author>
<author>
<name sortKey="Brister, J R" uniqKey="Brister J">J.R. Brister</name>
</author>
<author>
<name sortKey="Ciufo, S" uniqKey="Ciufo S">S. Ciufo</name>
</author>
<author>
<name sortKey="Haddad, D" uniqKey="Haddad D">D. Haddad</name>
</author>
<author>
<name sortKey="Mcveigh, R" uniqKey="Mcveigh R">R. McVeigh</name>
</author>
<author>
<name sortKey="Rajput, B" uniqKey="Rajput B">B. Rajput</name>
</author>
<author>
<name sortKey="Robbertse, B" uniqKey="Robbertse B">B. Robbertse</name>
</author>
<author>
<name sortKey="Smith White, B" uniqKey="Smith White B">B. Smith-White</name>
</author>
<author>
<name sortKey="Ako Adjei, D" uniqKey="Ako Adjei D">D. Ako-Adjei</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Matthijnssens, J" uniqKey="Matthijnssens J">J. Matthijnssens</name>
</author>
<author>
<name sortKey="Ciarlet, M" uniqKey="Ciarlet M">M. Ciarlet</name>
</author>
<author>
<name sortKey="Mcdonald, S M" uniqKey="Mcdonald S">S.M. McDonald</name>
</author>
<author>
<name sortKey="Attoui, H" uniqKey="Attoui H">H. Attoui</name>
</author>
<author>
<name sortKey="Banyai, K" uniqKey="Banyai K">K. Banyai</name>
</author>
<author>
<name sortKey="Brister, J R" uniqKey="Brister J">J.R. Brister</name>
</author>
<author>
<name sortKey="Buesa, J" uniqKey="Buesa J">J. Buesa</name>
</author>
<author>
<name sortKey="Esona, M D" uniqKey="Esona M">M.D. Esona</name>
</author>
<author>
<name sortKey="Estes, M K" uniqKey="Estes M">M.K. Estes</name>
</author>
<author>
<name sortKey="Gentsch, J R" uniqKey="Gentsch J">J.R. Gentsch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brister, J R" uniqKey="Brister J">J.R. Brister</name>
</author>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Kuiken, C" uniqKey="Kuiken C">C. Kuiken</name>
</author>
<author>
<name sortKey="Lefkowitz, E J" uniqKey="Lefkowitz E">E.J. Lefkowitz</name>
</author>
<author>
<name sortKey="Le Mercier, P" uniqKey="Le Mercier P">P. Le Mercier</name>
</author>
<author>
<name sortKey="Leplae, R" uniqKey="Leplae R">R. Leplae</name>
</author>
<author>
<name sortKey="Madupu, R" uniqKey="Madupu R">R. Madupu</name>
</author>
<author>
<name sortKey="Scheuermann, R H" uniqKey="Scheuermann R">R.H. Scheuermann</name>
</author>
<author>
<name sortKey="Schobel, S" uniqKey="Schobel S">S. Schobel</name>
</author>
<author>
<name sortKey="Seto, D" uniqKey="Seto D">D. Seto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brister, J R" uniqKey="Brister J">J.R. Brister</name>
</author>
<author>
<name sortKey="Le Mercier, P" uniqKey="Le Mercier P">P. Le Mercier</name>
</author>
<author>
<name sortKey="Hu, J C" uniqKey="Hu J">J.C. Hu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kuhn, J H" uniqKey="Kuhn J">J.H. Kuhn</name>
</author>
<author>
<name sortKey="Andersen, K G" uniqKey="Andersen K">K.G. Andersen</name>
</author>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Bavari, S" uniqKey="Bavari S">S. Bavari</name>
</author>
<author>
<name sortKey="Becker, S" uniqKey="Becker S">S. Becker</name>
</author>
<author>
<name sortKey="Bennett, R S" uniqKey="Bennett R">R.S. Bennett</name>
</author>
<author>
<name sortKey="Bergman, N H" uniqKey="Bergman N">N.H. Bergman</name>
</author>
<author>
<name sortKey="Blinkova, O" uniqKey="Blinkova O">O. Blinkova</name>
</author>
<author>
<name sortKey="Bradfute, S" uniqKey="Bradfute S">S. Bradfute</name>
</author>
<author>
<name sortKey="Brister, J R" uniqKey="Brister J">J.R. Brister</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Brister, J R" uniqKey="Brister J">J.R. Brister</name>
</author>
<author>
<name sortKey="Ako Adjei, D" uniqKey="Ako Adjei D">D. Ako-Adjei</name>
</author>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Blinkova, O" uniqKey="Blinkova O">O. Blinkova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Altschul, S F" uniqKey="Altschul S">S.F. Altschul</name>
</author>
<author>
<name sortKey="Madden, T L" uniqKey="Madden T">T.L. Madden</name>
</author>
<author>
<name sortKey="Schaffer, A A" uniqKey="Schaffer A">A.A. Schaffer</name>
</author>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J. Zhang</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z. Zhang</name>
</author>
<author>
<name sortKey="Miller, W" uniqKey="Miller W">W. Miller</name>
</author>
<author>
<name sortKey="Lipman, D J" uniqKey="Lipman D">D.J. Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nawrocki, E P" uniqKey="Nawrocki E">E.P. Nawrocki</name>
</author>
<author>
<name sortKey="Eddy, S R" uniqKey="Eddy S">S.R. Eddy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Matthijnssens, J" uniqKey="Matthijnssens J">J. Matthijnssens</name>
</author>
<author>
<name sortKey="Ciarlet, M" uniqKey="Ciarlet M">M. Ciarlet</name>
</author>
<author>
<name sortKey="Rahman, M" uniqKey="Rahman M">M. Rahman</name>
</author>
<author>
<name sortKey="Attoui, H" uniqKey="Attoui H">H. Attoui</name>
</author>
<author>
<name sortKey="Banyai, K" uniqKey="Banyai K">K. Banyai</name>
</author>
<author>
<name sortKey="Estes, M K" uniqKey="Estes M">M.K. Estes</name>
</author>
<author>
<name sortKey="Gentsch, J R" uniqKey="Gentsch J">J.R. Gentsch</name>
</author>
<author>
<name sortKey="Iturriza Gomara, M" uniqKey="Iturriza Gomara M">M. Iturriza-Gomara</name>
</author>
<author>
<name sortKey="Kirkwood, C D" uniqKey="Kirkwood C">C.D. Kirkwood</name>
</author>
<author>
<name sortKey="Martella, V" uniqKey="Martella V">V. Martella</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Maes, P" uniqKey="Maes P">P. Maes</name>
</author>
<author>
<name sortKey="Matthijnssens, J" uniqKey="Matthijnssens J">J. Matthijnssens</name>
</author>
<author>
<name sortKey="Rahman, M" uniqKey="Rahman M">M. Rahman</name>
</author>
<author>
<name sortKey="Van Ranst, M" uniqKey="Van Ranst M">M. Van Ranst</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Barrett, T" uniqKey="Barrett T">T. Barrett</name>
</author>
<author>
<name sortKey="Clark, K" uniqKey="Clark K">K. Clark</name>
</author>
<author>
<name sortKey="Gevorgyan, R" uniqKey="Gevorgyan R">R. Gevorgyan</name>
</author>
<author>
<name sortKey="Gorelenkov, V" uniqKey="Gorelenkov V">V. Gorelenkov</name>
</author>
<author>
<name sortKey="Gribov, E" uniqKey="Gribov E">E. Gribov</name>
</author>
<author>
<name sortKey="Karsch Mizrachi, I" uniqKey="Karsch Mizrachi I">I. Karsch-Mizrachi</name>
</author>
<author>
<name sortKey="Kimelman, M" uniqKey="Kimelman M">M. Kimelman</name>
</author>
<author>
<name sortKey="Pruitt, K D" uniqKey="Pruitt K">K.D. Pruitt</name>
</author>
<author>
<name sortKey="Resenchuk, S" uniqKey="Resenchuk S">S. Resenchuk</name>
</author>
<author>
<name sortKey="Tatusova, T" uniqKey="Tatusova T">T. Tatusova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kodama, Y" uniqKey="Kodama Y">Y. Kodama</name>
</author>
<author>
<name sortKey="Shumway, M" uniqKey="Shumway M">M. Shumway</name>
</author>
<author>
<name sortKey="Leinonen, R" uniqKey="Leinonen R">R. Leinonen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zaslavsky, L" uniqKey="Zaslavsky L">L. Zaslavsky</name>
</author>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y. Bao</name>
</author>
<author>
<name sortKey="Tatusova, T A" uniqKey="Tatusova T">T.A. Tatusova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, R C" uniqKey="Edgar R">R.C. Edgar</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="iso-abbrev">Nucleic Acids Res</journal-id>
<journal-id journal-id-type="publisher-id">nar</journal-id>
<journal-title-group>
<journal-title>Nucleic Acids Research</journal-title>
</journal-title-group>
<issn pub-type="ppub">0305-1048</issn>
<issn pub-type="epub">1362-4962</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">27899678</article-id>
<article-id pub-id-type="pmc">5210549</article-id>
<article-id pub-id-type="doi">10.1093/nar/gkw1065</article-id>
<article-id pub-id-type="publisher-id">gkw1065</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Database Issue</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Virus Variation Resource – improved response to emergent viral outbreaks</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Hatcher</surname>
<given-names>Eneida L.</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhdanov</surname>
<given-names>Sergey A.</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Bao</surname>
<given-names>Yiming</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Blinkova</surname>
<given-names>Olga</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Nawrocki</surname>
<given-names>Eric P.</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ostapchuck</surname>
<given-names>Yuri</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Schäffer</surname>
<given-names>Alejandro A.</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Brister</surname>
<given-names>J. Rodney</given-names>
</name>
<pmc-comment>jamesbr@ncbi.nlm.nih.gov</pmc-comment>
<xref ref-type="corresp" rid="COR1"></xref>
</contrib>
</contrib-group>
<aff id="AFF1">National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA</aff>
<author-notes>
<corresp id="COR1">
<label>*</label>
To whom correspondence should be addressed. Tel: +1 301 594 6099; Fax: +1 301 402 9651; Email:
<email>jamesbr@ncbi.nlm.nih.gov</email>
</corresp>
</author-notes>
<pub-date pub-type="ppub">
<day>04</day>
<month>1</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="epub" iso-8601-date="2016-11-28">
<day>28</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>28</day>
<month>11</month>
<year>2016</year>
</pub-date>
<pmc-comment> PMC Release delay is 0 months and 0 days and was based on the . </pmc-comment>
<volume>45</volume>
<issue>Database issue</issue>
<issue-title>Database issue</issue-title>
<fpage>D482</fpage>
<lpage>D490</lpage>
<history>
<date date-type="accepted">
<day>28</day>
<month>10</month>
<year>2016</year>
</date>
<date date-type="rev-recd">
<day>20</day>
<month>10</month>
<year>2016</year>
</date>
<date date-type="received">
<day>15</day>
<month>9</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>Published by Oxford University Press on behalf of Nucleic Acids Research 2016.</copyright-statement>
<copyright-year>2017</copyright-year>
<license license-type="us-gov">
<license-p>This work is written by (a) US Government employee(s) and is in the public domain in the US.</license-p>
</license>
<license>
<license-p>This article is made available via the PMC Open Access Subset for unrestricted re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the COVID-19 pandemic or until permissions are revoked in writing. Upon expiration of these permissions, PMC is granted a perpetual license to make this article available via PMC and Europe PMC, consistent with existing copyright protections.</license-p>
</license>
</permissions>
<self-uri xlink:href="gkw1065.pdf"></self-uri>
<abstract>
<title>Abstract</title>
<p>The Virus Variation Resource is a value-added viral sequence data resource hosted by the National Center for Biotechnology Information. The resource is located at
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genome/viruses/variation/">http://www.ncbi.nlm.nih.gov/genome/viruses/variation/</ext-link>
and includes modules for seven viral groups: influenza virus,
<italic>Dengue virus, West Nile virus, Ebolavirus</italic>
, MERS coronavirus,
<italic>Rotavirus A</italic>
and
<italic>Zika virus</italic>
. Each module is supported by pipelines that scan newly released GenBank records, annotate genes and proteins and parse sample descriptors and then map them to controlled vocabulary. These processes in turn support a purpose-built search interface where users can select sequences based on standardized gene, protein and metadata terms. Once sequences are selected, a suite of tools for downloading data, multi-sequence alignment and tree building supports a variety of user directed activities. This manuscript describes a series of features and functionalities recently added to the Virus Variation Resource.</p>
</abstract>
<counts>
<page-count count="9"></page-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="SEC1">
<title>INTRODUCTION</title>
<p>Genome sequences have the potential to define evolutionary relationships, elucidate disease determinants and inform public health policy decisions. The public databases that comprise the International Nucleotide Sequence Database Consortium (INSDC) are an invaluable resource to a variety of genome-related sequence analysis projects (
<xref rid="B1" ref-type="bibr">1</xref>
). This collaboration between the National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute and the DNA Databank of Japan supports free and unrestricted access to stored sequence data that are maintained as part of the scientific record. As nucleotide sequencing efforts extend into the future, the archival INSDC databases will support comparisons between samples collected over generations and provide infrastructure to study the evolution and impact of viruses in real time. Despite this potential, there are fundamental issues with archival databases that can only be resolved through resources that provide enhanced data such as the NCBI Virus Variation Resource (
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genome/viruses/variation/">http://www.ncbi.nlm.nih.gov/genome/viruses/variation/</ext-link>
), which is described in this manuscript.</p>
<p>GenBank records (
<xref rid="B2" ref-type="bibr">2</xref>
) and other INSDC sequence records are archival by design, and changes to them can be made only by one of the original submitters. Hence, it is likely that the gene and protein annotations and information about the source of the sequence will remain unchanged after a sequence is deposited in an INSDC database. This is problematic because even if communities develop sequence annotation standards, the pace of biochemical and genetic research effectively guarantees that annotations become outdated as new genetic features are characterized and naming conventions change. For example, while it has been known for some time that flavivirus genomes encode a polyprotein that is cleaved into mature peptides, sometimes with two rounds of cleavage (
<xref rid="B3" ref-type="bibr">3</xref>
<xref rid="B6" ref-type="bibr">6</xref>
), recently, several flavivirus proteins have been identified that are translated (at least partially) from alternative reading frames (
<xref rid="B7" ref-type="bibr">7</xref>
). These alternative reading frame proteins and mature peptides, especially the products of the second round of cleavage, are not annotated in the vast majority of current GenBank records for flavivirus genomes.</p>
<p>The limitations of an archival database can be illustrated by considering a common way in which it might be used – to obtain all of the nucleotide sequences that encode a particular gene of interest. Take, for example, the RNA-dependent RNA polymerase (RdRp) of the Ebolavirus. One would need to know that this gene is also sometimes called L-protein or L-polymerase and search the database with all three names to find all relevant protein sequences. In addition, not all genes or proteins are annotated in all database entries, so one would still likely miss some potential sequences. Alternatively, a nucleotide BLAST search could be performed using the RdRp coding region from the Zaire ebolavirus Reference Sequence (RefSeq accession number NC_002549.1). However, when matching sequences are obtained, there would still be no indication of potential problems with the sequences, such as frameshifts, which may affect the biological function of the resulting protein. Even when an annotation pipeline is available to validate retrieved sequences, several additional steps would be needed to associate metadata, such as country of isolation or host, to the sequences.</p>
<p>Issues regarding the long term usability of sequence data were addressed in the NCBI Influenza Virus Resource (
<xref rid="B8" ref-type="bibr">8</xref>
). This resource leveraged machine processing of GenBank records, human curation and a unique search and retrieval interface to build a value-added user experience where researchers could search for sequences using defined, standardized terms (Table
<xref rid="tbl1" ref-type="table">1</xref>
). An annotation pipeline was added later to standardize gene and protein annotation and nomenclature across all sequences. This feature supports not only standardized annotation of sequences when submitted, but also provides a mechanism to update previously submitted sequences as new genes and proteins are described. In many ways, the NCBI Influenza Virus Resource paved a path for a variety of other resources that share the common goal of making viral sequence data more accessible (
<xref rid="B9" ref-type="bibr">9</xref>
<xref rid="B12" ref-type="bibr">12</xref>
). These include the NCBI Virus Variation Resource where the Influenza Virus Resource data model was extended to include dengue and West Nile viruses (
<xref rid="B13" ref-type="bibr">13</xref>
,
<xref rid="B14" ref-type="bibr">14</xref>
). While the initial release of this resource provided a range of functionalities, the necessity of in-house annotation pipelines and internally developed tools imposed long development cycles making it difficult to quickly provide new modules in response to emerging outbreaks and associated nucleotide sequencing efforts.</p>
<table-wrap id="tbl1" orientation="portrait" position="float">
<label>Table 1.</label>
<caption>
<title>Summary of data enhancements in the Virus Variation Resource</title>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" rowspan="1" colspan="1">INSDC/GenBank</th>
<th align="left" rowspan="1" colspan="1">Virus Variation Resource</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">Inconsistent and/or out of date gene/protein names present in INSDC sequence records</td>
<td align="left" rowspan="1" colspan="1">Gene and protein sequences are validated and given consistent, up to date names</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Annotation is often incomplete in INSDC sequence records, especially for mature peptides</td>
<td align="left" rowspan="1" colspan="1">All proteins and mature peptides annotated and possible sequence errors reported</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Non-standardized source descriptor (metadata) vocabulary and formatting within INSDC sequence records</td>
<td align="left" rowspan="1" colspan="1">Source descriptors are parsed from several fields within INSDC sequence records and mapped to standardized terms with correct spelling</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Source metadata potentially missing from INSDC sequence records</td>
<td align="left" rowspan="1" colspan="1">Source metadata can be added manually from literature</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Drug resistance and/or high virulence sequence polymorphisms may not be annotated in INSDC Influenza virus sequence records</td>
<td align="left" rowspan="1" colspan="1">Documented drug resistance and high virulence sequence variations are detected and can be retrieved</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Sequence searches based on metadata terms and gene/protein names can be difficult</td>
<td align="left" rowspan="1" colspan="1">Complex searches can be performed through a convenient user interface</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Once sequences are retrieved, users must perform some data analysis locally or on a third party site</td>
<td align="left" rowspan="1" colspan="1">Selected sequences can be aligned or visualized as a tree within the resource</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1"></td>
<td align="left" rowspan="1" colspan="1"></td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Download formats for sequences and metadata limited for some uses</td>
<td align="left" rowspan="1" colspan="1">Sequences can be downloaded in a variety of formats with customized metadata fields</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Here, we document a series of updates and improvements designed to make viral sequences more easily accessible and usable through the Virus Variation Resource, a value-added database, as well as tools that make it simple to analyze genomic relationships. The resource now includes expanded data processing pipelines and analysis tools, and supports selection and retrieval of nucleotide and protein sequences from four new viral groups: Ebolaviruses, MERS coronavirus, rotavirus, and Zika virus (Table
<xref rid="tbl2" ref-type="table">2</xref>
). The latest package of updates includes a variety of features designed to improve data usability and ease data retrieval. New processes have been added to parse source descriptor terms from GenBank records and map these to controlled vocabulary, and the resource now supports retrieval of sequences based on standardized isolation source and host terms in addition to standardized gene and protein names. A new set of filters has also been developed to identify laboratory isolates, vaccine strains or environmental samples so that they can be included or excluded from searches. A variety of updates have been made to the search interface and results table to better leverage these features, and a new set of multi-sequence alignment and tree building tools has been implemented to allow robust analysis of retrieved sequences.</p>
<table-wrap id="tbl2" orientation="portrait" position="float">
<label>Table 2.</label>
<caption>
<title>Publically available sequence content of Virus Variation Resource (as of September 1, 2016)</title>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" rowspan="1" colspan="1">Virus module</th>
<th align="left" rowspan="1" colspan="1">Species/Types included</th>
<th align="right" rowspan="1" colspan="1">Nucleotide seq.</th>
<th align="right" rowspan="1" colspan="1">Complete genomes</th>
<th align="right" rowspan="1" colspan="1">Protein seq.</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">Dengue virus</td>
<td align="left" rowspan="1" colspan="1">
<italic>Dengue virus</italic>
types 1, 2, 3 and 4</td>
<td align="right" rowspan="1" colspan="1">18 495</td>
<td align="right" rowspan="1" colspan="1">4140</td>
<td align="right" rowspan="1" colspan="1">17 635</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Ebolavirus</td>
<td align="left" rowspan="1" colspan="1">
<italic>Zaire ebolavirus, Bundibugyo ebolavirus, Sudan ebolavirus, Reston ebolavirus; Tai Forest ebolavirus</italic>
</td>
<td align="right" rowspan="1" colspan="1">1849</td>
<td align="right" rowspan="1" colspan="1">1318</td>
<td align="right" rowspan="1" colspan="1">14 407</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Influenza virus</td>
<td align="left" rowspan="1" colspan="1">
<italic>Influenza A virus, Influenza B virus, Influenza C virus</italic>
</td>
<td align="right" rowspan="1" colspan="1">471 603</td>
<td align="right" rowspan="1" colspan="1">33 717</td>
<td align="right" rowspan="1" colspan="1">624 541</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">MERS coronavirus</td>
<td align="left" rowspan="1" colspan="1">
<italic>Middle East respiratory syndrome-related coronavirus</italic>
</td>
<td align="right" rowspan="1" colspan="1">730</td>
<td align="right" rowspan="1" colspan="1">320</td>
<td align="right" rowspan="1" colspan="1">3269</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Rotavirus</td>
<td align="left" rowspan="1" colspan="1">
<italic>Rotavirus A</italic>
</td>
<td align="right" rowspan="1" colspan="1">49 186</td>
<td align="right" rowspan="1" colspan="1">1169</td>
<td align="right" rowspan="1" colspan="1">49 607</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">West Nile virus</td>
<td align="left" rowspan="1" colspan="1">
<italic>West Nile virus</italic>
genotypes 1 and 2;
<italic>Kunjin virus</italic>
</td>
<td align="right" rowspan="1" colspan="1">4184</td>
<td align="right" rowspan="1" colspan="1">1675</td>
<td align="right" rowspan="1" colspan="1">3678</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Zika virus</td>
<td align="left" rowspan="1" colspan="1">
<italic>Zika virus</italic>
</td>
<td align="right" rowspan="1" colspan="1">386</td>
<td align="right" rowspan="1" colspan="1">111</td>
<td align="right" rowspan="1" colspan="1">345</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="SEC1-1">
<title>The Virus Variation model</title>
<p>The NCBI Virus Variation Resource provides users with a convenient way in which to search, download, and analyze viral nucleotide and protein sequences. The resource includes data processing pipelines that retrieve sequences from GenBank, provide standardized gene and protein annotation, and map sequence source descriptors (i.e. metadata) to uniform vocabularies. This data processing enables users to select sequences based on standardized gene, protein and metadata terms using a purposely-designed interface. Once selected, sequences can then be downloaded with the standardized metadata in a variety of formats or analyzed using web-based alignment and tree building tools. There are currently seven discrete Virus Variation modules—
<italic>Dengue virus, Ebolavirus</italic>
, influenza virus, MERS coronavirus,
<italic>Rotavirus A, West Nile virus</italic>
, and
<italic>Zika virus</italic>
—and these include a total of nearly 550 000 nucleotide sequences (see Table
<xref rid="tbl2" ref-type="table">2</xref>
). Example usages of the resources for dengue virus, Ebolavirus, and rotavirus are Klema
<italic>et al</italic>
. (
<xref rid="B15" ref-type="bibr">15</xref>
), Bell
<italic>et al</italic>
. (
<xref rid="B16" ref-type="bibr">16</xref>
), Agbemabiese
<italic>et al</italic>
. (
<xref rid="B17" ref-type="bibr">17</xref>
), respectively.</p>
</sec>
<sec id="SEC1-2">
<title>Rapid deployment model</title>
<p>Current development efforts have focused on expanding the Virus Variation model to include more viruses, enhancing the functionality of the resource and providing rapid support to emergent sequencing efforts. This last point has been particularly relevant over the past several years as emerging viral outbreaks of Ebola and Zika viruses and others have quickly led to large sequencing efforts. There was a clear need to support these sequencing efforts with bioinformatics resources, but timelines prevented traditional development paths where new virus modules and features were added over the course of months. The first rapid deployment of a Virus Variation module was during the western African Ebola virus outbreak that began in December of 2013. The outbreak was declared a Public Health Emergency of International Concern by the World Health Organization on August 8, 2014 (
<ext-link ext-link-type="uri" xlink:href="http://www.who.int/mediacentre/news/statements/2014/ebola-20140808/en/">http://www.who.int/mediacentre/news/statements/2014/ebola-20140808/en/</ext-link>
). By September, a Virus Variation Resource specific to Ebolaviruses was available to help access the sequences that had begun to pour into the INSDC databases. Similarly, a Virus Variation Resource module was developed in September 2014 in response to the outbreak of Middle East respiratory syndrome-related coronavirus (MERS-CoV). Most recently, this rapid response model was repeated for the Zika virus module, which was put in place in March 2016. This need-based deployment strategy is likely a model for future efforts, and much of our current development is geared toward harmonizing processes and interfaces among individual data and software modules so as to provide more support for more virus species within the resource and to respond more efficiently to emergent large-scale sequencing efforts.</p>
</sec>
<sec id="SEC1-3">
<title>Sequence annotation</title>
<p>Accurate gene and protein annotation is necessary both to identify sequences of interest and to analyze them. The Virus Variation Resource employs annotation pipelines that support consistent gene and protein naming. Initial processing for each annotation pipeline is the same: Newly released GenBank records are retrieved hourly based on their listed taxonomy. Retrieved sequences are compared to nucleotide references for that virus group using BLASTN, and the best match is determined (
<xref rid="B8" ref-type="bibr">8</xref>
,
<xref rid="B13" ref-type="bibr">13</xref>
,
<xref rid="B18" ref-type="bibr">18</xref>
). This step confirms species taxonomy, identifies segment assignment if applicable and provides information about the lineage, genotype, type or subtype. The references used are listed in Table
<xref rid="tbl3" ref-type="table">3</xref>
, and sequences that fail to match a reference within established metrics are pushed to a curation interface where they can be reviewed manually.</p>
<table-wrap id="tbl3" orientation="portrait" position="float">
<label>Table 3.</label>
<caption>
<title>Reference sequences employed by Virus Variation</title>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" rowspan="1" colspan="1">Virus module</th>
<th align="left" rowspan="1" colspan="1">Reference sequences</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">Dengue virus</td>
<td align="left" rowspan="1" colspan="1">NC_001477, NC_001474, NC_001475, NC_002640</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Ebolavirus</td>
<td align="left" rowspan="1" colspan="1">NC_014372, NC_014373, NC_004161, NC_006432, NC_002549</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Influenza virus</td>
<td align="left" rowspan="1" colspan="1">References are created by Virus Variation staff as needed, and a comprehensive list is maintained here:
<ext-link ext-link-type="ftp" xlink:href="ftp://ftp.ncbi.nih.gov/genomes/INFLUENZA/ANNOTATION/">ftp://ftp.ncbi.nih.gov/genomes/INFLUENZA/ANNOTATION/</ext-link>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">MERS coronavirus</td>
<td align="left" rowspan="1" colspan="1">NC_019843</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Rotavirus</td>
<td align="left" rowspan="1" colspan="1">References are selected and maintained by the Rotavirus Classification Working Group (
<xref rid="B27" ref-type="bibr">27</xref>
,
<xref rid="B28" ref-type="bibr">28</xref>
) and updates can be found here:
<ext-link ext-link-type="uri" xlink:href="https://rega.kuleuven.be/cev/viralmetagenomics/virus-classification/newgenotypes">https://rega.kuleuven.be/cev/viralmetagenomics/virus-classification/newgenotypes</ext-link>
</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">West Nile virus</td>
<td align="left" rowspan="1" colspan="1">NC_009942, NC_001563</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Zika virus</td>
<td align="left" rowspan="1" colspan="1">NC_012532</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Once a sequence has been matched to a reference, one of three pipelines is employed to determine the span of gene and protein features and to assign standardized names to these features. The first pipeline uses a reference protein guided approach based on the Prosplign tool as described previously (
<xref rid="B8" ref-type="bibr">8</xref>
,
<xref rid="B13" ref-type="bibr">13</xref>
,
<xref rid="B18" ref-type="bibr">18</xref>
). Here, protein reference sequences are aligned with potential translations of the query sequence. The highest scoring translation alignment to any protein reference is then chosen and parsed to determine that it meets specific criteria – the presence of a start codon, exact matches to mature peptide cleavage sites or premature stop sites. Post transcriptional and translational exceptions can be accounted for by this tool by adjusting parameters and allowing multiple transitions from different open reading frames to be assembled into a single alignment. One advantage of this approach is that new viruses can be incorporated by adding new reference protein sequences and adjusting the criteria used for validating a particular translation. Such was the case for Zika virus annotation where the existing dengue virus pipeline was updated with new Zika virus reference sequences (see Table
<xref rid="tbl3" ref-type="table">3</xref>
).</p>
<p>A second approach to gene and protein annotation was implemented in the Ebola virus and MERS coronavirus rapid deployment modules. Here, there was a need to quickly develop a pipeline that could validate the annotation on GenBank records and assign consistent gene and protein names so that these could be accurately used as search criteria. To accomplish this, a BLAST-based pipeline was developed that compares genes and proteins as annotated on GenBank records to reference proteins derived from the best reference nucleotide match. If a protein matches the reference sequence with >70% identity as measured by BLASTP then the presence of this protein is stored. Genes are validated in the same manner using BLASTN and reference nucleotide sequences. Sequences with genes and proteins that cannot be validated are pushed to the curation interface where they can be manually examined. Ultimately these approaches support both search and analysis functionality but are not capable of generating standardized annotation across all sequences belonging to a particular virus.</p>
<p>Our experience has emphasized the importance of accurate annotation pipelines that can be applied to new viruses rapidly in response to emergent needs. Though our current pipelines are effective, they are also very specific to particular viruses and application to new viruses requires much work developing reference sequences, defining processing parameters and manually reviewing annotation results. With that in mind we are now implementing a new, third approach to annotation that can be adapted rapidly when needed and is scalable to multiple virus groups. This new approach is built around two important considerations. First, it uses annotations contained within the so-call Reference Sequence records (
<xref rid="B19" ref-type="bibr">19</xref>
) that are created by our group to represent important taxonomic and sequence space groups. The nucleotide and protein sequences within these records can be invaluable for the unambiguous assignment of sequences to defined groups and can also serve as repositories of reference sequence feature annotation maintained by in-house curation efforts often in collaboration with other scientists (
<xref rid="B20" ref-type="bibr">20</xref>
<xref rid="B24" ref-type="bibr">24</xref>
). Second, this approach includes a comprehensive list of error flags that provide extensive information about sequences and can provide warnings about potential problems. This error coding not only allows staff to quickly sort through thousands of annotations during the development of new pipelines, but also provides potential criteria for the selection or filtering of sequences to resource users.</p>
<p>This new approach was used to annotate polyprotein and mature peptide genomic intervals in West Nile virus (WNV), and this annotation will be available soon through the Virus Variation Resource. These annotations were calculated as follows: First, GenBank West Nile sequences were classified as one of the two common lineages of WNV (lineage 1 or lineage 2) using a combination of BLASTN (
<xref rid="B25" ref-type="bibr">25</xref>
) against the two RefSeq sequences and expert knowledge. The principal characteristic that distinguishes lineage 1 from lineage 2 is that the additional protein WARF4 occurs only in lineage 1 WNV genomes and is believed to occur in most of them (
<xref rid="B7" ref-type="bibr">7</xref>
). There is some evidence that a small proportion of WNV genomes do not fit neatly into lineage 1 or lineage 2 (
<xref rid="B7" ref-type="bibr">7</xref>
), but these were classified as lineage 2 in our annotations. Second, the annotation pipeline built a covariance model (CM) for each of 16 mature peptides present in the NC_009942 RefSeq annotation and for the 15 mature peptides in the NC_001563 RefSeq. The CMs are built using the cmbuild program of the Infernal homology search software package (
<xref rid="B26" ref-type="bibr">26</xref>
). Infernal is typically used for modeling the sequence and secondary structure of RNAs, and because the sequences we are modeling lack structure (i.e. basepairs between positions), the CMs we created are effectively identical to sequence-only profile hidden Markov models. In the current version of our pipeline, each model was derived from the single RefSeq nucleotide sequence encoding each mature peptide. Third, the CMs built from the RefSeq to which that genome was assigned were used to predict each mature peptide coding sequence using Infernal's cmscan program.</p>
<p>The annotation software then runs a variety of validation checks and produces error codes that assist in curation of sequences. For example, the pipeline checks for the existence of any in-frame stop codons within the predicted regions. If one or more is found, the prediction boundaries are modified to terminate at the 5΄-most stop found. Coding sequence (CDS) coordinates are determined implicitly based on the predicted mature peptide coordinates. Lineage 1 (NC_009942) has three CDSs and lineage 2 (NC_001563) has two CDSs. For each CDS, the predictions for the corresponding mature peptides that make up each CDS are tested for consistency by ensuring that mature peptide coding sequences that are adjacent (separated by 0 nucleotides) in the RefSeq are also adjacent in the predictions. The start position of the first mature peptide and end position of the final mature peptide that comprise each CDS are then used as the start and stop position for that CDS. CDS annotations are not made if the mature peptide consistency check fails. In addition to checking for early stop codons and the adjacency of mature peptide coding sequences, the annotation pipeline identifies other unusual or unexpected features in each sequence and reports those as ‘error codes’. There are 17 possible error codes, which provide an easy way for users to gauge the quality of each sequence and its annotations, and should facilitate the selection of subsets of the sequence data that meet specific user-defined quality standards. A more detailed description of the new annotation pipeline and error flags will be included in full detail eventually in a separate manuscript, as well as in the help documents available at the Virus Variation Resource.</p>
</sec>
<sec id="SEC1-4">
<title>Source metadata processing</title>
<p>Another important aspect of sequence analysis is to place a given sequence within biological, temporal and geospatial contexts. Such associations can provide profound health policy and scientific insights, but unfortunately, descriptors that provide information about the source of nucleotide sequences are notoriously inconsistent. To resolve this issue, the Virus Variation database loading pipeline parses GenBank records, identifies important metadata terms, such as sample isolation host, date, country and source, and maps these to a standardized vocabulary using a hierarchical approach. For example, isolation host terms are first identified in the host field and failing that, then isolate or strain fields, then isolation source, note and finally organism name.</p>
<p>This vocabulary mapping strategy follows the INSDC practice of separating isolation host from source. In this convention host refers to an organism—and hence has an organism's name that can be mapped to the NCBI taxonomy tree—and isolation source refers to a physical, environmental or local geographic location (
<xref rid="B1" ref-type="bibr">1</xref>
). For human pathogens isolation source often refers to a host tissue or bodily fluid, and the Virus Variation vocabulary mapping strategy attempts to combine similar clinical terms into biologically relevant groups. For example, the parsed terms ‘serum,’ ‘plasma’ and ‘lymphocytes’ are all mapped to the standardized vocabulary term ‘blood’. To support more efficient data retrieval, host terms are mapped in a hierarchy, and once a species term such as ‘
<italic>Accipiter cooperii</italic>
’ is identified, it is mapped to both the group name ‘Bird’ and the common name ‘Accipiter.’</p>
<p>Other metadata terms such as those for disease associations and clinical/laboratory manipulations are more difficult to parse. To this end, laboratory isolates, vaccine strains and environmental samples are identified by searching for key terms, such as ‘tissue culture’ or ‘sewage,’ from all fields. Disease terms for dengue virus are also found using a similar strategy. In all cases these strategies require extensive examination of sequence records and documentation of specific terms that can be accurately mapped to controlled vocabulary gleaned from established ontologies such as the Environmental Ontology (
<ext-link ext-link-type="uri" xlink:href="https://bioportal.bioontology.org/ontologies/ENVO">https://bioportal.bioontology.org/ontologies/ENVO</ext-link>
) and the Infectious Disease Ontology (
<ext-link ext-link-type="uri" xlink:href="https://bioportal.bioontology.org/ontologies/IDO">https://bioportal.bioontology.org/ontologies/IDO</ext-link>
). This process is supported by a curation interface that lists records where parsing fails to identify expected terms, leading to good old-fashioned manual curation and the identification of new terms, common misspellings, regional spelling differences and the manual incorporation of metadata from relevant literature into the Virus Variation database. In total, these vocabulary remapping strategies can have a profound impact on data usability as large numbers of parsed terms can be mapped to controlled vocabularies (Table
<xref rid="tbl4" ref-type="table">4</xref>
).</p>
<table-wrap id="tbl4" orientation="portrait" position="float">
<label>Table 4.</label>
<caption>
<title>Number of GenBank sequences where non-standard metadata terms were mapped to standardized vocabulary</title>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" rowspan="1" colspan="1">Virus module</th>
<th align="right" rowspan="1" colspan="1">Total sequences processed</th>
<th align="right" rowspan="1" colspan="1">Isolation country</th>
<th align="right" rowspan="1" colspan="1">Isolation host</th>
<th align="right" rowspan="1" colspan="1">Isolation source</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">Dengue virus</td>
<td align="right" rowspan="1" colspan="1">18 909</td>
<td align="right" rowspan="1" colspan="1">1321</td>
<td align="right" rowspan="1" colspan="1">6361</td>
<td align="right" rowspan="1" colspan="1">7402</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Ebolavirus</td>
<td align="right" rowspan="1" colspan="1">1849</td>
<td align="right" rowspan="1" colspan="1">598</td>
<td align="right" rowspan="1" colspan="1">56</td>
<td align="right" rowspan="1" colspan="1">588</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Influenza virus</td>
<td align="right" rowspan="1" colspan="1">472 050</td>
<td align="right" rowspan="1" colspan="1">267 955</td>
<td align="right" rowspan="1" colspan="1">380 384</td>
<td align="right" rowspan="1" colspan="1">n.a.</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">MERS coronavirus</td>
<td align="right" rowspan="1" colspan="1">730</td>
<td align="right" rowspan="1" colspan="1">5</td>
<td align="right" rowspan="1" colspan="1">95</td>
<td align="right" rowspan="1" colspan="1">327</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Rotavirus</td>
<td align="right" rowspan="1" colspan="1">49 186</td>
<td align="right" rowspan="1" colspan="1">15 823</td>
<td align="right" rowspan="1" colspan="1">17 166</td>
<td align="right" rowspan="1" colspan="1">19 009</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">West Nile virus</td>
<td align="right" rowspan="1" colspan="1">4184</td>
<td align="right" rowspan="1" colspan="1">2143</td>
<td align="right" rowspan="1" colspan="1">1253</td>
<td align="right" rowspan="1" colspan="1">1329</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Zika virus</td>
<td align="right" rowspan="1" colspan="1">386</td>
<td align="right" rowspan="1" colspan="1">86</td>
<td align="right" rowspan="1" colspan="1">127</td>
<td align="right" rowspan="1" colspan="1">148</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="SEC1-5">
<title>Search interface</title>
<p>The Virus Variation annotation and metadata mapping pipelines create standardized terms that can then be leveraged by the resource search interface. A link to this interface can be found on the home page of each virus module, which also includes links to help documents, other NCBI resources, and relevant external resources (for an example, please see
<ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/genome/viruses/variation/dengue/">http://www.ncbi.nlm.nih.gov/genome/viruses/variation/dengue/</ext-link>
). To access the search interface from the module home page, select the link to ‘Search nucleotide and protein sequences.’ Here, users can select between protein and nucleotide searches (see Figure
<xref ref-type="fig" rid="F1">1</xref>
). When searching protein sequences, selecting ‘Full-length sequences only’ filter, limits retrieved sequences to those with a complete coding region as determined to the relevant reference. The same filter limits nucleotide searches to full-length genomes, where the completeness of a given genome is operationally determined by comparing the genes/proteins present on a given sequence to those on the relevant, full-length reference genome. Currently, non-coding, terminal regions are not included in this determination.</p>
<fig id="F1" orientation="portrait" position="float">
<label>Figure 1.</label>
<caption>
<p>Virus Variation Resource search interface page. (
<bold>A</bold>
) The Ebolavirus module search interface prior to selection of filters and hidden elements. (
<bold>B</bold>
) The Ebolavirus module search interface with all elements opened and several example searches displayed in the query builder. The search page is divided into three elements. The first element supports selection of protein or nucleotide sequences based on standardized metadata terms generated by processing pipelines described in the text. Menus support filtering of sequences based on gene or protein names, host, isolation country and isolation source, and collection and release dates ranges can be set with text boxes. Additional filters are accessible with a drop-down arrow revealing options for environmental or laboratory isolates, vaccine strains, keyword or sequence string searches, and optional menus tailored to specific viruses. The second element supports searches based on GenBank accessions – either using the text box or by uploading a text file of accessions. The third element includes the query builder where the number of sequences retrieved from individual searches can be viewed by clicking one of the ‘Add query’ buttons. When multiple searches are added to the Query Builder, the total number of unique sequence records is also summed. A checkbox is provided that allows identical sequences to be collapsed and represented by the oldest sequence on the results table. Clicking the ‘Show results’ button opens a separate browser tab and displays all of the sequences meeting the criteria in each of the checked queries in the results interface.</p>
</caption>
<graphic xlink:href="gkw1065fig1"></graphic>
</fig>
<p>During both protein and nucleotide searches, users can define explicitly the genomic regions present on retrieved sequences using drop-down menus that support multiple selections. Additionally, sequences can be filtered using standardized source metadata terms for host, region/country and isolation source using similar pull down menus. The host and country menus are arranged so that aggregate terms are listed in the top portion of the menu and more discrete terms below. In addition to these common filters, there are module-specific filters for species, types, and disease for Ebolaviruses and dengue virus respectively. The influenza virus module also provides some module-specific search options. For example, a user can select ‘Full length only’ to include sequences with complete coding regions or ‘Full length plus’ to include sequences with complete coding regions, but no start and/or stop codon. Several other specific filters are also available on the influenza module search interface, such as H and N subtypes, minimum or maximum sequence lengths, and inclusion or exclusion of pandemic H1N1 viruses.</p>
<p>A second set of functions and filters is included within the ‘Additional filters’ menu. Here users can search for keywords in the GenBank record deflines or strings within sequences. There are also filters to include or exclude laboratory isolates, vaccine strains, and environmental isolates. One can also select specific rotavirus segment types based on assignment by the Rotavirus Classification Working Group (
<xref rid="B27" ref-type="bibr">27</xref>
,
<xref rid="B28" ref-type="bibr">28</xref>
), or by selecting specific sequences by GenBank accession. Once the parameters for a specific search are selected, a user can choose to add the query to the query builder and define another search, or they can go directly to the results. Several searches can be run and added to the query builder where the combination of filters and number of retrieved sequences is displayed for each search. The number of unique sequences can be displayed using the ‘collapse identical sequences’ checkbox. Individual searches can then be selected and/or combined and sent to the results page for further refinement and analysis.</p>
</sec>
<sec id="SEC1-6">
<title>Results page</title>
<p>The results page supports selection of sequences from the search set for analysis or download. Search parameters are displayed at the top of the results page, and a table displays retrieved sequences and associated metadata. The individual columns within the table can be selected to display specific sets of metadata and hyperlinked GenBank and BioSample accessions (
<xref rid="B29" ref-type="bibr">29</xref>
). BioSample records store an extended set of sample descriptors and are linked to Sequence Read Archive (SRA) (
<xref rid="B30" ref-type="bibr">30</xref>
) records, allowing users to easily find sequence read data associated with retrieved GenBank sequences when available. One new feature is the ability to collapse identical retrieved sequences for all viruses as described in the preceding section. When identical sequences are collapsed on the query page, they will be represented by a single sequence on the results page with the number of collapsed sequences shown in the ‘Identical sequences’ column (see Figure
<xref ref-type="fig" rid="F2">2</xref>
). Clicking the arrows in the ‘Identical sequences’ column displays the individual sequences and makes them selectable. Users can now customize sequence titles including the FASTA defline of downloaded sequences and tree labels using the ‘Customize label’ tool. The defline can be modified to include various types of data such as the sequence accession number, calculated genomic region, host, isolation source, collection date or country, as well as field-separators such as pipes or slashes. User-selected titles will also be displayed in multi-sequence alignments and trees as described in the following section.</p>
<fig id="F2" orientation="portrait" position="float">
<label>Figure 2.</label>
<caption>
<p>Virus Variation Resource results interface page. The results interface search criteria at the top of the page and a table of retrieved sequences below. There is a row of functions directly above the table of retrieved sequences that supports a number of actions. For example, users can select the visible columns in the results table using the ‘Select columns’ link, or quickly display multiple sequence alignments of selected sequences using the ‘Build sequence alignment’ button. There is also an option to customize sequence labels before downloading them or building trees. Individual GenBank or BioSample records listed in the table can be reviewed by clicking the hyperlinked accessions. If identical sequences were collapsed, they can be expanded to view individual accessions by clicking the blue arrow in the ‘Identical sequences’ column.</p>
</caption>
<graphic xlink:href="gkw1065fig2"></graphic>
</fig>
</sec>
<sec id="SEC1-7">
<title>Analysis tools</title>
<p>Users can build multiple sequence alignments or trees from selected sequences, and these in turn can be downloaded in various formats. The influenza module uses previously described tools for these functions (
<xref rid="B8" ref-type="bibr">8</xref>
,
<xref rid="B13" ref-type="bibr">13</xref>
,
<xref rid="B31" ref-type="bibr">31</xref>
), but a new set of tools has been developed for other viruses. Multiple sequence alignments are constructed using an optimized version of MUSCLE, and rooted trees are generated using the Unweighted Pair Group Method with six base nucleotide or amino acid k-mers (
<xref rid="B32" ref-type="bibr">32</xref>
) (see Figure
<xref ref-type="fig" rid="F3">3</xref>
). The multiple sequence alignment display includes a navigation map above the alignment, a variation histogram and a consensus sequence. Characters are colored to indicate variable positions. The alignment can be downloaded in FASTA, Clustal, Phylip, NEXUS, or ASN.1 formats. The tree display supports a variety of layouts including rectangular and slanted cladograms, radial trees and circular trees, the image can be downloaded as a PDF, and the tree file can be downloaded in ASN text or binary, Newick, or NEXUS formats. These options are accessible through the ‘Tools’ menu in the viewer. The data labels on multi-sequence alignments and trees can be customized from the results table before the tree is calculated using the ‘Customize label’ options, making it easier to identify the distribution of sample/sequence characteristics. When certain download formats are selected, customized labels will be included in the downloaded files (FASTA and ASN.1 for the multiple alignments, and all files for the trees). A URL is also provided to make sharing a tree easy.</p>
<fig id="F3" orientation="portrait" position="float">
<label>Figure 3.</label>
<caption>
<p>Virus Variation Resource tree and multi-sequence alignment displays. (
<bold>A</bold>
) A sample tree is shown depicting the use of standardized metadata terms as sequence labels. The tree was built from 31 West Nile virus complete polyprotein sequences collected since 2013. Sequence labels are based on GenBank accessions, host, country of isolation and isolation date. Left clicking a node highlights the lineage, and hovering over a node with the cursor displays a menu that includes descriptors for that particular sample, including GenBank accession and available standardized metadata terms for host, country, isolation source, etc. The menu also includes a function to reroot the tree around that sequence. (
<bold>B</bold>
) A multi-sequence alignment is shown for the same 31 West Nile polyprotein sequences. Individual GenBank accessions are listed to the left next to sequences. Left clicking the accession displays a menu that includes the standardized metadata label chosen in the results interface, a link to the sequence in GenBank, a function to use that sequence as an anchor for the alignment. Differences between residues in a given sequence and the consensus are highlighted in red. A histogram above the alignment shows coverage in blue and the frequency of changes in red.</p>
</caption>
<graphic xlink:href="gkw1065fig3"></graphic>
</fig>
</sec>
</sec>
<sec id="SEC2">
<title>FUTURE DIRECTIONS</title>
<p>The Virus Variation Resource described here provides a number of features that improve the usability of archival sequence data. The resource now includes more than 20% of the GenBank sequences that are assigned viral taxonomy. Further improvement will be dependent on which viruses are added in the future and on updates to the various pipelines, interfaces and tools so that they can further support user needs. Our plan is to increase the pace at which new virus species are added to the Virus Variation Resource, and we are currently developing layers of data processing – the least transformative of which could be applied across all viral sequences but still provide basic information about a sequence. The search interface and data displays will be revised so that they better support user-required comparative genomic functions across a much larger number of viral species from the same query page. We also intend to support searches based on author names and more detailed sample information, such as clinical symptoms or laboratory handling. Though we will begin parsing the potentially rich metadata data sets from BioSample records, the success of this effort will ultimately rest on improved community awareness and more consistent submission of metadata to public databases.</p>
<p>Given the unbridled growth and clear potential of nucleotide sequencing efforts, one must assume the current Virus Variation Resource is just scratching the surface of future bioinformatic needs. The current resource model is suited to viruses that have experimentally validated annotation, and similar modules are in development for additional viral species. However, the vast majority of viruses do not have strong experimental evidence for protein coding regions, making it difficult to build a Virus Variation module including an annotation pipeline. In these cases annotation will need to be inferred from related, experimentally studied viruses, requiring new approaches and better ways of standardizing gene and protein information across multiple groups of viruses. Our current annotation pipeline development is directed toward these goals, and we intend to extend public access to these pipelines beyond our current influenza virus module. We also intend to reveal resource-derived annotation as tracks on multiple sequence alignments, making annotated sequences available for download and improving access to our data sets. This will also enable users to limit downloads and multiple sequence alignments to selected mature peptides for polyprotein sequences, and trees to be built from selected genomic regions.</p>
<p>Finally, there are a variety of enhancements to our tools under development. We are developing improved tree visualizations that support better search and markup functions, similar to those currently used in the influenza virus module. Some limitations of the tree function will be addressed at a later time by giving the user the option of viewing the quick tree which is currently offered, or a more sophisticated combination of MUSCLE-multiple sequence alignment and phylogenetic tree. We are also interested in supporting BLAST-based searches within our data sets to support more precise sequence associations. Ultimately, the presumed very large sequencing datasets of the future will ultimately require better ways to evaluate data retrieved from searches which, in turn, will require better integration of search functions with data visualizations such as trees.</p>
<p>Members of the scientific community are encouraged to contact the NCBI Help Desk (
<email>ncbi-help@ncbi.nlm.nih.gov</email>
) to make suggestions to improve the Virus Variation Resource, or to assist with establishing annotation or metadata standards.</p>
</sec>
</body>
<back>
<sec id="SEC3">
<title>FUNDING</title>
<p>Intramural Research Program of the National Institutes of Health; National Library of Medicine. Funding for open access charge: Intramural Research Program of the National Institutes of Health; National Library of Medicine.</p>
<p>
<italic>Conflict of interest statement</italic>
. None declared.</p>
</sec>
<ref-list>
<title>REFERENCES</title>
<ref id="B1">
<label>1.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Cochrane</surname>
<given-names>G.</given-names>
</name>
,
<name name-style="western">
<surname>Karsch-Mizrachi</surname>
<given-names>I.</given-names>
</name>
,
<name name-style="western">
<surname>Takagi</surname>
<given-names>T.</given-names>
</name>
,
<collab>International Nucleotide Sequence Database Collaboration</collab>
</person-group>
<article-title>The International Nucleotide Sequence Database Collaboration</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2016</year>
;
<volume>44</volume>
:
<fpage>D48</fpage>
<lpage>D50</lpage>
.
<pub-id pub-id-type="pmid">26657633</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<label>2.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Benson</surname>
<given-names>D.A.</given-names>
</name>
,
<name name-style="western">
<surname>Clark</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Karsch-Mizrachi</surname>
<given-names>I.</given-names>
</name>
,
<name name-style="western">
<surname>Lipman</surname>
<given-names>D.J.</given-names>
</name>
,
<name name-style="western">
<surname>Ostell</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Sayers</surname>
<given-names>E.W.</given-names>
</name>
</person-group>
<article-title>GenBank</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2015</year>
;
<volume>43</volume>
:
<fpage>D30</fpage>
<lpage>D35</lpage>
.
<pub-id pub-id-type="pmid">25414350</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<label>3.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Paul</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Bartenschlager</surname>
<given-names>R.</given-names>
</name>
</person-group>
<article-title>Flaviviridae replication organelles: Oh, what a tangled web we weave</article-title>
.
<source>Annu. Rev. Virol.</source>
<year>2015</year>
;
<volume>2</volume>
:
<fpage>289</fpage>
<lpage>310</lpage>
.
<pub-id pub-id-type="pmid">26958917</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<label>4.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Lingala</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Ghany</surname>
<given-names>M.G.</given-names>
</name>
</person-group>
<article-title>Natural history of Hepatitis C</article-title>
.
<source>Gastroenterol. Clin. North Am.</source>
<year>2015</year>
;
<volume>44</volume>
:
<fpage>717</fpage>
<lpage>734</lpage>
.
<pub-id pub-id-type="pmid">26600216</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<label>5.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>McVey</surname>
<given-names>D.S.</given-names>
</name>
,
<name name-style="western">
<surname>Wilson</surname>
<given-names>W.C.</given-names>
</name>
,
<name name-style="western">
<surname>Gay</surname>
<given-names>C.G.</given-names>
</name>
</person-group>
<article-title>West Nile virus</article-title>
.
<source>Rev. Sci. Tech.</source>
<year>2015</year>
;
<volume>34</volume>
:
<fpage>431</fpage>
<lpage>439</lpage>
.
<pub-id pub-id-type="pmid">26601446</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<label>6.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Bavia</surname>
<given-names>L.</given-names>
</name>
,
<name name-style="western">
<surname>Mosimann</surname>
<given-names>A.L.</given-names>
</name>
,
<name name-style="western">
<surname>Aoki</surname>
<given-names>M.N.</given-names>
</name>
,
<name name-style="western">
<surname>Duarte Dos Santos</surname>
<given-names>C.N.</given-names>
</name>
</person-group>
<article-title>A glance at subgenomic flavivirus RNAs and microRNAs in flavivirus infections</article-title>
.
<source>Virol. J.</source>
<year>2016</year>
;
<volume>13</volume>
:
<fpage>84</fpage>
<lpage>103</lpage>
.
<pub-id pub-id-type="pmid">27233361</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<label>7.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Faggioni</surname>
<given-names>G.</given-names>
</name>
,
<name name-style="western">
<surname>Pomponi</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>De Santis</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Masuelli</surname>
<given-names>L.</given-names>
</name>
,
<name name-style="western">
<surname>Ciammaruconi</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Monaco</surname>
<given-names>F.</given-names>
</name>
,
<name name-style="western">
<surname>Di Gennaro</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Marzocchella</surname>
<given-names>L.</given-names>
</name>
,
<name name-style="western">
<surname>Sambri</surname>
<given-names>V.</given-names>
</name>
,
<name name-style="western">
<surname>Lelli</surname>
<given-names>R.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>West Nile alternative open reading frame (N-NS4B/WARF4) is produced in infected West Nile Virus (WNV) cells and induces humoral response in WNV infected individuals</article-title>
.
<source>Virol. J.</source>
<year>2012</year>
;
<volume>9</volume>
:
<fpage>283</fpage>
<lpage>296</lpage>
.
<pub-id pub-id-type="pmid">23173701</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<label>8.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Bolotov</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Dernovoy</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Kiryutin</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Zaslavsky</surname>
<given-names>L.</given-names>
</name>
,
<name name-style="western">
<surname>Tatusova</surname>
<given-names>T.</given-names>
</name>
,
<name name-style="western">
<surname>Ostell</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Lipman</surname>
<given-names>D.</given-names>
</name>
</person-group>
<article-title>The influenza virus resource at the National Center for Biotechnology Information</article-title>
.
<source>J. Virol.</source>
<year>2008</year>
;
<volume>82</volume>
:
<fpage>596</fpage>
<lpage>601</lpage>
.
<pub-id pub-id-type="pmid">17942553</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<label>9.</label>
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name name-style="western">
<surname>Foley</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Leitner</surname>
<given-names>T.</given-names>
</name>
,
<name name-style="western">
<surname>Apetrei</surname>
<given-names>C.</given-names>
</name>
,
<name name-style="western">
<surname>Hahn</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Mizrachi</surname>
<given-names>I.</given-names>
</name>
,
<name name-style="western">
<surname>Mullins</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Rambaut</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Wolinsky</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Korber</surname>
<given-names>B.</given-names>
</name>
</person-group>
<source>HIV Sequence Compendium 2013</source>
.
<year>2013</year>
;
<publisher-loc>New Mexico</publisher-loc>
:
<publisher-name>Theoretical Biology and Biophysics Group, Los Alamos National Laboratory</publisher-name>
<comment>LA-UR 13-26007</comment>
.</mixed-citation>
</ref>
<ref id="B10">
<label>10.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Greene</surname>
<given-names>J.M.</given-names>
</name>
,
<name name-style="western">
<surname>Collins</surname>
<given-names>F.</given-names>
</name>
,
<name name-style="western">
<surname>Lefkowitz</surname>
<given-names>E.J.</given-names>
</name>
,
<name name-style="western">
<surname>Roos</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Scheuermann</surname>
<given-names>R.H.</given-names>
</name>
,
<name name-style="western">
<surname>Sobral</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Stevens</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>White</surname>
<given-names>O.</given-names>
</name>
,
<name name-style="western">
<surname>Di Francesco</surname>
<given-names>V.</given-names>
</name>
</person-group>
<article-title>National Institute of Allergy and Infectious Diseases bioinformatics resource centers: new assets for pathogen informatics</article-title>
.
<source>Infect. Immun.</source>
<year>2007</year>
;
<volume>75</volume>
:
<fpage>3212</fpage>
<lpage>3219</lpage>
.
<pub-id pub-id-type="pmid">17420237</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<label>11.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Pickett</surname>
<given-names>B.E.</given-names>
</name>
,
<name name-style="western">
<surname>Sadat</surname>
<given-names>E.L.</given-names>
</name>
,
<name name-style="western">
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Noronha</surname>
<given-names>J.M.</given-names>
</name>
,
<name name-style="western">
<surname>Squires</surname>
<given-names>R.B.</given-names>
</name>
,
<name name-style="western">
<surname>Hunt</surname>
<given-names>V.</given-names>
</name>
,
<name name-style="western">
<surname>Liu</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Kumar</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Zaremba</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Gu</surname>
<given-names>Z.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>ViPR: an open bioinformatics database and analysis resource for virology research</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2012</year>
;
<volume>40</volume>
:
<fpage>D593</fpage>
<lpage>D598</lpage>
.
<pub-id pub-id-type="pmid">22006842</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<label>12.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Van Doorslaer</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Tan</surname>
<given-names>Q.</given-names>
</name>
,
<name name-style="western">
<surname>Xirasagar</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Bandaru</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Gopalan</surname>
<given-names>V.</given-names>
</name>
,
<name name-style="western">
<surname>Mohamoud</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Huyen</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>McBride</surname>
<given-names>A.A.</given-names>
</name>
</person-group>
<article-title>The Papillomavirus Episteme: a central resource for papillomavirus sequence data and analysis</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2013</year>
;
<volume>41</volume>
:
<fpage>D571</fpage>
<lpage>D578</lpage>
.
<pub-id pub-id-type="pmid">23093593</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<label>13.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Resch</surname>
<given-names>W.</given-names>
</name>
,
<name name-style="western">
<surname>Zaslavsky</surname>
<given-names>L.</given-names>
</name>
,
<name name-style="western">
<surname>Kiryutin</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Rozanov</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Tatusova</surname>
<given-names>T.A.</given-names>
</name>
</person-group>
<article-title>Virus variation resources at the National Center for Biotechnology Information: dengue virus</article-title>
.
<source>BMC Microbiol</source>
.
<year>2009</year>
;
<volume>9</volume>
:
<fpage>65</fpage>
<lpage>71</lpage>
.
<pub-id pub-id-type="pmid">19341451</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<label>14.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Brister</surname>
<given-names>J.R.</given-names>
</name>
,
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Zhdanov</surname>
<given-names>S.A.</given-names>
</name>
,
<name name-style="western">
<surname>Ostapchuck</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Chetvernin</surname>
<given-names>V.</given-names>
</name>
,
<name name-style="western">
<surname>Kiryutin</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Zaslavsky</surname>
<given-names>L.</given-names>
</name>
,
<name name-style="western">
<surname>Kimelman</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Tatusova</surname>
<given-names>T.A.</given-names>
</name>
</person-group>
<article-title>Virus Variation Resource–recent updates and future directions</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2014</year>
;
<volume>42</volume>
:
<fpage>D660</fpage>
<lpage>D665</lpage>
.
<pub-id pub-id-type="pmid">24304891</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<label>15.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Klema</surname>
<given-names>V.J.</given-names>
</name>
,
<name name-style="western">
<surname>Ye</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Hindupur</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Teramoto</surname>
<given-names>T.</given-names>
</name>
,
<name name-style="western">
<surname>Gottipati</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Padmanabhan</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Choi</surname>
<given-names>K.H.</given-names>
</name>
</person-group>
<article-title>Dengue virus nonstructural protein 5 (NS5) assembles into a dimer with a unique methyltransferase and polymerase interface</article-title>
.
<source>PLoS Pathog.</source>
<year>2016</year>
;
<volume>12</volume>
:
<fpage>e1005451</fpage>
.
<pub-id pub-id-type="pmid">26895240</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<label>16.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Bell</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Lewandowski</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Myers</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Wooldridge</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Aarons</surname>
<given-names>E.</given-names>
</name>
,
<name name-style="western">
<surname>Simpson</surname>
<given-names>A.</given-names>
</name>
,
<name name-style="western">
<surname>Vipond</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Jacobs</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Gharbia</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Zambon</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>Genome sequence analysis of Ebola virus in clinical samples from three British healthcare workers, August 2014 to March 2015</article-title>
.
<source>Euro Surveill.</source>
<year>2015</year>
;
<volume>20</volume>
:
<fpage>6</fpage>
<lpage>10</lpage>
.
<pub-id pub-id-type="pmid">26290487</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<label>17.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Agbemabiese</surname>
<given-names>C.A.</given-names>
</name>
,
<name name-style="western">
<surname>Nakagomi</surname>
<given-names>T.</given-names>
</name>
,
<name name-style="western">
<surname>Doan</surname>
<given-names>Y.H.</given-names>
</name>
,
<name name-style="western">
<surname>Do</surname>
<given-names>L.P.</given-names>
</name>
,
<name name-style="western">
<surname>Damanka</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Armah</surname>
<given-names>G.E.</given-names>
</name>
,
<name name-style="western">
<surname>Nakagomi</surname>
<given-names>O.</given-names>
</name>
</person-group>
<article-title>Genomic constellation and evolution of Ghanaian G2P[4] rotavirus strains from a global perspective</article-title>
.
<source>Infect. Genet. Evol.</source>
<year>2016</year>
;
<volume>45</volume>
:
<fpage>122</fpage>
<lpage>131</lpage>
.
<pub-id pub-id-type="pmid">27569866</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<label>18.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Bolotov</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Dernovoy</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Kiryutin</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Tatusova</surname>
<given-names>T.</given-names>
</name>
</person-group>
<article-title>FLAN: a web server for influenza virus genome annotation</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2007</year>
;
<volume>35</volume>
:
<fpage>W280</fpage>
<lpage>W284</lpage>
.
<pub-id pub-id-type="pmid">17545199</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<label>19.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>O'Leary</surname>
<given-names>N.A.</given-names>
</name>
,
<name name-style="western">
<surname>Wright</surname>
<given-names>M.W.</given-names>
</name>
,
<name name-style="western">
<surname>Brister</surname>
<given-names>J.R.</given-names>
</name>
,
<name name-style="western">
<surname>Ciufo</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Haddad</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>McVeigh</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Rajput</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Robbertse</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Smith-White</surname>
<given-names>B.</given-names>
</name>
,
<name name-style="western">
<surname>Ako-Adjei</surname>
<given-names>D.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2016</year>
;
<volume>44</volume>
:
<fpage>D733</fpage>
<lpage>D745</lpage>
.
<pub-id pub-id-type="pmid">26553804</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<label>20.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Matthijnssens</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Ciarlet</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>McDonald</surname>
<given-names>S.M.</given-names>
</name>
,
<name name-style="western">
<surname>Attoui</surname>
<given-names>H.</given-names>
</name>
,
<name name-style="western">
<surname>Banyai</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Brister</surname>
<given-names>J.R.</given-names>
</name>
,
<name name-style="western">
<surname>Buesa</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Esona</surname>
<given-names>M.D.</given-names>
</name>
,
<name name-style="western">
<surname>Estes</surname>
<given-names>M.K.</given-names>
</name>
,
<name name-style="western">
<surname>Gentsch</surname>
<given-names>J.R.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Uniformity of rotavirus strain nomenclature proposed by the Rotavirus Classification Working Group (RCWG)</article-title>
.
<source>Arch. Virol.</source>
<year>2011</year>
;
<volume>156</volume>
:
<fpage>1397</fpage>
<lpage>1413</lpage>
.
<pub-id pub-id-type="pmid">21597953</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<label>21.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Brister</surname>
<given-names>J.R.</given-names>
</name>
,
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Kuiken</surname>
<given-names>C.</given-names>
</name>
,
<name name-style="western">
<surname>Lefkowitz</surname>
<given-names>E.J.</given-names>
</name>
,
<name name-style="western">
<surname>Le Mercier</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Leplae</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Madupu</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Scheuermann</surname>
<given-names>R.H.</given-names>
</name>
,
<name name-style="western">
<surname>Schobel</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Seto</surname>
<given-names>D.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Towards viral genome annotation standards, report from the 2010 NCBI annotation workshop</article-title>
.
<source>Viruses</source>
.
<year>2010</year>
;
<volume>2</volume>
:
<fpage>2258</fpage>
<lpage>2268</lpage>
.
<pub-id pub-id-type="pmid">21994619</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<label>22.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Brister</surname>
<given-names>J.R.</given-names>
</name>
,
<name name-style="western">
<surname>Le Mercier</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Hu</surname>
<given-names>J.C.</given-names>
</name>
</person-group>
<article-title>Microbial virus genome annotation-mustering the troops to fight the sequence onslaught</article-title>
.
<source>Virology</source>
.
<year>2012</year>
;
<volume>434</volume>
:
<fpage>175</fpage>
<lpage>180</lpage>
.
<pub-id pub-id-type="pmid">23084289</pub-id>
</mixed-citation>
</ref>
<ref id="B23">
<label>23.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Kuhn</surname>
<given-names>J.H.</given-names>
</name>
,
<name name-style="western">
<surname>Andersen</surname>
<given-names>K.G.</given-names>
</name>
,
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Bavari</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Becker</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Bennett</surname>
<given-names>R.S.</given-names>
</name>
,
<name name-style="western">
<surname>Bergman</surname>
<given-names>N.H.</given-names>
</name>
,
<name name-style="western">
<surname>Blinkova</surname>
<given-names>O.</given-names>
</name>
,
<name name-style="western">
<surname>Bradfute</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Brister</surname>
<given-names>J.R.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Filovirus RefSeq entries: evaluation and selection of filovirus type variants, type sequences, and names</article-title>
.
<source>Viruses</source>
.
<year>2014</year>
;
<volume>6</volume>
:
<fpage>3663</fpage>
<lpage>3682</lpage>
.
<pub-id pub-id-type="pmid">25256396</pub-id>
</mixed-citation>
</ref>
<ref id="B24">
<label>24.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Brister</surname>
<given-names>J.R.</given-names>
</name>
,
<name name-style="western">
<surname>Ako-Adjei</surname>
<given-names>D.</given-names>
</name>
,
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Blinkova</surname>
<given-names>O.</given-names>
</name>
</person-group>
<article-title>NCBI viral genomes resource</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2015</year>
;
<volume>43</volume>
:
<fpage>D571</fpage>
<lpage>D577</lpage>
.
<pub-id pub-id-type="pmid">25428358</pub-id>
</mixed-citation>
</ref>
<ref id="B25">
<label>25.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Altschul</surname>
<given-names>S.F.</given-names>
</name>
,
<name name-style="western">
<surname>Madden</surname>
<given-names>T.L.</given-names>
</name>
,
<name name-style="western">
<surname>Schaffer</surname>
<given-names>A.A.</given-names>
</name>
,
<name name-style="western">
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Zhang</surname>
<given-names>Z.</given-names>
</name>
,
<name name-style="western">
<surname>Miller</surname>
<given-names>W.</given-names>
</name>
,
<name name-style="western">
<surname>Lipman</surname>
<given-names>D.J.</given-names>
</name>
</person-group>
<article-title>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</article-title>
.
<source>Nucleic Acids Res.</source>
<year>1997</year>
;
<volume>25</volume>
:
<fpage>3389</fpage>
<lpage>3402</lpage>
.
<pub-id pub-id-type="pmid">9254694</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<label>26.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Nawrocki</surname>
<given-names>E.P.</given-names>
</name>
,
<name name-style="western">
<surname>Eddy</surname>
<given-names>S.R.</given-names>
</name>
</person-group>
<article-title>Infernal 1.1: 100-fold faster RNA homology searches</article-title>
.
<source>Bioinformatics</source>
.
<year>2013</year>
;
<volume>29</volume>
:
<fpage>2933</fpage>
<lpage>2935</lpage>
.
<pub-id pub-id-type="pmid">24008419</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<label>27.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Matthijnssens</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Ciarlet</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Rahman</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Attoui</surname>
<given-names>H.</given-names>
</name>
,
<name name-style="western">
<surname>Banyai</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Estes</surname>
<given-names>M.K.</given-names>
</name>
,
<name name-style="western">
<surname>Gentsch</surname>
<given-names>J.R.</given-names>
</name>
,
<name name-style="western">
<surname>Iturriza-Gomara</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Kirkwood</surname>
<given-names>C.D.</given-names>
</name>
,
<name name-style="western">
<surname>Martella</surname>
<given-names>V.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Recommendations for the classification of group A rotaviruses using all 11 genomic RNA segments</article-title>
.
<source>Arch. Virol.</source>
<year>2008</year>
;
<volume>153</volume>
:
<fpage>1621</fpage>
<lpage>1629</lpage>
.
<pub-id pub-id-type="pmid">18604469</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<label>28.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Maes</surname>
<given-names>P.</given-names>
</name>
,
<name name-style="western">
<surname>Matthijnssens</surname>
<given-names>J.</given-names>
</name>
,
<name name-style="western">
<surname>Rahman</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Van Ranst</surname>
<given-names>M.</given-names>
</name>
</person-group>
<article-title>RotaC: a web-based tool for the complete genome classification of group A rotaviruses</article-title>
.
<source>BMC Microbiol.</source>
<year>2009</year>
;
<volume>9</volume>
:
<fpage>238</fpage>
<lpage>241</lpage>
.
<pub-id pub-id-type="pmid">19930627</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<label>29.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Barrett</surname>
<given-names>T.</given-names>
</name>
,
<name name-style="western">
<surname>Clark</surname>
<given-names>K.</given-names>
</name>
,
<name name-style="western">
<surname>Gevorgyan</surname>
<given-names>R.</given-names>
</name>
,
<name name-style="western">
<surname>Gorelenkov</surname>
<given-names>V.</given-names>
</name>
,
<name name-style="western">
<surname>Gribov</surname>
<given-names>E.</given-names>
</name>
,
<name name-style="western">
<surname>Karsch-Mizrachi</surname>
<given-names>I.</given-names>
</name>
,
<name name-style="western">
<surname>Kimelman</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Pruitt</surname>
<given-names>K.D.</given-names>
</name>
,
<name name-style="western">
<surname>Resenchuk</surname>
<given-names>S.</given-names>
</name>
,
<name name-style="western">
<surname>Tatusova</surname>
<given-names>T.</given-names>
</name>
<etal></etal>
</person-group>
<article-title>BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2012</year>
;
<volume>40</volume>
:
<fpage>D57</fpage>
<lpage>D63</lpage>
.
<pub-id pub-id-type="pmid">22139929</pub-id>
</mixed-citation>
</ref>
<ref id="B30">
<label>30.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Kodama</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Shumway</surname>
<given-names>M.</given-names>
</name>
,
<name name-style="western">
<surname>Leinonen</surname>
<given-names>R.</given-names>
</name>
,
<collab>International Nucleotide Sequence DatabaseCollaboration</collab>
</person-group>
<article-title>The sequence read archive: explosive growth of sequencing data</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2012</year>
;
<volume>40</volume>
:
<fpage>D54</fpage>
<lpage>D56</lpage>
.
<pub-id pub-id-type="pmid">22009675</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<label>31.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Zaslavsky</surname>
<given-names>L.</given-names>
</name>
,
<name name-style="western">
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
,
<name name-style="western">
<surname>Tatusova</surname>
<given-names>T.A.</given-names>
</name>
</person-group>
<article-title>Visualization of large influenza virus sequence datasets using adaptively aggregated trees with sampling-based subscale representation</article-title>
.
<source>BMC Bioinformatics</source>
.
<year>2008</year>
;
<volume>9</volume>
:
<fpage>237</fpage>
<lpage>243</lpage>
.
<pub-id pub-id-type="pmid">18485197</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<label>32.</label>
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name name-style="western">
<surname>Edgar</surname>
<given-names>R.C.</given-names>
</name>
</person-group>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
.
<source>Nucleic Acids Res.</source>
<year>2004</year>
;
<volume>32</volume>
:
<fpage>1792</fpage>
<lpage>1797</lpage>
.
<pub-id pub-id-type="pmid">15034147</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F57  | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000F57  | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021