Serveur d'exploration H2N2

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.
***** Acces problem to record *****\

Identifieur interne : 0001589 ( Pmc/Corpus ); précédent : 0001588; suivant : 0001590 ***** probable Xml problem with record *****

Links to Exploration step


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Rule-based meta-analysis reveals the major role of PB2 in influencing influenza A virus virulence in mice</title>
<author>
<name sortKey="Ivan, Fransiskus Xaverius" sort="Ivan, Fransiskus Xaverius" uniqKey="Ivan F" first="Fransiskus Xaverius" last="Ivan">Fransiskus Xaverius Ivan</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kwoh, Chee Keong" sort="Kwoh, Chee Keong" uniqKey="Kwoh C" first="Chee Keong" last="Kwoh">Chee Keong Kwoh</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">31874643</idno>
<idno type="pmc">6929465</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929465</idno>
<idno type="RBID">PMC:6929465</idno>
<idno type="doi">10.1186/s12864-019-6295-8</idno>
<date when="2019">2019</date>
<idno type="wicri:Area/Pmc/Corpus">000158</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000158</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Rule-based meta-analysis reveals the major role of PB2 in influencing influenza A virus virulence in mice</title>
<author>
<name sortKey="Ivan, Fransiskus Xaverius" sort="Ivan, Fransiskus Xaverius" uniqKey="Ivan F" first="Fransiskus Xaverius" last="Ivan">Fransiskus Xaverius Ivan</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Kwoh, Chee Keong" sort="Kwoh, Chee Keong" uniqKey="Kwoh C" first="Chee Keong" last="Kwoh">Chee Keong Kwoh</name>
<affiliation>
<nlm:aff id="Aff1"></nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Genomics</title>
<idno type="eISSN">1471-2164</idno>
<imprint>
<date when="2019">2019</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p id="Par1">Influenza A virus (IAV) poses threats to human health and life. Many individual studies have been carried out in mice to uncover the viral factors responsible for the virulence of IAV infections. Nonetheless, a single study may not provide enough confident about virulence factors, hence combining several studies for a meta-analysis is desired to provide better views. For this, we documented more than 500 records of IAV infections in mice, whose viral proteins could be retrieved and the mouse lethal dose 50 or alternatively, weight loss and/or survival data, was/were available for virulence classification.</p>
</sec>
<sec>
<title>Results</title>
<p id="Par2">IAV virulence models were learned from various datasets containing aligned IAV proteins and the corresponding two virulence classes (avirulent and virulent) or three virulence classes (low, intermediate and high virulence). Three proven rule-based learning approaches, i.e., OneR, JRip and PART, and additionally random forest were used for modelling. PART models achieved the best performance, with moderate average model accuracies ranged from 65.0 to 84.4% and from 54.0 to 66.6% for the two-class and three-class problems, respectively. PART models were comparable to or even better than random forest models and should be preferred based on the Occam’s razor principle. Interestingly, the average accuracy of the models was improved when host information was taken into account. For model interpretation, we observed that although many sites in HA were highly correlated with virulence, PART models based on sites in PB2 could compete against and were often better than PART models based on sites in HA. Moreover, PART had a high preference to include sites in PB2 when models were learned from datasets containing the concatenated alignments of all IAV proteins. Several sites with a known contribution to virulence were found as the top protein sites, and site pairs that may synergistically influence virulence were also uncovered.</p>
</sec>
<sec>
<title>Conclusion</title>
<p id="Par3">Modelling IAV virulence is a challenging problem. Rule-based models generated using viral proteins are useful for its advantage in interpretation, but only achieve moderate performance. Development of more advanced approaches that learn models from features extracted from both viral and host proteins shall be considered for future works.</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Muramoto, Y" uniqKey="Muramoto Y">Y Muramoto</name>
</author>
<author>
<name sortKey="Noda, T" uniqKey="Noda T">T Noda</name>
</author>
<author>
<name sortKey="Kawakami, E" uniqKey="Kawakami E">E Kawakami</name>
</author>
<author>
<name sortKey="Akkina, R" uniqKey="Akkina R">R Akkina</name>
</author>
<author>
<name sortKey="Kawaoka, Y" uniqKey="Kawaoka Y">Y Kawaoka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Poovorawan, Y" uniqKey="Poovorawan Y">Y Poovorawan</name>
</author>
<author>
<name sortKey="Pyungporn, S" uniqKey="Pyungporn S">S Pyungporn</name>
</author>
<author>
<name sortKey="Prachayangprecha, S" uniqKey="Prachayangprecha S">S Prachayangprecha</name>
</author>
<author>
<name sortKey="Makkoch, J" uniqKey="Makkoch J">J Makkoch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Su, S" uniqKey="Su S">S Su</name>
</author>
<author>
<name sortKey="Bi, Y" uniqKey="Bi Y">Y Bi</name>
</author>
<author>
<name sortKey="Wong, G" uniqKey="Wong G">G Wong</name>
</author>
<author>
<name sortKey="Gray, Gc" uniqKey="Gray G">GC Gray</name>
</author>
<author>
<name sortKey="Gao, Gf" uniqKey="Gao G">GF Gao</name>
</author>
<author>
<name sortKey="Li, S" uniqKey="Li S">S Li</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ma, Mj" uniqKey="Ma M">MJ Ma</name>
</author>
<author>
<name sortKey="Liu, C" uniqKey="Liu C">C Liu</name>
</author>
<author>
<name sortKey="Wu, Mn" uniqKey="Wu M">MN Wu</name>
</author>
<author>
<name sortKey="Zhao, T" uniqKey="Zhao T">T Zhao</name>
</author>
<author>
<name sortKey="Wang, Gl" uniqKey="Wang G">GL Wang</name>
</author>
<author>
<name sortKey="Yang, Y" uniqKey="Yang Y">Y Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lindenmann, J" uniqKey="Lindenmann J">J Lindenmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Verhelst, J" uniqKey="Verhelst J">J Verhelst</name>
</author>
<author>
<name sortKey="Parthoens, E" uniqKey="Parthoens E">E Parthoens</name>
</author>
<author>
<name sortKey="Schepens, B" uniqKey="Schepens B">B Schepens</name>
</author>
<author>
<name sortKey="Fiers, W" uniqKey="Fiers W">W Fiers</name>
</author>
<author>
<name sortKey="Saelens, X" uniqKey="Saelens X">X Saelens</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kamal, Rp" uniqKey="Kamal R">RP Kamal</name>
</author>
<author>
<name sortKey="Katz, Jm" uniqKey="Katz J">JM Katz</name>
</author>
<author>
<name sortKey="York, Ia" uniqKey="York I">IA York</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Medina, Ra" uniqKey="Medina R">RA Medina</name>
</author>
<author>
<name sortKey="Garcia Sastre, A" uniqKey="Garcia Sastre A">A Garcia-Sastre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Imai, M" uniqKey="Imai M">M Imai</name>
</author>
<author>
<name sortKey="Kawaoka, Y" uniqKey="Kawaoka Y">Y Kawaoka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Conenello, Gm" uniqKey="Conenello G">GM Conenello</name>
</author>
<author>
<name sortKey="Zamarin, D" uniqKey="Zamarin D">D Zamarin</name>
</author>
<author>
<name sortKey="Perrone, La" uniqKey="Perrone L">LA Perrone</name>
</author>
<author>
<name sortKey="Tumpey, T" uniqKey="Tumpey T">T Tumpey</name>
</author>
<author>
<name sortKey="Palese, P" uniqKey="Palese P">P Palese</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Song, J" uniqKey="Song J">J Song</name>
</author>
<author>
<name sortKey="Xu, J" uniqKey="Xu J">J Xu</name>
</author>
<author>
<name sortKey="Shi, J" uniqKey="Shi J">J Shi</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Chen, H" uniqKey="Chen H">H Chen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Seyer, R" uniqKey="Seyer R">R Seyer</name>
</author>
<author>
<name sortKey="Hrincius, Er" uniqKey="Hrincius E">ER Hrincius</name>
</author>
<author>
<name sortKey="Ritzel, D" uniqKey="Ritzel D">D Ritzel</name>
</author>
<author>
<name sortKey="Abt, M" uniqKey="Abt M">M Abt</name>
</author>
<author>
<name sortKey="Mellmann, A" uniqKey="Mellmann A">A Mellmann</name>
</author>
<author>
<name sortKey="Marjuki, H" uniqKey="Marjuki H">H Marjuki</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lycett, Sj" uniqKey="Lycett S">SJ Lycett</name>
</author>
<author>
<name sortKey="Ward, Mj" uniqKey="Ward M">MJ Ward</name>
</author>
<author>
<name sortKey="Lewis, Fi" uniqKey="Lewis F">FI Lewis</name>
</author>
<author>
<name sortKey="Poon, Af" uniqKey="Poon A">AF Poon</name>
</author>
<author>
<name sortKey="Kosakovsky Pond, Sl" uniqKey="Kosakovsky Pond S">SL Kosakovsky Pond</name>
</author>
<author>
<name sortKey="Brown, Aj" uniqKey="Brown A">AJ Brown</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Holte, R" uniqKey="Holte R">R Holte</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cohen, Ww" uniqKey="Cohen W">WW Cohen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Frank, E" uniqKey="Frank E">E Frank</name>
</author>
<author>
<name sortKey="Witten, Ih" uniqKey="Witten I">IH Witten</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Breiman, L" uniqKey="Breiman L">L Breiman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mair, Cm" uniqKey="Mair C">CM Mair</name>
</author>
<author>
<name sortKey="Ludwig, K" uniqKey="Ludwig K">K Ludwig</name>
</author>
<author>
<name sortKey="Herrmann, A" uniqKey="Herrmann A">A Herrmann</name>
</author>
<author>
<name sortKey="Sieben, C" uniqKey="Sieben C">C Sieben</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Arai, Y" uniqKey="Arai Y">Y Arai</name>
</author>
<author>
<name sortKey="Kawashita, N" uniqKey="Kawashita N">N Kawashita</name>
</author>
<author>
<name sortKey="Hotta, K" uniqKey="Hotta K">K Hotta</name>
</author>
<author>
<name sortKey="Hoang, Pvm" uniqKey="Hoang P">PVM Hoang</name>
</author>
<author>
<name sortKey="Nguyen, Hlk" uniqKey="Nguyen H">HLK Nguyen</name>
</author>
<author>
<name sortKey="Nguyen, Tc" uniqKey="Nguyen T">TC Nguyen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Czudai Matwich, V" uniqKey="Czudai Matwich V">V Czudai-Matwich</name>
</author>
<author>
<name sortKey="Otte, A" uniqKey="Otte A">A Otte</name>
</author>
<author>
<name sortKey="Matrosovich, M" uniqKey="Matrosovich M">M Matrosovich</name>
</author>
<author>
<name sortKey="Gabriel, G" uniqKey="Gabriel G">G Gabriel</name>
</author>
<author>
<name sortKey="Klenk, Hd" uniqKey="Klenk H">HD Klenk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fan, S" uniqKey="Fan S">S Fan</name>
</author>
<author>
<name sortKey="Hatta, M" uniqKey="Hatta M">M Hatta</name>
</author>
<author>
<name sortKey="Kim, Jh" uniqKey="Kim J">JH Kim</name>
</author>
<author>
<name sortKey="Halfmann, P" uniqKey="Halfmann P">P Halfmann</name>
</author>
<author>
<name sortKey="Imai, M" uniqKey="Imai M">M Imai</name>
</author>
<author>
<name sortKey="Macken, Ca" uniqKey="Macken C">CA Macken</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, J" uniqKey="Wang J">J Wang</name>
</author>
<author>
<name sortKey="Sun, Y" uniqKey="Sun Y">Y Sun</name>
</author>
<author>
<name sortKey="Xu, Q" uniqKey="Xu Q">Q Xu</name>
</author>
<author>
<name sortKey="Tan, Y" uniqKey="Tan Y">Y Tan</name>
</author>
<author>
<name sortKey="Pu, J" uniqKey="Pu J">J Pu</name>
</author>
<author>
<name sortKey="Yang, H" uniqKey="Yang H">H Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, X" uniqKey="Wang X">X Wang</name>
</author>
<author>
<name sortKey="Chen, S" uniqKey="Chen S">S Chen</name>
</author>
<author>
<name sortKey="Wang, D" uniqKey="Wang D">D Wang</name>
</author>
<author>
<name sortKey="Zha, X" uniqKey="Zha X">X Zha</name>
</author>
<author>
<name sortKey="Zheng, S" uniqKey="Zheng S">S Zheng</name>
</author>
<author>
<name sortKey="Qin, T" uniqKey="Qin T">T Qin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sediri, H" uniqKey="Sediri H">H Sediri</name>
</author>
<author>
<name sortKey="Thiele, S" uniqKey="Thiele S">S Thiele</name>
</author>
<author>
<name sortKey="Schwalm, F" uniqKey="Schwalm F">F Schwalm</name>
</author>
<author>
<name sortKey="Gabriel, G" uniqKey="Gabriel G">G Gabriel</name>
</author>
<author>
<name sortKey="Klenk, Hd" uniqKey="Klenk H">HD Klenk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fan, S" uniqKey="Fan S">S Fan</name>
</author>
<author>
<name sortKey="Macken, Ca" uniqKey="Macken C">CA Macken</name>
</author>
<author>
<name sortKey="Li, C" uniqKey="Li C">C Li</name>
</author>
<author>
<name sortKey="Ozawa, M" uniqKey="Ozawa M">M Ozawa</name>
</author>
<author>
<name sortKey="Goto, H" uniqKey="Goto H">H Goto</name>
</author>
<author>
<name sortKey="Iswahyudi, Nf" uniqKey="Iswahyudi N">NF Iswahyudi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pu, J" uniqKey="Pu J">J Pu</name>
</author>
<author>
<name sortKey="Wang, J" uniqKey="Wang J">J Wang</name>
</author>
<author>
<name sortKey="Zhang, Y" uniqKey="Zhang Y">Y Zhang</name>
</author>
<author>
<name sortKey="Fu, G" uniqKey="Fu G">G Fu</name>
</author>
<author>
<name sortKey="Bi, Y" uniqKey="Bi Y">Y Bi</name>
</author>
<author>
<name sortKey="Sun, Y" uniqKey="Sun Y">Y Sun</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, H" uniqKey="Chen H">H Chen</name>
</author>
<author>
<name sortKey="Bright, Ra" uniqKey="Bright R">RA Bright</name>
</author>
<author>
<name sortKey="Subbarao, K" uniqKey="Subbarao K">K Subbarao</name>
</author>
<author>
<name sortKey="Smith, C" uniqKey="Smith C">C Smith</name>
</author>
<author>
<name sortKey="Cox, Nj" uniqKey="Cox N">NJ Cox</name>
</author>
<author>
<name sortKey="Katz, Jm" uniqKey="Katz J">JM Katz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cheng, K" uniqKey="Cheng K">K Cheng</name>
</author>
<author>
<name sortKey="Yu, Z" uniqKey="Yu Z">Z Yu</name>
</author>
<author>
<name sortKey="Chai, H" uniqKey="Chai H">H Chai</name>
</author>
<author>
<name sortKey="Sun, W" uniqKey="Sun W">W Sun</name>
</author>
<author>
<name sortKey="Xin, Y" uniqKey="Xin Y">Y Xin</name>
</author>
<author>
<name sortKey="Zhang, Q" uniqKey="Zhang Q">Q Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Katz, Jm" uniqKey="Katz J">JM Katz</name>
</author>
<author>
<name sortKey="Lu, X" uniqKey="Lu X">X Lu</name>
</author>
<author>
<name sortKey="Tumpey, Tm" uniqKey="Tumpey T">TM Tumpey</name>
</author>
<author>
<name sortKey="Smith, Cb" uniqKey="Smith C">CB Smith</name>
</author>
<author>
<name sortKey="Shaw, Mw" uniqKey="Shaw M">MW Shaw</name>
</author>
<author>
<name sortKey="Subbarao, K" uniqKey="Subbarao K">K Subbarao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Hu, Y" uniqKey="Hu Y">Y Hu</name>
</author>
<author>
<name sortKey="Chang, G" uniqKey="Chang G">G Chang</name>
</author>
<author>
<name sortKey="Sun, W" uniqKey="Sun W">W Sun</name>
</author>
<author>
<name sortKey="Yang, Y" uniqKey="Yang Y">Y Yang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ping, J" uniqKey="Ping J">J Ping</name>
</author>
<author>
<name sortKey="Dankar, Sk" uniqKey="Dankar S">SK Dankar</name>
</author>
<author>
<name sortKey="Forbes, Ne" uniqKey="Forbes N">NE Forbes</name>
</author>
<author>
<name sortKey="Keleta, L" uniqKey="Keleta L">L Keleta</name>
</author>
<author>
<name sortKey="Zhou, Y" uniqKey="Zhou Y">Y Zhou</name>
</author>
<author>
<name sortKey="Tyler, S" uniqKey="Tyler S">S Tyler</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Song, Ms" uniqKey="Song M">MS Song</name>
</author>
<author>
<name sortKey="Pascua, Pn" uniqKey="Pascua P">PN Pascua</name>
</author>
<author>
<name sortKey="Lee, Jh" uniqKey="Lee J">JH Lee</name>
</author>
<author>
<name sortKey="Baek, Yh" uniqKey="Baek Y">YH Baek</name>
</author>
<author>
<name sortKey="Park, Kj" uniqKey="Park K">KJ Park</name>
</author>
<author>
<name sortKey="Kwon, Hi" uniqKey="Kwon H">HI Kwon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, X" uniqKey="Zhang X">X Zhang</name>
</author>
<author>
<name sortKey="Xu, G" uniqKey="Xu G">G Xu</name>
</author>
<author>
<name sortKey="Wang, C" uniqKey="Wang C">C Wang</name>
</author>
<author>
<name sortKey="Jiang, M" uniqKey="Jiang M">M Jiang</name>
</author>
<author>
<name sortKey="Gao, W" uniqKey="Gao W">W Gao</name>
</author>
<author>
<name sortKey="Wang, M" uniqKey="Wang M">M Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bussey, Ka" uniqKey="Bussey K">KA Bussey</name>
</author>
<author>
<name sortKey="Bousse, Tl" uniqKey="Bousse T">TL Bousse</name>
</author>
<author>
<name sortKey="Desmet, Ea" uniqKey="Desmet E">EA Desmet</name>
</author>
<author>
<name sortKey="Kim, B" uniqKey="Kim B">B Kim</name>
</author>
<author>
<name sortKey="Takimoto, T" uniqKey="Takimoto T">T Takimoto</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hatta, M" uniqKey="Hatta M">M Hatta</name>
</author>
<author>
<name sortKey="Gao, P" uniqKey="Gao P">P Gao</name>
</author>
<author>
<name sortKey="Halfmann, P" uniqKey="Halfmann P">P Halfmann</name>
</author>
<author>
<name sortKey="Kawaoka, Y" uniqKey="Kawaoka Y">Y Kawaoka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sun, H" uniqKey="Sun H">H Sun</name>
</author>
<author>
<name sortKey="Cui, P" uniqKey="Cui P">P Cui</name>
</author>
<author>
<name sortKey="Song, Y" uniqKey="Song Y">Y Song</name>
</author>
<author>
<name sortKey="Qi, Y" uniqKey="Qi Y">Y Qi</name>
</author>
<author>
<name sortKey="Li, X" uniqKey="Li X">X Li</name>
</author>
<author>
<name sortKey="Qi, W" uniqKey="Qi W">W Qi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Park, Sj" uniqKey="Park S">SJ Park</name>
</author>
<author>
<name sortKey="Kim, Eh" uniqKey="Kim E">EH Kim</name>
</author>
<author>
<name sortKey="Kwon, Hi" uniqKey="Kwon H">HI Kwon</name>
</author>
<author>
<name sortKey="Song, Ms" uniqKey="Song M">MS Song</name>
</author>
<author>
<name sortKey="Kim, Sm" uniqKey="Kim S">SM Kim</name>
</author>
<author>
<name sortKey="Kim, Yi" uniqKey="Kim Y">YI Kim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bi, Y" uniqKey="Bi Y">Y Bi</name>
</author>
<author>
<name sortKey="Xie, Q" uniqKey="Xie Q">Q Xie</name>
</author>
<author>
<name sortKey="Zhang, S" uniqKey="Zhang S">S Zhang</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Xiao, H" uniqKey="Xiao H">H Xiao</name>
</author>
<author>
<name sortKey="Jin, T" uniqKey="Jin T">T Jin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hu, M" uniqKey="Hu M">M Hu</name>
</author>
<author>
<name sortKey="Yuan, S" uniqKey="Yuan S">S Yuan</name>
</author>
<author>
<name sortKey="Zhang, K" uniqKey="Zhang K">K Zhang</name>
</author>
<author>
<name sortKey="Singh, K" uniqKey="Singh K">K Singh</name>
</author>
<author>
<name sortKey="Ma, Q" uniqKey="Ma Q">Q Ma</name>
</author>
<author>
<name sortKey="Zhou, J" uniqKey="Zhou J">J Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, W" uniqKey="Li W">W Li</name>
</author>
<author>
<name sortKey="Lee, Hhy" uniqKey="Lee H">HHY Lee</name>
</author>
<author>
<name sortKey="Li, Rf" uniqKey="Li R">RF Li</name>
</author>
<author>
<name sortKey="Zhu, Hm" uniqKey="Zhu H">HM Zhu</name>
</author>
<author>
<name sortKey="Yi, G" uniqKey="Yi G">G Yi</name>
</author>
<author>
<name sortKey="Peiris, Jsm" uniqKey="Peiris J">JSM Peiris</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mok, Ck" uniqKey="Mok C">CK Mok</name>
</author>
<author>
<name sortKey="Lee, Hh" uniqKey="Lee H">HH Lee</name>
</author>
<author>
<name sortKey="Lestra, M" uniqKey="Lestra M">M Lestra</name>
</author>
<author>
<name sortKey="Nicholls, Jm" uniqKey="Nicholls J">JM Nicholls</name>
</author>
<author>
<name sortKey="Chan, Mc" uniqKey="Chan M">MC Chan</name>
</author>
<author>
<name sortKey="Sia, Sf" uniqKey="Sia S">SF Sia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Xiao, C" uniqKey="Xiao C">C Xiao</name>
</author>
<author>
<name sortKey="Ma, W" uniqKey="Ma W">W Ma</name>
</author>
<author>
<name sortKey="Sun, N" uniqKey="Sun N">N Sun</name>
</author>
<author>
<name sortKey="Huang, L" uniqKey="Huang L">L Huang</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Zeng, Z" uniqKey="Zeng Z">Z Zeng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, C" uniqKey="Wang C">C Wang</name>
</author>
<author>
<name sortKey="Lee, Hh" uniqKey="Lee H">HH Lee</name>
</author>
<author>
<name sortKey="Yang, Zf" uniqKey="Yang Z">ZF Yang</name>
</author>
<author>
<name sortKey="Mok, Ck" uniqKey="Mok C">CK Mok</name>
</author>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Neumann, G" uniqKey="Neumann G">G Neumann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boivin, S" uniqKey="Boivin S">S Boivin</name>
</author>
<author>
<name sortKey="Hart, Dj" uniqKey="Hart D">DJ Hart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gabriel, G" uniqKey="Gabriel G">G Gabriel</name>
</author>
<author>
<name sortKey="Dauber, B" uniqKey="Dauber B">B Dauber</name>
</author>
<author>
<name sortKey="Wolff, T" uniqKey="Wolff T">T Wolff</name>
</author>
<author>
<name sortKey="Planz, O" uniqKey="Planz O">O Planz</name>
</author>
<author>
<name sortKey="Klenk, Hd" uniqKey="Klenk H">HD Klenk</name>
</author>
<author>
<name sortKey="Stech, J" uniqKey="Stech J">J Stech</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, Cy" uniqKey="Lee C">CY Lee</name>
</author>
<author>
<name sortKey="An, Sh" uniqKey="An S">SH An</name>
</author>
<author>
<name sortKey="Kim, I" uniqKey="Kim I">I Kim</name>
</author>
<author>
<name sortKey="Go, Dm" uniqKey="Go D">DM Go</name>
</author>
<author>
<name sortKey="Kim, Dy" uniqKey="Kim D">DY Kim</name>
</author>
<author>
<name sortKey="Choi, Jg" uniqKey="Choi J">JG Choi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, B" uniqKey="Zhou B">B Zhou</name>
</author>
<author>
<name sortKey="Li, Y" uniqKey="Li Y">Y Li</name>
</author>
<author>
<name sortKey="Halpin, R" uniqKey="Halpin R">R Halpin</name>
</author>
<author>
<name sortKey="Hine, E" uniqKey="Hine E">E Hine</name>
</author>
<author>
<name sortKey="Spiro, Dj" uniqKey="Spiro D">DJ Spiro</name>
</author>
<author>
<name sortKey="Wentworth, De" uniqKey="Wentworth D">DE Wentworth</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kato, Ys" uniqKey="Kato Y">YS Kato</name>
</author>
<author>
<name sortKey="Fukui, K" uniqKey="Fukui K">K Fukui</name>
</author>
<author>
<name sortKey="Suzuki, K" uniqKey="Suzuki K">K Suzuki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cheng, J" uniqKey="Cheng J">J Cheng</name>
</author>
<author>
<name sortKey="Zhang, C" uniqKey="Zhang C">C Zhang</name>
</author>
<author>
<name sortKey="Tao, J" uniqKey="Tao J">J Tao</name>
</author>
<author>
<name sortKey="Li, B" uniqKey="Li B">B Li</name>
</author>
<author>
<name sortKey="Shi, Y" uniqKey="Shi Y">Y Shi</name>
</author>
<author>
<name sortKey="Liu, H" uniqKey="Liu H">H Liu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fan, S" uniqKey="Fan S">S Fan</name>
</author>
<author>
<name sortKey="Deng, G" uniqKey="Deng G">G Deng</name>
</author>
<author>
<name sortKey="Song, J" uniqKey="Song J">J Song</name>
</author>
<author>
<name sortKey="Tian, G" uniqKey="Tian G">G Tian</name>
</author>
<author>
<name sortKey="Suo, Y" uniqKey="Suo Y">Y Suo</name>
</author>
<author>
<name sortKey="Jiang, Y" uniqKey="Jiang Y">Y Jiang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Blazejewska, P" uniqKey="Blazejewska P">P Blazejewska</name>
</author>
<author>
<name sortKey="Koscinski, L" uniqKey="Koscinski L">L Koscinski</name>
</author>
<author>
<name sortKey="Viegas, N" uniqKey="Viegas N">N Viegas</name>
</author>
<author>
<name sortKey="Anhlan, D" uniqKey="Anhlan D">D Anhlan</name>
</author>
<author>
<name sortKey="Ludwig, S" uniqKey="Ludwig S">S Ludwig</name>
</author>
<author>
<name sortKey="Schughart, K" uniqKey="Schughart K">K Schughart</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Boon, Ac" uniqKey="Boon A">AC Boon</name>
</author>
<author>
<name sortKey="De Beauchamp, J" uniqKey="De Beauchamp J">J de Beauchamp</name>
</author>
<author>
<name sortKey="Hollmann, A" uniqKey="Hollmann A">A Hollmann</name>
</author>
<author>
<name sortKey="Luke, J" uniqKey="Luke J">J Luke</name>
</author>
<author>
<name sortKey="Kotb, M" uniqKey="Kotb M">M Kotb</name>
</author>
<author>
<name sortKey="Rowe, S" uniqKey="Rowe S">S Rowe</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Davidson, S" uniqKey="Davidson S">S Davidson</name>
</author>
<author>
<name sortKey="Crotta, S" uniqKey="Crotta S">S Crotta</name>
</author>
<author>
<name sortKey="Mccabe, Tm" uniqKey="Mccabe T">TM McCabe</name>
</author>
<author>
<name sortKey="Wack, A" uniqKey="Wack A">A Wack</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pica, N" uniqKey="Pica N">N Pica</name>
</author>
<author>
<name sortKey="Iyer, A" uniqKey="Iyer A">A Iyer</name>
</author>
<author>
<name sortKey="Ramos, I" uniqKey="Ramos I">I Ramos</name>
</author>
<author>
<name sortKey="Bouvier, Nm" uniqKey="Bouvier N">NM Bouvier</name>
</author>
<author>
<name sortKey="Fernandez Sesma, A" uniqKey="Fernandez Sesma A">A Fernandez-Sesma</name>
</author>
<author>
<name sortKey="Garcia Sastre, A" uniqKey="Garcia Sastre A">A Garcia-Sastre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Srivastava, B" uniqKey="Srivastava B">B Srivastava</name>
</author>
<author>
<name sortKey="Blazejewska, P" uniqKey="Blazejewska P">P Blazejewska</name>
</author>
<author>
<name sortKey="Hessmann, M" uniqKey="Hessmann M">M Hessmann</name>
</author>
<author>
<name sortKey="Bruder, D" uniqKey="Bruder D">D Bruder</name>
</author>
<author>
<name sortKey="Geffers, R" uniqKey="Geffers R">R Geffers</name>
</author>
<author>
<name sortKey="Mauel, S" uniqKey="Mauel S">S Mauel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ye, J" uniqKey="Ye J">J Ye</name>
</author>
<author>
<name sortKey="Sorrell, Em" uniqKey="Sorrell E">EM Sorrell</name>
</author>
<author>
<name sortKey="Cai, Y" uniqKey="Cai Y">Y Cai</name>
</author>
<author>
<name sortKey="Shao, H" uniqKey="Shao H">H Shao</name>
</author>
<author>
<name sortKey="Xu, K" uniqKey="Xu K">K Xu</name>
</author>
<author>
<name sortKey="Pena, L" uniqKey="Pena L">L Pena</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, K" uniqKey="Zhou K">K Zhou</name>
</author>
<author>
<name sortKey="Wang, J" uniqKey="Wang J">J Wang</name>
</author>
<author>
<name sortKey="Li, A" uniqKey="Li A">A Li</name>
</author>
<author>
<name sortKey="Zhao, W" uniqKey="Zhao W">W Zhao</name>
</author>
<author>
<name sortKey="Wang, D" uniqKey="Wang D">D Wang</name>
</author>
<author>
<name sortKey="Zhang, W" uniqKey="Zhang W">W Zhang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Eisfeld, Aj" uniqKey="Eisfeld A">AJ Eisfeld</name>
</author>
<author>
<name sortKey="Gasper, Dj" uniqKey="Gasper D">DJ Gasper</name>
</author>
<author>
<name sortKey="Suresh, M" uniqKey="Suresh M">M Suresh</name>
</author>
<author>
<name sortKey="Kawaoka, Y" uniqKey="Kawaoka Y">Y Kawaoka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reed, Lj" uniqKey="Reed L">LJ Reed</name>
</author>
<author>
<name sortKey="Muench, H" uniqKey="Muench H">H Muench</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y Bao</name>
</author>
<author>
<name sortKey="Bolotov, P" uniqKey="Bolotov P">P Bolotov</name>
</author>
<author>
<name sortKey="Dernovoy, D" uniqKey="Dernovoy D">D Dernovoy</name>
</author>
<author>
<name sortKey="Kiryutin, B" uniqKey="Kiryutin B">B Kiryutin</name>
</author>
<author>
<name sortKey="Zaslavsky, L" uniqKey="Zaslavsky L">L Zaslavsky</name>
</author>
<author>
<name sortKey="Tatusova, T" uniqKey="Tatusova T">T Tatusova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sayers, Ew" uniqKey="Sayers E">EW Sayers</name>
</author>
<author>
<name sortKey="Cavanaugh, M" uniqKey="Cavanaugh M">M Cavanaugh</name>
</author>
<author>
<name sortKey="Clark, K" uniqKey="Clark K">K Clark</name>
</author>
<author>
<name sortKey="Ostell, J" uniqKey="Ostell J">J Ostell</name>
</author>
<author>
<name sortKey="Pruitt, Kd" uniqKey="Pruitt K">KD Pruitt</name>
</author>
<author>
<name sortKey="Karsch Mizrachi, I" uniqKey="Karsch Mizrachi I">I Karsch-Mizrachi</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hornik, K" uniqKey="Hornik K">K Hornik</name>
</author>
<author>
<name sortKey="Buchta, C" uniqKey="Buchta C">C Buchta</name>
</author>
<author>
<name sortKey="Zeileis, A" uniqKey="Zeileis A">A Zeileis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liaw, A" uniqKey="Liaw A">A Liaw</name>
</author>
<author>
<name sortKey="Wiener, M" uniqKey="Wiener M">M Wiener</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, Rc" uniqKey="Edgar R">RC Edgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Venables, Wn" uniqKey="Venables W">WN Venables</name>
</author>
<author>
<name sortKey="Ripley, Bd" uniqKey="Ripley B">BD Ripley</name>
</author>
<author>
<name sortKey="Venables, Wn" uniqKey="Venables W">WN Venables</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Genomics</journal-id>
<journal-id journal-id-type="iso-abbrev">BMC Genomics</journal-id>
<journal-title-group>
<journal-title>BMC Genomics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2164</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
<publisher-loc>London</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">31874643</article-id>
<article-id pub-id-type="pmc">6929465</article-id>
<article-id pub-id-type="publisher-id">6295</article-id>
<article-id pub-id-type="doi">10.1186/s12864-019-6295-8</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Rule-based meta-analysis reveals the major role of PB2 in influencing influenza A virus virulence in mice</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-6491-6358</contrib-id>
<name>
<surname>Ivan</surname>
<given-names>Fransiskus Xaverius</given-names>
</name>
<address>
<email>fivan@ntu.edu.sg</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kwoh</surname>
<given-names>Chee Keong</given-names>
</name>
<address>
<email>asckkwoh@ntu.edu.sg</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<aff id="Aff1">
<institution-wrap>
<institution-id institution-id-type="ISNI">0000 0001 2224 0361</institution-id>
<institution-id institution-id-type="GRID">grid.59025.3b</institution-id>
<institution>Biomedical Informatics Lab, School of Computer Science and Engineering,</institution>
<institution>Nanyang Technological University,</institution>
</institution-wrap>
Singapore, Singapore</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>24</day>
<month>12</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="pmc-release">
<day>24</day>
<month>12</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="collection">
<year>2019</year>
</pub-date>
<volume>20</volume>
<issue>Suppl 9</issue>
<elocation-id>973</elocation-id>
<history>
<date date-type="received">
<day>10</day>
<month>11</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>11</month>
<year>2019</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s). 2019</copyright-statement>
<license license-type="OpenAccess">
<license-p>
<bold>Open Access</bold>
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</ext-link>
) applies to the data made available in this article, unless otherwise stated.</license-p>
</license>
</permissions>
<abstract id="Abs1">
<sec>
<title>Background</title>
<p id="Par1">Influenza A virus (IAV) poses threats to human health and life. Many individual studies have been carried out in mice to uncover the viral factors responsible for the virulence of IAV infections. Nonetheless, a single study may not provide enough confident about virulence factors, hence combining several studies for a meta-analysis is desired to provide better views. For this, we documented more than 500 records of IAV infections in mice, whose viral proteins could be retrieved and the mouse lethal dose 50 or alternatively, weight loss and/or survival data, was/were available for virulence classification.</p>
</sec>
<sec>
<title>Results</title>
<p id="Par2">IAV virulence models were learned from various datasets containing aligned IAV proteins and the corresponding two virulence classes (avirulent and virulent) or three virulence classes (low, intermediate and high virulence). Three proven rule-based learning approaches, i.e., OneR, JRip and PART, and additionally random forest were used for modelling. PART models achieved the best performance, with moderate average model accuracies ranged from 65.0 to 84.4% and from 54.0 to 66.6% for the two-class and three-class problems, respectively. PART models were comparable to or even better than random forest models and should be preferred based on the Occam’s razor principle. Interestingly, the average accuracy of the models was improved when host information was taken into account. For model interpretation, we observed that although many sites in HA were highly correlated with virulence, PART models based on sites in PB2 could compete against and were often better than PART models based on sites in HA. Moreover, PART had a high preference to include sites in PB2 when models were learned from datasets containing the concatenated alignments of all IAV proteins. Several sites with a known contribution to virulence were found as the top protein sites, and site pairs that may synergistically influence virulence were also uncovered.</p>
</sec>
<sec>
<title>Conclusion</title>
<p id="Par3">Modelling IAV virulence is a challenging problem. Rule-based models generated using viral proteins are useful for its advantage in interpretation, but only achieve moderate performance. Development of more advanced approaches that learn models from features extracted from both viral and host proteins shall be considered for future works.</p>
</sec>
</abstract>
<kwd-group xml:lang="en">
<title>Keywords</title>
<kwd>Influenza A virus</kwd>
<kwd>Mouse models</kwd>
<kwd>Virulence</kwd>
<kwd>Proteins</kwd>
<kwd>Meta-analysis</kwd>
<kwd>Rule-based classification</kwd>
<kwd>Random forest</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source>
<institution>Ministry of Education, Singapore</institution>
</funding-source>
<award-id>MOE2014-T2-2-023</award-id>
<principal-award-recipient>
<name>
<surname>Kwoh</surname>
<given-names>Chee Keong</given-names>
</name>
</principal-award-recipient>
</award-group>
</funding-group>
<funding-group>
<award-group>
<funding-source>
<institution>A*STAR-NTU-SUTD AI Partnership Grant</institution>
</funding-source>
<award-id>RGANS1905</award-id>
<principal-award-recipient>
<name>
<surname>Kwoh</surname>
<given-names>Chee Keong</given-names>
</name>
</principal-award-recipient>
</award-group>
</funding-group>
<conference xlink:href="https://incob2019.org/">
<conf-name>International Conference on Bioinformatics (InCoB 2019)</conf-name>
<conf-acronym>InCoB 2019</conf-acronym>
<conf-loc>Jakarta, Indonesia</conf-loc>
<conf-date>10-12 September 2019</conf-date>
</conference>
<custom-meta-group>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© The Author(s) 2019</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec id="Sec1">
<title>Background</title>
<p id="Par37">Influenza A virus (IAV) is a member of the family
<italic>Orthomyxoviridae</italic>
that circulates in humans, mammals and birds. The genome of the virus consists of 8 single-stranded, negative-sense viral RNA segments encoding at least 12 proteins that make up its proteome [
<xref ref-type="bibr" rid="CR1">1</xref>
]. Segment 1 encodes for the basic RNA polymerase 2 (PB2); segment 2 encodes for the basic RNA polymerase 1 (PB1) and non-essential PB1-F2 protein; segment 3 encodes for the acidic RNA polymerase (PA) and non-essential PA-X protein; segment 4 encodes for the hemagglutinin (HA) membrane glycoprotein; segment 5 encodes for the nucleocapsid protein (NP); segment 6 encodes for the neuraminidase (NA) membrane glycoprotein; segment 7 encodes for the matrix protein 1 (M1) and matrix protein 2 (M2; also referred to as ion channel protein); and segment 8 encodes for the nonstructural protein 1 (NS1) and nonstructural protein 2 (NS2; also referred to as nuclear export protein).</p>
<p id="Par38">The HA and NA determine the subtype of IAV. To date, 18 HA (H1-H18) and 11 NA (N1-N11) have been identified. The H1N1, H2N2, and H3N2 subtypes have been responsible for five pandemics of severe human respiratory diseases in the last 100 years, i.e., the 1918 Spanish Influenza (H1N1), 1957 Asian Influenza (H2N2), 1968 Hong Kong (H3N2), 1977 Russian Influenza (H1N1), and 2009 Swine-Origin Influenza (H1N1). The H1N1 and H3N2 subtypes also cause recurrent, seasonal epidemics. In the last few years, the seasonal human IAVs were mainly dominated by the 1968’s H3N2 and 2009’s H1N1 strains. In addition to epidemic and pandemic strains, several IAV subtypes have also infected humans, including the H5N1, H5N6, H6N1, H7N2, H7N3, H7N7, H7N9, H9N2, and H10N8 avian influenza viruses [
<xref ref-type="bibr" rid="CR2">2</xref>
,
<xref ref-type="bibr" rid="CR3">3</xref>
]. Among them, the H5N1 and H7N9 subtypes have raised a major public health concern due to their ability to cause outbreaks with high fatality rate (about 60% (
<ext-link ext-link-type="uri" xlink:href="http://www.who.int">www.who.int</ext-link>
) and 39% [
<xref ref-type="bibr" rid="CR4">4</xref>
], respectively). Overall, IAV poses a threat to human health and life, and therefore further understanding about the virus is needed for a better surveillance and counteractive measures against it.</p>
<p id="Par39">Many aspects of IAV and the disease it causes have been investigated in mice since the animals are not only cost-effective and easy to handle, but also available in various inbred, transgenic, and knockout strains. Moreover, the genomes of various inbred mice have been recently available. Mice have also allowed us to uncover host and viral molecular determinants of IAV virulence. Early outcome of IAV study in mice was the revelation of the protective role of interferon-induced gene Mx1 against the virus [
<xref ref-type="bibr" rid="CR5">5</xref>
]. Recently, the gene has been shown to inhibit the assembly of functional viral ribonucleoprotein complex of IAV [
<xref ref-type="bibr" rid="CR6">6</xref>
]. In the last 50 years, the importance of many more host genes in influenza pathogenesis has been discovered through experiments in mice, including RIG-I, IFITM3, TNF and IL-1R genes (reviewed in [
<xref ref-type="bibr" rid="CR7">7</xref>
,
<xref ref-type="bibr" rid="CR8">8</xref>
]). Nonetheless, one limitation of the existing approaches in investigating host molecular determinants involved in IAV virulence is that it has not yet taken into account the contribution of allelic variation to differential host responses.</p>
<p id="Par40">In contrast, the influence of variations in viral genes to IAV virulence have been investigated in a number of ways. These included the generation of mouse-adapted IAVs through serial lung-to-lung passaging and recombinant IAVs harboring specific mutations using plasmid-based reverse genetic techniques combined with mutagenesis approaches. The application of these techniques has provided various insights about viral mutations involved in IAV virulence. For example, the increased virulence of IAV during its adaptation in mice has been associated with mutations in the region 190-helix, 220-loop and 130-loop, which surround the receptor-binding site in the HA protein (reviewed in [
<xref ref-type="bibr" rid="CR9">9</xref>
]). Mutations in PB2 have also been considered to play a significant role in the increased IAV virulence in mice, which include mutations E627K and D701N that are considered as general markers for IAV virulence in mice [
<xref ref-type="bibr" rid="CR7">7</xref>
]. Interestingly, a single mutation N66S in the accessory protein PB1-F2 could also contribute to increased virulence [
<xref ref-type="bibr" rid="CR10">10</xref>
]. Mutations in multiple sites of a specific viral protein and mutations in multiple genes have also been shown to have a synergistic effect on IAV virulence in mice. For example, synergistic effect of dual mutations S224P and N383D in PA led to increased polymerase activity and has been considered as a hallmark for natural adaptation of H1N1 and H5N1 viruses to mammals [
<xref ref-type="bibr" rid="CR11">11</xref>
]. Another example is the synergistic action of two mutations D222G and K163E in HA and one mutation F35 L in PA of pandemic 2009 influenza H1N1 virus that causes lethality in the infected mice [
<xref ref-type="bibr" rid="CR12">12</xref>
]. Furthermore, virulence may not only be encoded at protein level, but also at nucleotide and post-translational level. In a very recent study, synonymous codons were interestingly able to give rise different virulence levels [
<xref ref-type="bibr" rid="CR13">13</xref>
]. On the other hand, the HA N-linked glycosylation is known to affect viral virulence by impacting the host immune response (reviewed in [
<xref ref-type="bibr" rid="CR14">14</xref>
]).</p>
<p id="Par41">The confidence of contribution of viral protein sites to the virulence of influenza infections could be better investigated through a meta-analysis approach, which is a systematic amalgamation of results from individual studies. Such approach, to our knowledge, has only been carried out using a Bayesian graphical model to investigate the viral protein sites important for virulence of influenza H5N1 in mammals [
<xref ref-type="bibr" rid="CR15">15</xref>
]. Nevertheless, a meta-analysis approach using Naive Bayes approach at viral nucleotide level has recently been carried out to demonstrate the contribution of synonymous nucleotide mutations to IAV virulence [
<xref ref-type="bibr" rid="CR13">13</xref>
]. In this paper we present a meta-analysis of viral protein sites that determine the virulence of infections with any subtype of IAV; however, instead of any mammal, we focus on the infections in mice. Our meta-analysis approach utilized rule-based machine learnings and random forest to predict IAV virulence from datasets we created. The creation of the datasets involved: (
<italic>i</italic>
) documentation of the virulence of infections involving particular IAV and mouse strains, (
<italic>ii</italic>
) classification of virulence levels, and (
<italic>iii</italic>
) collection and alignments of the corresponding IAV protein sequences. For learning IAV virulence models, each column of the alignments was considered as a feature vector and the virulence levels as a target vector. When host information was considered, the amino acids in the columns were tagged with a symbol representing the corresponding mouse strain. The models were developed using either all records in the datasets or records for a specific mouse strain or influenza subtype, and using the concatenated alignments of all IAV proteins or individual alignment of PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2, or NS2 proteins. Top protein sites and synergy between protein sites were then examined for some biological interpretations.</p>
</sec>
<sec id="Sec2">
<title>Results</title>
<sec id="Sec3">
<title>Datasets for modelling IAV virulence</title>
<p id="Par42">The steps in creating benchmark datasets for modeling IAV virulence is summarized in Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
. Initially, a dataset containing 637 records of IAV infections in mice – of which the full or incomplete genomes of the IAVs could be retrieved from public sequence databases and the virulence class of the infection could be identified - was created according to information available in 84 journal publications (Additional file 
<xref rid="MOESM5" ref-type="media">5</xref>
: Table S1). Of those records, 502 records have their MLD50 provided in the literature. Following RULE 6 (see
<xref rid="Sec9" ref-type="sec">Methods</xref>
), multiple records involving specific IAV and mouse strain were reduced into a single record (Additional file 
<xref rid="MOESM6" ref-type="media">6</xref>
: Table S2). This produced a new dataset containing 555 records and named as the Mouse-IAV Virulence (MIVir) dataset. Using the same rule, the MIVir dataset was further reduced to a dataset containing 489 records of IAV virulence across different mouse strains and named as the IAV Virulence (IVir) dataset (Additional file 
<xref rid="MOESM7" ref-type="media">7</xref>
: Table S3).
<fig id="Fig1">
<label>Fig. 1</label>
<caption>
<p>Creation of benchmark datasets for IAV virulence prediction. The dataset containing initial virulence information can be found in Table S1 (Additional file
<xref rid="MOESM5" ref-type="media">5</xref>
), while the Mouse-IAV Virulence (MVir) and IAV Virulence (IVir) datasets can be found in Table S2 and S3 (Additional files
<xref rid="MOESM6" ref-type="media">6</xref>
and
<xref rid="MOESM7" ref-type="media">
<bold>7</bold>
</xref>
), respectively</p>
</caption>
<graphic xlink:href="12864_2019_6295_Fig1_HTML" id="MO1"></graphic>
</fig>
</p>
<p id="Par43">The MIVir and IVir datasets were then inner joined with another dataset containing the 12 IAV proteins whose amino acids in their aligned position (named as the IAV Proteins (IP) dataset), producing the MIVir ×
<sub>I</sub>
IP and IVir ×
<sub>I</sub>
IP datasets, respectively. The keys for joining the dataset were the IAV strains listed in the MIVir or IVir dataset. Once again, note that some virus strains were represented by multiple records in the IP dataset and some proteins were generated from extrapolated genomes. The breakdowns of the two joined datasets are shown in Fig.
<xref rid="Fig1" ref-type="fig">1</xref>
, and a more detailed breakdown of the MIVir ×
<sub>I</sub>
IP is shown in Table 
<xref rid="Tab1" ref-type="table">1</xref>
. As shown in the figure and table, the final datasets were mainly dominated by experiments involving BALB/C and C57BL/6 mice and H1N1, H3N2 and H5N1 viruses. Much fewer 129S1/SvImJ, 129S1/SvPasCrlVr, A/J, C3H, CAST/EiJ, CBA/J, CD-1, DBA/2, FVB/NJ, ICR, NOD/ShiLtJ, NZO/HILtJ, PWK/PhJ, SJL/JOrlCrl, and WSB/EiJ mice and H1N2, H3N8, H5N2, H5N5, H5N6, H5N8, H6N1, H7N1, H7N2, H7N3, H7N7, H7N9 and H9N2 viruses were in the datasets. Subsets of the MIVir ×
<sub>I</sub>
IP dataset used in this study included the dataset containing all records (named as the MIV dataset) and datasets containing records of infections in BALB/C and C57BL/6 mice (the BALB/C and C57BL/6 datasets, respectively); while subsets of the IVir ×
<sub>I</sub>
IP dataset used in this study included the dataset containing all records (the IV dataset) and datasets containing infections with H1N1, H3N2 and H5N1 viruses (the H1N1, H3N2 and H5N1 datasets, respectively). For virulence modelling, we further considered the subsets of the MIV, IV, BALB/C, C57BL/6, H1N1, H3N2 and H5N1 datasets, whether they contained the concatenated IAV protein alignments or individual alignment of PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2 or NS2 proteins.
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>Cross-tabulation between mouse strains and IAV subtypes in the MIVir ×
<sub>I</sub>
IP (MIV) dataset. The number at the top in each cell corresponds to the number of records of relevant infections, and its breakdown into high, intermediate and low virulence cases for the three-class classification problems are shown in order in parenthesis. The number of virulent cases for the two-class classification problems is the sum of the number of high and intermediate virulence cases, while the number of avirulent cases equals to the number of low virulence cases</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th rowspan="2">Mouse strain</th>
<th colspan="5">IAV subtype</th>
</tr>
<tr>
<th>H1N1</th>
<th>H3N2</th>
<th>H5N1</th>
<th>Others</th>
<th>Total</th>
</tr>
</thead>
<tbody>
<tr>
<td>BALB/C</td>
<td>
<p>123</p>
<p>(35/40/48)</p>
</td>
<td>
<p>14</p>
<p>(4/2/8)</p>
</td>
<td>
<p>162</p>
<p>(69/40/53)</p>
</td>
<td>
<p>136</p>
<p>(39/49/48)</p>
</td>
<td>
<p>435</p>
<p>(147/131/157)</p>
</td>
</tr>
<tr>
<td>C57BL/6</td>
<td>
<p>61</p>
<p>(14/34/13)</p>
</td>
<td>
<p>17</p>
<p>(1/2/14)</p>
</td>
<td>
<p>6</p>
<p>(6/0/0)</p>
</td>
<td>
<p>26</p>
<p>(10/5/11)</p>
</td>
<td>
<p>110</p>
<p>(31/41/38)</p>
</td>
</tr>
<tr>
<td>CD-1</td>
<td>
<p>0</p>
<p>(0/0/0)</p>
</td>
<td>
<p>34</p>
<p>(5/16/13)</p>
</td>
<td>
<p>0</p>
<p>(0/0/0)</p>
</td>
<td>
<p>0</p>
<p>(0/0/0)</p>
</td>
<td>
<p>34</p>
<p>(5/16/13)</p>
</td>
</tr>
<tr>
<td>DBA/2</td>
<td>
<p>21</p>
<p>(14/5/2)</p>
</td>
<td>
<p>15</p>
<p>(2/5/8)</p>
</td>
<td>
<p>0</p>
<p>(0/0/0)</p>
</td>
<td>
<p>6</p>
<p>(2/2/2)</p>
</td>
<td>
<p>42</p>
<p>(18/12/12)</p>
</td>
</tr>
<tr>
<td>Others</td>
<td>
<p>19</p>
<p>(9/3/7)</p>
</td>
<td>
<p>7</p>
<p>(5/0/2)</p>
</td>
<td>
<p>1</p>
<p>(0/0/1)</p>
</td>
<td>
<p>1</p>
<p>(0/1/0)</p>
</td>
<td>
<p>28</p>
<p>(14/4/10)</p>
</td>
</tr>
<tr>
<td>Total</td>
<td>
<p>224</p>
<p>(72/82/70)</p>
</td>
<td>
<p>87</p>
<p>(17/25/45)</p>
</td>
<td>
<p>169</p>
<p>(75/40/54)</p>
</td>
<td>
<p>169</p>
<p>(51/57/61)</p>
</td>
<td>
<p>649</p>
<p>(215/204/230)</p>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
</sec>
<sec id="Sec4">
<title>Visualization of IV dataset</title>
<p id="Par44">For an initial view of the IAV sequences being used for virulence prediction, the 3D multidimensional scaling plot that visualizes the level of similarity between the concatenated alignments of all IAV proteins in the IV dataset is presented in Fig. 
<xref rid="Fig2" ref-type="fig">2</xref>
. While the clusters of dominant IAV subtypes can be easily observed in the plot, separation between virulence classes is lack and this illustrates the challenge in the prediction.
<fig id="Fig2">
<label>Fig. 2</label>
<caption>
<p>Three-dimensional multidimensional scaling plot of the concatenated alignments of all IAV proteins. Each data point, which represents a record of concatenated aligned proteins of a particular IAV strain, is colored based on the subtype and three-class virulence label</p>
</caption>
<graphic xlink:href="12864_2019_6295_Fig2_HTML" id="MO2"></graphic>
</fig>
</p>
<p id="Par45">In addition, the correlation between each site and the target virulence class in the IV dataset was also measured using the Benjamini-Hochberg (BH) adjusted
<italic>p</italic>
-value of the chi-square test of independence. The line plots showing the –log (BH adjusted p-value) over the alignment sites of each IAV protein for the two-class and three-class datasets are given in Fig. 
<xref rid="Fig3" ref-type="fig">3</xref>
. Overall, HA had many more sites that had a significant correlation with the target virulence (BH adjusted p-value < 0.05), i.e., 72 and 283 sites for the two-class and three-class datasets, respectively. On the other hand, M2 had the least numbers of significant sites, i.e., 1 and 4 for the two-class and three-class datasets, respectively. The numbers of significant sites for other proteins and for the two-class and three-class datasets, respectively, are as follows: 26 and 44 for PB2, 6 and 30 for PB1, 14 and 33 for PA, 19 and 40 for NP, 19 and 167 for NA, 4 and 10 for M1, 18 and 32 for NS1, 3 and 30 for PB1-F2, 6 and 26 for PA-X, and 3 and 5 for NS2. Interestingly, while PB2, PA, NP, M1, NS1 and NS2 had their number of significant sites for the three-class dataset about twice the number of significant sites for the two-class dataset, the PB1, HA, NA, PB1-F2 and PA-X had a much higher fold increase in the number of significant sites.
<fig id="Fig3">
<label>Fig. 3</label>
<caption>
<p>Line plots showing the correlations between sites in the IAV protein alignments and IAV virulence class in the two-class (on the left; subplots A-L) and three-class (on the right; subplots M-X) IV datasets. The correlations are measured using the negative log of the Benjamini-Hochberg (BH) adjusted
<italic>p</italic>
-values of the chi-square tests for independence between sites and IAV virulence. The red dashed horizontal line in each plot indicates the critical adjusted p-value based on the significance level of 0.05</p>
</caption>
<graphic xlink:href="12864_2019_6295_Fig3_HTML" id="MO3"></graphic>
</fig>
</p>
</sec>
<sec id="Sec5">
<title>Performance of rule-based models for IAV virulence</title>
<p id="Par46">Here we focus on the application of OneR, JRip and PART algorithms for developing rule-based models for IAV virulence from various datasets we created. Examples of the virulence models generated using the machine learning algorithms for the two-class and three-class MIV, IV, BALB/C, C57BL/6, H1N1, H3N2 and H5N1 datasets containing the concatenated protein alignments are provided in Tables S9-S15 (Additional files
<xref rid="MOESM13" ref-type="media">13</xref>
,
<xref rid="MOESM14" ref-type="media">14</xref>
,
<xref rid="MOESM15" ref-type="media">15</xref>
,
<xref rid="MOESM16" ref-type="media">16</xref>
,
<xref rid="MOESM17" ref-type="media">17</xref>
,
<xref rid="MOESM18" ref-type="media">18</xref>
and
<xref rid="MOESM19" ref-type="media">19</xref>
), respectively. For each of the two-class and three-class datasets, containing either the concatenated protein alignments or individual protein alignment, 100 virulence models were generated for performance evaluation in this section and model characterization in the next section. Specifically, a three-way ANOVA (with interactions) model was built for each two-class and three-class dataset collection to evaluate the difference in accuracy between models. It revealed that the accuracy of the virulence models in both collections were influenced by the dataset, protein alignment, machine learning algorithm, as well as interactions among them. Following this, the Tukey’s HSD post hoc tests for multiple comparisons between pairs of models were carried out and some results are discussed here.</p>
<p id="Par47">Table 
<xref rid="Tab2" ref-type="table">2</xref>
highlights the performance of OneR, JRip and PART on the two-class and three-class datasets containing the concatenated IAV protein alignments. Overall, in terms of their average accuracy, precision and recall, PART models always outperformed OneR and JRip, while JRip were almost always better than OneR (the only case OneR consistently outperformed JRip was on the three-class H3N2 classification). However, statistical significant differences were mainly observed between PART and OneR/JRip models, and less frequently observed between OneR and JRip models mentioned (please inspect (Additional file 
<xref rid="MOESM3" ref-type="media">3</xref>
: Figure S3) for MIV and IV and (Additional file 
<xref rid="MOESM4" ref-type="media">4</xref>
: Figure S4) for BALB/C, C57BL/6, H1N1, H3N2 and H5N1). Nonetheless, PART had many more rules compared to JRip and OneR. For example, PART had on average 10.67 and 46.97 rules per model for the two-class and three-class IV dataset, respectively; while JRip had on average 3.89 and 4.55 rules, respectively, and OneR always had 1 rule.
<table-wrap id="Tab2">
<label>Table 2</label>
<caption>
<p>Average accuracy, precision and recall (standard deviations in parantheses) of the 100 OneR (1R), JRip (JR) on PART (PT) models learned independently from the two-class and three-class MIV, IV, BALB/C, C57BL/6, H1N1, H3N2 and H5N1 datasets containing the concatenated alignments of all IAV proteins</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th rowspan="2"></th>
<th colspan="3">Accuracy (%)</th>
<th colspan="3">Precision (%)</th>
<th colspan="3">Recall (%)</th>
</tr>
<tr>
<th>1R</th>
<th>JR</th>
<th>PT</th>
<th>1R</th>
<th>JR</th>
<th>PT</th>
<th>1R</th>
<th>JR</th>
<th>PT</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="10">Two-class datasets</td>
</tr>
<tr>
<td> MIV</td>
<td>
<p>58.6</p>
<p>(3.6)</p>
</td>
<td>
<p>58.8</p>
<p>(5.9)</p>
</td>
<td>
<p>
<bold>71.8</bold>
</p>
<p>(3.8)</p>
</td>
<td>
<p>59.1</p>
<p>(3.8)</p>
</td>
<td>
<p>59.9</p>
<p>(6.8)</p>
</td>
<td>
<p>
<bold>72.2</bold>
</p>
<p>(3.8)</p>
</td>
<td>
<p>58.6</p>
<p>(3.6)</p>
</td>
<td>
<p>58.8</p>
<p>(5.9)</p>
</td>
<td>
<p>
<bold>71.8</bold>
</p>
<p>(3.8)</p>
</td>
</tr>
<tr>
<td> IV</td>
<td>
<p>55.2</p>
<p>(4.0)</p>
</td>
<td>
<p>60.4</p>
<p>(6.1)</p>
</td>
<td>
<p>
<bold>72.4</bold>
</p>
<p>(4.0)</p>
</td>
<td>
<p>55.8</p>
<p>(4.4)</p>
</td>
<td>
<p>61.2</p>
<p>(6.5)</p>
</td>
<td>
<p>
<bold>72.8</bold>
</p>
<p>(4.1)</p>
</td>
<td>
<p>55.2</p>
<p>(4.0)</p>
</td>
<td>
<p>60.4</p>
<p>(6.1)</p>
</td>
<td>
<p>
<bold>72.4</bold>
</p>
<p>(4.0)</p>
</td>
</tr>
<tr>
<td> BALB/C</td>
<td>
<p>54.6</p>
<p>(3.8)</p>
</td>
<td>
<p>57.5</p>
<p>(5.5)</p>
</td>
<td>
<p>
<bold>70.6</bold>
</p>
<p>(4.8)</p>
</td>
<td>
<p>55.1</p>
<p>(4.3)</p>
</td>
<td>
<p>58.3</p>
<p>(6.4)</p>
</td>
<td>
<p>
<bold>71.0</bold>
</p>
<p>(4.9)</p>
</td>
<td>
<p>54.6</p>
<p>(3.8)</p>
</td>
<td>
<p>57.5</p>
<p>(5.5)</p>
</td>
<td>
<p>
<bold>70.6</bold>
</p>
<p>(4.8)</p>
</td>
</tr>
<tr>
<td> C57BL/6</td>
<td>
<p>70.7</p>
<p>(7.9)</p>
</td>
<td>
<p>73.4</p>
<p>(7.4)</p>
</td>
<td>
<p>
<bold>74.3</bold>
</p>
<p>(7.1)</p>
</td>
<td>
<p>72.6</p>
<p>(8.6)</p>
</td>
<td>
<p>75.0</p>
<p>(7.5)</p>
</td>
<td>
<p>
<bold>75.4</bold>
</p>
<p>(7.1)</p>
</td>
<td>
<p>70.7</p>
<p>(7.9)</p>
</td>
<td>
<p>73.4</p>
<p>(7.4)</p>
</td>
<td>
<p>
<bold>74.3</bold>
</p>
<p>(7.1)</p>
</td>
</tr>
<tr>
<td> H1N1</td>
<td>
<p>58.7</p>
<p>(6.0)</p>
</td>
<td>
<p>59.2</p>
<p>(6.3)</p>
</td>
<td>
<p>
<bold>65.0</bold>
</p>
<p>(7.5)</p>
</td>
<td>
<p>61.8</p>
<p>(8.0)</p>
</td>
<td>
<p>61.9</p>
<p>(8.1)</p>
</td>
<td>
<p>
<bold>65.8</bold>
</p>
<p>(7.6)</p>
</td>
<td>
<p>58.7</p>
<p>(6.0)</p>
</td>
<td>
<p>59.2</p>
<p>(6.3)</p>
</td>
<td>
<p>
<bold>65.0</bold>
</p>
<p>(7.5)</p>
</td>
</tr>
<tr>
<td> H3N2</td>
<td>
<p>72.1</p>
<p>(9.2)</p>
</td>
<td>
<p>80.7</p>
<p>(11.5)</p>
</td>
<td>
<p>
<bold>84.4</bold>
</p>
<p>(8.4)</p>
</td>
<td>
<p>79.4</p>
<p>(8.8)</p>
</td>
<td>
<p>84.1</p>
<p>(9.7)</p>
</td>
<td>
<p>
<bold>86.5</bold>
</p>
<p>(7.4)</p>
</td>
<td>
<p>72.1</p>
<p>(9.2)</p>
</td>
<td>
<p>80.7</p>
<p>(11.5)</p>
</td>
<td>
<p>
<bold>84.4</bold>
</p>
<p>(8.4)</p>
</td>
</tr>
<tr>
<td> H5N1</td>
<td>
<p>57.3</p>
<p>(6.4)</p>
</td>
<td>
<p>64.9</p>
<p>(8.1)</p>
</td>
<td>
<p>
<bold>72.4</bold>
</p>
<p>(6.9)</p>
</td>
<td>
<p>62.1</p>
<p>(10.6)</p>
</td>
<td>
<p>67.2</p>
<p>(8.8)</p>
</td>
<td>
<p>
<bold>73.3</bold>
</p>
<p>(7.3)</p>
</td>
<td>
<p>57.3</p>
<p>(6.4)</p>
</td>
<td>
<p>64.9</p>
<p>(8.1)</p>
</td>
<td>
<p>
<bold>72.4</bold>
</p>
<p>(6.9)</p>
</td>
</tr>
<tr>
<td colspan="10">Three-class datasets</td>
</tr>
<tr>
<td> MIV</td>
<td>
<p>45.7</p>
<p>(2.6)</p>
</td>
<td>
<p>44.5</p>
<p>(3.4)</p>
</td>
<td>
<p>
<bold>60.2</bold>
</p>
<p>(3.0)</p>
</td>
<td>
<p>46.6</p>
<p>(3.1)</p>
</td>
<td>
<p>52.8</p>
<p>(5.3)</p>
</td>
<td>
<p>
<bold>60.3</bold>
</p>
<p>(2.9)</p>
</td>
<td>
<p>45.7</p>
<p>(2.6)</p>
</td>
<td>
<p>44.5</p>
<p>(3.4)</p>
</td>
<td>
<p>
<bold>60.2</bold>
</p>
<p>(3.0)</p>
</td>
</tr>
<tr>
<td> IV</td>
<td>
<p>42.1</p>
<p>(3.2)</p>
</td>
<td>
<p>42.5</p>
<p>(3.3)</p>
</td>
<td>
<p>
<bold>56.3</bold>
</p>
<p>(3.5)</p>
</td>
<td>
<p>43.4</p>
<p>(4.4)</p>
</td>
<td>
<p>47.9</p>
<p>(6.5)</p>
</td>
<td>
<p>
<bold>56.6</bold>
</p>
<p>(3.5)</p>
</td>
<td>
<p>42.1</p>
<p>(3.2)</p>
</td>
<td>
<p>42.5</p>
<p>(3.3)</p>
</td>
<td>
<p>
<bold>56.3</bold>
</p>
<p>(3.5)</p>
</td>
</tr>
<tr>
<td> BALB/C</td>
<td>
<p>39.8</p>
<p>(3.5)</p>
</td>
<td>
<p>42.1</p>
<p>(4.2)</p>
</td>
<td>
<p>
<bold>55.4</bold>
</p>
<p>(3.5)</p>
</td>
<td>
<p>40.7</p>
<p>(4.8)</p>
</td>
<td>
<p>49.1</p>
<p>(6.9)</p>
</td>
<td>
<p>
<bold>55.5</bold>
</p>
<p>(3.5)</p>
</td>
<td>
<p>39.8</p>
<p>(3.5)</p>
</td>
<td>
<p>42.1</p>
<p>(4.2)</p>
</td>
<td>
<p>
<bold>55.4</bold>
</p>
<p>(3.5)</p>
</td>
</tr>
<tr>
<td> C57BL/6</td>
<td>
<p>60.4</p>
<p>(5.8)</p>
</td>
<td>
<p>61.9</p>
<p>(7.2)</p>
</td>
<td>
<p>
<bold>66.6</bold>
</p>
<p>(7.5)</p>
</td>
<td>
<p>65.6</p>
<p>(7.6)</p>
</td>
<td>
<p>66.3</p>
<p>(7.1)</p>
</td>
<td>
<p>
<bold>68.6</bold>
</p>
<p>(7.8)</p>
</td>
<td>
<p>60.4</p>
<p>(5.8)</p>
</td>
<td>
<p>61.9</p>
<p>(7.2)</p>
</td>
<td>
<p>
<bold>66.6</bold>
</p>
<p>(7.5)</p>
</td>
</tr>
<tr>
<td> H1N1</td>
<td>
<p>43.3</p>
<p>(5.0)</p>
</td>
<td>
<p>44.0</p>
<p>(7.1)</p>
</td>
<td>
<p>
<bold>54.6</bold>
</p>
<p>(6.6)</p>
</td>
<td>
<p>48.4</p>
<p>(8.2)</p>
</td>
<td>
<p>50.3</p>
<p>(9.7)</p>
</td>
<td>
<p>
<bold>55.5</bold>
</p>
<p>(7.0)</p>
</td>
<td>
<p>43.3</p>
<p>(5.0)</p>
</td>
<td>
<p>44.0</p>
<p>(7.1)</p>
</td>
<td>
<p>
<bold>54.6</bold>
</p>
<p>(6.6)</p>
</td>
</tr>
<tr>
<td> H3N2</td>
<td>
<p>47.9</p>
<p>(8.9)</p>
</td>
<td>
<p>43.0</p>
<p>(9.5)</p>
</td>
<td>
<p>
<bold>60.9</bold>
</p>
<p>(11.7)</p>
</td>
<td>
<p>61.4</p>
<p>(17.1)</p>
</td>
<td>
<p>59.3</p>
<p>(14.6)</p>
</td>
<td>
<p>
<bold>64.4</bold>
</p>
<p>(13.6)</p>
</td>
<td>
<p>47.9</p>
<p>(8.9)</p>
</td>
<td>
<p>43.0</p>
<p>(9.5)</p>
</td>
<td>
<p>
<bold>60.9</bold>
</p>
<p>(11.7)</p>
</td>
</tr>
<tr>
<td> H5N1</td>
<td>
<p>38.0</p>
<p>(5.8)</p>
</td>
<td>
<p>42.1</p>
<p>(6.9)</p>
</td>
<td>
<p>
<bold>54.0</bold>
</p>
<p>(7.5)</p>
</td>
<td>
<p>39.7</p>
<p>(8.6)</p>
</td>
<td>
<p>47.6</p>
<p>(10.6)</p>
</td>
<td>
<p>
<bold>55.1</bold>
</p>
<p>(7.8)</p>
</td>
<td>
<p>38.0</p>
<p>(5.8)</p>
</td>
<td>
<p>42.1</p>
<p>(6.9)</p>
</td>
<td>
<p>
<bold>54.0</bold>
</p>
<p>(7.5)</p>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p id="Par48">Table
<xref rid="Tab2" ref-type="table">2</xref>
also shows that incorporating host information improved the accuracy of the three-class virulence classification but not for the two-class virulence classification – the average accuracies of PART models on the three-class MIV and IV datasets were 60.2 and 56.3% (Tukey’s HSD adjusted
<italic>p</italic>
-value for the difference was < 0.05), respectively, but they were about the same for the two-class virulence classification, i.e., 71.8% for MIV dataset and 72.4% for IV dataset (Tukey’s HSD adjusted p-value for the difference was close to 1). Furthermore, when consindering the host strains, the rule-based models were more accurate for the C57BL/6 datasets than the BALB/C datasets (statistically significant (Tukey’s HSD adjusted p-value < 0.05) for the three-class problem but not two-class problem); and when considering the IAV subtypes, the rule-based models were more accurate for the H3N2 datasets than the H1N1 and H5N1 datasets (statistically significant for all cases). However, it ought to be noted that the standard deviations for the C57BL/6 and H3N2 datasets were higher than the rest, and that aggregating all mouse and/or virus strains gave the smallest standard deviation while keeping accuracy competitive.</p>
<p id="Par49">The distributions of the accuracies of the 100 OneR/JRip/PART models learned from the two-class and three-class MIV and IV datasets containing either the concatenated protein alignments or an individual protein alignment are shown in Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
and those learned from the BALB/C, C57BL/6, H1N1, H3N2 and H5N1 datasets are shown in (Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
: Figure S1). The results of the Tukey’s HSD post hoc test for multiple comparisons between pairs of models that appear in each plot in Fig.
<xref rid="Fig4" ref-type="fig">4</xref>
and Additional file
<xref rid="MOESM1" ref-type="media">1</xref>
: Figure S1 are given in Figures S3 and S4 (Additional files 
<xref rid="MOESM3" ref-type="media">3</xref>
and
<xref rid="MOESM4" ref-type="media">4</xref>
), respectively. Once again, PART usually outperformed OneR and JRip, but it was not unusual that OneR outperformed JRip. Of interest, PART models that were built on the datasets containing the concatenated protein alignments almost always achieved the highest average accuracy, except for the three-class H3N2. The average accuracy was usually significantly higher than the accuracy of other competing models. In many cases, PART model that is based on PB2 or HA alignment could compete against PART model that is based on the concatenated protein alignments (no significant difference between their average accuracy; see Figure S3 and S4 (Additional files
<xref rid="MOESM3" ref-type="media">3</xref>
and
<xref rid="MOESM4" ref-type="media">4</xref>
)).
<fig id="Fig4">
<label>Fig. 4</label>
<caption>
<p>Accuracy distribution of 100 models learned independently from the two-class and three-class MIV (A and B, respectively) and IV (C and D, respectively) datasets using OneR (1R), JRip (JR) and PART (PT). The datasets contain either the concatenated alignments of all IAV proteins or individual alignment of PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2 or NS2 proteins. The red dashed horizontal line indicates the accuracy of zero rule learner, while the blue horizontal lines indicate significant difference (Tukey’s HSD adjusted p-value < 0.05) between two virulence models generated from the same protein alignment. Information about significant differences between all possible pairs of virulence models in each plot can be found in Figure S3 (Additional file
<xref rid="MOESM3" ref-type="media">3</xref>
)</p>
</caption>
<graphic xlink:href="12864_2019_6295_Fig4_HTML" id="MO4"></graphic>
</fig>
</p>
<p id="Par50">Finally, we noted that RF models did not outperform PART models. In about 50% of the cases, PART even gave significantly better accuracies than RF (see (Additional file 
<xref rid="MOESM2" ref-type="media">2</xref>
: Figure S2)). Nonetheless, the site importance ranking output by RF could provide valuable insights and hence, RF models were further explored.</p>
</sec>
<sec id="Sec6">
<title>Top sites and synergy between sites for IAV virulence</title>
<p id="Par51">As the performance of the models generated by a specific learning algorithm varied from one independent learning to another, the models themselves tended to vary a lot. This demonstrated the influence of selected training data. Hence, rather than inspecting the model one by one, it is more interesting to investigate individual sites that were frequently included in learned models or considered to have more impacts in the models. For this, the OneR’s single site model and RF’s site importance ranking naturally suit the purpose. For JRip and PART, we calculated the average contribution of each site to the accuracy of learned models. Table 
<xref rid="Tab3" ref-type="table">3</xref>
summarizes the sites selected by OneR (ordered by their frequency; sites that were selected once are not shown), top 20 sites by JRip and PART (ordered by their average contribution to the accuracy of learned models), and top 20 influential sites by RF (ordered by the average mean decrease in accuracy) following 100 independent learnings from the two-class and three-class IV datasets containing the concatenated protein alignments.
<table-wrap id="Tab3">
<label>Table 3</label>
<caption>
<p>Top sites for modelling IAV virulence based on the 100 models generated from the (A) two-class and (B) three-class IV datasets containing the concatenated aligments of all IAV proteins. For OneR (1R), the numbers in parentheses are the frequency of the corresponding site being selected in the models; for JRip (JR) and PART (PT), they are the average contribution of the corresponding site to accuracy (in percent); and for random forest (RF), they are the average mean decrease in accuracy attributed to the corresponding site. Each number was calculated following 100 independent learnings from the two-class or three-class IV dataset. For 1R, only sites with frequency > 1 are shown, while for JR, PT and RF, only top 20 sites are shown</p>
</caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td colspan="6">(A) Two-class IV dataset</td>
</tr>
<tr>
<td rowspan="3"> 1R</td>
<td>HA-142 (28)</td>
<td>HA-188 (12)</td>
<td>HA-160 (7)</td>
<td>NA-46 (6)</td>
<td>HA-189 (4)</td>
</tr>
<tr>
<td>PA-X-213 (4)</td>
<td>HA-219 (3)</td>
<td>HA-285 (3)</td>
<td>HA-397 (3)</td>
<td>NA-79 (3)</td>
</tr>
<tr>
<td>NS1–171 (3)</td>
<td>NS1–95 (3)</td>
<td>HA-196 (2)</td>
<td>NA-86 (2)</td>
<td>NS1–226 (2)</td>
</tr>
<tr>
<td rowspan="4"> JR</td>
<td>PB2–627 (4.07)</td>
<td>PB2–701 (3.03)</td>
<td>PA-97 (1.40)</td>
<td>HA-297 (1.26)</td>
<td>HA-452 (0.96)</td>
</tr>
<tr>
<td>HA-218 (0.91)</td>
<td>NA-46 (0.89)</td>
<td>M1–227 (0.89)</td>
<td>NA-17 (0.71)</td>
<td>NA-164a (0.58)</td>
</tr>
<tr>
<td>NS1–95 (0.55)</td>
<td>NS1–226 (0.53)</td>
<td>M1–15 (0.52)</td>
<td>NS1–171 (0.51)</td>
<td>PB2–508 (0.48)</td>
</tr>
<tr>
<td>NA-151 (0.43)</td>
<td>PA-X-207 (0.43)</td>
<td>NA-29 (0.42)</td>
<td>NA-371 (0.40)</td>
<td>HA-278 (0.39)</td>
</tr>
<tr>
<td rowspan="4"> PT</td>
<td>NS1–42 (20.29)</td>
<td>PA-97 (20.20)</td>
<td>PB2–714 (18.28)</td>
<td>PB2–110 (16.72)</td>
<td>PB2–153 (13.26)</td>
</tr>
<tr>
<td>PB2–701 (11.53)</td>
<td>NA-276 (10.35)</td>
<td>NP-101 (10.19)</td>
<td>PA-556 (9.94)</td>
<td>PB2–318 (9.26)</td>
</tr>
<tr>
<td>NP-492 (9.16)</td>
<td>NP-133 (8.92)</td>
<td>PB2–80 (8.71)</td>
<td>M1–215 (8.20)</td>
<td>NS1–123 (7.58)</td>
</tr>
<tr>
<td>HA-485 (7.56)</td>
<td>PA-341 (6.67)</td>
<td>PB2–635 (6.23)</td>
<td>PB2–158 (6.08)</td>
<td>PB2–627 (5.83)</td>
</tr>
<tr>
<td rowspan="4"> RF</td>
<td>PA-97 (6.75)</td>
<td>PB2–701 (6.54)</td>
<td>PA-X-97 (6.25)</td>
<td>NS1–42 (5.87)</td>
<td>HA-218 (5.53)</td>
</tr>
<tr>
<td>PB2–355 (5.11)</td>
<td>NP-34 (4.83)</td>
<td>PB2–627 (4.76)</td>
<td>PB2–714 (4.55)</td>
<td>HA-186 (4.12)</td>
</tr>
<tr>
<td>HA-227 (3.88)</td>
<td>NP-101 (3.78)</td>
<td>PB2–699 (3.68)</td>
<td>HA-485 (3.66)</td>
<td>PB2–318 (3.62)</td>
</tr>
<tr>
<td>HA-142 (3.52)</td>
<td>M1–30 (3.49)</td>
<td>PB2–675 (3.46)</td>
<td>PB2–153 (3.43)</td>
<td>NA-46 (3.35)</td>
</tr>
<tr>
<td colspan="6">(B) Three-class IV dataset</td>
</tr>
<tr>
<td rowspan="2"> 1R</td>
<td>HA-188 (34)</td>
<td>NA-370 (16)</td>
<td>NA-16 (10)</td>
<td>HA-142 (9)</td>
<td>HA-53 (6)</td>
</tr>
<tr>
<td>HA-94 (4)</td>
<td>NA-164a (4)</td>
<td>HA-8 (3)</td>
<td>HA-173 (2)</td>
<td>HA-285 (2)</td>
</tr>
<tr>
<td rowspan="4"> JR</td>
<td>PB2–627 (4.98)</td>
<td>PB2–701 (1.73)</td>
<td>NA-151 (1.45)</td>
<td>NA-164a (1.37)</td>
<td>HA-218 (1.20)</td>
</tr>
<tr>
<td>HA-297 (1.02)</td>
<td>HA-225 (0.94)</td>
<td>HA-452 (0.93)</td>
<td>PB1-F2–28 (0.88)</td>
<td>HA-327b (0.85)</td>
</tr>
<tr>
<td>M2–28 (0.84)</td>
<td>HA-266 (0.74)</td>
<td>NS1–42 (0.71)</td>
<td>PA-97 (0.68)</td>
<td>NA-61 (0.68)</td>
</tr>
<tr>
<td>PA-X-213 (0.59)</td>
<td>HA-482 (0.58)</td>
<td>M2–93 (0.54)</td>
<td>HA-160 (0.52)</td>
<td>PB1-F2–49 (0.51)</td>
</tr>
<tr>
<td rowspan="4"> PT</td>
<td>PB2–158 (12.81)</td>
<td>PB2–110 (11.97)</td>
<td>NS1–42 (10.79)</td>
<td>PB2–153 (10.56)</td>
<td>NA-276 (10.31)</td>
</tr>
<tr>
<td>PB2–80 (9.21)</td>
<td>NS2–67 (8.46)</td>
<td>PB2–265 (8.23)</td>
<td>PB2–66 (7.92)</td>
<td>PB2–627 (7.62)</td>
</tr>
<tr>
<td>NA-441 (7.28)</td>
<td>NS1–28 (6.97)</td>
<td>M2–24 (6.87)</td>
<td>PB2–497 (6.54)</td>
<td>HA-294 (6.51)</td>
</tr>
<tr>
<td>PB1–578 (6.20)</td>
<td>PA-97 (6.19)</td>
<td>NP-101 (6.18)</td>
<td>PB2–76 (6.07)</td>
<td>M1–215 (6.06)</td>
</tr>
<tr>
<td rowspan="4"> RF</td>
<td>PB2–627 (6. 69)</td>
<td>NS1–42 (6.49)</td>
<td>HA-225 (6.41)</td>
<td>PB2–701 (6.34)</td>
<td>PA-97 (5.90)</td>
</tr>
<tr>
<td>HA-218 (5.42)</td>
<td>PB2–355 (5.41)</td>
<td>PA-X-97 (5.26)</td>
<td>M1–215 (4.84)</td>
<td>PB2–699 (4.52)</td>
</tr>
<tr>
<td>NP-133 (4.51)</td>
<td>NP-101 (4.48)</td>
<td>PB2–153 (4.41)</td>
<td>M1–30 (4.35)</td>
<td>NP-34 (4.31)</td>
</tr>
<tr>
<td>HA-227 (4.22)</td>
<td>HA-156 (4.17)</td>
<td>PB2–714 (4.12)</td>
<td>HA-188 (4.12)</td>
<td>NA-49 (4.10)</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
<p id="Par52">Overall, for the top sites in Table
<xref rid="Tab3" ref-type="table">3</xref>
, OneR and JRip preferred sites in HA and NA, PART had a high preference towards sites in PB2, and RF pointed out more sites in PB2 and HA were important. In terms of their consistency in selecting sites for the two-class and three-class virulence models, RF was the most consistent (15 shared sites), followed by PART (10 shared sites), JRip (8 shared sites) and finally OneR (only 4 sites). Furthermore, no site was shared by all four learners for either the two-class or three-class virulence models; but there were few sites shared by three learners: PB2–627, PB2–701, PA-97 and NA-46 for the two-class models, and PB2–627, PA-97 and NS1–42 for the three-class models.</p>
<p id="Par53">In addition to analyzing individual sites, it is also interesting to investigate the synergy between sites that determine IAV virulence. The rule-based models given by JRip and PART serve this purpose, but here we limit to PART models that usually gave the highest average accuracy. For this, in similar way to the identification of top individual sites, we extracted the average contribution of each pair of sites appearing in each rule in PART models to the overall accuracy. The synergistic networks arising from top 50 site pairs in PART models learned from the two-class and three-class IV datasets containing concatenated protein alignments are shown in Fig. 
<xref rid="Fig5" ref-type="fig">5A and B</xref>
, respectively. As shown, the sites in both cases were interestingly fully connected and mainly involved sites in PB2. Top 4 sites that had high degree (number of connections) for the two-class virulence models included PB2–714 (degree = 14), PA-97 (13), NS1–42 (10) and PB2–701 (7), and interestingly, the pairing between top two sites PB2–714 and PA-97 had the highest contribution to accuracy. On the other hand, sites that had high degree for the three-class virulence models included PB2–110 (15), PB2–158 (13), NS1–42 (10) and PB2–153 (9), and the pairing between PB2–153 and NS1–42 had the highest contribution to accuracy.
<fig id="Fig5">
<label>Fig. 5</label>
<caption>
<p>Synergistic graphs between IAV protein sites in determining virulence based on 100 PART models learned from the (A) two-class and (B) three-class IV datasets containing the concatenated alignments of all IAV proteins. Each node in the graphs represents an IAV protein site – the type of the protein is encoded by color and the site number is written above the node. Two sites are connected by an edge if they appear in the top 50 site pairs contributing to the accuracy of the corresponding PART models. The thickness of an edge indicates the level of contribution of the corresponding site pair to the accuracy</p>
</caption>
<graphic xlink:href="12864_2019_6295_Fig5_HTML" id="MO5"></graphic>
</fig>
</p>
</sec>
</sec>
<sec id="Sec7">
<title>Discussion</title>
<p id="Par54">In this influenza study, we systematically and extensively searched literature, collected infection records involving specific mouse and IAV strains, noted their virulence, classified the virulence level, and obtained related IAV proteins in order to develop predictive virulence models of IAV infections. Furthermore, we proposed a number of procedures to tackle various missing data. For virulence, the MLD50 value is the ultimate information we looked for; but in its absence, weight loss and/or survival data of infected mice were utilized to infer the lower or upper bound of MLD50 and subsequently, to label the virulence class. For IAV genomes, when the genomes were incomplete or contained partial sequences, extrapolation was performed using the closest genome relative identified with BLAST. These pre-processing works were done manually and ambiguity occasionally occurred. Hence, caution must be taken when dealing with the datasets and improvement in the pre-processing approach may be considered for future works. Alternatively, efforts in improving the current practice of storing IAV virulence information by research community such that it eases its reusability ought to be encouraged, e.g., by creating an online database that accepts submissions of IAV virulence related data and is able to generate high quality tables or figures of the input data (which then can be added into related manuscript).</p>
<p id="Par55">Despite the limitations of the datasets due to the ways in handling missing MLD50, partial sequences and incomplete genomes, and also a recent critic of using LD50 as a virulence measure [
<xref ref-type="bibr" rid="CR16">16</xref>
], the models learned from the datasets could provide insights about IAV virulence across mouse and virus strains. Rule-based models were chosen since their output can be easily interpreted and are congruent with the current practice in investigating IAV virulence experimentally. Three rule-based learning approaches were employed: OneR, JRip and PART. OneR approach outputs a single site model that gives the highest accuracy [
<xref ref-type="bibr" rid="CR17">17</xref>
]; JRip and PART considers multiple sites and they construct a set of decision rules using different strategy. While JRip mainly uses separate-and-conquer algorithms [
<xref ref-type="bibr" rid="CR18">18</xref>
], PART combines separate-and-conquer strategy and partial decision trees [
<xref ref-type="bibr" rid="CR19">19</xref>
]. For a comparison in the performance, we also explored the RF approach [
<xref ref-type="bibr" rid="CR20">20</xref>
] in modelling IAV virulence.</p>
<p id="Par56">For the models and their performance, we first noted that OneR mainly selected sites in HA and NA for its single site models, and the OneR models could give significantly better average accuracies than the zero rule model (in which the accuracy is calculated by assigning all records to the class label that has the most observations). Among the sites, some have known functions while some others are not yet characterized. For example, site 188 in HA is known to be located at the helix 190 that surrounds the receptor-binding site and thus it affects host specificity [
<xref ref-type="bibr" rid="CR21">21</xref>
], while site 142 in HA has not yet been well studied even though it was frequently selected as the top OneR classifier. On the other hand, JRip and PART generated multiple site models and while JRip usually did not outperform OneR, PART almost always outperformed OneR and JRip. Of interest, PART also outperformed RF in about 50% of the tested cases. Moreover, a higher accuracy generally could be achieved by PART when considering either the concatenated protein alignments or individual protein alignments. These results demonstrate a synergistic between sites within a single protein and sites in different proteins (in other words, the polygenic nature of IAV virulence in mice). This is consistent with the observations from various experimental studies, such as the ones that demonstrate intra-protein synergy in PB2 [
<xref ref-type="bibr" rid="CR22">22</xref>
<xref ref-type="bibr" rid="CR27">27</xref>
], PA [
<xref ref-type="bibr" rid="CR11">11</xref>
], and NS1 [
<xref ref-type="bibr" rid="CR28">28</xref>
,
<xref ref-type="bibr" rid="CR29">29</xref>
], and inter-protein synergy that involves combinations of PB2, PB1, PA, HA or NA [
<xref ref-type="bibr" rid="CR12">12</xref>
,
<xref ref-type="bibr" rid="CR30">30</xref>
<xref ref-type="bibr" rid="CR36">36</xref>
].</p>
<p id="Par57">Further inspection on PART models across different IAV strains using IV dataset revealed that although HA had many more sites correlated with virulence, PB2 seemed to play more important role in determining IAV virulence. This was in agreement with the RF’s site importance ranking. In terms of their accuracy, PART models based on PB2 alone were usually as good as or even better than PART models based on HA; except when modelling the virulence of two-class H1N1, PART models based on HA were more superior (see (Additional file
<xref rid="MOESM2" ref-type="media">2</xref>
: Figure S2)). Moreover, PART models based on the concatenated IAV protein alignments had a high preference towards sites in PB2, and many sites in PB2 were also considered as the most important features for RF models (Table
<xref rid="Tab3" ref-type="table">3</xref>
). Figure 
<xref rid="Fig5" ref-type="fig">5</xref>
that shows synergistic graphs for the two-class and three-class virulence models further clearly demonstrate this. Investigations on MIV dataset and datasets for specific IAV or mouse strain also revealed the dominance of PB2 in most of the cases (data not shown). When sites in PB2 did not dominate, the sites in HA dominated, such as in the case for the two-class H1N1 dataset.</p>
<p id="Par58">The critical role of PB2 in determining virulence in mice have been indeed highlighted for various strains, including H3N2 [
<xref ref-type="bibr" rid="CR34">34</xref>
,
<xref ref-type="bibr" rid="CR37">37</xref>
], H5N1 [
<xref ref-type="bibr" rid="CR22">22</xref>
<xref ref-type="bibr" rid="CR24">24</xref>
,
<xref ref-type="bibr" rid="CR38">38</xref>
,
<xref ref-type="bibr" rid="CR39">39</xref>
], H5N8 [
<xref ref-type="bibr" rid="CR26">26</xref>
,
<xref ref-type="bibr" rid="CR40">40</xref>
], H7N9 [
<xref ref-type="bibr" rid="CR41">41</xref>
<xref ref-type="bibr" rid="CR45">45</xref>
], H9N2 [
<xref ref-type="bibr" rid="CR25">25</xref>
,
<xref ref-type="bibr" rid="CR27">27</xref>
,
<xref ref-type="bibr" rid="CR45">45</xref>
,
<xref ref-type="bibr" rid="CR46">46</xref>
] and H10N8 [
<xref ref-type="bibr" rid="CR45">45</xref>
]. Among the top 20 sites in PB2 for PART models, sites 627 and 701 have been repeatedly shown to affect IAV virulence in mammals including mice. Site 627 is considered critical for efficient replication, while site 701 influences polymerase activity via its interaction with the nuclear import factor importin α that mediates the transport of proteins into nucleus [
<xref ref-type="bibr" rid="CR47">47</xref>
]. Other top sites in PB2 are also known to contribute to virulence. For examples, site 714 (top 20 for the two-class IV dataset) influences replication efficiency and IAV virulence in mice in combination with site 701 [
<xref ref-type="bibr" rid="CR23">23</xref>
,
<xref ref-type="bibr" rid="CR48">48</xref>
,
<xref ref-type="bibr" rid="CR49">49</xref>
]; site 66 (top 20 for the three-class IV dataset) sets a prerequisite for acquiring virulence [
<xref ref-type="bibr" rid="CR50">50</xref>
]; and site 158 (top 20 for the two-class and three-class IV dataset; specifically, top one for the three-class) strongly influences the virulence of both pandemic H1N1 and H5 influenza viruses in mice [
<xref ref-type="bibr" rid="CR51">51</xref>
]. Experimental evidence for the contribution of other top sites in PB2 to virulence, e.g., sites 80, 110 and 153, are still none to our knowledge. On the other hand, some other sites not in the top list have been shown to play a role in dictating virulence, e.g., sites 147, 339 and 588 that can synergize to give rise a higher level of virulence [
<xref ref-type="bibr" rid="CR24">24</xref>
].</p>
<p id="Par59">Next, the synergistic graph for the two-class virulence models interestingly presented a clustering of two subgraphs for sites in PART virulence models, with sites PB2–714, PA-97 and NS1–42 act as a bottleneck (a node with high betweenness centrality, i.e., having many shortest paths going through it) connecting the two subgraphs. For the three-class models, the synergistic graph containing top site pairs concentrated and expanded in the subnetwork that included sites PB2–80, PB2–110, PB2–153, PB2–297, NA-300, NS1–42, and M1–215. This may indicate a greater role of these sites in sensitizing the virulence level of IAV infections. For example, site 42 within the RNA-binding domain of NS1 influences the capability of the protein in binding double-stranded RNA and it determines the degree of pathogenicity in mice [
<xref ref-type="bibr" rid="CR52">52</xref>
]. This site also influences the activation of IRF3 and regulation of host interferon response, which subsequently influences the efficiency of viral replication [
<xref ref-type="bibr" rid="CR53">53</xref>
]. Another site that has been experimentally explored is site 215 in M1, which also contributes to the degree of IAV virulence [
<xref ref-type="bibr" rid="CR54">54</xref>
].</p>
<p id="Par60">Overall, PART, with its approach that combines separate-and-conquer strategy and partial decision tree, has been a suitable method to generate sequence-based virulence models that are not only considerably good in performance, but also provides interpretable information. But here, rather than relying on a single model developed from a single training dataset, the information was extracted from 100 models learned independently from different training datasets. While bias due to imbalanced classes were resolved by under-sampling to obtain balanced classes, the iterations might help reducing bias due to over-sampling of a particular mouse or IAV strain. Furthermore, we also noted from the confusion matrix that PART models tended to misclassify the avirulent (or less virulent) strains as virulent (or more virulent) ones rather than misclassify the virulent (more virulent) strains as avirulent (or less virulent) ones. In practice, this is preferred since classifying the virulent strains as avirulent ones is a worse decision that can cost lives. Moreover, we also investigated the effect of increasing the training size for learning PART models (data not shown). Using the two-class and three-class IV datasets containing the concatenated protein alignments, the mean accuracy of PART models based on the training size of 80% or 90% of the total records was about 2–3% higher than the mean accuracy of PART models based on the training size of 60%, but it came at the cost of higher standard deviation (about 1.3–3.3% higher) and average number of rules (3–6 rules more for two-class and 12–20 rules more for three-class; in other words, more complex models). Increasing the training size up to 99% of the total records led not only to much higher variance, but also a drop in the mean accuracy. Thus, with additional consideration that there were high overlaps between top sites from PART models trained on 80% or 90% datasets and top sites trained on 60% datasets, and we observed that the top sites for models trained on 80 and 90% datasets were still dominated by sites in PB2, presenting results from models trained on 60% of datasets is justifiable.</p>
<p id="Par61">In terms of their accuracy, PART models achieved moderate performance for various datasets being learned. The average accuracy over 100 models ranged between 65.0 and 84.4% (15.0–34.4% above baseline) for the two-class datasets that utilized all IAV proteins, and between 54.0 and 66.6% (20.7–33.3% above baseline) for the three-class datasets (see Table
<xref rid="Tab2" ref-type="table">2</xref>
). Learning from subsets of specific mouse or IAV strains revealed that some strains were easier while others were harder to learn. Of interest, while the average accuracies were relatively the same for the two-class datasets regardless the host information was included or not, a significant improvement (3.9% in increase of accuracy) was observed when incorporating host information for the three-class dataset. Thus, using learning approaches that further incorporate host information shall be encouraged, especially since several laboratory experiments have demonstrated the importance of host genetic backgrounds in determining IAV virulence [
<xref ref-type="bibr" rid="CR55">55</xref>
<xref ref-type="bibr" rid="CR61">61</xref>
], even at a substrain level [
<xref ref-type="bibr" rid="CR62">62</xref>
]. In particular, with the availability of genomes and proteomes of various mouse strains, sophisticated methods that are based on host-pathogen protein-protein interactions might be of interest. If successful, an implementation of such methods may be translated to human cases and other diseases to improve our understanding about disease mechanisms, establish a foundation for future personalized medicine, and provide a better surveillance. Nevertheless, the development of the approaches will be more fruitful if there is a significant increase in the number of influenza experiments carried out with mouse and IAV strains that are still limited in their number of studies.</p>
</sec>
<sec id="Sec8">
<title>Conclusions</title>
<p id="Par62">In summary, we have developed benchmark datasets and explored rule-based and RF approaches for modelling IAV virulence. To our knowledge, the datasets are currently the biggest aggregation of IAV infections in mice, and the number of the infection records can still grow. The creation of these benchmark datasets will be beneficial for further understanding the molecular principles underlying influenza mechanisms since mice have been a major animal model for influenza. In this study, we utilized the datasets to assess the predictability of IAV virulence for specific and across mouse and IAV strains, and to identify top proteins sites and synergy between protein sites that contribute to IAV virulence. Overall, our study confirmed the polygenic nature of IAV virulence, with several sites in PB2 playing more dominant roles. Not only sites that are well known as IAV virulence markers, e.g. 627, 701 and 714, but also some other sites in PB2 not yet known influencing virulence were identified. Nonetheless, modelling virulence is a very challenging problem due to the nature of complex interactions that underlie the phenotype, which involve not only viral factors, but also host factors. Hence, future works shall incorporate more host information, especially the host proteomic data that are now widely available for various mouse strains. Applying different machine learning approaches and protein features, and posing virulence modelling as a regression problem that predicts MLD50 shall also be considered.</p>
</sec>
<sec id="Sec9">
<title>Methods</title>
<sec id="Sec10">
<title>Collection of IAV infections in mice with virulence information</title>
<p id="Par63">Journal publications containing virulence information of IAV infections in non-transgenic and non-knock-out inbred mice – which were searched using Google or PubMed search engines (with keywords that included influenza, infection, mouse, virulence, virus and LD50), found in the citations of retrieved articles, or recommended automatically by ScienceDirect – were collected. Each unique infection involving specific IAV strain and specific mouse strain and with known value of MLD50 was recorded. Infections without MLD50 values but whose weight loss and/or survival data of infected mice per infection dose could be estimated from the relevant figures, were also recorded and used to estimate the lower or upper bound of MLD50; few of them were used to estimate the exact MLD50 using the Reed and Muench method [
<xref ref-type="bibr" rid="CR63">63</xref>
]. Various MLD50 units, which included the plaque forming unit (PFU), focus forming unit (FFU), egg infectious dose (EID50), tissue culture infectious dose (TCID50), and cell culture infectious dose (CCID50), were assumed to measure the same quantity.</p>
</sec>
<sec id="Sec11">
<title>Virulence classification</title>
<p id="Par64">In addition to the assumption on the equality of various MLD50 units, the MLD50 thresholds of 10
<sup>3.0</sup>
and 10
<sup>6.0</sup>
were used for virulence classification. The thresholds are used by WHO when classifying influenza virulence in mice in EID50 unit [
<xref ref-type="bibr" rid="CR64">64</xref>
]. In this regard, for the two-class problems, the levels of virulence were categorized into avirulent class if the MLD50 was > 10
<sup>6.0</sup>
and virulent class otherwise. When the class of an infection could not be determined from the lower or upper bound of MLD50, then the following rules were used:</p>
<sec id="Sec12">
<title>Rule 1</title>
<p id="Par65">An infection is avirulent if:
<list list-type="simple">
<list-item>
<label>(i)</label>
<p id="Par66">the infection dose between 10
<sup>4.0</sup>
and 10
<sup>6.0</sup>
leads to < 15% average weight loss;</p>
</list-item>
<list-item>
<label>(ii)</label>
<p id="Par67">the infection dose ≥10
<sup>5.0</sup>
does not kill any mouse; or</p>
</list-item>
<list-item>
<label>(iii)</label>
<p id="Par68">the infection dose between 10
<sup>3.0</sup>
and 10
<sup>4.0</sup>
leads to ≤10% average weight loss.</p>
</list-item>
</list>
</p>
</sec>
<sec id="Sec13">
<title>Rule 2</title>
<p id="Par69">An infection is virulent if:
<list list-type="simple">
<list-item>
<label>(i)</label>
<p id="Par70">the infection dose ≤10
<sup>5.0</sup>
leads to ≥15% average weight loss;</p>
</list-item>
<list-item>
<label>(ii)</label>
<p id="Par71">the infection dose ≤10
<sup>3.0</sup>
leads to ≥10% average weight loss; or</p>
</list-item>
<list-item>
<label>(iii)</label>
<p id="Par72">the infection dose ≤10
<sup>3.5</sup>
kills ≥10% mice.</p>
</list-item>
</list>
</p>
<p id="Par73">For the three-class classification problems, the levels of virulence were categorized into low virulence if the MLD50 was > 10
<sup>6.0</sup>
, intermediate virulence if the MLD50 was ≤10
<sup>6.0</sup>
and > 10
<sup>3.0</sup>
, and high virulent otherwise. When the class of an infection could not be determined from the lower or upper bound of MLD50, then the following rules were used:</p>
</sec>
<sec id="Sec14">
<title>Rule 3</title>
<p id="Par74">An infection is low virulence if it is considered avirulent (as given in the two class labelling).</p>
</sec>
<sec id="Sec15">
<title>Rule 4</title>
<p id="Par75">An infection is intermediate virulence if:
<list list-type="simple">
<list-item>
<label>(i)</label>
<p id="Par76">the infection dose < 10
<sup>4.0</sup>
leads to ≥10% average weight loss;</p>
</list-item>
<list-item>
<label>(ii)</label>
<p id="Par77">the infection dose between 10
<sup>4.0</sup>
and 10
<sup>5.0</sup>
leads to ≥15% average weight loss; or</p>
</list-item>
<list-item>
<label>(iii)</label>
<p id="Par78">the infection dose between 10
<sup>5.0</sup>
and 10
<sup>6.0</sup>
leads to ≥20% average weight loss.</p>
</list-item>
</list>
</p>
</sec>
<sec id="Sec16">
<title>Rule 5</title>
<p id="Par79">An infection is high virulence if:
<list list-type="simple">
<list-item>
<label>(i)</label>
<p id="Par80">the infection dose ≤10
<sup>6.0</sup>
kills ≥80% mice or leads to ≥25% average weight loss; or</p>
</list-item>
<list-item>
<label>(ii)</label>
<p id="Par81">the infection dose ≤10
<sup>1.0</sup>
kills ≥20% mice.</p>
</list-item>
</list>
</p>
<p id="Par82">The above procedure created the initial dataset for IAV infections in mice with virulence information for this study (Additional file
<xref rid="MOESM5" ref-type="media">5</xref>
: Table S1). Following this, multiple records of infections involving specific IAV and mouse strains were reduced into a single record (Additional file
<xref rid="MOESM6" ref-type="media">6</xref>
: Table S2) by the following procedure (termed as
<bold>RULE 6</bold>
):
<list list-type="simple">
<list-item>
<label>(i)</label>
<p id="Par83">Specify the majority class of the three-class virulence assignment for those records; when no majority, consider the class that is more or the most virulent.</p>
</list-item>
<list-item>
<label>(ii)</label>
<p id="Par84">Select the record with:
<list list-type="bullet">
<list-item>
<label></label>
<p id="Par85">the highest lower bound of MLD50 value when only the lower bound of MLD50 values is presented;</p>
</list-item>
<list-item>
<label></label>
<p id="Par86">the lowest exact or upper bound of MLD50 value when they are available; but when the highest lower bound of MLD50 value is lower than this value, then calculate the average of those two values and assign the virulence class as described previously.</p>
</list-item>
</list>
</p>
</list-item>
</list>
</p>
<p id="Par87">This procedure selected a record that had the more or most virulent information among the records, except when only the lower bound of MLD50 values was available; or alternatively, with the majority class if it could be determined. Note that when applying this procedure, the recombinants of naturally occurring or wild-type IAV strains were considered representing the wild-type version. In a similar fashion, we applied this procedure to reduce multiple records of infections of a specific IAV strain in different mouse strains into a single record (Additional file
<xref rid="MOESM7" ref-type="media">7</xref>
: Table S3).</p>
</sec>
</sec>
<sec id="Sec17">
<title>Collection of related genomes and main proteins</title>
<p id="Par88">The availability of the sequences of IAV strains in the public databases, when they are not suggested in the literature, were checked online using Google, GenBank and GISAID search engines, or search offline in the genomeset.dat and influenza_na.dat files that were retrieved from NCBI Influenza Virus Resource [
<xref ref-type="bibr" rid="CR65">65</xref>
]. The sequences of the viruses, if available, were collected from GenBank [
<xref ref-type="bibr" rid="CR66">66</xref>
] or GISAID [
<xref ref-type="bibr" rid="CR67">67</xref>
]. When the genome of a particular virus were incomplete, the HA and/or NA of the virus were/was BLASTed against GenBank database of all influenza viruses and the top virus hit whose complete genome was available was used to extrapolate the incomplete genome (Additional file 
<xref rid="MOESM11" ref-type="media">11</xref>
: Table S7). Considering the closeness between their collection year and name, the genomes of A/Turkey/15/2006(H5N1) and A/chicken/Shandong/L1/2007(H9N2) were used to represent the genomes of A/Turkey/13/2006(H5N1) and A/chicken/Shandong/lx1023/2007(H9N2), respectively, which were not available during this study. Furthermore, we extrapolated partial IAV sequences by using the closest complete IAV sequence identified by BLAST (Additional file 
<xref rid="MOESM12" ref-type="media">12</xref>
: Table S8). Following the collection of IAV genomes and their extrapolation, the 12 IAV proteins were obtained by identifying their coding sequence regions using Influenza Virus Sequence Annotation Tool available at the NCBI Influenza Virus Resource [
<xref ref-type="bibr" rid="CR65">65</xref>
] and then translating them into proteins according to standard genetic code. Some proteins, mainly for recombinant and/or mutant viruses, were generated from existing proteins according to the list of amino acid differences at various sites reported in the literature. Note that some IAVs were represented by different versions of genomes or sets of proteins, but the reassortant or mutant viruses were mainly reconstructed from one of the versions. The metadata for IAV nucleotide sequences used in this study, reconstructed recombinant and/or mutant IAVs generated from those sequences, and the acknowledgement of the source of the GISAID sequences are provided in Table S4-S6 (Additional files
<xref rid="MOESM8" ref-type="media">8</xref>
,
<xref rid="MOESM9" ref-type="media">9</xref>
and
<xref rid="MOESM10" ref-type="media">10</xref>
), respectively. They were also made available in DR-NTU (Data) under the title “Virulence Information for Influenza Virus Infections (VI2VI) in Mice” [
<xref ref-type="bibr" rid="CR68">68</xref>
], and further update will be available in the link: 10.21979/N9/ILQBAB.</p>
</sec>
<sec id="Sec18">
<title>Machine learning approaches for IAV virulence prediction</title>
<p id="Par89">Three rule-based machine learning approaches, i.e., OneR, JRip and PART that are available in RWeka version 0.4.39 [
<xref ref-type="bibr" rid="CR69">69</xref>
], and random forest (RF) that is available in randomForest package version 4.6.14 for R software (R version 3.5.1 [
<xref ref-type="bibr" rid="CR70">70</xref>
] was used for all statistical and computational works in this study) were explored to develop predictive models for IAV virulence. Various input datasets were considered (see the first section of results), but in general, the input datasets consisted of IAV proteins that have been aligned with muscle package version 3.8.425 [
<xref ref-type="bibr" rid="CR71">71</xref>
] and their target virulence class. The datasets included either the concatenated alignments of all IAV proteins or individual alignment of PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2 or NS2 proteins. Each column in the alignment that contained more than one symbol was considered as a single feature vector – H3 and N2 numberings were used to label the position in the alignments of HA and NA, respectively. Input datasets that incorporated the host strain information, where each amino acid in the alignments was tagged with a symbol indicating associated host strain, were also considered. For each input dataset, each learning algorithm and each of the two-class and three-class datasets, rule-based and RF models were learned independently 100 times. In each iteration, the dataset was balanced by reducing the size of the bigger (biggest) class to the size of the smaller (smallest) class through sampling without replacement. Unless stated otherwise, 60% of the records (rows of the alignment) from each virulent class were used as training data for learning a model, while the rest were used as test data. Performance metrics that included accuracy, (macro-average) precision and (macro-average) recall were calculated to evaluate the models.</p>
</sec>
<sec id="Sec19">
<title>Visualization, statistical analyses and site rankings</title>
<p id="Par90">The concatenated alignments of all IAV proteins were visualized in 3D Cartesian coordinates. For this, a matrix of pairwise distances from the concatenated protein alignments was computed using Fitch similarity matrix and then the Kruskal’s non-metric multidimensional scaling available in MASS package version 7.3.50 [
<xref ref-type="bibr" rid="CR72">72</xref>
] for R software was applied to place each record of the concatenated protein sequences in a 3D space.</p>
<p id="Par91">The correlations between sites in the alignment and the target virulence class were measured using the Benjamini-Hochberg adjusted
<italic>p</italic>
-values of the chi-square test of independence. The –log (adjusted p-value) of the test over the sites of each IAV protein was visualized with a line plot.</p>
<p id="Par92">For each of the two-class and three-class datasets, a three-way ANOVA model (with interactions) was built to identify factors that influence the accuracy of the virulence models. The factors included the dataset (with 7 levels: MIV, IV, BALB/C, C57BL/6, H1N1, H3N2 and H5N1), protein alignment (with 13 levels: all proteins, PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2 and NS2) and machine learning algorithm (with 3 levels: OneR, JRip and PART). The Tukey’s HSD post hoc test was then carried out to identify pairs of groups (virulence models) that were significantly different. The Wilcoxon signed-rank sum test was also used to test the null hypothesis that the median of the accuracy of PART model learned from any dataset containing the concatenated protein alignments is greater than that of the corresponding RF model. The p-values of the tests were adjusted using the Bonferroni method.</p>
<p id="Par93">Following 100 independent learnings from the two-class and three-class IV datasets, the protein sites from models learned using each algorithm were ranked. For OneR, the sites were ranked according to their frequency of being selected for the models; for JRip and PART, the sites were ranked according to their average contribution to the accuracy of learned models; and for RF, the sites were ranked according to their contribution to the average mean decrease in accuracy. For PART models, we also ranked the site pairs according to their average contribution to the accuracy of learned models and visualized the synergistic graph arises from the top 50 site pairs using igraph package version 1.2.2 [
<xref ref-type="bibr" rid="CR73">73</xref>
] for R software.</p>
</sec>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary information</title>
<sec id="Sec20">
<p>
<supplementary-material content-type="local-data" id="MOESM1">
<media xlink:href="12864_2019_6295_MOESM1_ESM.pptx">
<caption>
<p>
<bold>Additional file 1: Figure S1.</bold>
Accuracy distribution of 100 OneR/JRip/PART models learned independently from two-class and three-class BALB/C, C57BL/6, H1N1, H3N2, and H5N1 datasets containing either the concatenated alignments of all IAV proteins or an individual alignment of PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2 or NS2 proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM2">
<media xlink:href="12864_2019_6295_MOESM2_ESM.pptx">
<caption>
<p>
<bold>Additional file 2: Figure S2.</bold>
Accuracy distribution of 100 PART/random forest models learned independently from two-class and three-class datasets containing the concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM3">
<media xlink:href="12864_2019_6295_MOESM3_ESM.pptx">
<caption>
<p>
<bold>Additional file 3: Figure S3.</bold>
Multiple comparisons between mean accuracies of OneR, JRip and PART models for IAV virulence based on two-class and three-class MIV and IV datasets containing either the concatenated alignment of all IAV proteins or an individual alignment of PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2 and NS2 proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM4">
<media xlink:href="12864_2019_6295_MOESM4_ESM.pptx">
<caption>
<p>
<bold>Additional file 4; Figure S4.</bold>
Multiple comparisons between mean accuracies of OneR, JRip and PART models for IAV virulence based on two-class and three-class BALB/C, C57BL/6, H1N1, H3N2 and H5N1 datasets containing either the concatenated alignment of all IAV proteins or an individual alignment of PB2, PB1, PA, HA, NP, NA, M1, NS1, PB1-F2, PA-X, M2 and NS2 proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM5">
<media xlink:href="12864_2019_6295_MOESM5_ESM.docx">
<caption>
<p>
<bold>Additional file 5: Table S1.</bold>
Initial dataset for IAV infections in mice with virulence information (with supplementary references).</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM6">
<media xlink:href="12864_2019_6295_MOESM6_ESM.docx">
<caption>
<p>
<bold>Additional file 6: Table S2.</bold>
Reduction of multiple records for infection involving specific IAV and mouse strains into a single record (with supplementary references).</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM7">
<media xlink:href="12864_2019_6295_MOESM7_ESM.docx">
<caption>
<p>
<bold>Additional file 7: Table S3.</bold>
Reduction of multiple records for infection of a specific IAV strain in different mouse strains into a single record (with supplementary references).</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM8">
<media xlink:href="12864_2019_6295_MOESM8_ESM.docx">
<caption>
<p>
<bold>Additional file 8: Table S4.</bold>
Metadata of IAV nucleotide sequences used in this study (with supplementary references).</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM9">
<media xlink:href="12864_2019_6295_MOESM9_ESM.docx">
<caption>
<p>
<bold>Additional file 9: Table S5.</bold>
Mutant and reassortant IAVs generated in this study.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM10">
<media xlink:href="12864_2019_6295_MOESM10_ESM.docx">
<caption>
<p>
<bold>Additional file 10: Table S6.</bold>
GISAID acknowledgement table for sequences used in this study.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM11">
<media xlink:href="12864_2019_6295_MOESM11_ESM.docx">
<caption>
<p>
<bold>Additional file 11: Table S7.</bold>
Extrapolated incomplete IAV genomes.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM12">
<media xlink:href="12864_2019_6295_MOESM12_ESM.docx">
<caption>
<p>
<bold>Additional file 12: Table S8.</bold>
Extrapolated partial IAV segments.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM13">
<media xlink:href="12864_2019_6295_MOESM13_ESM.docx">
<caption>
<p>
<bold>Additional file 13: Table S9.</bold>
Examples of rules generated by OneR, JRip and PART for two-class and three-class MIV datasets containing concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM14">
<media xlink:href="12864_2019_6295_MOESM14_ESM.docx">
<caption>
<p>
<bold>Additional file 14: Table S10.</bold>
Examples of rules generated by OneR, JRip and PART for two-class and three-class IV datasets containing concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM15">
<media xlink:href="12864_2019_6295_MOESM15_ESM.docx">
<caption>
<p>
<bold>Additional file 15: Table S11.</bold>
Examples of rules generated by OneR, JRip and PART for two-class and three-class BALB/C datasets containing concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM16">
<media xlink:href="12864_2019_6295_MOESM16_ESM.docx">
<caption>
<p>
<bold>Additional file 16: Table S12.</bold>
Examples of rules generated by OneR, JRip and PART for two-class and three-class C57BL/6 datasets containing concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM17">
<media xlink:href="12864_2019_6295_MOESM17_ESM.docx">
<caption>
<p>
<bold>Additional file 17: Table S13.</bold>
Examples of rules generated by OneR, JRip and PART for two-class and three-class H1N1 datasets containing concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM18">
<media xlink:href="12864_2019_6295_MOESM18_ESM.docx">
<caption>
<p>
<bold>Additional file 18: Table S14.</bold>
Examples of rules generated by OneR, JRip and PART for two-class and three-class H3N2 datasets containing concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="MOESM19">
<media xlink:href="12864_2019_6295_MOESM19_ESM.docx">
<caption>
<p>
<bold>Additional file 19: Table S15.</bold>
Examples of rules generated by OneR, JRip and PART for two-class and three-class H5N1 datasets containing concatenated alignments of IAV proteins.</p>
</caption>
</media>
</supplementary-material>
</p>
</sec>
</sec>
</body>
<back>
<glossary>
<title>Abbreviations</title>
<def-list>
<def-item>
<term>1R</term>
<def>
<p id="Par4">OneR</p>
</def>
</def-item>
<def-item>
<term>avir</term>
<def>
<p id="Par5">avirulent</p>
</def>
</def-item>
<def-item>
<term>CCID50</term>
<def>
<p id="Par6">cell culture infectious dose</p>
</def>
</def-item>
<def-item>
<term>EID50</term>
<def>
<p id="Par7">egg infectious dose</p>
</def>
</def-item>
<def-item>
<term>FFU</term>
<def>
<p id="Par8">focus forming unit</p>
</def>
</def-item>
<def-item>
<term>HA</term>
<def>
<p id="Par9">hemagglutinin</p>
</def>
</def-item>
<def-item>
<term>hi</term>
<def>
<p id="Par10">high virulence</p>
</def>
</def-item>
<def-item>
<term>IAV</term>
<def>
<p id="Par11">influenza A virus</p>
</def>
</def-item>
<def-item>
<term>int</term>
<def>
<p id="Par12">intermediate virulence</p>
</def>
</def-item>
<def-item>
<term>IP</term>
<def>
<p id="Par13">IAV protein dataset</p>
</def>
</def-item>
<def-item>
<term>IV</term>
<def>
<p id="Par14">IVir ×
<sub>I</sub>
IP dataset</p>
</def>
</def-item>
<def-item>
<term>IVir</term>
<def>
<p id="Par15">IAV virulence dataset (virulence dataset whose multiple IAV infection records across different mouse strains were reduced into a single record)</p>
</def>
</def-item>
<def-item>
<term>JR</term>
<def>
<p id="Par16">JRip</p>
</def>
</def-item>
<def-item>
<term>LD50</term>
<def>
<p id="Par17">lethal dose 50</p>
</def>
</def-item>
<def-item>
<term>lo</term>
<def>
<p id="Par18">low virulence</p>
</def>
</def-item>
<def-item>
<term>M1</term>
<def>
<p id="Par19">matrix protein 1</p>
</def>
</def-item>
<def-item>
<term>M2</term>
<def>
<p id="Par20">matrix protein 2</p>
</def>
</def-item>
<def-item>
<term>MIV</term>
<def>
<p id="Par21">MIVir ×
<sub>I</sub>
IP dataset</p>
</def>
</def-item>
<def-item>
<term>MIVir</term>
<def>
<p id="Par22">mouse-IAV virulence dataset (virulence dataset whose multiple records involving specific IAV and mouse strain were reduced into a single record)</p>
</def>
</def-item>
<def-item>
<term>MLD50</term>
<def>
<p id="Par23">mouse lethal dose 50</p>
</def>
</def-item>
<def-item>
<term>NA</term>
<def>
<p id="Par24">neuraminidase</p>
</def>
</def-item>
<def-item>
<term>NP</term>
<def>
<p id="Par25">nucleocapsid protein</p>
</def>
</def-item>
<def-item>
<term>NS1</term>
<def>
<p id="Par26">non-structural protein 1</p>
</def>
</def-item>
<def-item>
<term>NS2</term>
<def>
<p id="Par27">non-structural protein 2</p>
</def>
</def-item>
<def-item>
<term>PA</term>
<def>
<p id="Par28">acidic RNA polymerase</p>
</def>
</def-item>
<def-item>
<term>PB1</term>
<def>
<p id="Par29">basic RNA polymerase 1</p>
</def>
</def-item>
<def-item>
<term>PB1-F2</term>
<def>
<p id="Par30">PB1 frame 2</p>
</def>
</def-item>
<def-item>
<term>PB2</term>
<def>
<p id="Par31">basic RNA polymerase 2</p>
</def>
</def-item>
<def-item>
<term>PFU</term>
<def>
<p id="Par32">plaque forming unit</p>
</def>
</def-item>
<def-item>
<term>PT</term>
<def>
<p id="Par33">PART</p>
</def>
</def-item>
<def-item>
<term>RF</term>
<def>
<p id="Par34">random forest</p>
</def>
</def-item>
<def-item>
<term>TCID50</term>
<def>
<p id="Par35">tissue culture infectious dose</p>
</def>
</def-item>
<def-item>
<term>vir</term>
<def>
<p id="Par36">virulent</p>
</def>
</def-item>
</def-list>
</glossary>
<fn-group>
<fn>
<p>
<bold>Publisher’s Note</bold>
</p>
<p>Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.</p>
</fn>
</fn-group>
<sec>
<title>Supplementary information</title>
<p>
<bold>Supplementary information</bold>
accompanies this paper at 10.1186/s12864-019-6295-8.</p>
</sec>
<ack>
<title>Acknowledgements</title>
<p>We wish to thank all authors whose IAV infection and genomic data were used in this study.</p>
<sec id="FPar1">
<title>About this supplement</title>
<p id="Par94">This article has been published as part of BMC Genomics, Volume 20 Supplement 9, 2019: 18th International Conference on Bioinformatics. The full contents of the supplement are available at
<ext-link ext-link-type="uri" xlink:href="https://bmcgenomics.biomedcentral.com/articles/supplements/volume-20-supplement-9">https://bmcgenomics.biomedcentral.com/articles/supplements/volume-20-supplement-9</ext-link>
.</p>
</sec>
</ack>
<notes notes-type="author-contribution">
<title>Authors’ contributions</title>
<p>FXI designed and conducted all research procedures and analyses in the study, and wrote the manuscript. CKK provided supervision to the research and revised the manuscript. Both authors have read and approved the final manuscript.</p>
</notes>
<notes notes-type="funding-information">
<title>Funding</title>
<p>Publication of this supplement was funded by AcRF Tier 2 Grant MOE2014-T2–2-023, Ministry of Education, Singapore and A*STAR-NTU-SUTD AI Partnership Grant RGANS1905.</p>
</notes>
<notes notes-type="data-availability">
<title>Availability of data and materials</title>
<p>All figures and tables generated in this study are available in this article and its additional files. The sequences used in this study are available in GenBank or GISAID or can be requested from the corresponding author of related publications – the GenBank/GISAID accession number or reference for the sequences can be found in Table S4 (Additional file
<xref rid="MOESM8" ref-type="media">8</xref>
). The figures and tables, in addition to the processing and analysis scripts, are also available in DR-NTU (Data) repository 10.21979/N9/ILQBAB.</p>
</notes>
<notes>
<title>Ethics approval and consent to participate</title>
<p id="Par95">Not applicable.</p>
</notes>
<notes>
<title>Consent for publication</title>
<p id="Par96">Not applicable.</p>
</notes>
<notes notes-type="COI-statement">
<title>Competing interests</title>
<p id="Par97">Both authors declare that they have no competing interests.</p>
</notes>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<label>1.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Muramoto</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Noda</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kawakami</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Akkina</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Identification of novel influenza A virus proteins translated from PA mRNA</article-title>
<source>J Virol</source>
<year>2013</year>
<volume>87</volume>
<issue>5</issue>
<fpage>2455</fpage>
<lpage>2462</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02656-12</pub-id>
<pub-id pub-id-type="pmid">23236060</pub-id>
</element-citation>
</ref>
<ref id="CR2">
<label>2.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Poovorawan</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Pyungporn</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Prachayangprecha</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Makkoch</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Global alert to avian influenza virus infection: from H5N1 to H7N9</article-title>
<source>Pathog Glob Health</source>
<year>2013</year>
<volume>107</volume>
<issue>5</issue>
<fpage>217</fpage>
<lpage>223</lpage>
<pub-id pub-id-type="doi">10.1179/2047773213Y.0000000103</pub-id>
<pub-id pub-id-type="pmid">23916331</pub-id>
</element-citation>
</ref>
<ref id="CR3">
<label>3.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Su</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Gray</surname>
<given-names>GC</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>GF</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Epidemiology, evolution, and recent outbreaks of avian influenza virus in China</article-title>
<source>J Virol</source>
<year>2015</year>
<volume>89</volume>
<issue>17</issue>
<fpage>8671</fpage>
<lpage>8676</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.01034-15</pub-id>
<pub-id pub-id-type="pmid">26063419</pub-id>
</element-citation>
</ref>
<ref id="CR4">
<label>4.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>MN</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>GL</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Influenza A(H7N9) virus antibody responses in survivors 1 year after infection, China, 2017</article-title>
<source>Emerg Infect Dis</source>
<year>2018</year>
<volume>24</volume>
<issue>4</issue>
<fpage>663</fpage>
<lpage>672</lpage>
<pub-id pub-id-type="doi">10.3201/eid2404.171995</pub-id>
<pub-id pub-id-type="pmid">29432091</pub-id>
</element-citation>
</ref>
<ref id="CR5">
<label>5.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lindenmann</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Inheritance of resistance to influenza virus in mice</article-title>
<source>Proc Soc Exp Biol Med</source>
<year>1964</year>
<volume>116</volume>
<fpage>506</fpage>
<lpage>509</lpage>
<pub-id pub-id-type="doi">10.3181/00379727-116-29292</pub-id>
<pub-id pub-id-type="pmid">14193387</pub-id>
</element-citation>
</ref>
<ref id="CR6">
<label>6.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Verhelst</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Parthoens</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Schepens</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Fiers</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Saelens</surname>
<given-names>X</given-names>
</name>
</person-group>
<article-title>Interferon-inducible protein Mx1 inhibits influenza virus by interfering with functional viral ribonucleoprotein complex assembly</article-title>
<source>J Virol</source>
<year>2012</year>
<volume>86</volume>
<issue>24</issue>
<fpage>13445</fpage>
<lpage>13455</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.01682-12</pub-id>
<pub-id pub-id-type="pmid">23015724</pub-id>
</element-citation>
</ref>
<ref id="CR7">
<label>7.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kamal</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Katz</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>York</surname>
<given-names>IA</given-names>
</name>
</person-group>
<article-title>Molecular determinants of influenza virus pathogenesis in mice</article-title>
<source>Curr Top Microbiol Immunol</source>
<year>2014</year>
<volume>385</volume>
<fpage>243</fpage>
<lpage>274</lpage>
<pub-id pub-id-type="pmid">25038937</pub-id>
</element-citation>
</ref>
<ref id="CR8">
<label>8.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Medina</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Garcia-Sastre</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Influenza A viruses: new research developments</article-title>
<source>Nat Rev Microbiol</source>
<year>2011</year>
<volume>9</volume>
<issue>8</issue>
<fpage>590</fpage>
<lpage>603</lpage>
<pub-id pub-id-type="doi">10.1038/nrmicro2613</pub-id>
<pub-id pub-id-type="pmid">21747392</pub-id>
</element-citation>
</ref>
<ref id="CR9">
<label>9.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Imai</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>The role of receptor binding specificity in interspecies transmission of influenza viruses</article-title>
<source>Curr Opin Virol</source>
<year>2012</year>
<volume>2</volume>
<issue>2</issue>
<fpage>160</fpage>
<lpage>167</lpage>
<pub-id pub-id-type="doi">10.1016/j.coviro.2012.03.003</pub-id>
<pub-id pub-id-type="pmid">22445963</pub-id>
</element-citation>
</ref>
<ref id="CR10">
<label>10.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Conenello</surname>
<given-names>GM</given-names>
</name>
<name>
<surname>Zamarin</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Perrone</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Tumpey</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Palese</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>A single mutation in the PB1-F2 of H5N1 (HK/97) and 1918 influenza A viruses contributes to increased virulence</article-title>
<source>PLoS Pathog</source>
<year>2007</year>
<volume>3</volume>
<issue>10</issue>
<fpage>1414</fpage>
<lpage>1421</lpage>
<pub-id pub-id-type="doi">10.1371/journal.ppat.0030141</pub-id>
<pub-id pub-id-type="pmid">17922571</pub-id>
</element-citation>
</ref>
<ref id="CR11">
<label>11.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Synergistic effect of S224P and N383D substitutions in the PA of H5N1 avian influenza virus contributes to mammalian adaptation</article-title>
<source>Sci Rep</source>
<year>2015</year>
<volume>5</volume>
<fpage>10510</fpage>
<pub-id pub-id-type="doi">10.1038/srep10510</pub-id>
<pub-id pub-id-type="pmid">26000865</pub-id>
</element-citation>
</ref>
<ref id="CR12">
<label>12.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Seyer</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hrincius</surname>
<given-names>ER</given-names>
</name>
<name>
<surname>Ritzel</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Abt</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Mellmann</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Marjuki</surname>
<given-names>H</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Synergistic adaptive mutations in the hemagglutinin and polymerase acidic protein lead to increased virulence of pandemic 2009 H1N1 influenza A virus in mice</article-title>
<source>J Infect Dis</source>
<year>2012</year>
<volume>205</volume>
<issue>2</issue>
<fpage>262</fpage>
<lpage>271</lpage>
<pub-id pub-id-type="doi">10.1093/infdis/jir716</pub-id>
<pub-id pub-id-type="pmid">22102733</pub-id>
</element-citation>
</ref>
<ref id="CR13">
<label>13.</label>
<mixed-citation publication-type="other">Peng Y, Zhu W, Feng Z, Zhu Z, Zhang Z, Chen Y, et al. Identification of genome-wide nucleotide sites associated with mammalian virulence in influenza A viruses. bioRxiv. 2018;416586. 10.1101/416586.</mixed-citation>
</ref>
<ref id="CR14">
<label>14.</label>
<mixed-citation publication-type="other">York IA, Stevens J, Alymova IV. Influenza virus N-linked glycosylation and innate immunity. Biosci Rep. 2019;39(1):BSR20171505.</mixed-citation>
</ref>
<ref id="CR15">
<label>15.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lycett</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Lewis</surname>
<given-names>FI</given-names>
</name>
<name>
<surname>Poon</surname>
<given-names>AF</given-names>
</name>
<name>
<surname>Kosakovsky Pond</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>Detection of mammalian virulence determinants in highly pathogenic avian influenza H5N1 viruses: multivariate analysis of published data</article-title>
<source>J Virol</source>
<year>2009</year>
<volume>83</volume>
<issue>19</issue>
<fpage>9901</fpage>
<lpage>9910</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.00608-09</pub-id>
<pub-id pub-id-type="pmid">19625397</pub-id>
</element-citation>
</ref>
<ref id="CR16">
<label>16.</label>
<mixed-citation publication-type="other">Casadevall A. The Pathogenic Potential of a Microbe. mSphere. 2017;2(1):e00015–17.</mixed-citation>
</ref>
<ref id="CR17">
<label>17.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Holte</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Very simple classification rules perform well on most commonly used datasets</article-title>
<source>Mach Learn</source>
<year>1993</year>
<volume>11</volume>
<fpage>63</fpage>
<lpage>91</lpage>
<pub-id pub-id-type="doi">10.1023/A:1022631118932</pub-id>
</element-citation>
</ref>
<ref id="CR18">
<label>18.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Cohen</surname>
<given-names>WW</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Prieditis</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Russell</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Fast effective rule induction</article-title>
<source>Proceedings of the twelfth international conference on machine learning</source>
<year>1995</year>
<publisher-loc>San Francisco</publisher-loc>
<publisher-name>Morgan Kaufmann Publishers Inc.</publisher-name>
</element-citation>
</ref>
<ref id="CR19">
<label>19.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Frank</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Witten</surname>
<given-names>IH</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Shavlik</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Generating accurate rule sets without global optimization</article-title>
<source>ICML ’98 proceedings of the fifteenth international conference on machine learning</source>
<year>1998</year>
<publisher-loc>San Francisco</publisher-loc>
<publisher-name>Morgan Kaufmann Publishers Inc.</publisher-name>
</element-citation>
</ref>
<ref id="CR20">
<label>20.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Breiman</surname>
<given-names>L</given-names>
</name>
</person-group>
<article-title>Random forests</article-title>
<source>Mach Learn</source>
<year>2001</year>
<volume>45</volume>
<issue>1</issue>
<fpage>5</fpage>
<lpage>32</lpage>
<pub-id pub-id-type="doi">10.1023/A:1010933404324</pub-id>
</element-citation>
</ref>
<ref id="CR21">
<label>21.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mair</surname>
<given-names>CM</given-names>
</name>
<name>
<surname>Ludwig</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Herrmann</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sieben</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Receptor binding and pH stability – how influenza A virus hemagglutinin affects host-specific virus infection</article-title>
<source>Biochim Biophys Acta</source>
<year>2014</year>
<volume>1838</volume>
<issue>4</issue>
<fpage>1153</fpage>
<lpage>1168</lpage>
<pub-id pub-id-type="doi">10.1016/j.bbamem.2013.10.004</pub-id>
<pub-id pub-id-type="pmid">24161712</pub-id>
</element-citation>
</ref>
<ref id="CR22">
<label>22.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Arai</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Kawashita</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Hotta</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Hoang</surname>
<given-names>PVM</given-names>
</name>
<name>
<surname>Nguyen</surname>
<given-names>HLK</given-names>
</name>
<name>
<surname>Nguyen</surname>
<given-names>TC</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Multiple polymerase gene mutations for human adaptation occurring in Asian H5N1 influenza virus clinical isolates</article-title>
<source>Sci Rep</source>
<year>2018</year>
<volume>8</volume>
<issue>1</issue>
<fpage>13066</fpage>
<pub-id pub-id-type="doi">10.1038/s41598-018-31397-3</pub-id>
<pub-id pub-id-type="pmid">30166556</pub-id>
</element-citation>
</ref>
<ref id="CR23">
<label>23.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Czudai-Matwich</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Otte</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Matrosovich</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Gabriel</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Klenk</surname>
<given-names>HD</given-names>
</name>
</person-group>
<article-title>PB2 mutations D701N and S714R promote adaptation of an influenza H5N1 virus to a mammalian host</article-title>
<source>J Virol</source>
<year>2014</year>
<volume>88</volume>
<issue>16</issue>
<fpage>8735</fpage>
<lpage>8742</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.00422-14</pub-id>
<pub-id pub-id-type="pmid">24899203</pub-id>
</element-citation>
</ref>
<ref id="CR24">
<label>24.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hatta</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>JH</given-names>
</name>
<name>
<surname>Halfmann</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Imai</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Macken</surname>
<given-names>CA</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Novel residues in avian influenza virus PB2 protein affect virulence in mammalian hosts</article-title>
<source>Nat Commun</source>
<year>2014</year>
<volume>5</volume>
<fpage>5021</fpage>
<pub-id pub-id-type="doi">10.1038/ncomms6021</pub-id>
<pub-id pub-id-type="pmid">25289523</pub-id>
</element-citation>
</ref>
<ref id="CR25">
<label>25.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Pu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>H</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Mouse-adapted H9N2 influenza A virus PB2 protein M147L and E627K mutations are critical for high virulence</article-title>
<source>PLoS One</source>
<year>2012</year>
<volume>7</volume>
<issue>7</issue>
<fpage>e40752</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0040752</pub-id>
<pub-id pub-id-type="pmid">22808250</pub-id>
</element-citation>
</ref>
<ref id="CR26">
<label>26.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Zha</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Synergistic effect of PB2 283M and 526R contributes to enhanced virulence of H5N8 influenza viruses in mice</article-title>
<source>Vet Res</source>
<year>2017</year>
<volume>48</volume>
<issue>1</issue>
<fpage>67</fpage>
<pub-id pub-id-type="doi">10.1186/s13567-017-0471-0</pub-id>
<pub-id pub-id-type="pmid">29070059</pub-id>
</element-citation>
</ref>
<ref id="CR27">
<label>27.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sediri</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Thiele</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Schwalm</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Gabriel</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Klenk</surname>
<given-names>HD</given-names>
</name>
</person-group>
<article-title>PB2 subunit of avian influenza virus subtype H9N2: a pandemic risk factor</article-title>
<source>J Gen Virol</source>
<year>2016</year>
<volume>97</volume>
<issue>1</issue>
<fpage>39</fpage>
<lpage>48</lpage>
<pub-id pub-id-type="doi">10.1099/jgv.0.000333</pub-id>
<pub-id pub-id-type="pmid">26560088</pub-id>
</element-citation>
</ref>
<ref id="CR28">
<label>28.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Macken</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ozawa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Goto</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Iswahyudi</surname>
<given-names>NF</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Synergistic effect of the PDZ and p85beta-binding domains of the NS1 protein on virulence of an avian H5N1 influenza A virus</article-title>
<source>J Virol</source>
<year>2013</year>
<volume>87</volume>
<issue>9</issue>
<fpage>4861</fpage>
<lpage>4871</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02608-12</pub-id>
<pub-id pub-id-type="pmid">23408626</pub-id>
</element-citation>
</ref>
<ref id="CR29">
<label>29.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Bi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Synergism of co-mutation of two amino acid residues in NS1 protein increases the pathogenicity of influenza virus in mice</article-title>
<source>Virus Res</source>
<year>2010</year>
<volume>151</volume>
<issue>2</issue>
<fpage>200</fpage>
<lpage>204</lpage>
<pub-id pub-id-type="doi">10.1016/j.virusres.2010.05.007</pub-id>
<pub-id pub-id-type="pmid">20546807</pub-id>
</element-citation>
</ref>
<ref id="CR30">
<label>30.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Bright</surname>
<given-names>RA</given-names>
</name>
<name>
<surname>Subbarao</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Cox</surname>
<given-names>NJ</given-names>
</name>
<name>
<surname>Katz</surname>
<given-names>JM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Polygenic virulence factors involved in pathogenesis of 1997 Hong Kong H5N1 influenza viruses in mice</article-title>
<source>Virus Res</source>
<year>2007</year>
<volume>128</volume>
<issue>1–2</issue>
<fpage>159</fpage>
<lpage>163</lpage>
<pub-id pub-id-type="doi">10.1016/j.virusres.2007.04.017</pub-id>
<pub-id pub-id-type="pmid">17521765</pub-id>
</element-citation>
</ref>
<ref id="CR31">
<label>31.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Chai</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Xin</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Q</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PB2-E627K and PA-T97I substitutions enhance polymerase activity and confer a virulent phenotype to an H6N1 avian influenza virus in mice</article-title>
<source>Virology</source>
<year>2014</year>
<volume>468–470</volume>
<fpage>207</fpage>
<lpage>213</lpage>
<pub-id pub-id-type="doi">10.1016/j.virol.2014.08.010</pub-id>
<pub-id pub-id-type="pmid">25194918</pub-id>
</element-citation>
</ref>
<ref id="CR32">
<label>32.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katz</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Tumpey</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>CB</given-names>
</name>
<name>
<surname>Shaw</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Subbarao</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Molecular correlates of influenza A H5N1 virus pathogenesis in mice</article-title>
<source>J Virol</source>
<year>2000</year>
<volume>74</volume>
<issue>22</issue>
<fpage>10807</fpage>
<lpage>10810</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.74.22.10807-10810.2000</pub-id>
<pub-id pub-id-type="pmid">11044127</pub-id>
</element-citation>
</ref>
<ref id="CR33">
<label>33.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PB1-mediated virulence attenuation of H5N1 influenza virus in mice is associated with PB2</article-title>
<source>J Gen Virol</source>
<year>2011</year>
<volume>92</volume>
<issue>Pt 6</issue>
<fpage>1435</fpage>
<lpage>1444</lpage>
<pub-id pub-id-type="doi">10.1099/vir.0.030718-0</pub-id>
<pub-id pub-id-type="pmid">21367983</pub-id>
</element-citation>
</ref>
<ref id="CR34">
<label>34.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ping</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Dankar</surname>
<given-names>SK</given-names>
</name>
<name>
<surname>Forbes</surname>
<given-names>NE</given-names>
</name>
<name>
<surname>Keleta</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tyler</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PB2 and hemagglutinin mutations are major determinants of host range and virulence in mouse-adapted influenza A virus</article-title>
<source>J Virol</source>
<year>2010</year>
<volume>84</volume>
<issue>20</issue>
<fpage>10606</fpage>
<lpage>10618</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.01187-10</pub-id>
<pub-id pub-id-type="pmid">20702632</pub-id>
</element-citation>
</ref>
<ref id="CR35">
<label>35.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname>
<given-names>MS</given-names>
</name>
<name>
<surname>Pascua</surname>
<given-names>PN</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>JH</given-names>
</name>
<name>
<surname>Baek</surname>
<given-names>YH</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>KJ</given-names>
</name>
<name>
<surname>Kwon</surname>
<given-names>HI</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Virulence and genetic compatibility of polymerase reassortant viruses derived from the pandemic (H1N1) 2009 influenza virus and circulating influenza A viruses</article-title>
<source>J Virol</source>
<year>2011</year>
<volume>85</volume>
<issue>13</issue>
<fpage>6275</fpage>
<lpage>6286</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02125-10</pub-id>
<pub-id pub-id-type="pmid">21507962</pub-id>
</element-citation>
</ref>
<ref id="CR36">
<label>36.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Enhanced pathogenicity and neurotropism of mouse-adapted H10N7 influenza virus are mediated by novel PB2 and NA mutations</article-title>
<source>J Gen Virol</source>
<year>2017</year>
<volume>98</volume>
<issue>6</issue>
<fpage>1185</fpage>
<lpage>1195</lpage>
<pub-id pub-id-type="doi">10.1099/jgv.0.000770</pub-id>
<pub-id pub-id-type="pmid">28597818</pub-id>
</element-citation>
</ref>
<ref id="CR37">
<label>37.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bussey</surname>
<given-names>KA</given-names>
</name>
<name>
<surname>Bousse</surname>
<given-names>TL</given-names>
</name>
<name>
<surname>Desmet</surname>
<given-names>EA</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Takimoto</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>PB2 residue 271 plays a key role in enhanced polymerase activity of influenza A viruses in mammalian host cells</article-title>
<source>J Virol</source>
<year>2010</year>
<volume>84</volume>
<issue>9</issue>
<fpage>4395</fpage>
<lpage>4406</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02642-09</pub-id>
<pub-id pub-id-type="pmid">20181719</pub-id>
</element-citation>
</ref>
<ref id="CR38">
<label>38.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hatta</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Halfmann</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Molecular basis for high virulence of Hong Kong H5N1 influenza A viruses</article-title>
<source>Science (New York, NY)</source>
<year>2001</year>
<volume>293</volume>
<issue>5536</issue>
<fpage>1840</fpage>
<lpage>1842</lpage>
<pub-id pub-id-type="doi">10.1126/science.1062882</pub-id>
</element-citation>
</ref>
<ref id="CR39">
<label>39.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PB2 segment promotes high-pathogenicity of H5N1 avian influenza viruses in mice</article-title>
<source>Front Microbiol</source>
<year>2015</year>
<volume>6</volume>
<fpage>73</fpage>
<pub-id pub-id-type="pmid">25713566</pub-id>
</element-citation>
</ref>
<ref id="CR40">
<label>40.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Park</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Kwon</surname>
<given-names>HI</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>MS</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>YI</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Altered virulence of highly pathogenic avian influenza (HPAI) H5N8 reassortant viruses in mammalian models</article-title>
<source>Virulence</source>
<year>2018</year>
<volume>9</volume>
<issue>1</issue>
<fpage>133</fpage>
<lpage>148</lpage>
<pub-id pub-id-type="doi">10.1080/21505594.2017.1366408</pub-id>
<pub-id pub-id-type="pmid">28873012</pub-id>
</element-citation>
</ref>
<ref id="CR41">
<label>41.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Assessment of the internal genes of influenza A (H7N9) virus contributing to high pathogenicity in mice</article-title>
<source>J Virol</source>
<year>2015</year>
<volume>89</volume>
<issue>1</issue>
<fpage>2</fpage>
<lpage>13</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02390-14</pub-id>
<pub-id pub-id-type="pmid">25320305</pub-id>
</element-citation>
</ref>
<ref id="CR42">
<label>42.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PB2 substitutions V598T/I increase the virulence of H7N9 influenza A virus in mammals</article-title>
<source>Virology</source>
<year>2017</year>
<volume>501</volume>
<fpage>92</fpage>
<lpage>101</lpage>
<pub-id pub-id-type="doi">10.1016/j.virol.2016.11.008</pub-id>
<pub-id pub-id-type="pmid">27889648</pub-id>
</element-citation>
</ref>
<ref id="CR43">
<label>43.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>HHY</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>RF</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>HM</given-names>
</name>
<name>
<surname>Yi</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Peiris</surname>
<given-names>JSM</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The PB2 mutation with lysine at 627 enhances the pathogenicity of avian influenza (H7N9) virus which belongs to a non-zoonotic lineage</article-title>
<source>Sci Rep</source>
<year>2017</year>
<volume>7</volume>
<issue>1</issue>
<fpage>2352</fpage>
<pub-id pub-id-type="doi">10.1038/s41598-017-02598-z</pub-id>
<pub-id pub-id-type="pmid">28539661</pub-id>
</element-citation>
</ref>
<ref id="CR44">
<label>44.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mok</surname>
<given-names>CK</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>HH</given-names>
</name>
<name>
<surname>Lestra</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Nicholls</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Chan</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Sia</surname>
<given-names>SF</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Amino acid substitutions in polymerase basic protein 2 gene contribute to the pathogenicity of the novel A/H7N9 influenza virus in mammalian hosts</article-title>
<source>J Virol</source>
<year>2014</year>
<volume>88</volume>
<issue>6</issue>
<fpage>3568</fpage>
<lpage>3576</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02740-13</pub-id>
<pub-id pub-id-type="pmid">24403592</pub-id>
</element-citation>
</ref>
<ref id="CR45">
<label>45.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xiao</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>Z</given-names>
</name>
<etal></etal>
</person-group>
<article-title>PB2-588 V promotes the mammalian adaptation of H10N8, H7N9 and H9N2 avian influenza viruses</article-title>
<source>Sci Rep</source>
<year>2016</year>
<volume>6</volume>
<fpage>19474</fpage>
<pub-id pub-id-type="doi">10.1038/srep19474</pub-id>
<pub-id pub-id-type="pmid">26782141</pub-id>
</element-citation>
</ref>
<ref id="CR46">
<label>46.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>HH</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>ZF</given-names>
</name>
<name>
<surname>Mok</surname>
<given-names>CK</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>PB2-Q591K mutation determines the pathogenicity of avian H9N2 influenza viruses for mammalian species</article-title>
<source>PLoS One</source>
<year>2016</year>
<volume>11</volume>
<issue>9</issue>
<fpage>e0162163</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0162163</pub-id>
<pub-id pub-id-type="pmid">27684944</pub-id>
</element-citation>
</ref>
<ref id="CR47">
<label>47.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Neumann</surname>
<given-names>G</given-names>
</name>
</person-group>
<article-title>H5N1 influenza virulence, pathogenicity and transmissibility: what do we know?</article-title>
<source>Future Virol</source>
<year>2015</year>
<volume>10</volume>
<issue>8</issue>
<fpage>971</fpage>
<lpage>980</lpage>
<pub-id pub-id-type="doi">10.2217/fvl.15.62</pub-id>
<pub-id pub-id-type="pmid">26617665</pub-id>
</element-citation>
</ref>
<ref id="CR48">
<label>48.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boivin</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hart</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Interaction of the influenza A virus polymerase PB2 C-terminal region with importin alpha isoforms provides insights into host adaptation and polymerase assembly</article-title>
<source>J Biol Chem</source>
<year>2011</year>
<volume>286</volume>
<issue>12</issue>
<fpage>10439</fpage>
<lpage>10448</lpage>
<pub-id pub-id-type="doi">10.1074/jbc.M110.182964</pub-id>
<pub-id pub-id-type="pmid">21216958</pub-id>
</element-citation>
</ref>
<ref id="CR49">
<label>49.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gabriel</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Dauber</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Wolff</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Planz</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Klenk</surname>
<given-names>HD</given-names>
</name>
<name>
<surname>Stech</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>The viral polymerase mediates adaptation of an avian influenza virus to a mammalian host</article-title>
<source>Proc Natl Acad Sci U S A</source>
<year>2005</year>
<volume>102</volume>
<issue>51</issue>
<fpage>18590</fpage>
<lpage>18595</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0507415102</pub-id>
<pub-id pub-id-type="pmid">16339318</pub-id>
</element-citation>
</ref>
<ref id="CR50">
<label>50.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>CY</given-names>
</name>
<name>
<surname>An</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Go</surname>
<given-names>DM</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>DY</given-names>
</name>
<name>
<surname>Choi</surname>
<given-names>JG</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Prerequisites for the acquisition of mammalian pathogenicity by influenza A virus with a prototypic avian PB2 gene</article-title>
<source>Sci Rep</source>
<year>2017</year>
<volume>7</volume>
<issue>1</issue>
<fpage>10205</fpage>
<pub-id pub-id-type="doi">10.1038/s41598-017-09560-z</pub-id>
<pub-id pub-id-type="pmid">28860593</pub-id>
</element-citation>
</ref>
<ref id="CR51">
<label>51.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Halpin</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Hine</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Spiro</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Wentworth</surname>
<given-names>DE</given-names>
</name>
</person-group>
<article-title>PB2 residue 158 is a pathogenic determinant of pandemic H1N1 and H5 influenza A viruses in mice</article-title>
<source>J Virol</source>
<year>2011</year>
<volume>85</volume>
<issue>1</issue>
<fpage>357</fpage>
<lpage>365</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.01694-10</pub-id>
<pub-id pub-id-type="pmid">20962098</pub-id>
</element-citation>
</ref>
<ref id="CR52">
<label>52.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kato</surname>
<given-names>YS</given-names>
</name>
<name>
<surname>Fukui</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Mechanism of a mutation in non-structural protein 1 inducing high pathogenicity of avian influenza virus H5N1</article-title>
<source>Protein Pept Lett</source>
<year>2016</year>
<volume>23</volume>
<issue>4</issue>
<fpage>372</fpage>
<lpage>378</lpage>
<pub-id pub-id-type="doi">10.2174/0929866523666160204124406</pub-id>
<pub-id pub-id-type="pmid">26845765</pub-id>
</element-citation>
</ref>
<ref id="CR53">
<label>53.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Effects of the S42 residue of the H1N1 swine influenza virus NS1 protein on interferon responses and virus replication</article-title>
<source>Virol J</source>
<year>2018</year>
<volume>15</volume>
<issue>1</issue>
<fpage>57</fpage>
<pub-id pub-id-type="doi">10.1186/s12985-018-0971-1</pub-id>
<pub-id pub-id-type="pmid">29587786</pub-id>
</element-citation>
</ref>
<ref id="CR54">
<label>54.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Suo</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>Y</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Two amino acid residues in the matrix protein M1 contribute to the virulence difference of H5N1 avian influenza viruses in mice</article-title>
<source>Virology</source>
<year>2009</year>
<volume>384</volume>
<issue>1</issue>
<fpage>28</fpage>
<lpage>32</lpage>
<pub-id pub-id-type="doi">10.1016/j.virol.2008.11.044</pub-id>
<pub-id pub-id-type="pmid">19117585</pub-id>
</element-citation>
</ref>
<ref id="CR55">
<label>55.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blazejewska</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Koscinski</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Viegas</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Anhlan</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Ludwig</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Schughart</surname>
<given-names>K</given-names>
</name>
</person-group>
<article-title>Pathogenicity of different PR8 influenza A virus variants in mice is determined by both viral and host factors</article-title>
<source>Virology</source>
<year>2011</year>
<volume>412</volume>
<issue>1</issue>
<fpage>36</fpage>
<lpage>45</lpage>
<pub-id pub-id-type="doi">10.1016/j.virol.2010.12.047</pub-id>
<pub-id pub-id-type="pmid">21256531</pub-id>
</element-citation>
</ref>
<ref id="CR56">
<label>56.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boon</surname>
<given-names>AC</given-names>
</name>
<name>
<surname>de Beauchamp</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hollmann</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Luke</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Kotb</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Rowe</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Host genetic variation affects resistance to infection with a highly pathogenic H5N1 influenza A virus in mice</article-title>
<source>J Virol</source>
<year>2009</year>
<volume>83</volume>
<issue>20</issue>
<fpage>10417</fpage>
<lpage>10426</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.00514-09</pub-id>
<pub-id pub-id-type="pmid">19706712</pub-id>
</element-citation>
</ref>
<ref id="CR57">
<label>57.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Davidson</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Crotta</surname>
<given-names>S</given-names>
</name>
<name>
<surname>McCabe</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Wack</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Pathogenic potential of interferon alphabeta in acute influenza infection</article-title>
<source>Nat Commun</source>
<year>2014</year>
<volume>5</volume>
<fpage>3864</fpage>
<pub-id pub-id-type="doi">10.1038/ncomms4864</pub-id>
<pub-id pub-id-type="pmid">24844667</pub-id>
</element-citation>
</ref>
<ref id="CR58">
<label>58.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pica</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Iyer</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Ramos</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Bouvier</surname>
<given-names>NM</given-names>
</name>
<name>
<surname>Fernandez-Sesma</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Garcia-Sastre</surname>
<given-names>A</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The DBA.2 mouse is susceptible to disease following infection with a broad, but limited, range of influenza A and B viruses</article-title>
<source>J Virol</source>
<year>2011</year>
<volume>85</volume>
<issue>23</issue>
<fpage>12825</fpage>
<lpage>12829</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.05930-11</pub-id>
<pub-id pub-id-type="pmid">21917963</pub-id>
</element-citation>
</ref>
<ref id="CR59">
<label>59.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Srivastava</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Blazejewska</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Hessmann</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Bruder</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Geffers</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Mauel</surname>
<given-names>S</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Host genetic background strongly influences the response to influenza A virus infections</article-title>
<source>PLoS One</source>
<year>2009</year>
<volume>4</volume>
<issue>3</issue>
<fpage>e4857</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0004857</pub-id>
<pub-id pub-id-type="pmid">19293935</pub-id>
</element-citation>
</ref>
<ref id="CR60">
<label>60.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ye</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sorrell</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Cai</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Shao</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Pena</surname>
<given-names>L</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Variations in the hemagglutinin of the 2009 H1N1 pandemic virus: potential for strains with altered virulence phenotype?</article-title>
<source>PLoS Pathog</source>
<year>2010</year>
<volume>6</volume>
<issue>10</issue>
<fpage>e1001145</fpage>
<pub-id pub-id-type="doi">10.1371/journal.ppat.1001145</pub-id>
<pub-id pub-id-type="pmid">20976194</pub-id>
</element-citation>
</ref>
<ref id="CR61">
<label>61.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>W</given-names>
</name>
<etal></etal>
</person-group>
<article-title>Swift and strong NK cell responses protect 129 mice against high-dose influenza virus infection</article-title>
<source>J Immunol</source>
<year>2016</year>
<volume>196</volume>
<issue>4</issue>
<fpage>1842</fpage>
<lpage>1854</lpage>
<pub-id pub-id-type="doi">10.4049/jimmunol.1501486</pub-id>
<pub-id pub-id-type="pmid">26773146</pub-id>
</element-citation>
</ref>
<ref id="CR62">
<label>62.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eisfeld</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Gasper</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Suresh</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>C57BL/6J and C57BL/6NJ mice are differentially susceptible to inflammation-associated disease caused by influenza A virus</article-title>
<source>Front Microbiol</source>
<year>2018</year>
<volume>9</volume>
<fpage>3307</fpage>
<pub-id pub-id-type="doi">10.3389/fmicb.2018.03307</pub-id>
<pub-id pub-id-type="pmid">30713529</pub-id>
</element-citation>
</ref>
<ref id="CR63">
<label>63.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reed</surname>
<given-names>LJ</given-names>
</name>
<name>
<surname>Muench</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>A simple method of estimating fifty percent endpoints</article-title>
<source>Am J Epidemiol</source>
<year>1938</year>
<volume>27</volume>
<issue>3</issue>
<fpage>493</fpage>
<lpage>497</lpage>
<pub-id pub-id-type="doi">10.1093/oxfordjournals.aje.a118408</pub-id>
</element-citation>
</ref>
<ref id="CR64">
<label>64.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<collab>World Health Organization</collab>
</person-group>
<article-title>Production of pilot lots of inactivated influenza vaccine in response to a pandemic threat: an interim biosafety risk assessment</article-title>
<source>Releve epidemiologique hebdomadaire</source>
<year>2003</year>
<volume>78</volume>
<issue>47</issue>
<fpage>405</fpage>
<lpage>408</lpage>
<pub-id pub-id-type="pmid">14677515</pub-id>
</element-citation>
</ref>
<ref id="CR65">
<label>65.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Bolotov</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Dernovoy</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Kiryutin</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Zaslavsky</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Tatusova</surname>
<given-names>T</given-names>
</name>
<etal></etal>
</person-group>
<article-title>The influenza virus resource at the National Center for Biotechnology Information</article-title>
<source>J Virol</source>
<year>2008</year>
<volume>82</volume>
<issue>2</issue>
<fpage>596</fpage>
<lpage>601</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02005-07</pub-id>
<pub-id pub-id-type="pmid">17942553</pub-id>
</element-citation>
</ref>
<ref id="CR66">
<label>66.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sayers</surname>
<given-names>EW</given-names>
</name>
<name>
<surname>Cavanaugh</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Clark</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Ostell</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Pruitt</surname>
<given-names>KD</given-names>
</name>
<name>
<surname>Karsch-Mizrachi</surname>
<given-names>I</given-names>
</name>
</person-group>
<article-title>GenBank</article-title>
<source>Nucleic Acids Res</source>
<year>2019</year>
<volume>47</volume>
<issue>D1</issue>
<fpage>D94</fpage>
<lpage>DD9</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gky989</pub-id>
<pub-id pub-id-type="pmid">30365038</pub-id>
</element-citation>
</ref>
<ref id="CR67">
<label>67.</label>
<mixed-citation publication-type="other">Shu Y, McCauley J. GISAID: global initiative on sharing all influenza data – from vision to reality. Euro Surveill. 2017;22(13).</mixed-citation>
</ref>
<ref id="CR68">
<label>68.</label>
<mixed-citation publication-type="other">Ivan FX. Virulence information for influenza virus infections (VI2VI) in mice. DR-NTU (Data); 2019.</mixed-citation>
</ref>
<ref id="CR69">
<label>69.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hornik</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Buchta</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Zeileis</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Open-source machine learning: R meets Weka</article-title>
<source>Comput Stat</source>
<year>2009</year>
<volume>24</volume>
<issue>2</issue>
<fpage>225</fpage>
<lpage>232</lpage>
<pub-id pub-id-type="doi">10.1007/s00180-008-0119-7</pub-id>
</element-citation>
</ref>
<ref id="CR70">
<label>70.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liaw</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Wiener</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Classification and regression by randomForest</article-title>
<source>R News</source>
<year>2002</year>
<volume>2</volume>
<issue>3</issue>
<fpage>18</fpage>
<lpage>22</lpage>
</element-citation>
</ref>
<ref id="CR71">
<label>71.</label>
<element-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC</given-names>
</name>
</person-group>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
<source>Nucleic Acids Res</source>
<year>2004</year>
<volume>32</volume>
<issue>5</issue>
<fpage>1792</fpage>
<lpage>1797</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id>
<pub-id pub-id-type="pmid">15034147</pub-id>
</element-citation>
</ref>
<ref id="CR72">
<label>72.</label>
<element-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Venables</surname>
<given-names>WN</given-names>
</name>
<name>
<surname>Ripley</surname>
<given-names>BD</given-names>
</name>
<name>
<surname>Venables</surname>
<given-names>WN</given-names>
</name>
</person-group>
<source>Modern applied statistics with S</source>
<year>2002</year>
<edition>4</edition>
<publisher-loc>New York</publisher-loc>
<publisher-name>Springer</publisher-name>
</element-citation>
</ref>
<ref id="CR73">
<label>73.</label>
<mixed-citation publication-type="other">Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006; Complex Systems:1695.</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/H2N2V1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 0001589 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 0001589 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    H2N2V1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     
   |texte=   
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Apr 14 19:59:40 2020. Site generation: Thu Mar 25 15:38:26 2021