Serveur d'exploration H2N2

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Using Non-Homogeneous Models of Nucleotide Substitution to Identify Host Shift Events: Application to the Origin of the 1918 ‘Spanish’ Influenza Pandemic Virus

Identifieur interne : 000472 ( Pmc/Corpus ); précédent : 000471; suivant : 000473

Using Non-Homogeneous Models of Nucleotide Substitution to Identify Host Shift Events: Application to the Origin of the 1918 ‘Spanish’ Influenza Pandemic Virus

Auteurs : Mario Dos Reis ; Alan J. Hay ; Richard A. Goldstein

Source :

RBID : PMC:2772961

Abstract

Nonhomogeneous Markov models of nucleotide substitution have received scant attention. Here we explore the possibility of using nonhomogeneous models to identify host shift nodes along phylogenetic trees of pathogens evolving in different hosts. It has been noticed that influenza viruses show marked differences in nucleotide composition in human and avian hosts. We take advantage of this fact to identify the host shift event that led to the 1918 ‘Spanish’ influenza. This disease killed over 50 million people worldwide, ranking it as the deadliest pandemic in recorded history. Our model suggests that the eight RNA segments which eventually became the 1918 viral genome were introduced into a mammalian host around 1882–1913. The viruses later diverged into the classical swine and human H1N1 influenza lineages around 1913–1915. The last common ancestor of human strains dates from February 1917 to April 1918. Because pigs are more readily infected with avian influenza viruses than humans, it would seem that they were the original recipient of the virus. This would suggest that the virus was introduced into humans sometime between 1913 and 1918.


Url:
DOI: 10.1007/s00239-009-9282-x
PubMed: 19787384
PubMed Central: 2772961

Links to Exploration step

PMC:2772961

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Using Non-Homogeneous Models of Nucleotide Substitution to Identify Host Shift Events: Application to the Origin of the 1918 ‘Spanish’ Influenza Pandemic Virus</title>
<author>
<name sortKey="Dos Reis, Mario" sort="Dos Reis, Mario" uniqKey="Dos Reis M" first="Mario" last="Dos Reis">Mario Dos Reis</name>
<affiliation>
<nlm:aff id="Aff1">The MRC National Institute for Medical Research, London, NW7 1AA UK</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hay, Alan J" sort="Hay, Alan J" uniqKey="Hay A" first="Alan J." last="Hay">Alan J. Hay</name>
<affiliation>
<nlm:aff id="Aff1">The MRC National Institute for Medical Research, London, NW7 1AA UK</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Goldstein, Richard A" sort="Goldstein, Richard A" uniqKey="Goldstein R" first="Richard A." last="Goldstein">Richard A. Goldstein</name>
<affiliation>
<nlm:aff id="Aff1">The MRC National Institute for Medical Research, London, NW7 1AA UK</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">19787384</idno>
<idno type="pmc">2772961</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2772961</idno>
<idno type="RBID">PMC:2772961</idno>
<idno type="doi">10.1007/s00239-009-9282-x</idno>
<date when="2009">2009</date>
<idno type="wicri:Area/Pmc/Corpus">000472</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000472</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Using Non-Homogeneous Models of Nucleotide Substitution to Identify Host Shift Events: Application to the Origin of the 1918 ‘Spanish’ Influenza Pandemic Virus</title>
<author>
<name sortKey="Dos Reis, Mario" sort="Dos Reis, Mario" uniqKey="Dos Reis M" first="Mario" last="Dos Reis">Mario Dos Reis</name>
<affiliation>
<nlm:aff id="Aff1">The MRC National Institute for Medical Research, London, NW7 1AA UK</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hay, Alan J" sort="Hay, Alan J" uniqKey="Hay A" first="Alan J." last="Hay">Alan J. Hay</name>
<affiliation>
<nlm:aff id="Aff1">The MRC National Institute for Medical Research, London, NW7 1AA UK</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Goldstein, Richard A" sort="Goldstein, Richard A" uniqKey="Goldstein R" first="Richard A." last="Goldstein">Richard A. Goldstein</name>
<affiliation>
<nlm:aff id="Aff1">The MRC National Institute for Medical Research, London, NW7 1AA UK</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Journal of Molecular Evolution</title>
<idno type="ISSN">0022-2844</idno>
<idno type="eISSN">1432-1432</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Nonhomogeneous Markov models of nucleotide substitution have received scant attention. Here we explore the possibility of using nonhomogeneous models to identify host shift nodes along phylogenetic trees of pathogens evolving in different hosts. It has been noticed that influenza viruses show marked differences in nucleotide composition in human and avian hosts. We take advantage of this fact to identify the host shift event that led to the 1918 ‘Spanish’ influenza. This disease killed over 50 million people worldwide, ranking it as the deadliest pandemic in recorded history. Our model suggests that the eight RNA segments which eventually became the 1918 viral genome were introduced into a mammalian host around 1882–1913. The viruses later diverged into the classical swine and human H1N1 influenza lineages around 1913–1915. The last common ancestor of human strains dates from February 1917 to April 1918. Because pigs are more readily infected with avian influenza viruses than humans, it would seem that they were the original recipient of the virus. This would suggest that the virus was introduced into humans sometime between 1913 and 1918.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc xml:lang="EN" article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">J Mol Evol</journal-id>
<journal-title>Journal of Molecular Evolution</journal-title>
<issn pub-type="ppub">0022-2844</issn>
<issn pub-type="epub">1432-1432</issn>
<publisher>
<publisher-name>Springer-Verlag</publisher-name>
<publisher-loc>New York</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">19787384</article-id>
<article-id pub-id-type="pmc">2772961</article-id>
<article-id pub-id-type="publisher-id">9282</article-id>
<article-id pub-id-type="doi">10.1007/s00239-009-9282-x</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Using Non-Homogeneous Models of Nucleotide Substitution to Identify Host Shift Events: Application to the Origin of the 1918 ‘Spanish’ Influenza Pandemic Virus</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name name-style="western">
<surname>dos Reis</surname>
<given-names>Mario</given-names>
</name>
<address>
<phone>+44-20-8816-2300</phone>
<fax>+44-20-8816-2460</fax>
<email>m.reis@mail.cryst.bbk.ac.uk</email>
</address>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name name-style="western">
<surname>Hay</surname>
<given-names>Alan J.</given-names>
</name>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<contrib contrib-type="author">
<name name-style="western">
<surname>Goldstein</surname>
<given-names>Richard A.</given-names>
</name>
<xref ref-type="aff" rid="Aff1"></xref>
</contrib>
<aff id="Aff1">The MRC National Institute for Medical Research, London, NW7 1AA UK</aff>
</contrib-group>
<pub-date pub-type="epub">
<day>29</day>
<month>9</month>
<year>2009</year>
</pub-date>
<pub-date pub-type="ppub">
<month>10</month>
<year>2009</year>
</pub-date>
<volume>69</volume>
<issue>4</issue>
<fpage>333</fpage>
<lpage>345</lpage>
<history>
<date date-type="received">
<day>5</day>
<month>6</month>
<year>2009</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>9</month>
<year>2009</year>
</date>
</history>
<permissions>
<copyright-statement>© The Author(s) 2009</copyright-statement>
</permissions>
<abstract xml:lang="EN">
<p>Nonhomogeneous Markov models of nucleotide substitution have received scant attention. Here we explore the possibility of using nonhomogeneous models to identify host shift nodes along phylogenetic trees of pathogens evolving in different hosts. It has been noticed that influenza viruses show marked differences in nucleotide composition in human and avian hosts. We take advantage of this fact to identify the host shift event that led to the 1918 ‘Spanish’ influenza. This disease killed over 50 million people worldwide, ranking it as the deadliest pandemic in recorded history. Our model suggests that the eight RNA segments which eventually became the 1918 viral genome were introduced into a mammalian host around 1882–1913. The viruses later diverged into the classical swine and human H1N1 influenza lineages around 1913–1915. The last common ancestor of human strains dates from February 1917 to April 1918. Because pigs are more readily infected with avian influenza viruses than humans, it would seem that they were the original recipient of the virus. This would suggest that the virus was introduced into humans sometime between 1913 and 1918.</p>
</abstract>
<kwd-group>
<title>Keywords</title>
<kwd>Influenza</kwd>
<kwd>Spanish flu</kwd>
<kwd>Swine flu</kwd>
<kwd>H1N1</kwd>
<kwd>Non-homogeneous model</kwd>
<kwd>CG content</kwd>
<kwd>Molecular dating</kwd>
</kwd-group>
<custom-meta-wrap>
<custom-meta>
<meta-name>issue-copyright-statement</meta-name>
<meta-value>© Springer Science+Business Media, LLC 2009</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="Sec1" sec-type="introduction">
<title>Introduction</title>
<p>Markov models of nucleotide substitution have now become widely used in phylogenetic analysis (Yang
<xref ref-type="bibr" rid="CR59">2006</xref>
; Felsenstein
<xref ref-type="bibr" rid="CR14">2003</xref>
). Markov models are defined by a substitution matrix that describes the pattern of changes that occur in a sequence as it evolves along a phylogenetic tree. If the pattern of nucleotide substitution is independent of time (i.e., it is the same along the whole tree), the process is said to be time homogeneous. In a homogeneous process, as time approaches infinity, the distribution of nucleotide frequencies in a sequence approaches a stationary or equilibrium distribution (usually denoted
<bold>π</bold>
). Most Markov evolutionary models assume that forward and backward evolution along a tree branch are indistinguishable at equilibrium. This reversibility property is simply a restriction that facilitates the mathematical treatment of the models (Yang
<xref ref-type="bibr" rid="CR56">1994</xref>
). One of the important properties of a reversible process at equilibrium is the so called ‘pulley effect’ (Felsenstein
<xref ref-type="bibr" rid="CR13">1981</xref>
) that prevents identification of the root of a stationary tree because the direction of evolution in such trees is not defined. Most models currently used in phylogenetic analysis assume homogeneity, stationarity, and reversibility.</p>
<p>The nucleotide frequencies of sequences belonging to distantly related species are generally quite different, a clear indicator that the homogeneity and stationarity assumptions are being violated (Yang and Roberts
<xref ref-type="bibr" rid="CR61">1995</xref>
). For trees including distantly related organisms, different models might be needed to describe the patterns of nucleotide substitution in different parts of the tree, and sometimes, even one model per branch might be needed to achieve a realistic representation of the evolutionary process (Yang and Roberts
<xref ref-type="bibr" rid="CR61">1995</xref>
). Such nonhomogeneous trees involve a large number of parameters that cannot be reliably estimated by maximum likelihood (ML) or that might become mathematically intractable. For this reason, despite their importance, relatively little work has been done on the use of nonhomogeneous models in phylogenetics (see for example Barry and Hartigan
<xref ref-type="bibr" rid="CR3">1987</xref>
; Boussau et al.
<xref ref-type="bibr" rid="CR5">2008</xref>
; Gu and Li
<xref ref-type="bibr" rid="CR23">1998</xref>
; Blanquart and Lartillot
<xref ref-type="bibr" rid="CR4">2008</xref>
; Yang and Roberts
<xref ref-type="bibr" rid="CR61">1995</xref>
; Galtier et al.
<xref ref-type="bibr" rid="CR16">1999</xref>
; Galtier and Gouy
<xref ref-type="bibr" rid="CR15">1998</xref>
; Lockhart et al.
<xref ref-type="bibr" rid="CR32">1994</xref>
). An interesting possibility that might lead to easily tractable nonhomogeneous models concerns the analysis of patterns of nucleotide substitution for viruses that have experienced well established host transfer events. If the intracellular environment of the new host is substantially different, this could lead to a shift in the substitution pattern of the virus in the new host (Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
). The nucleotide frequencies of the viral genome would then drift toward new equilibrium values. Trees accommodating viral sequences isolated from different hosts could then be analyzed by assuming just one set of evolutionary parameters for each host clade. If one of the hosts serves as a natural reservoir, viral evolution within this host would be stationary. The process would be nonstationary in the new hosts. Branches linking different host clades would contain host shift nodes, and the positions of these nodes could be determined by maximum likelihood.
<fig id="Fig1">
<label>Fig. 1</label>
<caption>
<p>The hypothetical evolution of a virus after a cross species jump (host shift). Evolution along the new host branches is non-stationary. The inset figure shows a computer simulation of the frequency of an arbitrary nucleotide
<italic>i</italic>
along evolutionary time (
<italic>d</italic>
) after a host shift. The equilibrium frequency in the reservoir host is
<italic>π</italic>
<sub>
<italic>i</italic>
</sub>
* and in the new host is π
<sub>
<italic>i</italic>
</sub>
</p>
</caption>
<graphic position="anchor" xlink:href="239_2009_9282_Fig1_HTML" id="MO1"></graphic>
</fig>
</p>
<p>If the G + C content of human, avian, and swine influenza virus sequences are plotted against the isolation year, a conspicuous pattern of G + C composition decay is seen in the mammalian viruses (Fig. 
<xref rid="Fig2" ref-type="fig">2</xref>
), indicating that different substitution patterns characterize the evolution of the viral segments in mammalian and avian hosts (Rabadan et al.
<xref ref-type="bibr" rid="CR37">2006</xref>
). The evolution of influenza viruses is therefore better represented by a nonhomogeneous Markov model where different substitution patterns would describe the evolution process in various parts of the virus phylogenetic tree. This raises the intriguing possibility that this change in substitution pattern might allow us to identify and study the point along the phylogenetic tree where host shifts have occurred.
<fig id="Fig2">
<label>Fig. 2</label>
<caption>
<p>Genome G + C content versus isolation year for influenza viruses.
<italic>Black dots</italic>
A/H1N1 waterfowl.
<italic>Red dots</italic>
A/H1N1 human. The
<italic>empty dots</italic>
are human viruses that reappeared after 1977, the isolation time for these viruses has been corrected for the period of evolutionary stasis (
<italic>see text</italic>
).
<italic>Blue dots</italic>
A/H1N1 classical swine.
<italic>Gray dots</italic>
A/H5N1 human. These are avian-like sequences that have not spread within the human population, and thus retain the avian nucleotide content.
<italic>Green dots</italic>
Influenza B. These viruses mainly infect humans, and they may have evolved from an avian reservoir at an unknown remote date (Gammelin et al.
<xref ref-type="bibr" rid="CR17">1990</xref>
). (Color figure online)</p>
</caption>
<graphic position="anchor" xlink:href="239_2009_9282_Fig2_HTML" id="MO2"></graphic>
</fig>
</p>
<p>Influenza A is a negative-strand RNA virus with a segmented genome that causes annual epidemics of disease in humans and domestic animals. The natural reservoir of the influenza A virus is waterfowl, in which the virus replicates and spreads causing little or no disease (Webster et al.
<xref ref-type="bibr" rid="CR54">1992</xref>
). The eight negative-strand RNA segments that comprise the virus genome encode 11 proteins. Two of these, the hemagglutinin (HA) and neuraminidase (NA), are surface glycoproteins that interact with the host’s immune system. Influenza viruses are classified according to the antigenic properties of the HA and NA proteins. A total of 16 HA and 9 NA serotypes have been identified in wild waterfowl, whereas only three HA (H1, H2, and H3) and only two NA (N1 and N2) subtypes are known to have been involved in epidemic disease in humans.</p>
<p>Avian viruses usually do not infect humans as these viruses are not adapted to the human host. Periodically, however, human viruses might acquire gene segments from an avian source, perhaps through an intermediary host, resulting in global pandemics in immunologically naive human populations. Two of the three 20th century flu pandemics were caused by this process. The 1957–1958 (H2N2, Asian flu) and 1968–1969 (H3N2, Hong Kong flu) pandemics that caused substantial mortality in the human population, were the result of reassortant viruses that had acquired novel segments coding for HA or HA and NA, and a polymerase gene (PB1) from an avian-like source (reviewed in Hay et al.
<xref ref-type="bibr" rid="CR26">2001</xref>
). Whether the 1918–1919 pandemic (H1N1, ‘Spanish’ flu) was caused by a reassortant virus like the 1957 and 1968 viruses, or was the result of transfer of a whole virus from an avian reservoir has been hotly debated (Gorman et al.
<xref ref-type="bibr" rid="CR19">1990</xref>
; Gibbs and Gibbs
<xref ref-type="bibr" rid="CR18">2006</xref>
; Gammelin et al.
<xref ref-type="bibr" rid="CR17">1990</xref>
; Taubenberger et al.
<xref ref-type="bibr" rid="CR51">2006</xref>
; Reid et al.
<xref ref-type="bibr" rid="CR40">2004</xref>
; Taubenberger et al.
<xref ref-type="bibr" rid="CR50">2005</xref>
; Gorman et al.
<xref ref-type="bibr" rid="CR20">1991</xref>
; Antonovics et al.
<xref ref-type="bibr" rid="CR2">2006</xref>
). During each of these pandemics the preceding virus subtype became extinct and was replaced by the new reassortant. In 1977, the H1N1 virus subtype which had become extinct in 1957 reappeared in the human population, infecting mainly young people (<25 years) who had not been exposed to the H1N1 subtype circulating previously. Since then, both H1N1 and H3N2 viruses have co-circulated with influenza B in humans. A stable lineage of H1N1 influenza in North American pigs (classical swine) was noticed after the 1918 pandemic. It is though that this classical swine lineage originated from the human ‘Spanish’ virus (Taubenberger
<xref ref-type="bibr" rid="CR49">2006</xref>
).</p>
<p>The 1918–1919 ‘Spanish’ flu has been the most devastating epidemic disease in recorded human history. It killed an estimated 50 million people worldwide (Johnson and Mueller
<xref ref-type="bibr" rid="CR28">2002</xref>
), many more than the number of deaths caused by the First World War. Given the constant threat of new zoonotic pandemics, much research has tried to understand the origin of the 1918 pandemic. The strongest evidence for an avian origin for the Spanish flu came from analysis of the genome sequence of the 1918 virus, obtained from lung tissue from a victim buried in the Alaskan permafrost (Taubenberger
<xref ref-type="bibr" rid="CR49">2006</xref>
; Reid et al.
<xref ref-type="bibr" rid="CR40">2004</xref>
; Taubenberger et al.
<xref ref-type="bibr" rid="CR50">2005</xref>
). Analysis of the consensus amino acid sequence of polymerase genes from avian viruses showed very little differences when compared to those from the 1918 virus (Taubenberger et al.
<xref ref-type="bibr" rid="CR50">2005</xref>
), while subsequent lineages of classical swine and human viruses had accumulated a substantial number of amino acid substitutions. This intuitively suggested that the introduction of the H1N1 virus into humans occurred in a relatively ‘short’ period (up to several years; Taubenberger et al.
<xref ref-type="bibr" rid="CR51">2006</xref>
) before the pandemic. A similar lack of adaptive evolution was also observed in other proteins of the 1918 virus (Reid et al.
<xref ref-type="bibr" rid="CR40">2004</xref>
) providing evidence for a single host shift event. Interestingly, on the nucleotide level, the 1918 virus was closer to other mammalian virus sequences than known avian virus consensus sequences, suggesting an early divergence between the current avian and 1918 virus lineages. This observation led Taubenberger et al. (
<xref ref-type="bibr" rid="CR50">2005</xref>
) to suggest that the donor of the 1918 virus was in evolutionary isolation from other known avian flu viruses. A number of authors have questioned this interpretation (Gibbs and Gibbs
<xref ref-type="bibr" rid="CR18">2006</xref>
; Antonovics et al.
<xref ref-type="bibr" rid="CR2">2006</xref>
). One issue is the reliance of Taubenberger et al. (
<xref ref-type="bibr" rid="CR50">2005</xref>
) on simplistic evolutionary models, and their focus on changes at the protein level, making the analysis susceptible to statistical noise and possible systematic biases. A rigorous phylogenetic study including the genome sequence of the 1918 virus, where the host shift event is clearly identified along the phylogenetic tree, and where modern molecular dating techniques are applied, has not yet been carried out.</p>
<p>As suggested by Fig. 
<xref rid="Fig2" ref-type="fig">2</xref>
, influenza is well suited for study as a nonhomogeneous evolutionary process. Here we explore the possibility of using such a nonhomogeneous model to study the evolution of H1N1 viruses in birds, pigs, and humans. We address the question of the origin of the 1918 virus and time of the putative host shift event that led to the introduction of this virus from an avian into a mammalian host. These results suggest that the segments that formed the 1918 virus were transmitted to a mammalian host some time within the interval 1882–1913, followed by subsequent divergence between the human and classical swine lineages around 1913–1915. The virus was likely introduced into the human population between 1913 and 1918. This suggests a minimum of 5 years evolution in mammals prior to 1918, and that the classical swine lineage did not originate from the pandemic virus of 1918.</p>
</sec>
<sec id="Sec2" sec-type="methods">
<title>Methods</title>
<sec id="Sec3">
<title>Data and Tree Estimation</title>
<p>We analyzed 40 full genome sequences of H1N1 influenza viruses isolated from avian (15), human (15), and swine (10) hosts. The eight RNA segment sequences from each genome were concatenated into a super gene and aligned (Muscle v3.6; Edgar
<xref ref-type="bibr" rid="CR12">2004</xref>
). The alignment, 13,140 sites, was edited manually. The tree topology was estimated by ML (HKY85 + dΓ
<sub>5</sub>
, PhyML v2.4.4; Guindon and Gascuel
<xref ref-type="bibr" rid="CR24">2003</xref>
), and the reliability of the tree topology was tested by bootstrapping 1,000 times. The virus strains analyzed and the consensus tree are shown in Fig. 
<xref rid="Fig3" ref-type="fig">3</xref>
. Currently, all full genome sequences of H1N1 waterfowl viruses available in GenBank have been isolated from North American birds. We repeated some of the analyses with waterfowl viruses from other parts of the world. The estimated evolutionary parameters (such as the equilibrium nucleotide frequencies) appear independent of the geographical origin. Thus, the results should not be affected if the virus from which the 1918 pandemic originated was of Eurasian, rather than American, origin.
<fig id="Fig3">
<label>Fig. 3</label>
<caption>
<p>Consensus tree for 1,000 bootstrap replicates. Support values for the mammalian virus clades are shown. The avian viruses are mostly from waterfowl except for a pigeon isolate. Estimating the tree under a Bayesian framework (MrBayes v3.1; Huelsenbeck et al.
<xref ref-type="bibr" rid="CR27">2001</xref>
) leads to essentially the same results. The tree is shown rooted for illustrative purposes only. The
<italic>black dot</italic>
indicates the position of the most recent common ancestor of the human clade (MRCAH)</p>
</caption>
<graphic position="anchor" xlink:href="239_2009_9282_Fig3_HTML" id="MO3"></graphic>
</fig>
</p>
</sec>
<sec id="Sec4">
<title>Nonhomogeneous Models of Influenza Evolution</title>
<p>We used the Hasegawa et al. (
<xref ref-type="bibr" rid="CR25">1985</xref>
) Markov model of nucleotide substitution (HKY85) to describe the local nucleotide substitution pattern along the branches of the avian and mammalian influenza virus tree. The evolutionary parameters (
<bold>π</bold>
 = {π
<sub>
<italic>i</italic>
</sub>
} and transition/transversion rate parameter κ) and the branch lengths (
<italic>d</italic>
<sub>
<italic>i</italic>
</sub>
) for a given tree topology were estimated by ML (Yang
<xref ref-type="bibr" rid="CR59">2006</xref>
). The HKY85 model offers a good compromise between accuracy, computational speed, and relatively low variance when compared to more general models of nucleotide substitution (Yang
<xref ref-type="bibr" rid="CR56">1994</xref>
).</p>
<p>Using different sets of
<bold>π</bold>
values to describe the evolution along different branches of the tree implies time heterogeneity in the substitution pattern. In this work, we considered three models of evolution in the human–swine–avian tree (Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
). The first model (M
<sub>1</sub>
) assumed homogeneity and stationarity, with one set of equilibrium nucleotide frequencies describing the substitution process in all branches of the tree. The second model (M
<sub>2</sub>
) assumed that equilibrium nucleotide frequencies are different in mammalian and avian hosts. The third model (M
<sub>3</sub>
), assumed different sets of equilibrium nucleotide frequencies for avian, human, and swine hosts, with the initial avian to mammal host shift occurring either to swine (M
<sub>3s</sub>
) or to humans (M
<sub>3h</sub>
). In models M
<sub>2</sub>
and M
<sub>3</sub>
, evolution along the avian clade is stationary. Models M
<sub>1</sub>
, M
<sub>2</sub>
, and M
<sub>3</sub>
are nested, so their log-likelihoods can be compared with the likelihood ratio test (LRT) to select the best model. The three models described above assumed a single avian to mammal host shift event. A variation of the M
<sub>2</sub>
model was also tested that assumes that influenza was transmitted independently from birds to humans and from birds to swine following the divergence of these two lineages (M
<sub>2.2j</sub>
, Fig. 
<xref rid="Fig4" ref-type="fig">4</xref>
). This model is not nested with any of the other models so the LRT cannot be used to assess its adequacy; the Akaike Information Criterion (AIC) can be used instead (Akaike
<xref ref-type="bibr" rid="CR1">1974</xref>
). All the models were tested on the data above using a nonhomogeneous implementation of the HKY85 model (PAML v3.15; Yang
<xref ref-type="bibr" rid="CR58">1997</xref>
; Yang and Roberts
<xref ref-type="bibr" rid="CR61">1995</xref>
) that considers rate variation among sites as a discrete gamma distribution (Yang
<xref ref-type="bibr" rid="CR57">1996</xref>
). A single gamma shape parameter (α) was assumed for the whole tree. Consideration of rate variation is fundamental since nucleotide frequencies decay at different rates at different sites, and averaging over them would lead to underestimation of the branch linking the mammalian clade with the host shift event.
<fig id="Fig4">
<label>Fig. 4</label>
<caption>
<p>Non-homogeneous models of influenza evolution. All model trees are unrooted. The real root is assumed to lie somewhere along the avian branches, however, its position is irrelevant since stationary evolution of the virus in the avian host is being assumed. Model M
<sub>1</sub>
is homogeneous and the host shift event (HSE) cannot be determined. In models M
<sub>2</sub>
and M
<sub>3</sub>
the HSE is assigned avian equilibrium frequencies. Different shadings indicate that different rate matrices (equilibrium nucleotide frequencies) are used to describe evolution along the corresponding branches. With current data it is not possible to distinguish whether the HSE was avian to human, or avian to swine, so model M
<sub>3</sub>
is in reality two models according to whether the branch linking the human–swine split (HSS) and the HSE is assigned human (M
<sub>3h</sub>
) or swine (M
<sub>3s</sub>
) equilibrium frequencies. Model M
<sub>2.2J</sub>
assumes two independent host shifts bird to mammal (
<italic>see text</italic>
)</p>
</caption>
<graphic position="anchor" xlink:href="239_2009_9282_Fig4_HTML" id="MO4"></graphic>
</fig>
</p>
</sec>
<sec id="Sec5">
<title>Molecular Dating</title>
<p>The tree fitted under the best nonhomogeneous model has branch lengths in substitutions per site. We time calibrated the tree using a fully relaxed clock model under a penalized likelihood scheme (r8s v1.71; Sanderson
<xref ref-type="bibr" rid="CR43">2003</xref>
; Langley and Fitch
<xref ref-type="bibr" rid="CR31">1974</xref>
). Nonhomogeneous model fitting and time calibration was repeated for each of the 1,000 bootstrapped trees and their corresponding alignments. Isolation dates for most of the sequences analyzed are available to within 1 year. To correct for this level of uncertainty, the ages of the viruses in the bootstrap analysis were drawn from a random uniform distribution for the corresponding interval, i.e., if a virus is reported as isolated in 1957, its bootstrap distribution of age was sampled from the uniform distribution with boundaries [1957.0–1958.0). Hence the uncertainties in tree topology, branch lengths, and virus isolation times were carried through the analyses. The earliest human isolate is dated November 1918. The bootstrap confidence intervals for the evolutionary parameters and the node ages were calculated as described elsewhere (Venables and Ripley
<xref ref-type="bibr" rid="CR53">2002</xref>
, p 136). Data manipulation and basic statistics were carried out with the R environment for statistical computing (
<ext-link ext-link-type="uri" xlink:href="http://www.r-project.org">www.r-project.org</ext-link>
). As an additional analysis, the third codon sites from the alignment (4,256 sites) were extracted, tree topology estimated, best nonhomogeneous model fitted, and the tree time calibrated. The results were essentially identical to the whole alignment case, albeit with wider confidence intervals.</p>
</sec>
</sec>
<sec id="Sec6" sec-type="results">
<title>Results</title>
<sec id="Sec7">
<title>ML Estimation of Branch Lengths and Evolutionary Parameters Under Models M
<sub>1</sub>
, M
<sub>2</sub>
, and M
<sub>3</sub>
</title>
<p>We used the consensus tree topology estimated above to fit by ML the three M models (M
<sub>1</sub>
, M
<sub>2</sub>
, and M
<sub>3</sub>
) and assess the suitability of the different hypotheses concerning the homogeneity of the evolution of influenza viruses. Assuming nonhomogeneous evolution of the virus gene segments significantly improves the model fit when compared to a fully homogeneous model (LRT, M
<sub>1</sub>
vs. M
<sub>2</sub>
,
<inline-formula id="IEq1">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq1.gif </pmc-comment>
<tex-math id="M1">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \chi_{4}^{2} = 163.14 $$\end{document}</tex-math>
</inline-formula>
,
<italic>P</italic>
 ≪ 0.001, Table 
<xref rid="Tab1" ref-type="table">1</xref>
). Allowing for different substitution patterns in humans and swine does not significantly improve the model fit (LRT, M
<sub>2</sub>
vs. M
<sub>3h</sub>
 ≈ M
<sub>3s</sub>
,
<inline-formula id="IEq2">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq2.gif </pmc-comment>
<tex-math id="M2">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \chi_{3}^{2} \approx 3.5 $$\end{document}</tex-math>
</inline-formula>
,
<italic>P</italic>
 ≤ 0.31, Table 
<xref rid="Tab1" ref-type="table">1</xref>
). This indicates that the shift in substitution patterns is a property of the evolution of the virus in mammalian hosts. The branch lengths for models M
<sub>1</sub>
and M
<sub>2</sub>
are highly correlated, but the homogeneous model slightly overestimates long branches (
<italic>d</italic>
<sub>M2</sub>
 = 0.96
<italic>d</italic>
<sub>M1</sub>
,
<italic>r</italic>
 > 0.999). Model M
<sub>2.2j</sub>
, which assumes two independent bird to mammal host shifts, has a lower likelihood than M
<sub>2</sub>
(Table 
<xref rid="Tab1" ref-type="table">1</xref>
). These two models are not nested, so the LRT cannot be used. The Akaike information criterion supports M
<sub>2</sub>
as the best model overall (AIC, Table 
<xref rid="Tab1" ref-type="table">1</xref>
). Our results, while not definitive, support a single jump from birds to mammals, a conclusion consistent with the more frequently observed inter-mammalian host shifts than shifts between avian and mammal species.
<table-wrap id="Tab1">
<label>Table 1</label>
<caption>
<p>Likelihoods and model comparison</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Model</th>
<th align="left">lnℓ</th>
<th align="left">np</th>
<th align="left">
<italic>P</italic>
-value</th>
<th align="left">AIC</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">M
<sub>1</sub>
</td>
<td char="." align="char">−83,751</td>
<td char="." align="char">82</td>
<td char="." align="char"></td>
<td char="." align="char">167,668</td>
</tr>
<tr>
<td align="left">
<bold>M</bold>
<sub>
<bold>2</bold>
</sub>
</td>
<td char="." align="char">
<bold>−83,670</bold>
</td>
<td char="." align="char">
<bold>86</bold>
</td>
<td char="." align="char">
<bold>0.001</bold>
</td>
<td char="." align="char">
<bold>167,514</bold>
</td>
</tr>
<tr>
<td align="left">M
<sub>3</sub>
</td>
<td char="." align="char">−83,668</td>
<td char="." align="char">89</td>
<td char="." align="char">0.31</td>
<td char="." align="char">167,516</td>
</tr>
<tr>
<td align="left">M
<sub>2.2j</sub>
</td>
<td char="." align="char">−83,672</td>
<td char="." align="char">87</td>
<td char="." align="char"></td>
<td char="." align="char">167,520</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<italic>np</italic>
Number of estimated parameters</p>
<p>Models M
<sub>3h</sub>
and M
<sub>3s</sub>
have essentially the same likelihood. The bold values highlight the statistically best model</p>
</table-wrap-foot>
</table-wrap>
</p>
<p>Table 
<xref rid="Tab2" ref-type="table">2</xref>
shows the ML estimates of the evolutionary parameters for model M
<sub>2</sub>
and their 95% confidence intervals (CI) from the bootstrap analysis. It is clear that the relative rates of G → A and C → U transition substitutions are accelerated in mammalian
<inline-formula id="IEq3">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq3.gif </pmc-comment>
<tex-math id="M3">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \left( {\hat{q}_{\text{GA}} = 4.99,\,\hat{q}_{\text{CU}} = 3.16} \right) $$\end{document}</tex-math>
</inline-formula>
when compared to avian
<inline-formula id="IEq4">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq4.gif </pmc-comment>
<tex-math id="M4">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \left( {\hat{q}_{\text{GA}} = 4.11,\,\hat{q}_{\text{CU}} = 2.94} \right) $$\end{document}</tex-math>
</inline-formula>
viruses. This shift in G → A and C → U transition rates is responsible for the G + C composition decay observed in mammalian viruses (Fig. 
<xref rid="Fig2" ref-type="fig">2</xref>
). Reasons for this shift in substitution rates are not clear. A few hypotheses of how this substitution pattern might have come about in human compared to avian hosts have been discussed (Greenbaum et al.
<xref ref-type="bibr" rid="CR21">2008</xref>
; Rabadan et al.
<xref ref-type="bibr" rid="CR37">2006</xref>
). It seems experimental work is needed to address this issue. The ML method is, however, blind to the causes of the substitution shift and simply identifies the most likely location of the host shift. Here we are content with using this substitution pattern shift to time the ancestor of human and swine H1N1 viruses rather than with the causes of the substitution pattern itself.
<table-wrap id="Tab2">
<label>Table 2</label>
<caption>
<p>ML estimates of evolutionary parameters for the HKY85 M
<sub>2</sub>
model</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Host</th>
<th align="left">Par</th>
<th align="left">Value (95% CI)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="2">All</td>
<td align="left">
<inline-formula id="IEq7">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq7.gif </pmc-comment>
<tex-math id="M5">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\kappa } $$\end{document}</tex-math>
</inline-formula>
</td>
<td char="(" align="char">12.5 (11.8, 13.8)</td>
</tr>
<tr>
<td align="left">
<inline-formula id="IEq8">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq8.gif </pmc-comment>
<tex-math id="M6">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\alpha } $$\end{document}</tex-math>
</inline-formula>
</td>
<td char="(" align="char">0.226 (0.216, 0.237)</td>
</tr>
<tr>
<td align="left" rowspan="4">Avian</td>
<td align="left">
<inline-formula id="IEq9">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq9.gif </pmc-comment>
<tex-math id="M7">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>U</sub>
</td>
<td char="(" align="char">0.235 (0.228, 0.242)</td>
</tr>
<tr>
<td align="left">
<inline-formula id="IEq10">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq10.gif </pmc-comment>
<tex-math id="M8">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>C</sub>
</td>
<td char="(" align="char">0.207 (0.200, 0.213)</td>
</tr>
<tr>
<td align="left">
<inline-formula id="IEq11">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq11.gif </pmc-comment>
<tex-math id="M9">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>A</sub>
</td>
<td char="(" align="char">0.329 (0.322, 0.337)</td>
</tr>
<tr>
<td align="left">
<inline-formula id="IEq12">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq12.gif </pmc-comment>
<tex-math id="M10">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>G</sub>
</td>
<td char="(" align="char">0.229 (0.222, 0.236)</td>
</tr>
<tr>
<td align="left" rowspan="4">Mammalian</td>
<td align="left">
<inline-formula id="IEq13">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq13.gif </pmc-comment>
<tex-math id="M11">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>U</sub>
</td>
<td char="(" align="char">0.253 (0.239, 0.267)</td>
</tr>
<tr>
<td align="left">
<inline-formula id="IEq14">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq14.gif </pmc-comment>
<tex-math id="M12">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>C</sub>
</td>
<td char="(" align="char">0.178 (0.167, 0.188)</td>
</tr>
<tr>
<td align="left">
<inline-formula id="IEq15">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq15.gif </pmc-comment>
<tex-math id="M13">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>A</sub>
</td>
<td char="(" align="char">0.399 (0.385, 0.415)</td>
</tr>
<tr>
<td align="left">
<inline-formula id="IEq16">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq16.gif </pmc-comment>
<tex-math id="M14">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{\pi } $$\end{document}</tex-math>
</inline-formula>
<sub>G</sub>
</td>
<td char="(" align="char">0.170 (0.159, 0.179)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<italic>Note</italic>
: the substitution rate from nucleotide
<italic>i</italic>
to
<italic>j</italic>
,
<italic>q</italic>
<sub>
<italic>ij</italic>
</sub>
, can be calculated from this table as
<italic>q</italic>
<sub>
<italic>ij</italic>
</sub>
 = 
<italic>c</italic>
κπ
<sub>
<italic>j</italic>
</sub>
for transitions and
<italic>q</italic>
<sub>
<italic>ij</italic>
</sub>
 = 
<italic>c</italic>
π
<sub>
<italic>j</italic>
</sub>
for transversions, where
<italic>c</italic>
is a proportionality constant (for details see chap 1 in Yang
<xref ref-type="bibr" rid="CR59">2006</xref>
)</p>
</table-wrap-foot>
</table-wrap>
</p>
</sec>
<sec id="Sec8">
<title>Stability of the Host Shift Node</title>
<p>An important property of nonhomogeneous, nonstationary models is their theoretical ability to identify the position where changes in the substitution pattern have occurred. The drift in base frequencies towards different equilibrium values along the tree branches should give, in theory, enough information to the maximum likelihood method to be able to identify the position of those nodes. In our case, it should allow the identification of the location where the host shift occurred. Figure 
<xref rid="Fig5" ref-type="fig">5</xref>
shows the likelihood surface for the branch projecting from the host shift towards the mammalian clade (
<italic>d</italic>
<sub>ma</sub>
) versus the branch projecting from the host shift towards the waterfowl clade (
<italic>d</italic>
<sub>wf</sub>
). The likelihood surface appears highly correlated along the
<italic>d</italic>
<sub>ma</sub>
 + 
<italic>d</italic>
<sub>wf</sub>
line, as are the estimated branch lengths from the bootstrap analysis (Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
). The bootstrapping exercise is essentially equivalent to sampling trees from the likelihood surface (a parametric bootstrap gives essentially the same results). For comparison, Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
also shows the likelihood surface for the two branches projecting forward from the human–swine split (
<italic>d</italic>
<sub>hu</sub>
and
<italic>d</italic>
<sub>sw</sub>
). The estimation of these branches is far more accurate, and their estimates are uncorrelated (Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
). The correlation in the likelihood surface seen in Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
translates into wide confidence intervals for the lengths of the branches projecting from the host shift (e.g.,
<inline-formula id="IEq5">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq5.gif </pmc-comment>
<tex-math id="M15">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{d}_{\text{ma}} = 0.0341 $$\end{document}</tex-math>
</inline-formula>
, 95% CI: 0.0, 0.0626). It is interesting to note that the sum of these branches, can be estimated much more reliably (
<inline-formula id="IEq6">
<pmc-comment> Alternate image not processed: 239_2009_9282_Article_IEq6.gif </pmc-comment>
<tex-math id="M16">\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{d}_{\text{ma}} + \hat{d}_{\text{wf}} = 0.159 $$\end{document}</tex-math>
</inline-formula>
, 95% CI: 0.147, 0.175). The correlation observed between
<italic>d</italic>
<sub>ma</sub>
and
<italic>d</italic>
<sub>wf</sub>
is directly related to the pulley effect that precludes the identification of the root in a reversible, stationary tree (Felsenstein
<xref ref-type="bibr" rid="CR13">1981</xref>
).
<fig id="Fig5">
<label>Fig. 5</label>
<caption>
<p>Stability of the maximum likelihood estimates of branch lengths for model M
<sub>2</sub>
. The plot shows the log-likelihood profiles (
<italic>top</italic>
) and bootstrap sample estimates (
<italic>bottom</italic>
) for selected pairwise branch comparisons. The inset tree, is the tree optimized under the HKY85 M
<sub>2</sub>
model, showing the waterfowl (
<italic>Wf</italic>
), human (
<italic>Hu</italic>
), and swine (
<italic>Sw</italic>
) clades, the host shift event (HSE) and the human–swine split (HSS). The two branches protruding from host shift event are
<italic>d</italic>
<sub>wf</sub>
and
<italic>d</italic>
<sub>ma</sub>
, and the two branches protruding forward from the human–swine split are
<italic>d</italic>
<sub>sw</sub>
and
<italic>d</italic>
<sub>hu</sub>
</p>
</caption>
<graphic position="anchor" xlink:href="239_2009_9282_Fig5_HTML" id="MO5"></graphic>
</fig>
</p>
</sec>
<sec id="Sec9">
<title>Tree Calibration and the Origin of the 1918 Pandemic Virus</title>
<p>The HKY85 M
<sub>2</sub>
tree optimized by ML has branch lengths in substitutions per site, as substitution rate and real time are confounded factors that cannot be estimated independently without additional information (Yang
<xref ref-type="bibr" rid="CR56">1994</xref>
). To estimate the date of the host shift event we calibrated the tree using Langley and Fitch’s (LF) molecular clock model (Langley and Fitch
<xref ref-type="bibr" rid="CR31">1974</xref>
) and timed the nodes along the human–swine portion of the HKY85 M
<sub>2</sub>
optimized tree. We used an implementation that uses a negative binomial correction to account for rate heterogeneity among sites and that considers local variations in the clock rate (r8s; Sanderson
<xref ref-type="bibr" rid="CR42">2002</xref>
,
<xref ref-type="bibr" rid="CR43">2003</xref>
). Substitution rates for each branch (a fully relaxed clock) and the ages of internal nodes were then estimated by penalized likelihood under the corrected LF model. This procedure was repeated for each one of the 1,000 bootstrap trees, as to assess the variability of substitution rates and age estimates under variable branch lengths and tree topologies.</p>
<p>Before fitting the LF model to date the host shift event, two oddities concerning the data analyzed need to be addressed (Fig.
<xref rid="Fig6" ref-type="fig">6</xref>
). First, human viruses isolated between 1933 and 57 have been passaged an undefined number of times in the laboratory before sequence determination, thus accumulating a substantial amount of nucleotide substitutions (Bush et al.
<xref ref-type="bibr" rid="CR7">2000</xref>
). Including these lab-adapted virus sequences in the estimation of the tree topology above is, however, not expected to lead to any errors since only the corresponding tips in the tree are expected to be elongated. These sequences provide valuable information for estimation of the evolutionary parameters and help reduce the variance of estimated internal branch lengths. However, including these sequences in the tree calibration would certainly lead to overestimation of the substitution rate, so the eight human viruses isolated between 1933 and 57 were not considered for the LF analysis. The 1918 Brevig Mission virus sequence was obtained directly from tissue of an Inuit woman buried in the Alaskan permafrost (Taubenberger
<xref ref-type="bibr" rid="CR49">2006</xref>
), and has no passage history, so it was included. The other oddity in the data is that the H1N1 viruses that reappeared in the human population in 1977 were very similar to the extinct strains circulating around 1950 (Nakajima et al.
<xref ref-type="bibr" rid="CR34">1978</xref>
). The reasons for this evolutionary stasis are not clear (Kilbourne
<xref ref-type="bibr" rid="CR30">2006</xref>
), prompting the speculation that these were the product of a lab accident, perhaps involving the release of a frozen strain (Palese
<xref ref-type="bibr" rid="CR36">2004</xref>
). We estimated the phylogenetic age of the modern H1N1 viruses by maximizing the likelihood of the LF model assuming variable intervals of evolutionary stasis. A time gap of 24.6 years is the most likely, indicating that the 1977 strain originated around 1953 (95% CI: 1948–1956) in agreement with previous studies (Nakajima et al.
<xref ref-type="bibr" rid="CR34">1978</xref>
; Raymond et al.
<xref ref-type="bibr" rid="CR39">1986</xref>
). The average branch substitution rate per site per year in human and classical swine viruses is 2.44 × 10
<sup>−3 </sup>
year
<sup>−1</sup>
(95% CI: 2.29 × 10
<sup>−3</sup>
, 2.58 × 10
<sup>−3</sup>
).</p>
<p>The human and swine lineages are estimated to have diverged between March 1913 and October 1915 (Table 
<xref rid="Tab3" ref-type="table">3</xref>
). The divergence time of this node seems reliable as the likelihood surface is well developed (Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
). The most recent common ancestor of human viruses (MRCAH, Fig. 
<xref rid="Fig3" ref-type="fig">3</xref>
) dates back to between February 1917 and April 1918 (Table 
<xref rid="Tab3" ref-type="table">3</xref>
). The host shift is estimated to have happened around 1882–1912. This assumes that the virus evolved at the average mammalian rate just after the host shift. However, accelerations of up to 50% in rate have been observed in swine viruses from recent avian origin (Ludwig et al.
<xref ref-type="bibr" rid="CR33">1995</xref>
). Assuming such increased substitution rate throughout the genome, would place the host shift around 1893–1913. Because the estimates of the length of the two branches projecting from the host shift are correlated (Fig. 
<xref rid="Fig5" ref-type="fig">5</xref>
), a large CI for the host shift date cannot be avoided (Table 
<xref rid="Tab3" ref-type="table">3</xref>
).
<table-wrap id="Tab3">
<label>Table 3</label>
<caption>
<p>Estimated dates for the host shift, human–swine split, and MRCAH</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left">Node</th>
<th align="left">Date (95% CI)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Host shift</td>
<td char="(" align="char">1901.1 (1882.8, 1912.2)</td>
</tr>
<tr>
<td align="left">Host shift
<sup>a</sup>
</td>
<td char="(" align="char">1905.8 (1893.2, 1913.0)</td>
</tr>
<tr>
<td align="left">Human–swine split</td>
<td char="(" align="char">1914.6 (1913.2, 1915.8)</td>
</tr>
<tr>
<td align="left">MRCAH</td>
<td char="(" align="char">1917.8 (1917.2, 1918.3)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<sup>a</sup>
Assuming an accelerated substitution rate, 1.5 times faster the average rate</p>
</table-wrap-foot>
</table-wrap>
<fig id="Fig6">
<label>Fig. 6</label>
<caption>
<p>Branch length versus year of isolation for human and swine H1N1 viruses. The total branch length from each tip to the human–swine split is plotted against the isolation year.
<italic>Red dots</italic>
human,
<italic>blue dots</italic>
classical swine. The
<italic>empty dots</italic>
show the corrected ages for the human viruses that reappeared in 1977. The regression slope is the approximated substitution rate. Some of the human viruses isolated between 1933 and 1957 deviate from the regression line due to extensive lab passing. The effect is negligible for the early swine viruses (1931–1957). (Color figure online)</p>
</caption>
<graphic position="anchor" xlink:href="239_2009_9282_Fig6_HTML" id="MO6"></graphic>
</fig>
</p>
</sec>
<sec id="Sec10">
<title>Reliability of the LF Local Clock Model Calibration</title>
<p>To test the reliability of the LF local clock calibration, we set the isolation date of the 1918 sequence as an unknown parameter and re-estimated it. We repeated this procedure for every sequence (except for the early, lab-adapted human isolates, 1933–57). We recovered the isolation date to within −1.30–1.52 years for all sequences (mean error = 0.013 years, SD = 0.64 years). The pandemic virus, dating from November 1918, was dated as June 1918, a 5 month error. Because the tip ages are highly correlated with the ages of the corresponding subtending nodes, and the variances of the estimated tip ages are larger than the variance of the corresponding node ages, it seems that the LF relaxed clock gives a robust calibration of the tree. We also re-analyzed the third codon sites from the whole alignment. Using only these sites we were able to retrieve the tree topology, the evolutionary parameters under model M
<sub>2</sub>
, and all the node ages.</p>
<p>A limitation of the LF model is that it assumes the substitution process is Poissonian (or negative binomial when rate variation is considered). This is true under simple nucleotide substitution models such as Jukes and Cantor; however, for more complicated models like HKY85 the process is not Poissonian (Yang
<xref ref-type="bibr" rid="CR59">2006</xref>
), although the deviations do not seem important. Also, the use of the ML branch lengths as proxy for the observed number of substitutions in the LF calibration, instead of re-estimating the branch lengths under a clock model and a full substitution matrix implies a loss of information from the data. We used an implementation of the TipDate model (PAML; Yang
<xref ref-type="bibr" rid="CR58">1997</xref>
; Rambaut
<xref ref-type="bibr" rid="CR38">2000</xref>
) to re-estimate the ages of all internal nodes under the HKY85 model, which should address the concerns about the LF model above. The current TipDate implementation assumes stationarity, however, this does not seem to generate any noticeable discrepancies as the estimated ages for the internal nodes are nearly identical for both methods (
<italic>r</italic>
 > 0.999).</p>
<p>There is a subtle but important point to the penalized likelihood and bootstrap approach used here. Although the bootstrap correctly accounts for uncertainties in branch length estimates, it does not take into account variations in the relaxed clock rates and divergence times, even if the branch lengths were perfectly known (Thorne and Kishino
<xref ref-type="bibr" rid="CR52">2005</xref>
). The result is that the uncertainties in divergence times are underestimated. Applying a Bayesian MCMC approach with an independent log-normal relaxed clock (Drummond and Rambaut
<xref ref-type="bibr" rid="CR10">2007</xref>
; Drummond et al.
<xref ref-type="bibr" rid="CR11">2006</xref>
), we find a divergence time for human and swine viruses between 1911.7–1916.1 and 1916.3–1918.1 for the MRCAH. This approach assumes homogeneity and stationarity so it cannot be used to date the host shift. Furthermore, the independence assumption is likely to overestimate the uncertainty in date estimates as it overlooks the different substitution rates in the human and swine lineages (Ludwig et al.
<xref ref-type="bibr" rid="CR33">1995</xref>
).</p>
</sec>
</sec>
<sec id="Sec11" sec-type="discussion">
<title>Discussion</title>
<p>Rabadan et al. (
<xref ref-type="bibr" rid="CR37">2006</xref>
) noticed the differences in nucleotide composition between avian and human influenza viruses. Here we show that these differences extend to classical swine viruses and that they can be modeled as a nonhomogeneous process along the waterfowl–mammalian phylogenetic tree. Analysis of the posterior site rates from the discrete gamma distribution (Yang and Kumar
<xref ref-type="bibr" rid="CR60">1996</xref>
), show that the mostly synonymous third codon sites evolve over 5 times faster than first and second sites. Most of the G + C decay signal comes from these third sites. Moreover, when the whole analysis was repeated using third sites alone, essentially all results were reproduced. This would suggest that the G + C decay is the consequence of a selectively neutral process (although see Greenbaum et al.
<xref ref-type="bibr" rid="CR21">2008</xref>
). Rabadan et al. (
<xref ref-type="bibr" rid="CR37">2006</xref>
) used the increase in
<italic>U</italic>
frequency observed in two human strains (1918 and 1933) to calculate the earliest date for the introduction of the polymerase genes into a mammalian virus, estimating this at roughly 1910. This point estimate falls within our estimated CI for the host shift; however, we disagree with the conclusion of those authors that this is the earliest possible date for the host shift, as they neither considered the variance of their estimate, nor the effect of rate variation among sites.</p>
<p>Our analysis was performed on the concatenated set of gene segments. Is this approach justified? The estimated topology for the concatenated set of eight RNA segments for the mammalian part of the tree is fully resolved (Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
). However, this is not the case when the topology is estimated for each gene segment separately. Analysis of the individual segments show similar results, with four of the segments (PA, HA, NP, NA) supporting this topology. The segments encoding the PB2 and PB1 proteins place the 1918 sequence at the bottom of the swine virus lineage with high, but inconclusive, bootstrap support (52 and 77%). The segment encoding the M and NS genes, the smallest of the eight segments, do not hold enough phylogenetic information to resolve the position of the 1918 sequence relative to the human–swine split node. The uncertainty in the position of the 1918 sequence for these segments is most likely an artifact of the long branches linking this sequence with the rest of the tree. The 1918 sequence itself is confidently placed at the bottom of the human branch when the full concatenated set is considered (100% bootstrap support). If we take the gene trees literally, the only possibility is that there were two different strains circulating in 1918 that reassorted to form the 1918 pandemic virus. This reassortant would have been replaced later by a non-reassortant some time before the earliest post-1918 human isolates of the 1930s. While this is an intriguing possibility, in the absence of more convincing statistical support we agree with Worobey’s (
<xref ref-type="bibr" rid="CR55">2008</xref>
) view that the 1918 sequence is much more reasonably placed on the human lineage. There does, however, seem to be reassortment occurring on the avian part of the tree, but the topology and timing of this part of the tree is not used in the analysis, and such reassortment does not affect the estimation of evolutionary parameters. Analysis of the individual genes gives similar values for the evolutionary parameters for all eight gene segments, as well as the concatenated gene set, especially in nucleotide frequencies, indicating that our values are robust to errors in tree topology in the avian part of the tree (Fig. 
<xref rid="Fig1" ref-type="fig">1</xref>
)</p>
<p>There still exists, however, the possibility that the segments that formed the 1918 virus were the product of sequential reassortment events involving avian-like viruses in a mammalian host before the split of the human and swine lineages. For example, a mammalian virus might have reassorted with an avian virus to produce a hybrid reassortant (such as in the 1957 and 1968 pandemics; Kawaoka et al.
<xref ref-type="bibr" rid="CR29">1989</xref>
), this hybrid might in turn have reassorted again one or more times losing the original segments and resulting in an avian-like virus with different segments introduced at different times and showing different levels of nucleotide composition decay. We performed a similar analysis on each of the eight H1N1 RNA segments, and obtained individual host transfer dates for each segment varying from 1840 to 1912. In particular, the HA and NP segments seem to have been introduced earlier (pre-1890) than the polymerase genes (post-1900). We intentionally avoid given specific ages to the individual segments, as the branches projecting from the host shift node are highly correlated (Fig. 
<xref rid="Fig7" ref-type="fig">7</xref>
), making the estimation of the individual host transfer dates highly uncertain. Concatenating the segments reduces the variance of date estimates, at the expense of assuming a single host shift event. The pulley effect that precludes the identification of the root in a stationary tree is a pervasive effect that is still present, and hampers the identification of the substitution pattern shift node along a nonhomogeneous tree. With the current data and analysis it is not possible to distinguish between a single host shift event or a successive series of host transfer/reassortment events. Disentangling the ages of the individual gene segments that formed the 1918 virus is difficult and will require further analysis.
<fig id="Fig7">
<label>Fig. 7</label>
<caption>
<p>Bootstrap distribution of the branches projecting from the host shift node (
<italic>d</italic>
<sub>ma</sub>
and
<italic>d</italic>
<sub>wf</sub>
) for the HA gene. Both branch parameters are highly correlated, making the estimation of the age of the HA gene in mammals unreliable</p>
</caption>
<graphic position="anchor" xlink:href="239_2009_9282_Fig7_HTML" id="MO7"></graphic>
</fig>
</p>
<p>Even before the genome sequence of the 1918 virus became available, several authors had already suggested that the ancestor of the 1918 virus was of avian origin (Gorman et al.
<xref ref-type="bibr" rid="CR19">1990</xref>
,
<xref ref-type="bibr" rid="CR20">1991</xref>
; Gammelin et al.
<xref ref-type="bibr" rid="CR17">1990</xref>
). Gammelin et al. (
<xref ref-type="bibr" rid="CR17">1990</xref>
) cautiously suggested an origin for the mammalian virus around 1837. Because they used the divergence between mammalian and avian viruses as the reference point in the NP phylogenetic tree to propose their date, this should be regarded as the earliest possible date. Gorman et al. (
<xref ref-type="bibr" rid="CR20">1991</xref>
) also used a phylogenetic tree based on the NP segment. They noticed that the NP proteins from early human and classical swine viruses (~1930s) were very similar to those from avian viruses, and argued (similarly to Reid et al.
<xref ref-type="bibr" rid="CR40">2004</xref>
; Taubenberger et al.
<xref ref-type="bibr" rid="CR50">2005</xref>
) that the host shift event must have been coincident roughly with the divergence of these lineages, an event that they calculated as occurring around 1912–1913 (close to our estimate of the date of the human–swine split) or 1918 (after considering the possibility of an accelerated substitution rate between 1918 and the 1930s). The accelerated substitution rate was suggested to explain how the host shift event could have occurred in 1918, allowing the simultaneous epidemics of swine and humans to be caused by a single event. With the availability of the 1918 sequence, the phylogenetic tree becomes much more resolved and this possibility is eliminated. Both of these studies implicitly assumed that the host shift happened at internal bifurcating nodes in the tree. Here we show that this is not necessarily so, as the host shift is more likely to have occurred before the divergence of the human and swine lineages.</p>
<p>Previous work (Taubenberger et al.
<xref ref-type="bibr" rid="CR50">2005</xref>
; Gorman et al.
<xref ref-type="bibr" rid="CR20">1991</xref>
; Gammelin et al.
<xref ref-type="bibr" rid="CR17">1990</xref>
) has highlighted the difficulty in piecing together evolutionary scenarios based solely on phylogenetic trees. Ideally we would want an internal clock that starts to tick when the host shift event occurs. Previous researchers have used the amino acid substitutions that distinguish mammalian and avian influenza (Taubenberger et al.
<xref ref-type="bibr" rid="CR50">2005</xref>
; Gorman et al.
<xref ref-type="bibr" rid="CR20">1991</xref>
). There are numerous reasons to suspect the validity of such calculations, as amino acid substitutions are relatively few in number and subject to idiosyncratic timing caused both by substitutions that might influence the probabilities of host shifts and by the evolutionary pressure to accept these substitutions in the new host. In contrast, we have analyzed the changes that occur in nucleotide frequency, representing host-specific substitution rates rather than adaptive changes. For instance, when only the mostly synonymous, third codon sites from the concatenated alignment were used, we were still able to retrieve the tree topology, the evolutionary parameters, and all the node timings, including the host shift.</p>
<p>Because most nucleotide changes seem to be selectively neutral, and since they occur at numerous locations along the entire sequence, we were not only able to make a reasonable estimate of host shift event, but we were also able to use sophisticated nonstationary evolutionary models and perform the type of rigorous statistical analysis that has been lacking in previous work. Our results are hence more likely to be robust to the different effects that occur with different locations under different degrees of selective pressure at the amino acid level in varying size populations. The nonhomogeneous method we propose here should have wider applications beyond influenza.</p>
<p>It has been suggested that the H1N1 classical swine lineage of influenza originated from a human source during the 1918–1919 outbreak (Taubenberger
<xref ref-type="bibr" rid="CR49">2006</xref>
). Our results, however, strongly indicate that this lineage split from the human one about 4 years before the pandemic. There are at least three possible hypotheses concerning the origin of the human and classical swine lineages of influenza: (a) an avian virus infected an unknown mammal, where it evolved for several years before infecting humans. It then infected swine around 1918 (Taubenberger et al.
<xref ref-type="bibr" rid="CR51">2006</xref>
); (b) an avian virus infected a human population where it evolved for several years before diverging into the classical swine and human lineages around 1914. Sometime after this date, the virus was introduced into the swine population; (c) an avian virus was transmitted to a swine population (Ludwig et al.
<xref ref-type="bibr" rid="CR33">1995</xref>
) where it evolved for several years, and sometime after 1913, but before early 1918, it crossed into humans leading to the 1918 pandemic. The problem with the first hypothesis is that the molecular data strongly supports a human–swine split between 1913 and 1916, inconsistent with the idea that classical swine originated from the 1918 human epidemic. The problem with the second hypothesis is that avian viruses are less well adapted to the human than the swine host. Avian hemagglutinin (including avian H1) bind preferentially to SAα-2,3Gal type avian receptors (Rogers and Paulson
<xref ref-type="bibr" rid="CR41">1983</xref>
), whereas human-adapted viruses (H1N1, H3N2, H2N2) bind preferentially SAα-2,6Gal type receptors expressed in the upper respiratory tract in humans. Thus, avian viruses (such as H5N1) that have infected humans directly, have not spread in the human population (Subbarao and Katz
<xref ref-type="bibr" rid="CR48">2000</xref>
). On the other hand, pigs express both SAα-2,6Gal and SAα-2,3Gal receptors and can readily be infected with avian and mammalian influenza viruses. This characteristic of the swine host led to the proposal of swine as mixing vessels for the reassortment of avian and mammalian influenza viruses (e.g. Scholtissek et al.
<xref ref-type="bibr" rid="CR45">1985</xref>
). Avian H1N1 viruses that became established in pigs in Europe (Brown et al.
<xref ref-type="bibr" rid="CR6">1997</xref>
) have subsequently caused occasional infections in humans (Gregory et al.
<xref ref-type="bibr" rid="CR22">2003</xref>
). More significantly, the emerging 2009 H1N1 pandemic is due to a reassortant virus which acquired its eight genes from different swine virus lineages, some of which originated from avian and human hosts (Dawood et al.
<xref ref-type="bibr" rid="CR8">2009</xref>
). There is still the problem of explaining the nearly simultaneous epidemics in swine and humans in 1918, given that the classical swine and human lineages had diverged years earlier. One possible explanation is that the swine epidemic was not noted until a similar epidemic appeared in humans in 1918. Alternatively, it is possible that the outbreaks of disease observed in swine during 1918 (Taubenberger
<xref ref-type="bibr" rid="CR49">2006</xref>
) were not due to a virus of the classical swine lineage but were caused by the human pandemic virus. This scenario is supported by the observation of human H1N1 viruses occasionally infecting swine (e.g., Neumeier et al.
<xref ref-type="bibr" rid="CR35">1994</xref>
), and by the recent infection of pigs in Canada by the 2009 H1N1 virus from a human source.</p>
<p>It is apparent that avian H1N1 viruses have become established in swine, while no instances of avian H1N1 viruses becoming established directly in humans have been observed. Considering this, we suggest an avian virus infected a swine host around 1883–1913, where it evolved for some time before acquiring the capacity to infect and spread in humans. This virus then entered the human population sometime after 1913 but before early-1918, when it initiated the pandemic. It is unlikely that the H1N1 virus was widespread in the human population before 1918. Seroarchaeological studies suggest that an H3 subtype was circulating worldwide at the time (Dowdle
<xref ref-type="bibr" rid="CR9">1999</xref>
). What happened to the virus during 1913–1918 is not clear; analysis of archaeoviral samples predating 1918 might shed some light on this issue. We might never get a definite answer to what happened during the years preceding 1918, but the possibility of potentially hazardous viruses smoldering in an isolated host population (whether human or swine), stresses the importance of extensive worldwide surveillance of influenza.</p>
<p>While the current article was in review, Smith et al. also concluded that the common ancestor of the classical swine and human H1N1 lineages was likely a few years before the pandemic of 1918 (Smith et al.
<xref ref-type="bibr" rid="CR46">2009a</xref>
), inconsistent with the Classical Swine lineage originating from the human 1918 outbreak and consistent with the identification of swine as a possible intermediate host.</p>
<p>While this manuscript was in preparation, the emergent pandemic H1N1 2009 virus was identified (Dawood et al.
<xref ref-type="bibr" rid="CR8">2009</xref>
). This is the first example, with the possible exception of 1918, that a virus of swine origin has become established in the human population to cause a pandemic. Certain parallels are apparent between the 1918 and 2009 pandemics, especially the possible role of swine as an intermediate host. The role of swine as a mixing vessel of different lineages, an important feature of the 2009 Swine-origin virus (Smith et al.
<xref ref-type="bibr" rid="CR47">2009b</xref>
), is less clear with the ‘Spanish flu’ pandemic; while we find limited evidence that the 1918 human pandemic was the result of a human/swine reassortment, Scholtissek (
<xref ref-type="bibr" rid="CR44">2008</xref>
) and Smith et al. (
<xref ref-type="bibr" rid="CR46">2009a</xref>
) both argue that this might have occurred for some of the segments. The possibility that the 2009 pandemic virus might increase in pathogenicity emphasizes the importance of understanding how the 1918 virus emerged and the basis of its extreme pathogenicity.</p>
</sec>
</body>
<back>
<ack>
<p>Thanks to John McCauley, Ziheng Yang and Rod Daniels for helpful comments. Thanks to Seena Shah for programming advice. This work was supported by the Medical Research Council, UK, and the European Union FP6 FLUPOL project number 044263.</p>
<p>
<bold>Open Access</bold>
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.</p>
</ack>
<ref-list id="Bib1">
<title>References</title>
<ref id="CR1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akaike</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>A new look at the statistical model identification</article-title>
<source>IEEE Trans Autom Control</source>
<year>1974</year>
<volume>19</volume>
<fpage>716</fpage>
<lpage>723</lpage>
<pub-id pub-id-type="doi">10.1109/TAC.1974.1100705</pub-id>
</citation>
<citation citation-type="display-unstructured">Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723 </citation>
</ref>
<ref id="CR2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Antonovics</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Hood</surname>
<given-names>ME</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>CH</given-names>
</name>
</person-group>
<article-title>Molecular virology: was the 1918 flu avian in origin?</article-title>
<source>Nature</source>
<year>2006</year>
<volume>440</volume>
<fpage>E9</fpage>
<pub-id pub-id-type="doi">10.1038/nature04824</pub-id>
</citation>
<citation citation-type="display-unstructured">Antonovics J, Hood ME, Baker CH (2006) Molecular virology: was the 1918 flu avian in origin? Nature 440:E9 (discussion E9–10)
<pub-id pub-id-type="pmid">16641950</pub-id>
</citation>
</ref>
<ref id="CR3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barry</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hartigan</surname>
<given-names>JA</given-names>
</name>
</person-group>
<article-title>Statistical analysis of hominoid molecular evolution</article-title>
<source>Stat Sci</source>
<year>1987</year>
<volume>2</volume>
<fpage>191</fpage>
<lpage>210</lpage>
<pub-id pub-id-type="doi">10.1214/ss/1177013353</pub-id>
</citation>
<citation citation-type="display-unstructured">Barry D, Hartigan JA (1987) Statistical analysis of hominoid molecular evolution. Stat Sci 2:191–210 </citation>
</ref>
<ref id="CR4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Blanquart</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lartillot</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>A site- and time-heterogeneous model of amino acid replacement</article-title>
<source>Mol Biol Evol</source>
<year>2008</year>
<volume>25</volume>
<fpage>842</fpage>
<lpage>858</lpage>
<pub-id pub-id-type="doi">10.1093/molbev/msn018</pub-id>
</citation>
<citation citation-type="display-unstructured">Blanquart S, Lartillot N (2008) A site- and time-heterogeneous model of amino acid replacement. Mol Biol Evol 25:842–858
<pub-id pub-id-type="pmid">18234708</pub-id>
</citation>
</ref>
<ref id="CR5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boussau</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Blanquart</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Necsulea</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lartillot</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Gouy</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Parallel adaptations to high temperatures in the Archaean eon</article-title>
<source>Nature</source>
<year>2008</year>
<volume>456</volume>
<fpage>942</fpage>
<lpage>945</lpage>
<pub-id pub-id-type="doi">10.1038/nature07393</pub-id>
</citation>
<citation citation-type="display-unstructured">Boussau B, Blanquart S, Necsulea A, Lartillot N, Gouy M (2008) Parallel adaptations to high temperatures in the Archaean eon. Nature 456:942–945
<pub-id pub-id-type="pmid">19037246</pub-id>
</citation>
</ref>
<ref id="CR6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brown</surname>
<given-names>IH</given-names>
</name>
<name>
<surname>Ludwig</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Olsen</surname>
<given-names>CW</given-names>
</name>
<name>
<surname>Hannoun</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Scholtissek</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Hinshaw</surname>
<given-names>VS</given-names>
</name>
<name>
<surname>Harris</surname>
<given-names>PA</given-names>
</name>
<name>
<surname>McCauley</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Strong</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Alexander</surname>
<given-names>DJ</given-names>
</name>
</person-group>
<article-title>Antigenic and genetic analyses of H1N1 influenza A viruses from European pigs</article-title>
<source>J Gen Virol</source>
<year>1997</year>
<volume>78</volume>
<fpage>553</fpage>
<lpage>562</lpage>
</citation>
<citation citation-type="display-unstructured">Brown IH, Ludwig S, Olsen CW, Hannoun C, Scholtissek C, Hinshaw VS, Harris PA, McCauley JW, Strong I, Alexander DJ (1997) Antigenic and genetic analyses of H1N1 influenza A viruses from European pigs. J Gen Virol 78:553–562
<pub-id pub-id-type="pmid">9049404</pub-id>
</citation>
</ref>
<ref id="CR7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bush</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>CB</given-names>
</name>
<name>
<surname>Cox</surname>
<given-names>NJ</given-names>
</name>
<name>
<surname>Fitch</surname>
<given-names>WM</given-names>
</name>
</person-group>
<article-title>Effects of passage history and sampling bias on phylogenetic reconstruction of human influenza A evolution</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2000</year>
<volume>97</volume>
<fpage>6974</fpage>
<lpage>6980</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.97.13.6974</pub-id>
</citation>
<citation citation-type="display-unstructured">Bush RM, Smith CB, Cox NJ, Fitch WM (2000) Effects of passage history and sampling bias on phylogenetic reconstruction of human influenza A evolution. Proc Natl Acad Sci USA 97:6974–6980
<pub-id pub-id-type="pmid">10860959</pub-id>
</citation>
</ref>
<ref id="CR8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dawood</surname>
<given-names>FS</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Finelli</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Shaw</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Lindstrom</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Garten</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Gubareva</surname>
<given-names>LV</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Bridges</surname>
<given-names>CB</given-names>
</name>
<name>
<surname>Uyeki</surname>
<given-names>TM</given-names>
</name>
</person-group>
<article-title>Emergence of a novel swine-origin influenza A (H1N1) virus in humans</article-title>
<source>N Engl J Med</source>
<year>2009</year>
<volume>360</volume>
<fpage>2605</fpage>
<lpage>2615</lpage>
<pub-id pub-id-type="doi">10.1056/NEJMoa0903810</pub-id>
</citation>
<citation citation-type="display-unstructured">Dawood FS, Jain S, Finelli L, Shaw MW, Lindstrom S, Garten RJ, Gubareva LV, Xu X, Bridges CB, Uyeki TM (2009) Emergence of a novel swine-origin influenza A (H1N1) virus in humans. N Engl J Med 360:2605–2615
<pub-id pub-id-type="pmid">19423869</pub-id>
</citation>
</ref>
<ref id="CR9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dowdle</surname>
<given-names>WR</given-names>
</name>
</person-group>
<article-title>Influenza A virus recycling revisited</article-title>
<source>Bull World Health Organ</source>
<year>1999</year>
<volume>77</volume>
<fpage>820</fpage>
<lpage>828</lpage>
</citation>
<citation citation-type="display-unstructured">Dowdle WR (1999) Influenza A virus recycling revisited. Bull World Health Organ 77:820–828
<pub-id pub-id-type="pmid">10593030</pub-id>
</citation>
</ref>
<ref id="CR10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Drummond</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Rambaut</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>BEAST: Bayesian evolutionary analysis by sampling trees</article-title>
<source>BMC Evol Biol</source>
<year>2007</year>
<volume>7</volume>
<fpage>214</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2148-7-214</pub-id>
</citation>
<citation citation-type="display-unstructured">Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7:214
<pub-id pub-id-type="pmid">17996036</pub-id>
</citation>
</ref>
<ref id="CR11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Drummond</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Ho</surname>
<given-names>SY</given-names>
</name>
<name>
<surname>Phillips</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Rambaut</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Relaxed phylogenetics and dating with confidence</article-title>
<source>PloS Biol</source>
<year>2006</year>
<volume>4</volume>
<fpage>e88</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pbio.0040088</pub-id>
</citation>
<citation citation-type="display-unstructured">Drummond AJ, Ho SY, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and dating with confidence. PloS Biol 4:e88
<pub-id pub-id-type="pmid">16683862</pub-id>
</citation>
</ref>
<ref id="CR12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edgar</surname>
<given-names>RC</given-names>
</name>
</person-group>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
<source>Nucleic Acids Res</source>
<year>2004</year>
<volume>32</volume>
<fpage>1792</fpage>
<lpage>1797</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id>
</citation>
<citation citation-type="display-unstructured">Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797
<pub-id pub-id-type="pmid">15034147</pub-id>
</citation>
</ref>
<ref id="CR13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Felsenstein</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Evolutionary trees from DNA sequences: a maximum likelihood approach</article-title>
<source>J Mol Evol</source>
<year>1981</year>
<volume>17</volume>
<fpage>368</fpage>
<lpage>376</lpage>
<pub-id pub-id-type="doi">10.1007/BF01734359</pub-id>
</citation>
<citation citation-type="display-unstructured">Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376
<pub-id pub-id-type="pmid">7288891</pub-id>
</citation>
</ref>
<ref id="CR14">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Felsenstein</surname>
<given-names>J</given-names>
</name>
</person-group>
<source>Inferring phylogenies</source>
<year>2003</year>
<publisher-loc>Sunderland, USA</publisher-loc>
<publisher-name>Sinauer Associates</publisher-name>
</citation>
<citation citation-type="display-unstructured">Felsenstein J (2003) Inferring phylogenies. Sinauer Associates, Sunderland, USA </citation>
</ref>
<ref id="CR15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Galtier</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Gouy</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis</article-title>
<source>Mol Biol Evol</source>
<year>1998</year>
<volume>15</volume>
<fpage>871</fpage>
<lpage>879</lpage>
</citation>
<citation citation-type="display-unstructured">Galtier N, Gouy M (1998) Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Mol Biol Evol 15:871–879
<pub-id pub-id-type="pmid">9656487</pub-id>
</citation>
</ref>
<ref id="CR16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Galtier</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Tourasse</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Gouy</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>A nonhyperthermophilic common ancestor to extant life forms</article-title>
<source>Science</source>
<year>1999</year>
<volume>283</volume>
<fpage>220</fpage>
<lpage>221</lpage>
<pub-id pub-id-type="doi">10.1126/science.283.5399.220</pub-id>
</citation>
<citation citation-type="display-unstructured">Galtier N, Tourasse N, Gouy M (1999) A nonhyperthermophilic common ancestor to extant life forms. Science 283:220–221
<pub-id pub-id-type="pmid">9880254</pub-id>
</citation>
</ref>
<ref id="CR17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gammelin</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Altmuller</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Reinhardt</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Mandler</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Harley</surname>
<given-names>VR</given-names>
</name>
<name>
<surname>Hudson</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Fitch</surname>
<given-names>WM</given-names>
</name>
<name>
<surname>Scholtissek</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>Phylogenetic analysis of nucleoproteins suggests that human influenza A viruses emerged from a 19th-century avian ancestor</article-title>
<source>Mol Biol Evol</source>
<year>1990</year>
<volume>7</volume>
<fpage>194</fpage>
<lpage>200</lpage>
</citation>
<citation citation-type="display-unstructured">Gammelin M, Altmuller A, Reinhardt U, Mandler J, Harley VR, Hudson PJ, Fitch WM, Scholtissek C (1990) Phylogenetic analysis of nucleoproteins suggests that human influenza A viruses emerged from a 19th-century avian ancestor. Mol Biol Evol 7:194–200
<pub-id pub-id-type="pmid">2319943</pub-id>
</citation>
</ref>
<ref id="CR18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gibbs</surname>
<given-names>MJ</given-names>
</name>
<name>
<surname>Gibbs</surname>
<given-names>AJ</given-names>
</name>
</person-group>
<article-title>Molecular virology: was the 1918 pandemic caused by a bird flu?</article-title>
<source>Nature</source>
<year>2006</year>
<volume>440</volume>
<fpage>E8</fpage>
<pub-id pub-id-type="doi">10.1038/nature04823</pub-id>
</citation>
<citation citation-type="display-unstructured">Gibbs MJ, Gibbs AJ (2006) Molecular virology: was the 1918 pandemic caused by a bird flu? Nature 440:E8 (discussion E9-10)
<pub-id pub-id-type="pmid">16641948</pub-id>
</citation>
</ref>
<ref id="CR19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gorman</surname>
<given-names>OT</given-names>
</name>
<name>
<surname>Donis</surname>
<given-names>RO</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
</person-group>
<article-title>Evolution of influenza A virus PB2 genes: implications for evolution of the ribonucleoprotein complex and origin of human influenza A virus</article-title>
<source>J Virol</source>
<year>1990</year>
<volume>64</volume>
<fpage>4893</fpage>
<lpage>4902</lpage>
</citation>
<citation citation-type="display-unstructured">Gorman OT, Donis RO, Kawaoka Y, Webster RG (1990) Evolution of influenza A virus PB2 genes: implications for evolution of the ribonucleoprotein complex and origin of human influenza A virus. J Virol 64:4893–4902
<pub-id pub-id-type="pmid">2398532</pub-id>
</citation>
</ref>
<ref id="CR20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gorman</surname>
<given-names>OT</given-names>
</name>
<name>
<surname>Bean</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Donatelli</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>YJ</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
</person-group>
<article-title>Evolution of influenza A virus nucleoprotein genes: implications for the origins of H1N1 human and classical swine viruses</article-title>
<source>J Virol</source>
<year>1991</year>
<volume>65</volume>
<fpage>3704</fpage>
<lpage>3714</lpage>
</citation>
<citation citation-type="display-unstructured">Gorman OT, Bean WJ, Kawaoka Y, Donatelli I, Guo YJ, Webster RG (1991) Evolution of influenza A virus nucleoprotein genes: implications for the origins of H1N1 human and classical swine viruses. J Virol 65:3704–3714
<pub-id pub-id-type="pmid">2041090</pub-id>
</citation>
</ref>
<ref id="CR21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Greenbaum</surname>
<given-names>BD</given-names>
</name>
<name>
<surname>Levine</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Bhanot</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Rabadan</surname>
<given-names>R</given-names>
</name>
</person-group>
<article-title>Patterns of evolution and host gene mimicry in influenza and other RNA viruses</article-title>
<source>PLoS Pathog</source>
<year>2008</year>
<volume>4</volume>
<fpage>e1000079</fpage>
<pub-id pub-id-type="doi">10.1371/journal.ppat.1000079</pub-id>
</citation>
<citation citation-type="display-unstructured">Greenbaum BD, Levine AJ, Bhanot G, Rabadan R (2008) Patterns of evolution and host gene mimicry in influenza and other RNA viruses. PLoS Pathog 4:e1000079
<pub-id pub-id-type="pmid">18535658</pub-id>
</citation>
</ref>
<ref id="CR22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gregory</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Bennett</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Kaiser</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Wunderli</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Matter</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Hay</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>YP</given-names>
</name>
</person-group>
<article-title>Human infection by a swine influenza A (H1N1) virus in Switzerland</article-title>
<source>Arch Virol</source>
<year>2003</year>
<volume>148</volume>
<fpage>793</fpage>
<lpage>802</lpage>
<pub-id pub-id-type="doi">10.1007/s00705-002-0953-9</pub-id>
</citation>
<citation citation-type="display-unstructured">Gregory V, Bennett M, Thomas Y, Kaiser L, Wunderli W, Matter H, Hay A, Lin YP (2003) Human infection by a swine influenza A (H1N1) virus in Switzerland. Arch Virol 148:793–802
<pub-id pub-id-type="pmid">12664301</pub-id>
</citation>
</ref>
<ref id="CR23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>WH</given-names>
</name>
</person-group>
<article-title>Estimation of evolutionary distances under stationary and nonstationary models of nucleotide substitution</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1998</year>
<volume>95</volume>
<fpage>5899</fpage>
<lpage>5905</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.95.11.5899</pub-id>
</citation>
<citation citation-type="display-unstructured">Gu X, Li WH (1998) Estimation of evolutionary distances under stationary and nonstationary models of nucleotide substitution. Proc Natl Acad Sci USA 95:5899–5905
<pub-id pub-id-type="pmid">9600890</pub-id>
</citation>
</ref>
<ref id="CR24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guindon</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gascuel</surname>
<given-names>O</given-names>
</name>
</person-group>
<article-title>A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood</article-title>
<source>Syst Biol</source>
<year>2003</year>
<volume>52</volume>
<fpage>696</fpage>
<lpage>704</lpage>
<pub-id pub-id-type="doi">10.1080/10635150390235520</pub-id>
</citation>
<citation citation-type="display-unstructured">Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696–704
<pub-id pub-id-type="pmid">14530136</pub-id>
</citation>
</ref>
<ref id="CR25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hasegawa</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Kishino</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Yano</surname>
<given-names>T</given-names>
</name>
</person-group>
<article-title>Dating of the human-ape splitting by a molecular clock of mitochondrial DNA</article-title>
<source>J Mol Evol</source>
<year>1985</year>
<volume>22</volume>
<fpage>160</fpage>
<lpage>174</lpage>
<pub-id pub-id-type="doi">10.1007/BF02101694</pub-id>
</citation>
<citation citation-type="display-unstructured">Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174
<pub-id pub-id-type="pmid">3934395</pub-id>
</citation>
</ref>
<ref id="CR26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hay</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Gregory</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Douglas</surname>
<given-names>AR</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>YP</given-names>
</name>
</person-group>
<article-title>The evolution of human influenza viruses</article-title>
<source>Philos Trans R Soc Lond B Biol Sci</source>
<year>2001</year>
<volume>356</volume>
<fpage>1861</fpage>
<lpage>1870</lpage>
<pub-id pub-id-type="doi">10.1098/rstb.2001.0999</pub-id>
</citation>
<citation citation-type="display-unstructured">Hay AJ, Gregory V, Douglas AR, Lin YP (2001) The evolution of human influenza viruses. Philos Trans R Soc Lond B Biol Sci 356:1861–1870
<pub-id pub-id-type="pmid">11779385</pub-id>
</citation>
</ref>
<ref id="CR27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huelsenbeck</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Ronquist</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Bollback</surname>
<given-names>JP</given-names>
</name>
</person-group>
<article-title>Bayesian inference of phylogeny and its impact on evolutionary biology</article-title>
<source>Science</source>
<year>2001</year>
<volume>294</volume>
<fpage>2310</fpage>
<lpage>2314</lpage>
<pub-id pub-id-type="doi">10.1126/science.1065889</pub-id>
</citation>
<citation citation-type="display-unstructured">Huelsenbeck JP, Ronquist F, Nielsen R, Bollback JP (2001) Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294:2310–2314
<pub-id pub-id-type="pmid">11743192</pub-id>
</citation>
</ref>
<ref id="CR28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Johnson</surname>
<given-names>NP</given-names>
</name>
<name>
<surname>Mueller</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Updating the accounts: global mortality of the 1918–1920 “Spanish” influenza pandemic</article-title>
<source>Bull Hist Med</source>
<year>2002</year>
<volume>76</volume>
<fpage>105</fpage>
<lpage>115</lpage>
<pub-id pub-id-type="doi">10.1353/bhm.2002.0022</pub-id>
</citation>
<citation citation-type="display-unstructured">Johnson NP, Mueller J (2002) Updating the accounts: global mortality of the 1918–1920 “Spanish” influenza pandemic. Bull Hist Med 76:105–115
<pub-id pub-id-type="pmid">11875246</pub-id>
</citation>
</ref>
<ref id="CR29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Krauss</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
</person-group>
<article-title>Avian-to-human transmission of the PB1 gene of influenza A viruses in the 1957 and 1968 pandemics</article-title>
<source>J Virol</source>
<year>1989</year>
<volume>63</volume>
<fpage>4603</fpage>
<lpage>4608</lpage>
</citation>
<citation citation-type="display-unstructured">Kawaoka Y, Krauss S, Webster RG (1989) Avian-to-human transmission of the PB1 gene of influenza A viruses in the 1957 and 1968 pandemics. J Virol 63:4603–4608
<pub-id pub-id-type="pmid">2795713</pub-id>
</citation>
</ref>
<ref id="CR30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kilbourne</surname>
<given-names>ED</given-names>
</name>
</person-group>
<article-title>Influenza pandemics of the 20th century</article-title>
<source>Emerg Infect Dis</source>
<year>2006</year>
<volume>12</volume>
<fpage>9</fpage>
<lpage>14</lpage>
</citation>
<citation citation-type="display-unstructured">Kilbourne ED (2006) Influenza pandemics of the 20th century. Emerg Infect Dis 12:9–14
<pub-id pub-id-type="pmid">16494710</pub-id>
</citation>
</ref>
<ref id="CR31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Langley</surname>
<given-names>CH</given-names>
</name>
<name>
<surname>Fitch</surname>
<given-names>WM</given-names>
</name>
</person-group>
<article-title>An examination of the constancy of the rate of molecular evolution</article-title>
<source>J Mol Evol</source>
<year>1974</year>
<volume>3</volume>
<fpage>161</fpage>
<lpage>177</lpage>
<pub-id pub-id-type="doi">10.1007/BF01797451</pub-id>
</citation>
<citation citation-type="display-unstructured">Langley CH, Fitch WM (1974) An examination of the constancy of the rate of molecular evolution. J Mol Evol 3:161–177
<pub-id pub-id-type="pmid">4368400</pub-id>
</citation>
</ref>
<ref id="CR32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lockhart</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Steel</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Hendy</surname>
<given-names>MD</given-names>
</name>
<name>
<surname>Penny</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>Recovering evolutionary trees under a more realistic model of sequence evolution</article-title>
<source>Mol Biol Evol</source>
<year>1994</year>
<volume>11</volume>
<fpage>605</fpage>
<lpage>612</lpage>
</citation>
<citation citation-type="display-unstructured">Lockhart PJ, Steel MA, Hendy MD, Penny D (1994) Recovering evolutionary trees under a more realistic model of sequence evolution. Mol Biol Evol 11:605–612
<pub-id pub-id-type="pmid">19391266</pub-id>
</citation>
</ref>
<ref id="CR33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ludwig</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Stitz</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Planz</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Van</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Fitch</surname>
<given-names>WM</given-names>
</name>
<name>
<surname>Scholtissek</surname>
<given-names>C</given-names>
</name>
</person-group>
<article-title>European swine virus as a possible source for the next influenza pandemic?</article-title>
<source>Virology</source>
<year>1995</year>
<volume>212</volume>
<fpage>555</fpage>
<lpage>561</lpage>
<pub-id pub-id-type="doi">10.1006/viro.1995.1513</pub-id>
</citation>
<citation citation-type="display-unstructured">Ludwig S, Stitz L, Planz O, Van H, Fitch WM, Scholtissek C (1995) European swine virus as a possible source for the next influenza pandemic? Virology 212:555–561
<pub-id pub-id-type="pmid">7571425</pub-id>
</citation>
</ref>
<ref id="CR34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nakajima</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Desselberger</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Palese</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Recent human influenza A (H1N1) viruses are closely related genetically to strains isolated in 1950</article-title>
<source>Nature</source>
<year>1978</year>
<volume>274</volume>
<fpage>334</fpage>
<lpage>339</lpage>
<pub-id pub-id-type="doi">10.1038/274334a0</pub-id>
</citation>
<citation citation-type="display-unstructured">Nakajima K, Desselberger U, Palese P (1978) Recent human influenza A (H1N1) viruses are closely related genetically to strains isolated in 1950. Nature 274:334–339
<pub-id pub-id-type="pmid">672956</pub-id>
</citation>
</ref>
<ref id="CR35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Neumeier</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Meier-Ewert</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Cox</surname>
<given-names>NJ</given-names>
</name>
</person-group>
<article-title>Genetic relatedness between influenza A (H1N1) viruses isolated from humans and pigs</article-title>
<source>J Gen Virol</source>
<year>1994</year>
<volume>75</volume>
<issue>Pt 8</issue>
<fpage>2103</fpage>
<lpage>2107</lpage>
<pub-id pub-id-type="doi">10.1099/0022-1317-75-8-2103</pub-id>
</citation>
<citation citation-type="display-unstructured">Neumeier E, Meier-Ewert H, Cox NJ (1994) Genetic relatedness between influenza A (H1N1) viruses isolated from humans and pigs. J Gen Virol 75(Pt 8):2103–2107
<pub-id pub-id-type="pmid">8046416</pub-id>
</citation>
</ref>
<ref id="CR36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Palese</surname>
<given-names>P</given-names>
</name>
</person-group>
<article-title>Influenza: old and new threats</article-title>
<source>Nat Med</source>
<year>2004</year>
<volume>10</volume>
<fpage>S82</fpage>
<lpage>S87</lpage>
<pub-id pub-id-type="doi">10.1038/nm1141</pub-id>
</citation>
<citation citation-type="display-unstructured">Palese P (2004) Influenza: old and new threats. Nat Med 10:S82–S87
<pub-id pub-id-type="pmid">15577936</pub-id>
</citation>
</ref>
<ref id="CR37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rabadan</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Levine</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Robins</surname>
<given-names>H</given-names>
</name>
</person-group>
<article-title>Comparison of avian and human influenza A viruses reveals a mutational bias on the viral genomes</article-title>
<source>J Virol</source>
<year>2006</year>
<volume>80</volume>
<fpage>11887</fpage>
<lpage>11891</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.01414-06</pub-id>
</citation>
<citation citation-type="display-unstructured">Rabadan R, Levine AJ, Robins H (2006) Comparison of avian and human influenza A viruses reveals a mutational bias on the viral genomes. J Virol 80:11887–11891
<pub-id pub-id-type="pmid">16987977</pub-id>
</citation>
</ref>
<ref id="CR38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rambaut</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies</article-title>
<source>Bioinformatics</source>
<year>2000</year>
<volume>16</volume>
<fpage>395</fpage>
<lpage>399</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/16.4.395</pub-id>
</citation>
<citation citation-type="display-unstructured">Rambaut A (2000) Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics 16:395–399
<pub-id pub-id-type="pmid">10869038</pub-id>
</citation>
</ref>
<ref id="CR39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Raymond</surname>
<given-names>FL</given-names>
</name>
<name>
<surname>Caton</surname>
<given-names>AJ</given-names>
</name>
<name>
<surname>Cox</surname>
<given-names>NJ</given-names>
</name>
<name>
<surname>Kendal</surname>
<given-names>AP</given-names>
</name>
<name>
<surname>Brownlee</surname>
<given-names>GG</given-names>
</name>
</person-group>
<article-title>The antigenicity and evolution of influenza H1 haemagglutinin, from 1950–1957 and 1977–1983: two pathways from one gene</article-title>
<source>Virology</source>
<year>1986</year>
<volume>148</volume>
<fpage>275</fpage>
<lpage>287</lpage>
<pub-id pub-id-type="doi">10.1016/0042-6822(86)90325-9</pub-id>
</citation>
<citation citation-type="display-unstructured">Raymond FL, Caton AJ, Cox NJ, Kendal AP, Brownlee GG (1986) The antigenicity and evolution of influenza H1 haemagglutinin, from 1950–1957 and 1977–1983: two pathways from one gene. Virology 148:275–287
<pub-id pub-id-type="pmid">3942036</pub-id>
</citation>
</ref>
<ref id="CR40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
</person-group>
<article-title>Evidence of an absence: the genetic origins of the 1918 pandemic influenza virus</article-title>
<source>Nat Rev Microbiol</source>
<year>2004</year>
<volume>2</volume>
<fpage>909</fpage>
<lpage>914</lpage>
<pub-id pub-id-type="doi">10.1038/nrmicro1027</pub-id>
</citation>
<citation citation-type="display-unstructured">Reid AH, Taubenberger JK, Fanning TG (2004) Evidence of an absence: the genetic origins of the 1918 pandemic influenza virus. Nat Rev Microbiol 2:909–914
<pub-id pub-id-type="pmid">15494747</pub-id>
</citation>
</ref>
<ref id="CR41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rogers</surname>
<given-names>GN</given-names>
</name>
<name>
<surname>Paulson</surname>
<given-names>JC</given-names>
</name>
</person-group>
<article-title>Receptor determinants of human and animal influenza virus isolates: differences in receptor specificity of the H3 hemagglutinin based on species of origin</article-title>
<source>Virology</source>
<year>1983</year>
<volume>127</volume>
<fpage>361</fpage>
<lpage>373</lpage>
<pub-id pub-id-type="doi">10.1016/0042-6822(83)90150-2</pub-id>
</citation>
<citation citation-type="display-unstructured">Rogers GN, Paulson JC (1983) Receptor determinants of human and animal influenza virus isolates: differences in receptor specificity of the H3 hemagglutinin based on species of origin. Virology 127:361–373
<pub-id pub-id-type="pmid">6868370</pub-id>
</citation>
</ref>
<ref id="CR42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sanderson</surname>
<given-names>MJ</given-names>
</name>
</person-group>
<article-title>Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach</article-title>
<source>Mol Biol Evol</source>
<year>2002</year>
<volume>19</volume>
<fpage>101</fpage>
<lpage>109</lpage>
</citation>
<citation citation-type="display-unstructured">Sanderson MJ (2002) Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Mol Biol Evol 19:101–109
<pub-id pub-id-type="pmid">11752195</pub-id>
</citation>
</ref>
<ref id="CR43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sanderson</surname>
<given-names>MJ</given-names>
</name>
</person-group>
<article-title>r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock</article-title>
<source>Bioinformatics</source>
<year>2003</year>
<volume>19</volume>
<fpage>301</fpage>
<lpage>302</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/19.2.301</pub-id>
</citation>
<citation citation-type="display-unstructured">Sanderson MJ (2003) r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19:301–302
<pub-id pub-id-type="pmid">12538260</pub-id>
</citation>
</ref>
<ref id="CR44">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Scholtissek</surname>
<given-names>C</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Klenk</surname>
<given-names>H-D</given-names>
</name>
<name>
<surname>Matrosovic</surname>
<given-names>MN</given-names>
</name>
<name>
<surname>Stech</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>History of research on avian influenza</article-title>
<source>Avian influenza</source>
<year>2008</year>
<publisher-loc>Basel</publisher-loc>
<publisher-name>Karger</publisher-name>
<fpage>101</fpage>
<lpage>117</lpage>
</citation>
<citation citation-type="display-unstructured">Scholtissek C (2008) History of research on avian influenza. In: Klenk H-D, Matrosovic MN, Stech J (eds) Avian influenza. Karger, Basel, pp 101–117 </citation>
</ref>
<ref id="CR45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scholtissek</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Burger</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Kistner</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Shortridge</surname>
<given-names>KF</given-names>
</name>
</person-group>
<article-title>The nucleoprotein as a possible major factor in determining host specificity of influenza H3N2 viruses</article-title>
<source>Virology</source>
<year>1985</year>
<volume>147</volume>
<fpage>287</fpage>
<lpage>294</lpage>
<pub-id pub-id-type="doi">10.1016/0042-6822(85)90131-X</pub-id>
</citation>
<citation citation-type="display-unstructured">Scholtissek C, Burger H, Kistner O, Shortridge KF (1985) The nucleoprotein as a possible major factor in determining host specificity of influenza H3N2 viruses. Virology 147:287–294
<pub-id pub-id-type="pmid">2416114</pub-id>
</citation>
</ref>
<ref id="CR46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>GJD</given-names>
</name>
<name>
<surname>Bahl</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Vijaykrishna</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Poon</surname>
<given-names>LLM</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<name>
<surname>Malik Peiris</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Dating the emergence of pandemic influenza viruses</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2009</year>
<volume>106</volume>
<fpage>11709</fpage>
<lpage>11712</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0904991106</pub-id>
</citation>
<citation citation-type="display-unstructured">Smith GJD, Bahl J, Vijaykrishna D, Zhang J, Poon LLM, Chen H, Webster RG, Malik Peiris JS, Guan Y (2009a) Dating the emergence of pandemic influenza viruses. Proc Natl Acad Sci USA 106:11709–11712
<pub-id pub-id-type="pmid">19597152</pub-id>
</citation>
</ref>
<ref id="CR47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Smith</surname>
<given-names>GJD</given-names>
</name>
<name>
<surname>Vijaykrishna</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Bahl</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lycett</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Worobey</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Pybus</surname>
<given-names>OG</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>SK</given-names>
</name>
<name>
<surname>Cheung</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Raghwani</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bhatt</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Malik Peiris</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Rambaut</surname>
<given-names>A</given-names>
</name>
</person-group>
<article-title>Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic</article-title>
<source>Nature</source>
<year>2009</year>
<volume>459</volume>
<fpage>1122</fpage>
<lpage>1125</lpage>
<pub-id pub-id-type="doi">10.1038/nature08182</pub-id>
</citation>
<citation citation-type="display-unstructured">Smith GJD, Vijaykrishna D, Bahl J, Lycett SJ, Worobey M, Pybus OG, Ma SK, Cheung CL, Raghwani J, Bhatt S, Malik Peiris JS, Guan Y, Rambaut A (2009b) Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature 459:1122–1125
<pub-id pub-id-type="pmid">19516283</pub-id>
</citation>
</ref>
<ref id="CR48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Subbarao</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Katz</surname>
<given-names>J</given-names>
</name>
</person-group>
<article-title>Avian influenza viruses infecting humans</article-title>
<source>Cell Mol Life Sci</source>
<year>2000</year>
<volume>57</volume>
<fpage>1770</fpage>
<lpage>1784</lpage>
<pub-id pub-id-type="doi">10.1007/PL00000657</pub-id>
</citation>
<citation citation-type="display-unstructured">Subbarao K, Katz J (2000) Avian influenza viruses infecting humans. Cell Mol Life Sci 57:1770–1784
<pub-id pub-id-type="pmid">11130181</pub-id>
</citation>
</ref>
<ref id="CR49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
</person-group>
<article-title>The origin and virulence of the 1918 “Spanish” influenza virus</article-title>
<source>Proc Am Philos Soc</source>
<year>2006</year>
<volume>150</volume>
<fpage>86</fpage>
<lpage>112</lpage>
</citation>
<citation citation-type="display-unstructured">Taubenberger JK (2006) The origin and virulence of the 1918 “Spanish” influenza virus. Proc Am Philos Soc 150:86–112
<pub-id pub-id-type="pmid">17526158</pub-id>
</citation>
</ref>
<ref id="CR50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Lourens</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
</person-group>
<article-title>Characterization of the 1918 influenza virus polymerase genes</article-title>
<source>Nature</source>
<year>2005</year>
<volume>437</volume>
<fpage>889</fpage>
<lpage>893</lpage>
<pub-id pub-id-type="doi">10.1038/nature04230</pub-id>
</citation>
<citation citation-type="display-unstructured">Taubenberger JK, Reid AH, Lourens RM, Wang R, Jin G, Fanning TG (2005) Characterization of the 1918 influenza virus polymerase genes. Nature 437:889–893
<pub-id pub-id-type="pmid">16208372</pub-id>
</citation>
</ref>
<ref id="CR51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Lourens</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>Guozhong</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
</person-group>
<article-title>Molecular virology: was the 1918 pandemic caused by a bird flu? Was the 1918 flu avian in origin? (Reply)</article-title>
<source>Nature</source>
<year>2006</year>
<volume>440</volume>
<fpage>e9</fpage>
<lpage>e10</lpage>
<pub-id pub-id-type="doi">10.1038/nature04825</pub-id>
</citation>
<citation citation-type="display-unstructured">Taubenberger JK, Reid AH, Lourens RM, Wang R, Jin Guozhong, Fanning TG (2006) Molecular virology: was the 1918 pandemic caused by a bird flu? Was the 1918 flu avian in origin? (Reply). Nature 440:e9–e10
<pub-id pub-id-type="pmid">16641950</pub-id>
</citation>
</ref>
<ref id="CR52">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Thorne</surname>
<given-names>JL</given-names>
</name>
<name>
<surname>Kishino</surname>
<given-names>H</given-names>
</name>
</person-group>
<person-group person-group-type="editor">
<name>
<surname>Rasmus</surname>
<given-names>N</given-names>
</name>
</person-group>
<article-title>Estimation of divergence times from molecular sequence data</article-title>
<source>Statistical methods in molecular evolution</source>
<year>2005</year>
<publisher-loc>New York, USA</publisher-loc>
<publisher-name>Springer</publisher-name>
<fpage>235</fpage>
<lpage>256</lpage>
</citation>
<citation citation-type="display-unstructured">Thorne JL, Kishino H (2005) Estimation of divergence times from molecular sequence data. In: Rasmus N (ed) Statistical methods in molecular evolution. Springer, New York, USA, pp 235–256 </citation>
</ref>
<ref id="CR53">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Venables</surname>
<given-names>WN</given-names>
</name>
<name>
<surname>Ripley</surname>
<given-names>BD</given-names>
</name>
</person-group>
<source>Modern Applied Statistics with S</source>
<year>2002</year>
<publisher-loc>New York, USA</publisher-loc>
<publisher-name>Springer</publisher-name>
</citation>
<citation citation-type="display-unstructured">Venables WN, Ripley BD (2002) Modern Applied Statistics with S. Springer, New York, USA </citation>
</ref>
<ref id="CR54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<name>
<surname>Bean</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Gorman</surname>
<given-names>OT</given-names>
</name>
<name>
<surname>Chambers</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
</person-group>
<article-title>Evolution and ecology of influenza A viruses</article-title>
<source>Microbiol Rev</source>
<year>1992</year>
<volume>56</volume>
<fpage>152</fpage>
<lpage>179</lpage>
</citation>
<citation citation-type="display-unstructured">Webster RG, Bean WJ, Gorman OT, Chambers TM, Kawaoka Y (1992) Evolution and ecology of influenza A viruses. Microbiol Rev 56:152–179
<pub-id pub-id-type="pmid">1579108</pub-id>
</citation>
</ref>
<ref id="CR55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Worobey</surname>
<given-names>M</given-names>
</name>
</person-group>
<article-title>Phylogenetic evidence against evolutionary stasis and natural abiotic reservoirs of influenza A virus</article-title>
<source>J Virol</source>
<year>2008</year>
<volume>82</volume>
<fpage>3769</fpage>
<lpage>3774</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02207-07</pub-id>
</citation>
<citation citation-type="display-unstructured">Worobey M (2008) Phylogenetic evidence against evolutionary stasis and natural abiotic reservoirs of influenza A virus. J Virol 82:3769–3774
<pub-id pub-id-type="pmid">18234791</pub-id>
</citation>
</ref>
<ref id="CR56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>Estimating the pattern of nucleotide substitution</article-title>
<source>J Mol Evol</source>
<year>1994</year>
<volume>39</volume>
<fpage>105</fpage>
<lpage>111</lpage>
</citation>
<citation citation-type="display-unstructured">Yang Z (1994) Estimating the pattern of nucleotide substitution. J Mol Evol 39:105–111
<pub-id pub-id-type="pmid">8064867</pub-id>
</citation>
</ref>
<ref id="CR57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>Among-site rate variation and its impact on phylogenetic analyses</article-title>
<source>Trends Ecol Evol</source>
<year>1996</year>
<volume>11</volume>
<fpage>367</fpage>
<lpage>372</lpage>
<pub-id pub-id-type="doi">10.1016/0169-5347(96)10041-0</pub-id>
</citation>
<citation citation-type="display-unstructured">Yang Z (1996) Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol Evol 11:367–372 </citation>
</ref>
<ref id="CR58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<article-title>PAML: a program package for phylogenetic analysis by maximum likelihood</article-title>
<source>Comput Appl Biosci</source>
<year>1997</year>
<volume>13</volume>
<fpage>555</fpage>
<lpage>556</lpage>
</citation>
<citation citation-type="display-unstructured">Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556
<pub-id pub-id-type="pmid">9367129</pub-id>
</citation>
</ref>
<ref id="CR59">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
</person-group>
<source>Computational molecular evolution</source>
<year>2006</year>
<publisher-loc>Oxford</publisher-loc>
<publisher-name>Oxford University Press</publisher-name>
</citation>
<citation citation-type="display-unstructured">Yang Z (2006) Computational molecular evolution. Oxford University Press, Oxford </citation>
</ref>
<ref id="CR60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>S</given-names>
</name>
</person-group>
<article-title>Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites</article-title>
<source>Mol Biol Evol</source>
<year>1996</year>
<volume>13</volume>
<fpage>650</fpage>
<lpage>659</lpage>
</citation>
<citation citation-type="display-unstructured">Yang Z, Kumar S (1996) Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites. Mol Biol Evol 13:650–659
<pub-id pub-id-type="pmid">8676739</pub-id>
</citation>
</ref>
<ref id="CR61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Roberts</surname>
<given-names>D</given-names>
</name>
</person-group>
<article-title>On the use of nucleic acid sequences to infer early branchings in the tree of life</article-title>
<source>Mol Biol Evol</source>
<year>1995</year>
<volume>12</volume>
<fpage>451</fpage>
<lpage>458</lpage>
</citation>
<citation citation-type="display-unstructured">Yang Z, Roberts D (1995) On the use of nucleic acid sequences to infer early branchings in the tree of life. Mol Biol Evol 12:451–458
<pub-id pub-id-type="pmid">7739387</pub-id>
</citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/H2N2V1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000472 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000472 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    H2N2V1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:2772961
   |texte=   Using Non-Homogeneous Models of Nucleotide Substitution to Identify Host Shift Events: Application to the Origin of the 1918 ‘Spanish’ Influenza Pandemic Virus
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:19787384" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a H2N2V1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Apr 14 19:59:40 2020. Site generation: Thu Mar 25 15:38:26 2021