Serveur d'exploration H2N2

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome

Identifieur interne : 000903 ( Pmc/Corpus ); précédent : 000902; suivant : 000904

PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome

Auteurs : Jiajie Zhang ; Amir Madany Mamlouk ; Thomas Martinetz ; Suhua Chang ; Jing Wang ; Rolf Hilgenfeld

Source :

RBID : PMC:3142226

Abstract

Background

Results of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few thousand sequences are to be included, analyzing the phylogenetic relationships among them becomes a challenging task. The recent frequent outbreaks of influenza A viruses have resulted in the rapid accumulation of corresponding genome sequences. Currently, there are more than 7500 influenza A virus genomes in the database. There are no efficient ways of representing this huge data set as a whole, thus preventing a further understanding of the diversity of the influenza A virus genome.

Results

Here we present a new algorithm, "PhyloMap", which combines ordination, vector quantization, and phylogenetic tree construction to give an elegant representation of a large sequence data set. The use of PhyloMap on influenza A virus genome sequences reveals the phylogenetic relationships of the internal genes that cannot be seen when only a subset of sequences are analyzed.

Conclusions

The application of PhyloMap to influenza A virus genome data shows that it is a robust algorithm for analyzing large sequence data sets. It utilizes the entire data set, minimizes bias, and provides intuitive visualization. PhyloMap is implemented in JAVA, and the source code is freely available at http://www.biochem.uni-luebeck.de/public/software/phylomap.html


Url:
DOI: 10.1186/1471-2105-12-248
PubMed: 21689434
PubMed Central: 3142226

Links to Exploration step

PMC:3142226

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome</title>
<author>
<name sortKey="Zhang, Jiajie" sort="Zhang, Jiajie" uniqKey="Zhang J" first="Jiajie" last="Zhang">Jiajie Zhang</name>
<affiliation>
<nlm:aff id="I1">Institute of Biochemistry, Center for Structural and Cell Biology in Medicine, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Institute for Neuro- and Bioinformatics, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mamlouk, Amir Madany" sort="Mamlouk, Amir Madany" uniqKey="Mamlouk A" first="Amir Madany" last="Mamlouk">Amir Madany Mamlouk</name>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Institute for Neuro- and Bioinformatics, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Martinetz, Thomas" sort="Martinetz, Thomas" uniqKey="Martinetz T" first="Thomas" last="Martinetz">Thomas Martinetz</name>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Institute for Neuro- and Bioinformatics, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chang, Suhua" sort="Chang, Suhua" uniqKey="Chang S" first="Suhua" last="Chang">Suhua Chang</name>
<affiliation>
<nlm:aff id="I4">Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jing" sort="Wang, Jing" uniqKey="Wang J" first="Jing" last="Wang">Jing Wang</name>
<affiliation>
<nlm:aff id="I4">Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hilgenfeld, Rolf" sort="Hilgenfeld, Rolf" uniqKey="Hilgenfeld R" first="Rolf" last="Hilgenfeld">Rolf Hilgenfeld</name>
<affiliation>
<nlm:aff id="I1">Institute of Biochemistry, Center for Structural and Cell Biology in Medicine, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I5">Laboratory for Structural Biology of Infection and Inflammation, c/o DESY, Building 22a, Notkestr. 85, 22603 Hamburg, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I6">Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zu Chong Zhi Rd., Shanghai 201203, China</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">21689434</idno>
<idno type="pmc">3142226</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3142226</idno>
<idno type="RBID">PMC:3142226</idno>
<idno type="doi">10.1186/1471-2105-12-248</idno>
<date when="2011">2011</date>
<idno type="wicri:Area/Pmc/Corpus">000903</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000903</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome</title>
<author>
<name sortKey="Zhang, Jiajie" sort="Zhang, Jiajie" uniqKey="Zhang J" first="Jiajie" last="Zhang">Jiajie Zhang</name>
<affiliation>
<nlm:aff id="I1">Institute of Biochemistry, Center for Structural and Cell Biology in Medicine, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Institute for Neuro- and Bioinformatics, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Mamlouk, Amir Madany" sort="Mamlouk, Amir Madany" uniqKey="Mamlouk A" first="Amir Madany" last="Mamlouk">Amir Madany Mamlouk</name>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Institute for Neuro- and Bioinformatics, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Martinetz, Thomas" sort="Martinetz, Thomas" uniqKey="Martinetz T" first="Thomas" last="Martinetz">Thomas Martinetz</name>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I3">Institute for Neuro- and Bioinformatics, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Chang, Suhua" sort="Chang, Suhua" uniqKey="Chang S" first="Suhua" last="Chang">Suhua Chang</name>
<affiliation>
<nlm:aff id="I4">Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Wang, Jing" sort="Wang, Jing" uniqKey="Wang J" first="Jing" last="Wang">Jing Wang</name>
<affiliation>
<nlm:aff id="I4">Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Hilgenfeld, Rolf" sort="Hilgenfeld, Rolf" uniqKey="Hilgenfeld R" first="Rolf" last="Hilgenfeld">Rolf Hilgenfeld</name>
<affiliation>
<nlm:aff id="I1">Institute of Biochemistry, Center for Structural and Cell Biology in Medicine, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I2">Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I5">Laboratory for Structural Biology of Infection and Inflammation, c/o DESY, Building 22a, Notkestr. 85, 22603 Hamburg, Germany</nlm:aff>
</affiliation>
<affiliation>
<nlm:aff id="I6">Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zu Chong Zhi Rd., Shanghai 201203, China</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">BMC Bioinformatics</title>
<idno type="eISSN">1471-2105</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<sec>
<title>Background</title>
<p>Results of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few thousand sequences are to be included, analyzing the phylogenetic relationships among them becomes a challenging task. The recent frequent outbreaks of influenza A viruses have resulted in the rapid accumulation of corresponding genome sequences. Currently, there are more than 7500 influenza A virus genomes in the database. There are no efficient ways of representing this huge data set as a whole, thus preventing a further understanding of the diversity of the influenza A virus genome.</p>
</sec>
<sec>
<title>Results</title>
<p>Here we present a new algorithm, "PhyloMap", which combines ordination, vector quantization, and phylogenetic tree construction to give an elegant representation of a large sequence data set. The use of PhyloMap on influenza A virus genome sequences reveals the phylogenetic relationships of the internal genes that cannot be seen when only a subset of sequences are analyzed.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The application of PhyloMap to influenza A virus genome data shows that it is a robust algorithm for analyzing large sequence data sets. It utilizes the entire data set, minimizes bias, and provides intuitive visualization. PhyloMap is implemented in JAVA, and the source code is freely available at
<ext-link ext-link-type="uri" xlink:href="http://www.biochem.uni-luebeck.de/public/software/phylomap.html">http://www.biochem.uni-luebeck.de/public/software/phylomap.html</ext-link>
</p>
</sec>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Procter, Jb" uniqKey="Procter J">JB Procter</name>
</author>
<author>
<name sortKey="Thompson, J" uniqKey="Thompson J">J Thompson</name>
</author>
<author>
<name sortKey="Letunic, I" uniqKey="Letunic I">I Letunic</name>
</author>
<author>
<name sortKey="Creevey, C" uniqKey="Creevey C">C Creevey</name>
</author>
<author>
<name sortKey="Jossinet, F" uniqKey="Jossinet F">F Jossinet</name>
</author>
<author>
<name sortKey="Barton, Gj" uniqKey="Barton G">GJ Barton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pavlopoulos, Ga" uniqKey="Pavlopoulos G">GA Pavlopoulos</name>
</author>
<author>
<name sortKey="Soldatos, Tg" uniqKey="Soldatos T">TG Soldatos</name>
</author>
<author>
<name sortKey="Barbosa Silva, A" uniqKey="Barbosa Silva A">A Barbosa-Silva</name>
</author>
<author>
<name sortKey="Schneider, R" uniqKey="Schneider R">R Schneider</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, Jm" uniqKey="Chen J">JM Chen</name>
</author>
<author>
<name sortKey="Sun, Yx" uniqKey="Sun Y">YX Sun</name>
</author>
<author>
<name sortKey="Chen, Jw" uniqKey="Chen J">JW Chen</name>
</author>
<author>
<name sortKey="Liu, S" uniqKey="Liu S">S Liu</name>
</author>
<author>
<name sortKey="Yu, Jm" uniqKey="Yu J">JM Yu</name>
</author>
<author>
<name sortKey="Shen, Cj" uniqKey="Shen C">CJ Shen</name>
</author>
<author>
<name sortKey="Sun, Xd" uniqKey="Sun X">XD Sun</name>
</author>
<author>
<name sortKey="Peng, D" uniqKey="Peng D">D Peng</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Garten, Rj" uniqKey="Garten R">RJ Garten</name>
</author>
<author>
<name sortKey="Davis, Ct" uniqKey="Davis C">CT Davis</name>
</author>
<author>
<name sortKey="Russell, Ca" uniqKey="Russell C">CA Russell</name>
</author>
<author>
<name sortKey="Shu, B" uniqKey="Shu B">B Shu</name>
</author>
<author>
<name sortKey="Lindstrom, S" uniqKey="Lindstrom S">S Lindstrom</name>
</author>
<author>
<name sortKey="Balish, A" uniqKey="Balish A">A Balish</name>
</author>
<author>
<name sortKey="Sessions, Wm" uniqKey="Sessions W">WM Sessions</name>
</author>
<author>
<name sortKey="Xu, X" uniqKey="Xu X">X Xu</name>
</author>
<author>
<name sortKey="Skepner, E" uniqKey="Skepner E">E Skepner</name>
</author>
<author>
<name sortKey="Deyde, V" uniqKey="Deyde V">V Deyde</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, Gj" uniqKey="Smith G">GJ Smith</name>
</author>
<author>
<name sortKey="Vijaykrishna, D" uniqKey="Vijaykrishna D">D Vijaykrishna</name>
</author>
<author>
<name sortKey="Bahl, J" uniqKey="Bahl J">J Bahl</name>
</author>
<author>
<name sortKey="Lycett, Sj" uniqKey="Lycett S">SJ Lycett</name>
</author>
<author>
<name sortKey="Worobey, M" uniqKey="Worobey M">M Worobey</name>
</author>
<author>
<name sortKey="Pybus, Og" uniqKey="Pybus O">OG Pybus</name>
</author>
<author>
<name sortKey="Ma, Sk" uniqKey="Ma S">SK Ma</name>
</author>
<author>
<name sortKey="Cheung, Cl" uniqKey="Cheung C">CL Cheung</name>
</author>
<author>
<name sortKey="Raghwani, J" uniqKey="Raghwani J">J Raghwani</name>
</author>
<author>
<name sortKey="Bhatt, S" uniqKey="Bhatt S">S Bhatt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Olsen, B" uniqKey="Olsen B">B Olsen</name>
</author>
<author>
<name sortKey="Munster, Vj" uniqKey="Munster V">VJ Munster</name>
</author>
<author>
<name sortKey="Wallensten, A" uniqKey="Wallensten A">A Wallensten</name>
</author>
<author>
<name sortKey="Waldenstrom, J" uniqKey="Waldenstrom J">J Waldenstrom</name>
</author>
<author>
<name sortKey="Osterhaus, Ad" uniqKey="Osterhaus A">AD Osterhaus</name>
</author>
<author>
<name sortKey="Fouchier, Ra" uniqKey="Fouchier R">RA Fouchier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nelson, Mi" uniqKey="Nelson M">MI Nelson</name>
</author>
<author>
<name sortKey="Viboud, C" uniqKey="Viboud C">C Viboud</name>
</author>
<author>
<name sortKey="Simonsen, L" uniqKey="Simonsen L">L Simonsen</name>
</author>
<author>
<name sortKey="Bennett, Rt" uniqKey="Bennett R">RT Bennett</name>
</author>
<author>
<name sortKey="Griesemer, Sb" uniqKey="Griesemer S">SB Griesemer</name>
</author>
<author>
<name sortKey="St George, K" uniqKey="St George K">K St George</name>
</author>
<author>
<name sortKey="Taylor, J" uniqKey="Taylor J">J Taylor</name>
</author>
<author>
<name sortKey="Spiro, Dj" uniqKey="Spiro D">DJ Spiro</name>
</author>
<author>
<name sortKey="Sengamalay, Na" uniqKey="Sengamalay N">NA Sengamalay</name>
</author>
<author>
<name sortKey="Ghedin, E" uniqKey="Ghedin E">E Ghedin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Liu, S" uniqKey="Liu S">S Liu</name>
</author>
<author>
<name sortKey="Ji, K" uniqKey="Ji K">K Ji</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Tai, D" uniqKey="Tai D">D Tai</name>
</author>
<author>
<name sortKey="Jiang, W" uniqKey="Jiang W">W Jiang</name>
</author>
<author>
<name sortKey="Hou, G" uniqKey="Hou G">G Hou</name>
</author>
<author>
<name sortKey="Li, J" uniqKey="Li J">J Li</name>
</author>
<author>
<name sortKey="Huang, B" uniqKey="Huang B">B Huang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Higgins, Dg" uniqKey="Higgins D">DG Higgins</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, Dj" uniqKey="Smith D">DJ Smith</name>
</author>
<author>
<name sortKey="Lapedes, As" uniqKey="Lapedes A">AS Lapedes</name>
</author>
<author>
<name sortKey="De Jong, Jc" uniqKey="De Jong J">JC de Jong</name>
</author>
<author>
<name sortKey="Bestebroer, Tm" uniqKey="Bestebroer T">TM Bestebroer</name>
</author>
<author>
<name sortKey="Rimmelzwaan, Gf" uniqKey="Rimmelzwaan G">GF Rimmelzwaan</name>
</author>
<author>
<name sortKey="Osterhaus, Ad" uniqKey="Osterhaus A">AD Osterhaus</name>
</author>
<author>
<name sortKey="Fouchier, Ra" uniqKey="Fouchier R">RA Fouchier</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wong, Eh" uniqKey="Wong E">EH Wong</name>
</author>
<author>
<name sortKey="Smith, Dk" uniqKey="Smith D">DK Smith</name>
</author>
<author>
<name sortKey="Rabadan, R" uniqKey="Rabadan R">R Rabadan</name>
</author>
<author>
<name sortKey="Peiris, M" uniqKey="Peiris M">M Peiris</name>
</author>
<author>
<name sortKey="Poon, Ll" uniqKey="Poon L">LL Poon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Martinetz, Tm" uniqKey="Martinetz T">TM Martinetz</name>
</author>
<author>
<name sortKey="Berkovich, Sg" uniqKey="Berkovich S">SG Berkovich</name>
</author>
<author>
<name sortKey="Schulten, Kj" uniqKey="Schulten K">KJ Schulten</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dawood, Fs" uniqKey="Dawood F">FS Dawood</name>
</author>
<author>
<name sortKey="Jain, S" uniqKey="Jain S">S Jain</name>
</author>
<author>
<name sortKey="Finelli, L" uniqKey="Finelli L">L Finelli</name>
</author>
<author>
<name sortKey="Shaw, Mw" uniqKey="Shaw M">MW Shaw</name>
</author>
<author>
<name sortKey="Lindstrom, S" uniqKey="Lindstrom S">S Lindstrom</name>
</author>
<author>
<name sortKey="Garten, Rj" uniqKey="Garten R">RJ Garten</name>
</author>
<author>
<name sortKey="Gubareva, Lv" uniqKey="Gubareva L">LV Gubareva</name>
</author>
<author>
<name sortKey="Xu, X" uniqKey="Xu X">X Xu</name>
</author>
<author>
<name sortKey="Bridges, Cb" uniqKey="Bridges C">CB Bridges</name>
</author>
<author>
<name sortKey="Uyeki, Tm" uniqKey="Uyeki T">TM Uyeki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Basler, Cf" uniqKey="Basler C">CF Basler</name>
</author>
<author>
<name sortKey="Reid, Ah" uniqKey="Reid A">AH Reid</name>
</author>
<author>
<name sortKey="Dybing, Jk" uniqKey="Dybing J">JK Dybing</name>
</author>
<author>
<name sortKey="Janczewski, Ta" uniqKey="Janczewski T">TA Janczewski</name>
</author>
<author>
<name sortKey="Fanning, Tg" uniqKey="Fanning T">TG Fanning</name>
</author>
<author>
<name sortKey="Zheng, H" uniqKey="Zheng H">H Zheng</name>
</author>
<author>
<name sortKey="Salvatore, M" uniqKey="Salvatore M">M Salvatore</name>
</author>
<author>
<name sortKey="Perdue, Ml" uniqKey="Perdue M">ML Perdue</name>
</author>
<author>
<name sortKey="Swayne, De" uniqKey="Swayne D">DE Swayne</name>
</author>
<author>
<name sortKey="Garcia Sastre, A" uniqKey="Garcia Sastre A">A Garcia-Sastre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reid, Ah" uniqKey="Reid A">AH Reid</name>
</author>
<author>
<name sortKey="Fanning, Tg" uniqKey="Fanning T">TG Fanning</name>
</author>
<author>
<name sortKey="Janczewski, Ta" uniqKey="Janczewski T">TA Janczewski</name>
</author>
<author>
<name sortKey="Lourens, Rm" uniqKey="Lourens R">RM Lourens</name>
</author>
<author>
<name sortKey="Taubenberger, Jk" uniqKey="Taubenberger J">JK Taubenberger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reid, Ah" uniqKey="Reid A">AH Reid</name>
</author>
<author>
<name sortKey="Fanning, Tg" uniqKey="Fanning T">TG Fanning</name>
</author>
<author>
<name sortKey="Janczewski, Ta" uniqKey="Janczewski T">TA Janczewski</name>
</author>
<author>
<name sortKey="Mccall, S" uniqKey="Mccall S">S McCall</name>
</author>
<author>
<name sortKey="Taubenberger, Jk" uniqKey="Taubenberger J">JK Taubenberger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Reid, Ah" uniqKey="Reid A">AH Reid</name>
</author>
<author>
<name sortKey="Taubenberger, Jk" uniqKey="Taubenberger J">JK Taubenberger</name>
</author>
<author>
<name sortKey="Fanning, Tg" uniqKey="Fanning T">TG Fanning</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taubenberger, Jk" uniqKey="Taubenberger J">JK Taubenberger</name>
</author>
<author>
<name sortKey="Reid, Ah" uniqKey="Reid A">AH Reid</name>
</author>
<author>
<name sortKey="Krafft, Ae" uniqKey="Krafft A">AE Krafft</name>
</author>
<author>
<name sortKey="Bijwaard, Ke" uniqKey="Bijwaard K">KE Bijwaard</name>
</author>
<author>
<name sortKey="Fanning, Tg" uniqKey="Fanning T">TG Fanning</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Taubenberger, Jk" uniqKey="Taubenberger J">JK Taubenberger</name>
</author>
<author>
<name sortKey="Reid, Ah" uniqKey="Reid A">AH Reid</name>
</author>
<author>
<name sortKey="Lourens, Rm" uniqKey="Lourens R">RM Lourens</name>
</author>
<author>
<name sortKey="Wang, R" uniqKey="Wang R">R Wang</name>
</author>
<author>
<name sortKey="Jin, G" uniqKey="Jin G">G Jin</name>
</author>
<author>
<name sortKey="Fanning, Tg" uniqKey="Fanning T">TG Fanning</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tumpey, Tm" uniqKey="Tumpey T">TM Tumpey</name>
</author>
<author>
<name sortKey="Basler, Cf" uniqKey="Basler C">CF Basler</name>
</author>
<author>
<name sortKey="Aguilar, Pv" uniqKey="Aguilar P">PV Aguilar</name>
</author>
<author>
<name sortKey="Zeng, H" uniqKey="Zeng H">H Zeng</name>
</author>
<author>
<name sortKey="Solorzano, A" uniqKey="Solorzano A">A Solorzano</name>
</author>
<author>
<name sortKey="Swayne, De" uniqKey="Swayne D">DE Swayne</name>
</author>
<author>
<name sortKey="Cox, Nj" uniqKey="Cox N">NJ Cox</name>
</author>
<author>
<name sortKey="Katz, Jm" uniqKey="Katz J">JM Katz</name>
</author>
<author>
<name sortKey="Taubenberger, Jk" uniqKey="Taubenberger J">JK Taubenberger</name>
</author>
<author>
<name sortKey="Palese, P" uniqKey="Palese P">P Palese</name>
</author>
<author>
<name sortKey="Garcia Sastre, A" uniqKey="Garcia Sastre A">A Garcia-Sastre</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lu, G" uniqKey="Lu G">G Lu</name>
</author>
<author>
<name sortKey="Rowley, T" uniqKey="Rowley T">T Rowley</name>
</author>
<author>
<name sortKey="Garten, R" uniqKey="Garten R">R Garten</name>
</author>
<author>
<name sortKey="Donis, Ro" uniqKey="Donis R">RO Donis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sammon, Jw" uniqKey="Sammon J">JW Sammon</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Felsenstein, J" uniqKey="Felsenstein J">J Felsenstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Smith, Rf" uniqKey="Smith R">RF Smith</name>
</author>
<author>
<name sortKey="Smith, Tf" uniqKey="Smith T">TF Smith</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lio, P" uniqKey="Lio P">P Lio</name>
</author>
<author>
<name sortKey="Goldman, N" uniqKey="Goldman N">N Goldman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Jones, Dt" uniqKey="Jones D">DT Jones</name>
</author>
<author>
<name sortKey="Taylor, Wr" uniqKey="Taylor W">WR Taylor</name>
</author>
<author>
<name sortKey="Thornton, Jm" uniqKey="Thornton J">JM Thornton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gower, Jc" uniqKey="Gower J">JC Gower</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chaudhuri, Bb" uniqKey="Chaudhuri B">BB Chaudhuri</name>
</author>
<author>
<name sortKey="Dutta, S" uniqKey="Dutta S">S Dutta</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y Bao</name>
</author>
<author>
<name sortKey="Bolotov, P" uniqKey="Bolotov P">P Bolotov</name>
</author>
<author>
<name sortKey="Dernovoy, D" uniqKey="Dernovoy D">D Dernovoy</name>
</author>
<author>
<name sortKey="Kiryutin, B" uniqKey="Kiryutin B">B Kiryutin</name>
</author>
<author>
<name sortKey="Zaslavsky, L" uniqKey="Zaslavsky L">L Zaslavsky</name>
</author>
<author>
<name sortKey="Tatusova, T" uniqKey="Tatusova T">T Tatusova</name>
</author>
<author>
<name sortKey="Ostell, J" uniqKey="Ostell J">J Ostell</name>
</author>
<author>
<name sortKey="Lipman, D" uniqKey="Lipman D">D Lipman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chang, S" uniqKey="Chang S">S Chang</name>
</author>
<author>
<name sortKey="Zhang, J" uniqKey="Zhang J">J Zhang</name>
</author>
<author>
<name sortKey="Liao, X" uniqKey="Liao X">X Liao</name>
</author>
<author>
<name sortKey="Zhu, X" uniqKey="Zhu X">X Zhu</name>
</author>
<author>
<name sortKey="Wang, D" uniqKey="Wang D">D Wang</name>
</author>
<author>
<name sortKey="Zhu, J" uniqKey="Zhu J">J Zhu</name>
</author>
<author>
<name sortKey="Feng, T" uniqKey="Feng T">T Feng</name>
</author>
<author>
<name sortKey="Zhu, B" uniqKey="Zhu B">B Zhu</name>
</author>
<author>
<name sortKey="Gao, Gf" uniqKey="Gao G">GF Gao</name>
</author>
<author>
<name sortKey="Wang, J" uniqKey="Wang J">J Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Edgar, Rc" uniqKey="Edgar R">RC Edgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sheerar, Mg" uniqKey="Sheerar M">MG Sheerar</name>
</author>
<author>
<name sortKey="Easterday, Bc" uniqKey="Easterday B">BC Easterday</name>
</author>
<author>
<name sortKey="Hinshaw, Vs" uniqKey="Hinshaw V">VS Hinshaw</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bachmaier, C" uniqKey="Bachmaier C">C Bachmaier</name>
</author>
<author>
<name sortKey="Brandes, U" uniqKey="Brandes U">U Brandes</name>
</author>
<author>
<name sortKey="Schlieper, B" uniqKey="Schlieper B">B Schlieper</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Webby, Rj" uniqKey="Webby R">RJ Webby</name>
</author>
<author>
<name sortKey="Swenson, Sl" uniqKey="Swenson S">SL Swenson</name>
</author>
<author>
<name sortKey="Krauss, Sl" uniqKey="Krauss S">SL Krauss</name>
</author>
<author>
<name sortKey="Gerrish, Pj" uniqKey="Gerrish P">PJ Gerrish</name>
</author>
<author>
<name sortKey="Goyal, Sm" uniqKey="Goyal S">SM Goyal</name>
</author>
<author>
<name sortKey="Webster, Rg" uniqKey="Webster R">RG Webster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhou, Nn" uniqKey="Zhou N">NN Zhou</name>
</author>
<author>
<name sortKey="Senne, Da" uniqKey="Senne D">DA Senne</name>
</author>
<author>
<name sortKey="Landgraf, Js" uniqKey="Landgraf J">JS Landgraf</name>
</author>
<author>
<name sortKey="Swenson, Sl" uniqKey="Swenson S">SL Swenson</name>
</author>
<author>
<name sortKey="Erickson, G" uniqKey="Erickson G">G Erickson</name>
</author>
<author>
<name sortKey="Rossow, K" uniqKey="Rossow K">K Rossow</name>
</author>
<author>
<name sortKey="Liu, L" uniqKey="Liu L">L Liu</name>
</author>
<author>
<name sortKey="Yoon, K" uniqKey="Yoon K">K Yoon</name>
</author>
<author>
<name sortKey="Krauss, S" uniqKey="Krauss S">S Krauss</name>
</author>
<author>
<name sortKey="Webster, Rg" uniqKey="Webster R">RG Webster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kawaoka, Y" uniqKey="Kawaoka Y">Y Kawaoka</name>
</author>
<author>
<name sortKey="Gorman, Ot" uniqKey="Gorman O">OT Gorman</name>
</author>
<author>
<name sortKey="Ito, T" uniqKey="Ito T">T Ito</name>
</author>
<author>
<name sortKey="Wells, K" uniqKey="Wells K">K Wells</name>
</author>
<author>
<name sortKey="Donis, Ro" uniqKey="Donis R">RO Donis</name>
</author>
<author>
<name sortKey="Castrucci, Mr" uniqKey="Castrucci M">MR Castrucci</name>
</author>
<author>
<name sortKey="Donatelli, I" uniqKey="Donatelli I">I Donatelli</name>
</author>
<author>
<name sortKey="Webster, Rg" uniqKey="Webster R">RG Webster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kawaoka, Y" uniqKey="Kawaoka Y">Y Kawaoka</name>
</author>
<author>
<name sortKey="Krauss, S" uniqKey="Krauss S">S Krauss</name>
</author>
<author>
<name sortKey="Webster, Rg" uniqKey="Webster R">RG Webster</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shortridge, Kf" uniqKey="Shortridge K">KF Shortridge</name>
</author>
<author>
<name sortKey="Webster, Rg" uniqKey="Webster R">RG Webster</name>
</author>
<author>
<name sortKey="Butterfield, Wk" uniqKey="Butterfield W">WK Butterfield</name>
</author>
<author>
<name sortKey="Campbell, Ch" uniqKey="Campbell C">CH Campbell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Webster, Rg" uniqKey="Webster R">RG Webster</name>
</author>
<author>
<name sortKey="Bean, Wj" uniqKey="Bean W">WJ Bean</name>
</author>
<author>
<name sortKey="Gorman, Ot" uniqKey="Gorman O">OT Gorman</name>
</author>
<author>
<name sortKey="Chambers, Tm" uniqKey="Chambers T">TM Chambers</name>
</author>
<author>
<name sortKey="Kawaoka, Y" uniqKey="Kawaoka Y">Y Kawaoka</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Munzner, T" uniqKey="Munzner T">T Munzner</name>
</author>
<author>
<name sortKey="Guimbretiere, F" uniqKey="Guimbretiere F">F Guimbretiere</name>
</author>
<author>
<name sortKey="Tasiran, S" uniqKey="Tasiran S">S Tasiran</name>
</author>
<author>
<name sortKey="Zhang, L" uniqKey="Zhang L">L Zhang</name>
</author>
<author>
<name sortKey="Zhou, Yh" uniqKey="Zhou Y">YH Zhou</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Keim, D" uniqKey="Keim D">D Keim</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Santamaria, R" uniqKey="Santamaria R">R Santamaria</name>
</author>
<author>
<name sortKey="Theron, R" uniqKey="Theron R">R Theron</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zaslavsky, L" uniqKey="Zaslavsky L">L Zaslavsky</name>
</author>
<author>
<name sortKey="Bao, Y" uniqKey="Bao Y">Y Bao</name>
</author>
<author>
<name sortKey="Tatusova, Ta" uniqKey="Tatusova T">TA Tatusova</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guiller, A" uniqKey="Guiller A">A Guiller</name>
</author>
<author>
<name sortKey="Bellido, A" uniqKey="Bellido A">A Bellido</name>
</author>
<author>
<name sortKey="Madec, L" uniqKey="Madec L">L Madec</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tenenbaum, Jb" uniqKey="Tenenbaum J">JB Tenenbaum</name>
</author>
<author>
<name sortKey="De Silva, V" uniqKey="De Silva V">V de Silva</name>
</author>
<author>
<name sortKey="Langford, Jc" uniqKey="Langford J">JC Langford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roweis, St" uniqKey="Roweis S">ST Roweis</name>
</author>
<author>
<name sortKey="Saul, Lk" uniqKey="Saul L">LK Saul</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Erdos, Pl" uniqKey="Erdos P">PL Erdös</name>
</author>
<author>
<name sortKey="Steel, Ma" uniqKey="Steel M">MA Steel</name>
</author>
<author>
<name sortKey="Szekely, La" uniqKey="Szekely L">LA Székely</name>
</author>
<author>
<name sortKey="Warnow, Tj" uniqKey="Warnow T">TJ Warnow</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bininda Emonds, Or" uniqKey="Bininda Emonds O">OR Bininda-Emonds</name>
</author>
<author>
<name sortKey="Brady, Sg" uniqKey="Brady S">SG Brady</name>
</author>
<author>
<name sortKey="Kim, J" uniqKey="Kim J">J Kim</name>
</author>
<author>
<name sortKey="Sanderson, Mj" uniqKey="Sanderson M">MJ Sanderson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lecointre, G" uniqKey="Lecointre G">G Lecointre</name>
</author>
<author>
<name sortKey="Philippe, H" uniqKey="Philippe H">H Philippe</name>
</author>
<author>
<name sortKey="Van Le, Hl" uniqKey="Van Le H">HL Vân Lê</name>
</author>
<author>
<name sortKey="Le Guyader, H" uniqKey="Le Guyader H">H Le Guyader</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rannala, B" uniqKey="Rannala B">B Rannala</name>
</author>
<author>
<name sortKey="Huelsenbeck, Jp" uniqKey="Huelsenbeck J">JP Huelsenbeck</name>
</author>
<author>
<name sortKey="Yang, Z" uniqKey="Yang Z">Z Yang</name>
</author>
<author>
<name sortKey="Nielsen, R" uniqKey="Nielsen R">R Nielsen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Graybeal, A" uniqKey="Graybeal A">A Graybeal</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wortley, Ah" uniqKey="Wortley A">AH Wortley</name>
</author>
<author>
<name sortKey="Rudall, Pj" uniqKey="Rudall P">PJ Rudall</name>
</author>
<author>
<name sortKey="Harris, Dj" uniqKey="Harris D">DJ Harris</name>
</author>
<author>
<name sortKey="Scotland, Rw" uniqKey="Scotland R">RW Scotland</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hedtke, Sm" uniqKey="Hedtke S">SM Hedtke</name>
</author>
<author>
<name sortKey="Townsend, Tm" uniqKey="Townsend T">TM Townsend</name>
</author>
<author>
<name sortKey="Hillis, Dm" uniqKey="Hillis D">DM Hillis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Furuse, Y" uniqKey="Furuse Y">Y Furuse</name>
</author>
<author>
<name sortKey="Suzuki, A" uniqKey="Suzuki A">A Suzuki</name>
</author>
<author>
<name sortKey="Kamigaki, T" uniqKey="Kamigaki T">T Kamigaki</name>
</author>
<author>
<name sortKey="Oshitani, H" uniqKey="Oshitani H">H Oshitani</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Macken, Ca" uniqKey="Macken C">CA Macken</name>
</author>
<author>
<name sortKey="Webby, Rj" uniqKey="Webby R">RJ Webby</name>
</author>
<author>
<name sortKey="Bruno, Wj" uniqKey="Bruno W">WJ Bruno</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schweiger, B" uniqKey="Schweiger B">B Schweiger</name>
</author>
<author>
<name sortKey="Bruns, L" uniqKey="Bruns L">L Bruns</name>
</author>
<author>
<name sortKey="Meixenberger, K" uniqKey="Meixenberger K">K Meixenberger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chen, Lm" uniqKey="Chen L">LM Chen</name>
</author>
<author>
<name sortKey="Davis, Ct" uniqKey="Davis C">CT Davis</name>
</author>
<author>
<name sortKey="Zhou, H" uniqKey="Zhou H">H Zhou</name>
</author>
<author>
<name sortKey="Cox, Nj" uniqKey="Cox N">NJ Cox</name>
</author>
<author>
<name sortKey="Donis, Ro" uniqKey="Donis R">RO Donis</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stover, Bc" uniqKey="Stover B">BC Stover</name>
</author>
<author>
<name sortKey="Muller, Kf" uniqKey="Muller K">KF Muller</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">BMC Bioinformatics</journal-id>
<journal-title-group>
<journal-title>BMC Bioinformatics</journal-title>
</journal-title-group>
<issn pub-type="epub">1471-2105</issn>
<publisher>
<publisher-name>BioMed Central</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">21689434</article-id>
<article-id pub-id-type="pmc">3142226</article-id>
<article-id pub-id-type="publisher-id">1471-2105-12-248</article-id>
<article-id pub-id-type="doi">10.1186/1471-2105-12-248</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Research Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" id="A1">
<name>
<surname>Zhang</surname>
<given-names>Jiajie</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>zhangjiajie@biochem.uni-luebeck.de</email>
</contrib>
<contrib contrib-type="author" id="A2">
<name>
<surname>Mamlouk</surname>
<given-names>Amir Madany</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>madany@inb.uni-luebeck.de</email>
</contrib>
<contrib contrib-type="author" id="A3">
<name>
<surname>Martinetz</surname>
<given-names>Thomas</given-names>
</name>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I3">3</xref>
<email>martinetz@inb.uni-luebeck.de</email>
</contrib>
<contrib contrib-type="author" id="A4">
<name>
<surname>Chang</surname>
<given-names>Suhua</given-names>
</name>
<xref ref-type="aff" rid="I4">4</xref>
<email>changsh@psych.ac.cn</email>
</contrib>
<contrib contrib-type="author" id="A5">
<name>
<surname>Wang</surname>
<given-names>Jing</given-names>
</name>
<xref ref-type="aff" rid="I4">4</xref>
<email>wangjing@psych.ac.cn</email>
</contrib>
<contrib contrib-type="author" corresp="yes" id="A6">
<name>
<surname>Hilgenfeld</surname>
<given-names>Rolf</given-names>
</name>
<xref ref-type="aff" rid="I1">1</xref>
<xref ref-type="aff" rid="I2">2</xref>
<xref ref-type="aff" rid="I5">5</xref>
<xref ref-type="aff" rid="I6">6</xref>
<email>hilgenfeld@biochem.uni-luebeck.de</email>
</contrib>
</contrib-group>
<aff id="I1">
<label>1</label>
Institute of Biochemistry, Center for Structural and Cell Biology in Medicine, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</aff>
<aff id="I2">
<label>2</label>
Graduate School for Computing in Medicine and Life Sciences, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</aff>
<aff id="I3">
<label>3</label>
Institute for Neuro- and Bioinformatics, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany</aff>
<aff id="I4">
<label>4</label>
Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China</aff>
<aff id="I5">
<label>5</label>
Laboratory for Structural Biology of Infection and Inflammation, c/o DESY, Building 22a, Notkestr. 85, 22603 Hamburg, Germany</aff>
<aff id="I6">
<label>6</label>
Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zu Chong Zhi Rd., Shanghai 201203, China</aff>
<pub-date pub-type="collection">
<year>2011</year>
</pub-date>
<pub-date pub-type="epub">
<day>20</day>
<month>6</month>
<year>2011</year>
</pub-date>
<volume>12</volume>
<fpage>248</fpage>
<lpage>248</lpage>
<history>
<date date-type="received">
<day>19</day>
<month>1</month>
<year>2011</year>
</date>
<date date-type="accepted">
<day>20</day>
<month>6</month>
<year>2011</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright ©2011 Zhang et al; licensee BioMed Central Ltd.</copyright-statement>
<copyright-year>2011</copyright-year>
<copyright-holder>Zhang et al; licensee BioMed Central Ltd.</copyright-holder>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.0">
<license-p>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
<ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/2.0">http://creativecommons.org/licenses/by/2.0</ext-link>
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.biomedcentral.com/1471-2105/12/248"></self-uri>
<abstract>
<sec>
<title>Background</title>
<p>Results of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few thousand sequences are to be included, analyzing the phylogenetic relationships among them becomes a challenging task. The recent frequent outbreaks of influenza A viruses have resulted in the rapid accumulation of corresponding genome sequences. Currently, there are more than 7500 influenza A virus genomes in the database. There are no efficient ways of representing this huge data set as a whole, thus preventing a further understanding of the diversity of the influenza A virus genome.</p>
</sec>
<sec>
<title>Results</title>
<p>Here we present a new algorithm, "PhyloMap", which combines ordination, vector quantization, and phylogenetic tree construction to give an elegant representation of a large sequence data set. The use of PhyloMap on influenza A virus genome sequences reveals the phylogenetic relationships of the internal genes that cannot be seen when only a subset of sequences are analyzed.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>The application of PhyloMap to influenza A virus genome data shows that it is a robust algorithm for analyzing large sequence data sets. It utilizes the entire data set, minimizes bias, and provides intuitive visualization. PhyloMap is implemented in JAVA, and the source code is freely available at
<ext-link ext-link-type="uri" xlink:href="http://www.biochem.uni-luebeck.de/public/software/phylomap.html">http://www.biochem.uni-luebeck.de/public/software/phylomap.html</ext-link>
</p>
</sec>
</abstract>
</article-meta>
</front>
<body>
<sec>
<title>Background</title>
<p>Phylogenetic trees are commonly used as a visualization tool [
<xref ref-type="bibr" rid="B1">1</xref>
] to help reveal the relationships among homologous sequences. When the number of sequences is limited, the relationships can be clearly observed from the tree; however, when more than a few thousand sequences are to be included, not only the accuracy of the inferred phylogenetic trees decreases, but it also becomes increasingly difficult to study the resulting trees and find patterns [
<xref ref-type="bibr" rid="B2">2</xref>
], and the computational demands of building a huge phylogenetic tree tend to be staggering. Researchers usually build a tree by sampling a small amount of data rather than constructing a complete tree using the entire dataset [
<xref ref-type="bibr" rid="B3">3</xref>
-
<xref ref-type="bibr" rid="B8">8</xref>
]. However, the sampling is generally done according to the experience of the researcher and is sometimes arbitrary. The conclusions drawn from such trees may be biased.</p>
<p>Higgins used Principal Coordinate Analysis (PCoA) [
<xref ref-type="bibr" rid="B9">9</xref>
] to visualize large sequence data sets, which are difficult to analyze using phylogenetic trees. He showed that PCoA can be considered complementary to phylogenetic tree analysis as it does not assume an underlying hierarchical structure in the data. A similar multidimensional scaling method was used by Smith et al [
<xref ref-type="bibr" rid="B10">10</xref>
] to analyze the antigenic and genetic evolution of influenza A virus. Wong et al. [
<xref ref-type="bibr" rid="B11">11</xref>
] used correspondence analysis to show the codon usage biases of influenza A virus. Ordination (i.e. displaying a set of data points in two or three dimensions so as to make the relationships among the points in higher dimensional space visible) has proved to be a powerful tool to visualize large datasets with high dimensionalities; nevertheless, it only preserves the main trends in the data but most of the information on detail gets lost. When the intrinsic dimensions of the data set are high, the results can sometimes be misleading.</p>
<p>Here, we present a new method - Phylogenetic Map (PhyloMap) - that combines PCoA, vector quantization, and phylogenetic tree construction to give an elegant visualization of a large sequence data set using all the data while still trying to capture the accurate relationships among them. Compared to traditional phylogenetic tree analysis, which is practicable only with a maximum of a few hundred sequences, PhyloMap can handle thousands of sequences at one time. PhyloMap first uses PCoA to help depict the main trends and then uses the "Neural-Gas" approach [
<xref ref-type="bibr" rid="B12">12</xref>
] to obtain multiple data centers which best represent the data set. The resulting data centers will be used to build a phylogenetic tree. Finally, we map the tree onto the PCoA result by preserving the tree topology and the distances. As the two different visualizations are superimposed, the resulting plot can greatly reduce the risk of misinterpretation.</p>
<p>Influenza A viruses are commonly classified by serological differences in their hemagglutinin (HA) and neuraminidase (NA) proteins. The gene sequences between different HAs or NAs are also significantly divergent and can be easily classified by serological type. However, the recent emergence of the 2009 H1N1 swine-origin human influenza A (H1N1) virus (S-OIV) [
<xref ref-type="bibr" rid="B13">13</xref>
] demonstrates that this classification has its limitations: "H1N1" is the designation for one of the two established seasonal subtypes as well as for the highly pathogenic 1918 virus that caused the "Spanish flu" pandemic [
<xref ref-type="bibr" rid="B14">14</xref>
-
<xref ref-type="bibr" rid="B20">20</xref>
], and for the currently spreading new swine-origin virus [
<xref ref-type="bibr" rid="B5">5</xref>
]. While a better classification is obviously needed [
<xref ref-type="bibr" rid="B3">3</xref>
,
<xref ref-type="bibr" rid="B21">21</xref>
], the cluster patterns of the internal genes (PB2, PB1, PA, NP, M1, M2, NS1, and NS2) of influenza A virus are less clear. We applied PhyloMap to influenza A virus internal genes, using all publicly available sequences. The results reveal patterns in those genes that cannot be seen when only a subset of sequences is analyzed, and can help us better characterize the diversity of influenza A virus genomes by considering not only the serological type differences but also the internal genes.</p>
</sec>
<sec sec-type="methods">
<title>Methods</title>
<sec>
<title>The PhyloMap algorithm</title>
<p>The input to PhyloMap is a set of aligned sequences, either amino acids or nucleotides. The algorithm involves five steps as shown in Figure
<xref ref-type="fig" rid="F1">1</xref>
. First, a distance matrix is calculated using the input alignment. This distance matrix will serve as the input to PCoA and Neural-Gas to get the principal coordinates of each sequence and
<italic>k </italic>
sequences as cluster centers, where
<italic>k </italic>
is defined by the user. Subsequently, the
<italic>k </italic>
sequences selected by the clustering algorithm will be used to build a phylogenetic tree. Finally, we adopted a multidimensional scaling technique similar to "Sammon's mapping" [
<xref ref-type="bibr" rid="B22">22</xref>
] to map the phylogenetic tree onto the first two axes of the principal coordinates. The results can then be plotted for inspection.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>Flow chart of the PhyloMap algorithm</bold>
.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-1"></graphic>
</fig>
<sec>
<title>1. Distance Matrix</title>
<p>The idea of ordination is to map the input sequences onto a low-dimensional space so that the distances and relationships of the sequence set are preserved as much as possible. In order to do that, one has to calculate a distance matrix
<italic>D </italic>
which contains the distances between each pair of sequences. The distance matrix is calculated by the "Phylip" package [
<xref ref-type="bibr" rid="B23">23</xref>
] using a continuous-time Markov process. Higgins [
<xref ref-type="bibr" rid="B9">9</xref>
] suggested several ways of calculating distances that will be guaranteed to be Euclidean such as the simple P-distance and using the Smith & Smith matrix [
<xref ref-type="bibr" rid="B24">24</xref>
]. However, none of these measurements can correct multiple substitutions, and they do not follow any evolutionary model. The distances inferred by the continuous-time Markov process [
<xref ref-type="bibr" rid="B25">25</xref>
] are not Euclidean but are close to P-distance when the sequence divergence is small. As the purpose of PCoA is to find the main trends rather than accurately reconstruct the distances between sequences in the lower dimensional space, the effect of non-Euclidean distances can be neglected. For the influenza A internal protein sequences analyzed here, the Jones-Taylor-Thornton [
<xref ref-type="bibr" rid="B26">26</xref>
] model is used to infer the distances.</p>
</sec>
<sec>
<title>2. Principal coordinate analysis</title>
<p>PCoA was first described by Gower [
<xref ref-type="bibr" rid="B27">27</xref>
]. Starting by converting the
<italic>n </italic>
×
<italic>n </italic>
distance matrix
<italic>D</italic>
, which has elements
<italic>d</italic>
<sub>
<italic>ij</italic>
</sub>
, to the similarity matrix
<italic>E </italic>
with elements
<disp-formula id="bmcM1">
<label>(1)</label>
<graphic xlink:href="1471-2105-12-248-i1.gif"></graphic>
</disp-formula>
</p>
<p>
<italic>E </italic>
is then centralized so that we have matrix
<italic>F </italic>
with elements
<disp-formula id="bmcM2">
<label>(2)</label>
<graphic xlink:href="1471-2105-12-248-i2.gif"></graphic>
</disp-formula>
</p>
<p>where
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i3.gif"></inline-graphic>
</inline-formula>
is the mean of row
<italic>i</italic>
,
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i4.gif"></inline-graphic>
</inline-formula>
is the mean of column
<italic>j</italic>
, and
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i5.gif"></inline-graphic>
</inline-formula>
is the grand mean of the matrix
<italic>E</italic>
.</p>
<p>The eigenvectors and the eigenvalues of the matrix
<italic>F </italic>
are calculated. Each eigenvector is normalized so that its sum of squares equals the corresponding eigenvalue. The eigenvectors are ranked according to the eigenvalue in a decreasing order. The first two eigenvectors are used as the two-dimensional coordinates of each sequence. The information (variation) preserved by the first two eigenvectors is the ratio of the sum of the first two eigenvalues to the sum of all eigenvalues.</p>
</sec>
<sec>
<title>3. Vector quantization (Clustering)</title>
<p>The clustering algorithm we choose here is the "Neural-Gas" [
<xref ref-type="bibr" rid="B12">12</xref>
]. The Neural-Gas proceeds similar to k-means but has the nice feature of providing results which hardly depend on the initialization. Therefore, performing only one run is sufficient and the algorithm yields stable results when run multiple times. The output of the clustering algorithm is a set of
<italic>k </italic>
cluster centers, where
<italic>k </italic>
is defined by the user. The Neural-Gas provides cluster centers each of which minimize the mean distance to the sequences it represents. However, we are not really searching for clusters. What we want is a set of sequences that best represent the data set. Therefore, finally, we substitute each cluster center by its closest sequence. The Neural-Gas will also guarantee that the centers are evenly distributed across the entire data set. In this application of Neural-Gas, we consider the algorithm as a sampling rather than as a clustering method. When using the resulting center sequences to build a phylogenetic tree, the tree will explore the variation of the data set without bias. For details of the algorithm, please refer to Martinetz
<italic>et al. </italic>
[
<xref ref-type="bibr" rid="B12">12</xref>
]. The number of sampling sequences might influence the accuracy of the inferred phylogenetic tree (see Discussion). For visualization purposes, it should not be too low, or else the sampling sequences would not be sufficient to represent the variation of the data. If chosen too high, the result of PhyloMap might be difficult to inspect visually. In practice, we found a sampling tree with no more than 50 sequences can be shown clearly in PhyloMap.</p>
</sec>
<sec>
<title>4. Phylogenetic tree construction</title>
<p>Subsequently, we use the sequences selected by the Neural-Gas to build a phylogenetic tree. The Neighbor-joining (NJ) tree is used in PhyloMap with the same distance measurement used for calculating the distance matrix for PCoA. Other non-distance-based tree building methods can also be used (see the discussion below). The NJ tree is unrooted since we just want to find the major lineages of the sequences rather than to portray the exact evolutionary history.</p>
</sec>
<sec>
<title>5. Mapping the phylogenetic tree onto the PCoA result</title>
<p>The core algorithm of PhyloMap is to map the phylogenetic tree onto the two-dimensional coordinates calculated by PCoA. We adopted a multidimensional scaling method (MDS) similar to "Sammon's mapping" [
<xref ref-type="bibr" rid="B22">22</xref>
], but a few changes have been made to fit our specific problem.</p>
<p>A phylogenetic tree has two types of nodes:</p>
<p>• Leaf nodes: nodes that do not have any children; each node represents a sequence.</p>
<p>• Inner nodes: nodes have children nodes and a parent node. The root node of the tree can be considered a special inner node that has no parent node.</p>
<p>Each leaf node corresponds to one point in the two-dimensional PCoA result. The positions of these points are fixed, which means the coordinates of the leaf nodes are predefined and cannot be changed when drawing the tree. If we want to preserve the edge length between nodes, only the inner nodes can be moved. Unlike other MDS problems where the distances of one data point to all other data points are known, in PhyloMap each inner node is only constrained by three other nodes: one parent node and two children nodes.</p>
<p>We first define an error function
<italic>E</italic>
<sub>
<italic>s </italic>
</sub>
similar to "Sammon's mapping":
<disp-formula id="bmcM3">
<label>(3)</label>
<graphic xlink:href="1471-2105-12-248-i6.gif"></graphic>
</disp-formula>
</p>
<p>where
<italic>s </italic>
is a scaling factor that compensates for the distance difference between the tree space and the PCoA space (if the same distance measurement is used both in PCoA and tree building, then
<italic>s </italic>
= 1),
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i7.gif"></inline-graphic>
</inline-formula>
is the edge length between node
<italic>s </italic>
and node
<italic>j </italic>
in the tree, and
<italic>d</italic>
<sub>
<italic>ij </italic>
</sub>
is the distance between node
<italic>i </italic>
and node
<italic>j </italic>
in the 2D PCoA result.</p>
<p>The algorithm will then employ gradient descent on the inner nodes to minimize
<italic> E</italic>
<sub>
<italic>s</italic>
</sub>
. The distance
<italic>d</italic>
<sub>
<italic>ij </italic>
</sub>
defined between node
<italic>i </italic>
and
<italic>j </italic>
is the straight-line distance. However, in our problem, the straight-line distance can only generate poor results, either large
<italic>E</italic>
<sub>
<italic>s </italic>
</sub>
or a plot that is difficult to inspect visually. This is because the leaf nodes cannot move and, hence, all the distance constraints have to be satisfied by the inner nodes. If the inner nodes only explore a small space, which will provide attractive visual results,
<italic> E</italic>
<sub>
<italic>s </italic>
</sub>
might be too large to accurately preserve the tree distances. To solve this problem, we use the Bezier curve [
<xref ref-type="bibr" rid="B28">28</xref>
] to compensate for the distances that are shorter than in the original tree. In this case, if the distances are shorter, they can be exactly preserved in the PhyloMap. Only the distances larger than in the original tree will contribute to the error (Figure
<xref ref-type="fig" rid="F2">2C</xref>
). So in the gradient descent procedures, we use a strategy which tries to keep most of the straight-line distances shorter than in the original by updating the longer distances more frequently than the shorter ones. The error function
<italic>E</italic>
<sub>
<italic>b </italic>
</sub>
after Bezier curve compensation is defined as:
<disp-formula id="bmcM4">
<label>(4)</label>
<graphic xlink:href="1471-2105-12-248-i8.gif"></graphic>
</disp-formula>
</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>NP PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 2984 NP protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 56.7% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00259. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of NP protein sequences built using distances inferred by the JTT model; 40 sequences have been selected by PhyloMap as data centers, the other two sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown. The tree was annotated using "TreeGraph 2" [
<xref ref-type="bibr" rid="B58">58</xref>
]. (
<italic>C</italic>
) The relationships of the distances between nodes in the original phylogenetic tree (
<italic>B</italic>
) and in the PhyloMap after mapping (
<italic>A</italic>
). Correlation coefficient: 0.998. Errors before Bezier curve compensation: 0.0496, after Bezier curve compensation: 0.00259. The errors after Bezier curve compensation are caused by the distances that are longer in the PhyloMap than in the original tree.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-2"></graphic>
</fig>
<p>where
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i9.gif"></inline-graphic>
</inline-formula>
is the length of the Bezier curve between node
<italic>i </italic>
and node
<italic>j</italic>
.</p>
<p>The algorithm can be summarized as follows:</p>
<p>
<bold>Input</bold>
: tree:
<italic>T</italic>
; leaf-node coordinates:
<italic>C</italic>
<sub>
<italic>leaf</italic>
</sub>
; scaling factor:
<italic>s</italic>
; max. number of iterations:
<italic>maxiters</italic>
; error
<italic>e</italic>
.</p>
<p>
<bold>Output</bold>
: all node coordinates
<italic>C</italic>
<sub>
<italic>node</italic>
</sub>
, corresponding Bezier curve control point
<italic>C</italic>
<sub>
<italic>bezier </italic>
</sub>
and error
<italic>E</italic>
<sub>
<italic>b </italic>
</sub>
after Bezier curve compensation.</p>
<p>1:
<italic>Du </italic>
:= calculate the desired distance matrix using all nodes in
<italic>T</italic>
.</p>
<p>2:
<italic>C</italic>
<sub>
<italic>node </italic>
</sub>
:= randomly initializing the coordinates of the inner nodes and attach
<italic>C</italic>
<sub>
<italic>leaf</italic>
</sub>
.</p>
<p>3:
<italic>D</italic>
<sub>
<italic>s </italic>
</sub>
:= calculate the actual distance matrix using
<italic>C</italic>
<sub>
<italic>node</italic>
</sub>
.</p>
<p>4:
<bold>while </bold>
<italic>maxiters </italic>
is not reached or
<italic>e</italic>
<sub>
<italic>i </italic>
</sub>
<italic>e</italic>
</p>
<p>5:
<bold>for each </bold>
inner node</p>
<p>6: update the coordinate of the inner node using gradient decent once every five iters.</p>
<p>7: update the coordinates of the inner node using gradient decent only if</p>
<p>there exists at least one edge connected to this node with
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i10.gif"></inline-graphic>
</inline-formula>
four times every five iters.</p>
<p>8: update
<italic>D</italic>
<sub>
<italic>s </italic>
</sub>
using the new coordinates.</p>
<p>9:
<bold>end for each</bold>
</p>
<p>10:
<italic>e</italic>
<sub>
<italic>i </italic>
</sub>
:= calculate error using equation (3).</p>
<p>11:
<bold>end while</bold>
</p>
<p>12:
<bold>for each </bold>
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i11.gif"></inline-graphic>
</inline-formula>
</p>
<p>13:
<italic>C</italic>
<sub>
<italic>bezier </italic>
</sub>
:= calculate the Bezier curve control point so that
<inline-formula>
<inline-graphic xlink:href="1471-2105-12-248-i12.gif"></inline-graphic>
</inline-formula>
.</p>
<p>14:
<bold>end for each</bold>
</p>
<p>15:
<italic>E</italic>
<sub>
<italic>b </italic>
</sub>
:= calculate error using equation (4).</p>
</sec>
</sec>
<sec>
<title>Influenza A virus genome data</title>
<p>We compiled a data set containing 74,309 sequences of influenza A virus internal proteins as available from the NCBI database [
<xref ref-type="bibr" rid="B29">29</xref>
] on 03-01-2010 (as summarized in Table
<xref ref-type="table" rid="T1">1</xref>
). We defined strict rules [
<xref ref-type="bibr" rid="B30">30</xref>
] for data validation to ensure a high quality of our dataset. Each sequence included in the data set is complete or nearly complete.</p>
<table-wrap id="T1" position="float">
<label>Table 1</label>
<caption>
<p>Number of protein sequences used in the data set</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th></th>
<th align="left">PB2</th>
<th align="left">PB1</th>
<th align="left">PA</th>
<th align="left">NP</th>
<th align="left">M1</th>
<th align="left">M2</th>
<th align="left">NS1</th>
<th align="left">NS2</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">No. of sequences</td>
<td align="left">8397</td>
<td align="left">8577</td>
<td align="left">8522</td>
<td align="left">8590</td>
<td align="left">11258</td>
<td align="left">10111</td>
<td align="left">9982</td>
<td align="left">8872</td>
</tr>
<tr>
<td align="left">No. of non-redundant sequences</td>
<td align="left">4384</td>
<td align="left">4022</td>
<td align="left">4173</td>
<td align="left">2984</td>
<td align="left">1496</td>
<td align="left">2016</td>
<td align="left">3734</td>
<td align="left">1650</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>All eight gene products were aligned separately using MUSCLE [
<xref ref-type="bibr" rid="B31">31</xref>
], and the alignment results were curated manually to assure a high quality such that gaps were minimal. For calculating the distance matrix (described above), protein sequences were used. The reason to use protein instead of nucleotide sequences is that while at the nucleotide level, two sequences may vary greatly, they may be very close at the amino-acid level due to functional restraints [
<xref ref-type="bibr" rid="B15">15</xref>
,
<xref ref-type="bibr" rid="B19">19</xref>
]; thus, the distance between two amino-acid sequences is more relevant for the assessment of their functional differences. For most of the internal genes, around half of the protein sequences are redundant. Hence, only one of a set of identical sequences was used to compose the data set as the input of the PhyloMap.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<sec>
<title>PhyloMap reduces the risk of misinterpretation</title>
<p>We have generated the PhyloMap for all influenza A virus internal genes using their protein sequences, i.e. PB2, PB1, PA, NP, M1, M2, NS1, and NS2 (Figures
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
,
<xref ref-type="fig" rid="F7">7</xref>
,
<xref ref-type="fig" rid="F8">8</xref>
,
<xref ref-type="fig" rid="F9">9</xref>
,
<xref ref-type="fig" rid="F10">10</xref>
and
<xref ref-type="fig" rid="F11">11</xref>
). Figure
<xref ref-type="fig" rid="F2">2A</xref>
illustrates the results for the example of the influenza A virus NP gene. The following major lineages can be easily identified:
<italic>(i)</italic>
, seasonal human H1N1 (as shown by the data points close to "12: A/Taiwan/5072/1999(H1N1)"),
<italic>(ii)</italic>
, seasonal human H3N2 (as shown by the data points close to "2: A/Waikato/122/2003(H3N2)"),
<italic>(iii)</italic>
, early human (as shown by the data points close to "15: A/United Kingdom/1/1933(H1N1)"),
<italic>(iv)</italic>
, classical swine [
<xref ref-type="bibr" rid="B32">32</xref>
] (as shown by the data points close to "26: A/Swine/Wisconsin/163/97(H1N1)", which includes S-OIV),
<italic>(v)</italic>
, equine (as shown by the data points close to "15: A/United Kingdom/1/1933(H1N1)"), and
<italic>(vi)</italic>
, avian (as shown by the data points close to "20: A/gray teal/Australia/2/1979(H4N4)"). PhyloMap has successfully captured all major lineages of the influenza A virus NP gene that were shown to exist in a previous study [
<xref ref-type="bibr" rid="B3">3</xref>
] using sequences sampled manually.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>PB2 PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 4384 PB2 protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 49.6% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00527. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of PA protein sequences built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-3"></graphic>
</fig>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>PB1 PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 4022 PB1 protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 41.4% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00261. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of PB1 protein sequences built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-4"></graphic>
</fig>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption>
<p>
<bold>PA PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 4173 PA protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 47.6% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00253. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of PA protein sequences built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-5"></graphic>
</fig>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption>
<p>
<bold>M1 PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 1496 M1 protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 43.3% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00531. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of M1 protein sequences built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-6"></graphic>
</fig>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption>
<p>
<bold>M2 PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 2016 M2 protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 44.9% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.013. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of M2 protein sequences built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-7"></graphic>
</fig>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption>
<p>
<bold>NS1 PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 3734 NS1 protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 60.1% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.0023. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of NS1 protein sequences built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree is mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-8"></graphic>
</fig>
<fig id="F9" position="float">
<label>Figure 9</label>
<caption>
<p>
<bold>NS1 PhyloMap excluding Group B</bold>
. (
<italic>A</italic>
) The PhyloMap for 3283 NS1 protein sequences excluding Group B. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 48.2% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00252. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of NS1 protein sequences excluding Group B built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-9"></graphic>
</fig>
<fig id="F10" position="float">
<label>Figure 10</label>
<caption>
<p>
<bold>NS2 PhyloMap</bold>
. (
<italic>A</italic>
) The PhyloMap for 1650 NS2 protein sequences. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 52.7% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00485. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of NS2 protein sequences built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-10"></graphic>
</fig>
<fig id="F11" position="float">
<label>Figure 11</label>
<caption>
<p>
<bold>NS2 PhyloMap excluding Group B</bold>
. (
<italic>A</italic>
) The PhyloMap for 1471 NS2 protein sequences excluding Group B. Each spot in the plot corresponds to one sequence, and the first two dimensions represent 38.7% of the total variation. The phylogenetic tree mapped onto the plot is shown in (
<italic>B</italic>
). The mapping error is 0.00427. The strain names that stand for the numbers in the plot are shown in the phylogenetic tree in (
<italic>B</italic>
). (
<italic>B</italic>
) The NJ tree of NS2 protein sequences excluding Group B built using distances inferred by the JTT model, 40 sequences have been selected by PhyloMap as data centers, the other 2 sequences (in bold italics) have been added manually. This tree has been mapped onto the PCoA result as shown in (
<italic>A</italic>
). Bootstrap values (1000 replications) for key nodes are shown.</p>
</caption>
<graphic xlink:href="1471-2105-12-248-11"></graphic>
</fig>
<p>It is obvious that PCoA alone can already identify most of the major lineages; however, without the support of the mapping tree, it fails to portray the distances between some strains. The straight-line distance between "29: A/equine/Sao Paulo/4/1976(H7N7)" and "33: A/smew/Sweden/V820/2006(H5N1)" is short, but if we follow the tree, the distance is substantially longer. The real distance may need another dimension in the PCoA to be displayed. The tree here has served to add more dimensions to the 2D PCoA plot.</p>
<p>While the topology of the tree is defined, different tree-drawing algorithms can generate very different tree representations. The subtrees can be arbitrarily placed by the tree-drawing algorithms [
<xref ref-type="bibr" rid="B33">33</xref>
] and can be moved up and down with a certain degree of freedom. The relationships between taxas usually cannot be clearly observed without further manually adjusting the tree. PCoA here has defined the positions of the leaf nodes in PhyloMap, which intuitively provide clustering information and the scale of their divergences. In a phylogenetic tree, some intermediate sequences would be arbitrarily placed into one of the major lineages [
<xref ref-type="bibr" rid="B9">9</xref>
]; however, with the guiding of PCoA, the intermediate position of such sequences becomes apparent. For example in Figure
<xref ref-type="fig" rid="F2">2B</xref>
, we might interpret the phylogenetic tree by putting the protein sequence of "9: A/Singapore/1-MA12B/1957(H2N2)" into the human H3N2 lineage if only the tree is present, but its obvious intermediate position can be clearly seen in the PhyloMap (Figure
<xref ref-type="fig" rid="F2">2A</xref>
). The low bootstrap value of that subtree also suggests caution should be applied when drawing conclusions from the phylogenetic tree.</p>
</sec>
<sec>
<title>The diversity of influenza A virus internal genes</title>
<p>Six distinct major lineages can be identified from the PhyloMap for all genes, i.e. seasonal human H3N2, seasonal human H1N1, early human, classical swine, equine, and avian viruses. The latter have been further separated into two sublineages (western hemisphere avian lineage and eastern hemisphere avian lineage) in a previous study [
<xref ref-type="bibr" rid="B3">3</xref>
] that used nucleotide sequences, but this cannot be unambiguously observed from the PhyloMap built with protein sequences. For PB2, the triple reassortment swine strains [
<xref ref-type="bibr" rid="B34">34</xref>
,
<xref ref-type="bibr" rid="B35">35</xref>
], which include the S-OIV, form a visually separable lineage in PhyloMap (Figure
<xref ref-type="fig" rid="F3">3</xref>
).</p>
<p>The PhyloMap shows similar patterns for PB2, PA, NP, M1, and M2 (Figures
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
and
<xref ref-type="fig" rid="F7">7</xref>
). The NS1 and NS2 genes are different from other genes by having a unique lineage called Group B. We can see from the PhyloMap plot (Figures
<xref ref-type="fig" rid="F8">8</xref>
and
<xref ref-type="fig" rid="F10">10</xref>
) that NS1 and NS2 Group B has a clear boundary and is far away from other sequences, which are collectively called Group A [
<xref ref-type="bibr" rid="B36">36</xref>
]. Because of Group B, the NS1 and NS2 PhyloMap looks very different from other genes. However, if we remove Group B sequences from the NS1 and NS2 data set and recalculate the plot (Figures
<xref ref-type="fig" rid="F9">9</xref>
and
<xref ref-type="fig" rid="F11">11</xref>
), we can see a topology similar to other genes (Figures
<xref ref-type="fig" rid="F2">2</xref>
,
<xref ref-type="fig" rid="F3">3</xref>
,
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F5">5</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
and
<xref ref-type="fig" rid="F7">7</xref>
). NS1 and NS2 Group B is composed of a variety of subtypes that are mostly avian strains, with only a few human and swine cases. The sample time spans the years from 1949 to 2008. However, other internal genes in the strains that contain Group B NS1 and NS2 genes do not form a separate lineage and most of them fall into the lineage of avian viruses.</p>
<p>PB1 also shows a pattern very different from other genes. PB1 of human H3N2 was derived from avian strains in 1968 through reassortment [
<xref ref-type="bibr" rid="B3">3</xref>
,
<xref ref-type="bibr" rid="B37">37</xref>
]. We can see from the PhyloMap that the human H3N2 virus PB1 sequences are closer to avian strains than other H3N2 genes. Moreover, PB1 shows a more conservative evolution pattern, as the genetic distances between different lineages are much smaller than for other internal genes. Another recent study also suggested the conservation of PB1 [
<xref ref-type="bibr" rid="B19">19</xref>
]. This is easy to explain, as PB1 is the catalytic subunit of the viral RNA-dependent RNA polymerase and should have a stable function in any host. A single amino-acid exchange in the functional site may abolish protein function and interrupt the viral life cycle.</p>
<p>The swine influenza viruses spread throughout the entire PhyloMap, further supporting the idea of swine being a "mixing-vessel" [
<xref ref-type="bibr" rid="B38">38</xref>
,
<xref ref-type="bibr" rid="B39">39</xref>
]. We also observed from the current sequenced samples that there are no avian strains containing internal gene segments from seasonal human strains. In contrast, there are many human strains carrying some internal gene segments from avian viruses. This observation combined with the seasonal human strain internal gene segments can be clearly separated from avian strains (except for PB1), suggesting that once the internal gene segments were fully adapted to man, they lost the ability to infect avian hosts.</p>
<p>By observing the first few dimensions of PCoA results, one can tell what are the major forces causing the data to variate from each other. We can see that the first dimension in our PCoA results on the internal genes generally reflects the host differences, and the second dimension reflects some of the subtype differences. The third dimension (not shown in the figures) further separates the swine and equine strains from others. The above observations show that the diversities of influenza A virus internal genes are mainly shaped by host differences and virus subtypes. However, using only subtype and host information is still not enough to distinguish major lineages among internal genes. For instance, the human H1N1 strains contain three major lineages: human seasonal H1N1, early human H1N1, and 2009 pandemic H1N1. These are highlighted in additional files (Additional files
<xref ref-type="supplementary-material" rid="S1">1</xref>
, Figure S1, Additional files
<xref ref-type="supplementary-material" rid="S2">2</xref>
, Figure S2, Additional files
<xref ref-type="supplementary-material" rid="S3">3</xref>
, Figure S3, Additional files
<xref ref-type="supplementary-material" rid="S4">4</xref>
, Figure S4, Additional files
<xref ref-type="supplementary-material" rid="S5">5</xref>
, Figure S5, Additional files
<xref ref-type="supplementary-material" rid="S6">6</xref>
, Figure S6, Additional files
<xref ref-type="supplementary-material" rid="S7">7</xref>
, Figure S7, Additional files
<xref ref-type="supplementary-material" rid="S8">8</xref>
, Figure S8 and Additional files
<xref ref-type="supplementary-material" rid="S9">9</xref>
).</p>
</sec>
<sec>
<title>PhyloMap helps locating the origin of emerging influenza A virus</title>
<p>As the main patterns of influenza A internal genes can be clearly seen from the PhyloMap result, one can start to investigate the more subtle relationships of the data by zooming in onto certain clusters or adding sequences of interest into the sampling tree. The sequences of the sampling tree found by the Neural-Gas approach minimize the quadratic errors. As a result, they can well represent the diversity of the data set. When it comes to finding the origin of a new strain, the samplings can provide a good reference data set that would not miss important lineages. We have mapped the genes of 1918 "Spanish flu" ("
<bold>
<italic>A/Brevig Mission/1/1918(H1N1)</italic>
</bold>
") and S-OIV ("
<bold>
<italic>A/California/04/2009(H1N1)</italic>
</bold>
") into the PhyloMap in addition to the sampling sequences. In our sampling trees, the "Spanish flu" (internal genes) forms a separate branch and cannot be put into any major lineages. This orphan position of "Spanish flu" seems to support the previous notion that these gene segments may have been acquired from a reservoir of influenza virus that has not yet been sampled [
<xref ref-type="bibr" rid="B17">17</xref>
,
<xref ref-type="bibr" rid="B18">18</xref>
]. One can also easily identify the origin of every internal gene of S-OIV from PhyloMap: PB2, PA, M1, and M2 from avian strains; PB1 from human H3N2; NP, NS1, and NS2 from classical swine.</p>
</sec>
</sec>
<sec>
<title>Discussion</title>
<p>While phylogenetic tree inference methods are relatively well developed, their interpretation relies heavily on visual inspection [
<xref ref-type="bibr" rid="B40">40</xref>
]. The difficulties of analyzing a huge tree have been mainly tackled by developing sophisticated tree visualization software. Visual data exploration usually follows a three-step process [
<xref ref-type="bibr" rid="B41">41</xref>
]: overview, zoom and filter, and details-on-demand. Despite advances in the visualization software [
<xref ref-type="bibr" rid="B42">42</xref>
,
<xref ref-type="bibr" rid="B43">43</xref>
], it is very difficult to comprehend the entire tree during the overview stage. When the data set reaches a few thousand sequences, this way of phylogeny analysis becomes almost impossible. PhyloMap was developed specifically for the overview process by summarizing the main phylogeny information. Both PCoA and "Neural-Gas" can be considered data compression techniques suitable to preserve the most important information in the data. Once the main trends in the data set are identified, one can zoom in onto areas of interest, thus reducing the data set to a size that can be well visualized by traditional phylogenetic trees.</p>
<p>Other means of adding more information to ordination such as superimposing a minimal spanning tree and a relative neighborhood graph have been proposed by Guiller [
<xref ref-type="bibr" rid="B44">44</xref>
]. However, all those methods require using all the data points, thereby only generating unrecognizable results when the data set is large. Our proposed method can also serve as a general way of adding another layer of information to any ordination analysis of data relationships that can alternatively be described by using a tree structure.</p>
<p>The PCoA used here is a linear dimensionality-reduction technique [
<xref ref-type="bibr" rid="B45">45</xref>
,
<xref ref-type="bibr" rid="B46">46</xref>
]. Despite the recent advances in nonlinear dimensionality reduction, we find PCoA very suitable for PhyloMap. First, PCoA finds the greatest variance in the data set; in other words, it preserves the global pattern and this is one of the main purposes of PhyloMap. Other methods such as Isomap [
<xref ref-type="bibr" rid="B45">45</xref>
] using geodesic distance might not make too much sense in phylogenetic analysis. Methods such as LLE [
<xref ref-type="bibr" rid="B46">46</xref>
] are designed to preserve local properties which is obviously not suitable for PhyloMap. Second, PCoA is robust in the sense that it does not depend on the initiation and does not require other parameters. The well-established algorithm for solving PCoA is both computationally efficient and numerically stable. Although the phylogenetic distances inferred using some evolutionary models are not Euclidean, resulting in negative eigenvalues, in practice, those values are usually very small compared to the first few eigenvalues. Thus, they have only minor influence on the results and will not distort the main trends in the data.</p>
<p>In PhyloMap, we use distance-based methods to build the sampling tree. As the distances are measured in the same way both in PCoA and in the phylogenetic tree, when mapping the tree onto the PCoA result, the error can be minimized. However, the sampling tree can also be built with parsimony-based or maximum-likelihood based methods. But in such cases, the edge lengths in the tree and the 2D PCoA result might not be on the same scale. We need to estimate the scaling factor
<italic>s </italic>
in equation (4). It is very difficult to exactly estimate
<italic>s </italic>
before the mapping is made, so
<italic>s </italic>
can only be searched within a certain range (The ratio of the distance between the furthest cluster centers in the PCoA result and the corresponding length in the tree can be a good starting value). This problem does not exist in classical MDS, since all the data points during the mapping can move freely, but in PhyloMap, the leaf nodes are fixed.</p>
<p>The accuracy of an inferred phylogenetic tree depends on many factors such as the number of sequences, number of characters (number of aligned positions), and substitution rate. In general, the accuracy of the inferred phylogenetic tree increases while more characters are used [
<xref ref-type="bibr" rid="B47">47</xref>
,
<xref ref-type="bibr" rid="B48">48</xref>
]. However, there are also many debates on whether to increase the number of sequences or the number of characters to improve the resolution of the phylogenetic analysis. In the case that the number of available characters to build the phylogenetic tree is fixed such as for the internal genes of influenza A virus, one might choose a small number of sequences to derive the most reliable tree. There are two interesting questions connected with this approach: how to choose the sequences, i.e. which sampling methods to apply, and how many sequences are needed given the number of characters. As for the sampling, we believe that clustering methods such as Neural-Gas should be used in order to avoid bias to arise from manual sampling, although some criteria should be developed to further test the influence of different clustering methods on the accuracy of the inferred tree. But an objective way of finding the optimal number of sequences is still lacking, and further theoretical and empirical studies are needed.</p>
</sec>
<sec>
<title>Conclusions</title>
<p>PhyloMap is a robust algorithm for analyzing phylogenetic relationships in large sequence data sets. It can utilize the entire data set and avoids the bias introduced by manual samplings. PhyloMap introduces two data compression techniques (dimensionality reduction and vector quantization) into phylogenetic studies to reduce the data without losing important information. The visualizations generated summarize the main phylogeny information and overcome the shortcomings of phylogenetic tree construction and ordination analysis when used alone.</p>
<p>There have been only a few studies targeting the phylogenetic diversity of the internal genes of influenza A virus [
<xref ref-type="bibr" rid="B3">3</xref>
,
<xref ref-type="bibr" rid="B8">8</xref>
,
<xref ref-type="bibr" rid="B54">54</xref>
]. However, the phylogenetic trees built in some of these studies only sampled a small portion of the data and therefore might not reflect the actual size and composition of the lineages, and the representative sequences might be biased [
<xref ref-type="bibr" rid="B3">3</xref>
]. PhyloMap gives a more comprehensive overall picture of the evolution of influenza A viruses and may further help define a new nomenclature system for influenza A viruses.</p>
<p>Research on influenza A viruses has suggested that they are constantly undergoing frequent reassortment [
<xref ref-type="bibr" rid="B55">55</xref>
,
<xref ref-type="bibr" rid="B56">56</xref>
]. However, as the overall phylogenetic relationships of the internal genes have been largely unknown so far, few studies have addressed the scale of reassortment and the patterns of segment compatibility in cases where the reassortment occurred between distant lineages [
<xref ref-type="bibr" rid="B57">57</xref>
]. Furthermore, a robust way of identifying reassorted strains is lacking. When a new strain emerges, it is a tedious job for researchers to compare different topologies of various phylogenetic trees to find the reassortment patterns. We are confident that PhyloMap can help develop new insights into the relationships between the internal genes, in order to find new means of studying reassortment.</p>
<p>PhyloMap is implemented in JAVA, and the source code is freely available for download at
<ext-link ext-link-type="uri" xlink:href="http://www.biochem.uni-luebeck.de/public/software/phylomap.html">http://www.biochem.uni-luebeck.de/public/software/phylomap.html</ext-link>
To visualize the results, some Matlab routines are also available from the above link.</p>
</sec>
<sec>
<title>Competing interests</title>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<title>Authors' contributions</title>
<p>JZ designed and implemented the PhyloMap algorithm, analyzed the data and drafted the manuscript, AMM participated in designing the PhyloMap algorithm, TM evaluated the algorithm and drafted the manuscript, SC participated in cleaning and analyzing the data, JW participated in drafting the manuscript, RH designed the research, analyzed the data and drafted the manuscript. All authors read and approved the final manuscript.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Supplementary Material</title>
<supplementary-material content-type="local-data" id="S1">
<caption>
<title>Additional file 1</title>
<p>
<bold>NP PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of NP PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S1.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S2">
<caption>
<title>Additional file 2</title>
<p>
<bold>PB2 PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of PB2 PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S2.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S3">
<caption>
<title>Additional file 3</title>
<p>
<bold>PB1 PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of PB1 PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S3.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S4">
<caption>
<title>Additional file 4</title>
<p>
<bold>PA PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of PA PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S4.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S5">
<caption>
<title>Additional file 5</title>
<p>
<bold>M1 PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of M1 PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S5.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S6">
<caption>
<title>Additional file 6</title>
<p>
<bold>M2 PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of M2 PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S6.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S7">
<caption>
<title>Additional file 7</title>
<p>
<bold>NS1 PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of NS1 PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S7.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S8">
<caption>
<title>Additional file 8</title>
<p>
<bold>NS2 PhyloMap highlights human H1N1 influenza A virus</bold>
. The figure of NS2 PhyloMap highlights human H1N1 influenza A virus</p>
</caption>
<media xlink:href="1471-2105-12-248-S8.TIFF" mimetype="image" mime-subtype="tiff">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="S9">
<caption>
<title>Additional file 9</title>
<p>
<bold>Figure legend</bold>
. The figure legend for Additional file
<xref ref-type="supplementary-material" rid="S1">1</xref>
,
<xref ref-type="supplementary-material" rid="S2">2</xref>
,
<xref ref-type="supplementary-material" rid="S3">3</xref>
,
<xref ref-type="supplementary-material" rid="S4">4</xref>
,
<xref ref-type="supplementary-material" rid="S5">5</xref>
,
<xref ref-type="supplementary-material" rid="S6">6</xref>
,
<xref ref-type="supplementary-material" rid="S7">7</xref>
and
<xref ref-type="supplementary-material" rid="S8">8</xref>
.</p>
</caption>
<media xlink:href="1471-2105-12-248-S9.DOC" mimetype="application" mime-subtype="msword">
<caption>
<p>Click here for file</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back>
<sec>
<title>Acknowledgements</title>
<p>We acknowledge support by the Graduate School for Computing in Medicine and Life Sciences, University of Lübeck.</p>
<p>Funding: Germany's Excellence Initiative [DFG GSC 235/1]; International Consortium on Antivirals (
<ext-link ext-link-type="uri" xlink:href="http://www.icav-citav.ca">http://www.icav-citav.ca</ext-link>
). RH is supported by a Chinese Academy of Sciences Visiting Professorship for Senior International Scientists, grant no. 2010T1S6, and by the Fonds der Chemischen Industrie.</p>
</sec>
<ref-list>
<ref id="B1">
<mixed-citation publication-type="journal">
<name>
<surname>Procter</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Letunic</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Creevey</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Jossinet</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Barton</surname>
<given-names>GJ</given-names>
</name>
<article-title>Visualization of multiple alignments, phylogenies and gene family evolution</article-title>
<source>Nat Methods</source>
<year>2010</year>
<volume>7</volume>
<fpage>S16</fpage>
<lpage>25</lpage>
<pub-id pub-id-type="doi">10.1038/nmeth.1434</pub-id>
<pub-id pub-id-type="pmid">20195253</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<name>
<surname>Pavlopoulos</surname>
<given-names>GA</given-names>
</name>
<name>
<surname>Soldatos</surname>
<given-names>TG</given-names>
</name>
<name>
<surname>Barbosa-Silva</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Schneider</surname>
<given-names>R</given-names>
</name>
<article-title>A reference guide for tree analysis and visualization</article-title>
<source>BioData Min</source>
<year>2010</year>
<volume>3</volume>
<fpage>1</fpage>
<pub-id pub-id-type="doi">10.1186/1756-0381-3-1</pub-id>
<pub-id pub-id-type="pmid">20175922</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal">
<name>
<surname>Chen</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>YX</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>XD</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>D</given-names>
</name>
<article-title>Panorama phylogenetic diversity and distribution of type A influenza viruses based on their six internal gene sequences</article-title>
<source>Virol J</source>
<year>2009</year>
<volume>6</volume>
<fpage>137</fpage>
<pub-id pub-id-type="doi">10.1186/1743-422X-6-137</pub-id>
<pub-id pub-id-type="pmid">19737421</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<name>
<surname>Garten</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>CT</given-names>
</name>
<name>
<surname>Russell</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Shu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Lindstrom</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Balish</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Sessions</surname>
<given-names>WM</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Skepner</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Deyde</surname>
<given-names>V</given-names>
</name>
<etal></etal>
<article-title>Antigenic and genetic characteristics of swine-origin 2009 A(H1N1) influenza viruses circulating in humans</article-title>
<source>Science</source>
<year>2009</year>
<volume>325</volume>
<fpage>197</fpage>
<lpage>201</lpage>
<pub-id pub-id-type="doi">10.1126/science.1176225</pub-id>
<pub-id pub-id-type="pmid">19465683</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal">
<name>
<surname>Smith</surname>
<given-names>GJ</given-names>
</name>
<name>
<surname>Vijaykrishna</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Bahl</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lycett</surname>
<given-names>SJ</given-names>
</name>
<name>
<surname>Worobey</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Pybus</surname>
<given-names>OG</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>SK</given-names>
</name>
<name>
<surname>Cheung</surname>
<given-names>CL</given-names>
</name>
<name>
<surname>Raghwani</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bhatt</surname>
<given-names>S</given-names>
</name>
<etal></etal>
<article-title>Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic</article-title>
<source>Nature</source>
<year>2009</year>
<volume>459</volume>
<fpage>1122</fpage>
<lpage>1125</lpage>
<pub-id pub-id-type="doi">10.1038/nature08182</pub-id>
<pub-id pub-id-type="pmid">19516283</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<name>
<surname>Olsen</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Munster</surname>
<given-names>VJ</given-names>
</name>
<name>
<surname>Wallensten</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Waldenstrom</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Osterhaus</surname>
<given-names>AD</given-names>
</name>
<name>
<surname>Fouchier</surname>
<given-names>RA</given-names>
</name>
<article-title>Global patterns of influenza A virus in wild birds</article-title>
<source>Science</source>
<year>2006</year>
<volume>312</volume>
<fpage>384</fpage>
<lpage>388</lpage>
<pub-id pub-id-type="doi">10.1126/science.1122438</pub-id>
<pub-id pub-id-type="pmid">16627734</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="journal">
<name>
<surname>Nelson</surname>
<given-names>MI</given-names>
</name>
<name>
<surname>Viboud</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Simonsen</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bennett</surname>
<given-names>RT</given-names>
</name>
<name>
<surname>Griesemer</surname>
<given-names>SB</given-names>
</name>
<name>
<surname>St George</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Spiro</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Sengamalay</surname>
<given-names>NA</given-names>
</name>
<name>
<surname>Ghedin</surname>
<given-names>E</given-names>
</name>
<etal></etal>
<article-title>Multiple reassortment events in the evolutionary history of H1N1 influenza A virus since 1918</article-title>
<source>PLoS Pathog</source>
<year>2008</year>
<volume>4</volume>
<fpage>e1000012</fpage>
<pub-id pub-id-type="doi">10.1371/journal.ppat.1000012</pub-id>
<pub-id pub-id-type="pmid">18463694</pub-id>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal">
<name>
<surname>Liu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Ji</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Tai</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Hou</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>B</given-names>
</name>
<article-title>Panorama phylogenetic diversity and distribution of Type A influenza virus</article-title>
<source>PLoS One</source>
<year>2009</year>
<volume>4</volume>
<fpage>e5022</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0005022</pub-id>
<pub-id pub-id-type="pmid">19325912</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal">
<name>
<surname>Higgins</surname>
<given-names>DG</given-names>
</name>
<article-title>Sequence ordinations: a multivariate analysis approach to analysing large sequence data sets</article-title>
<source>Comput Appl Biosci</source>
<year>1992</year>
<volume>8</volume>
<fpage>15</fpage>
<lpage>22</lpage>
<pub-id pub-id-type="pmid">1568121</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<name>
<surname>Smith</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Lapedes</surname>
<given-names>AS</given-names>
</name>
<name>
<surname>de Jong</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Bestebroer</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Rimmelzwaan</surname>
<given-names>GF</given-names>
</name>
<name>
<surname>Osterhaus</surname>
<given-names>AD</given-names>
</name>
<name>
<surname>Fouchier</surname>
<given-names>RA</given-names>
</name>
<article-title>Mapping the antigenic and genetic evolution of influenza virus</article-title>
<source>Science</source>
<year>2004</year>
<volume>305</volume>
<fpage>371</fpage>
<lpage>376</lpage>
<pub-id pub-id-type="doi">10.1126/science.1097211</pub-id>
<pub-id pub-id-type="pmid">15218094</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<name>
<surname>Wong</surname>
<given-names>EH</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>DK</given-names>
</name>
<name>
<surname>Rabadan</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Peiris</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Poon</surname>
<given-names>LL</given-names>
</name>
<article-title>Codon usage bias and the evolution of influenza A viruses. Codon Usage Biases of Influenza Virus</article-title>
<source>BMC Evol Biol</source>
<year>2010</year>
<volume>10</volume>
<fpage>253</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2148-10-253</pub-id>
<pub-id pub-id-type="pmid">20723216</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="journal">
<name>
<surname>Martinetz</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Berkovich</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Schulten</surname>
<given-names>KJ</given-names>
</name>
<article-title>Neural-Gas network for vector quantization and Its application to time-series prediction</article-title>
<source>IEEE Trans Neural Networks</source>
<year>1993</year>
<volume>4</volume>
<fpage>558</fpage>
<lpage>569</lpage>
<pub-id pub-id-type="doi">10.1109/72.238311</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<name>
<surname>Dawood</surname>
<given-names>FS</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Finelli</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Shaw</surname>
<given-names>MW</given-names>
</name>
<name>
<surname>Lindstrom</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Garten</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Gubareva</surname>
<given-names>LV</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Bridges</surname>
<given-names>CB</given-names>
</name>
<name>
<surname>Uyeki</surname>
<given-names>TM</given-names>
</name>
<article-title>Emergence of a novel swine-origin influenza A (H1N1) virus in humans</article-title>
<source>N Engl J Med</source>
<year>2009</year>
<volume>360</volume>
<fpage>2605</fpage>
<lpage>2615</lpage>
<pub-id pub-id-type="pmid">19423869</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<name>
<surname>Basler</surname>
<given-names>CF</given-names>
</name>
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Dybing</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Janczewski</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Salvatore</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Perdue</surname>
<given-names>ML</given-names>
</name>
<name>
<surname>Swayne</surname>
<given-names>DE</given-names>
</name>
<name>
<surname>Garcia-Sastre</surname>
<given-names>A</given-names>
</name>
<etal></etal>
<article-title>Sequence of the 1918 pandemic influenza virus nonstructural gene (NS) segment and characterization of recombinant viruses bearing the 1918 NS genes</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>2001</year>
<volume>98</volume>
<fpage>2746</fpage>
<lpage>2751</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.031575198</pub-id>
<pub-id pub-id-type="pmid">11226311</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
<name>
<surname>Janczewski</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>Lourens</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<article-title>Novel origin of the 1918 pandemic influenza virus nucleoprotein gene</article-title>
<source>J Virol</source>
<year>2004</year>
<volume>78</volume>
<fpage>12462</fpage>
<lpage>12470</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.78.22.12462-12470.2004</pub-id>
<pub-id pub-id-type="pmid">15507633</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="journal">
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
<name>
<surname>Janczewski</surname>
<given-names>TA</given-names>
</name>
<name>
<surname>McCall</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<article-title>Characterization of the 1918 "Spanish" influenza virus matrix gene segment</article-title>
<source>J Virol</source>
<year>2002</year>
<volume>76</volume>
<fpage>10717</fpage>
<lpage>10723</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.76.21.10717-10723.2002</pub-id>
<pub-id pub-id-type="pmid">12368314</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
<article-title>Evidence of an absence: the genetic origins of the 1918 pandemic influenza virus</article-title>
<source>Nat Rev Microbiol</source>
<year>2004</year>
<volume>2</volume>
<fpage>909</fpage>
<lpage>914</lpage>
<pub-id pub-id-type="doi">10.1038/nrmicro1027</pub-id>
<pub-id pub-id-type="pmid">15494747</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal">
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Krafft</surname>
<given-names>AE</given-names>
</name>
<name>
<surname>Bijwaard</surname>
<given-names>KE</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
<article-title>Initial genetic characterization of the 1918 "Spanish" influenza virus</article-title>
<source>Science</source>
<year>1997</year>
<volume>275</volume>
<fpage>1793</fpage>
<lpage>1796</lpage>
<pub-id pub-id-type="doi">10.1126/science.275.5307.1793</pub-id>
<pub-id pub-id-type="pmid">9065404</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Reid</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Lourens</surname>
<given-names>RM</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Fanning</surname>
<given-names>TG</given-names>
</name>
<article-title>Characterization of the 1918 influenza virus polymerase genes</article-title>
<source>Nature</source>
<year>2005</year>
<volume>437</volume>
<fpage>889</fpage>
<lpage>893</lpage>
<pub-id pub-id-type="doi">10.1038/nature04230</pub-id>
<pub-id pub-id-type="pmid">16208372</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<name>
<surname>Tumpey</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Basler</surname>
<given-names>CF</given-names>
</name>
<name>
<surname>Aguilar</surname>
<given-names>PV</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Solorzano</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Swayne</surname>
<given-names>DE</given-names>
</name>
<name>
<surname>Cox</surname>
<given-names>NJ</given-names>
</name>
<name>
<surname>Katz</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Taubenberger</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Palese</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Garcia-Sastre</surname>
<given-names>A</given-names>
</name>
<article-title>Characterization of the reconstructed 1918 Spanish influenza pandemic virus</article-title>
<source>Science</source>
<year>2005</year>
<volume>310</volume>
<fpage>77</fpage>
<lpage>80</lpage>
<pub-id pub-id-type="doi">10.1126/science.1119392</pub-id>
<pub-id pub-id-type="pmid">16210530</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<name>
<surname>Lu</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Rowley</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Garten</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Donis</surname>
<given-names>RO</given-names>
</name>
<article-title>FluGenome: a web tool for genotyping influenza A virus</article-title>
<source>Nucleic Acids Res</source>
<year>2007</year>
<volume>35</volume>
<fpage>W275</fpage>
<lpage>279</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkm365</pub-id>
<pub-id pub-id-type="pmid">17537820</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="journal">
<name>
<surname>Sammon</surname>
<given-names>JW</given-names>
</name>
<article-title>A nonlinear mapping for data structure analysis</article-title>
<source>IEEE Trans on Computers</source>
<year>1969</year>
<volume>C 18</volume>
<fpage>401</fpage>
<lpage>409</lpage>
</mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="journal">
<name>
<surname>Felsenstein</surname>
<given-names>J</given-names>
</name>
<article-title>PHYLIP - Phylogeny Inference Package (Version 3.2)</article-title>
<source>Cladistics</source>
<year>1989</year>
<volume>5</volume>
<fpage>164</fpage>
<lpage>166</lpage>
</mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="journal">
<name>
<surname>Smith</surname>
<given-names>RF</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>TF</given-names>
</name>
<article-title>Automatic generation of primary sequence patterns from sets of related protein sequences</article-title>
<source>Proc Natl Acad Sci USA</source>
<year>1990</year>
<volume>87</volume>
<fpage>118</fpage>
<lpage>122</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.87.1.118</pub-id>
<pub-id pub-id-type="pmid">2296575</pub-id>
</mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal">
<name>
<surname>Lio</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Goldman</surname>
<given-names>N</given-names>
</name>
<article-title>Models of molecular evolution and phylogeny</article-title>
<source>Genome Res</source>
<year>1998</year>
<volume>8</volume>
<fpage>1233</fpage>
<lpage>1244</lpage>
<pub-id pub-id-type="pmid">9872979</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<mixed-citation publication-type="journal">
<name>
<surname>Jones</surname>
<given-names>DT</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>WR</given-names>
</name>
<name>
<surname>Thornton</surname>
<given-names>JM</given-names>
</name>
<article-title>The rapid generation of mutation data matrices from protein sequences</article-title>
<source>Comput Appl Biosci</source>
<year>1992</year>
<volume>8</volume>
<fpage>275</fpage>
<lpage>282</lpage>
<pub-id pub-id-type="pmid">1633570</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<mixed-citation publication-type="journal">
<name>
<surname>Gower</surname>
<given-names>JC</given-names>
</name>
<article-title>Some distance properties of latent root and vector methods used in multivariate analysis</article-title>
<source>Biometrika</source>
<year>1966</year>
<volume>53</volume>
<fpage>325</fpage>
<lpage>338</lpage>
</mixed-citation>
</ref>
<ref id="B28">
<mixed-citation publication-type="journal">
<name>
<surname>Chaudhuri</surname>
<given-names>BB</given-names>
</name>
<name>
<surname>Dutta</surname>
<given-names>S</given-names>
</name>
<article-title>Interactive curve drawing by segmented Bezier approximation with a control parameter</article-title>
<source>Pattern Recogn Lett</source>
<year>1986</year>
<volume>4</volume>
<fpage>171</fpage>
<lpage>176</lpage>
<pub-id pub-id-type="doi">10.1016/0167-8655(86)90016-4</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<mixed-citation publication-type="journal">
<name>
<surname>Bao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Bolotov</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Dernovoy</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Kiryutin</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Zaslavsky</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Tatusova</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ostell</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lipman</surname>
<given-names>D</given-names>
</name>
<article-title>The influenza virus resource at the National Center for Biotechnology Information</article-title>
<source>J Virol</source>
<year>2008</year>
<volume>82</volume>
<fpage>596</fpage>
<lpage>601</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.02005-07</pub-id>
<pub-id pub-id-type="pmid">17942553</pub-id>
</mixed-citation>
</ref>
<ref id="B30">
<mixed-citation publication-type="journal">
<name>
<surname>Chang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Liao</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>GF</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J</given-names>
</name>
<etal></etal>
<article-title>Influenza Virus Database (IVDB): an integrated information resource and analysis platform for influenza virus research</article-title>
<source>Nucleic Acids Res</source>
<year>2007</year>
<volume>35</volume>
<fpage>D376</fpage>
<lpage>380</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkl779</pub-id>
<pub-id pub-id-type="pmid">17065465</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<mixed-citation publication-type="journal">
<name>
<surname>Edgar</surname>
<given-names>RC</given-names>
</name>
<article-title>MUSCLE: multiple sequence alignment with high accuracy and high throughput</article-title>
<source>Nucleic Acids Res</source>
<year>2004</year>
<volume>32</volume>
<fpage>1792</fpage>
<lpage>1797</lpage>
<pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id>
<pub-id pub-id-type="pmid">15034147</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<mixed-citation publication-type="journal">
<name>
<surname>Sheerar</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Easterday</surname>
<given-names>BC</given-names>
</name>
<name>
<surname>Hinshaw</surname>
<given-names>VS</given-names>
</name>
<article-title>Antigenic conservation of H1N1 swine influenza viruses</article-title>
<source>J Gen Virol</source>
<year>1989</year>
<volume>70</volume>
<issue>Pt 12</issue>
<fpage>3297</fpage>
<lpage>3303</lpage>
<pub-id pub-id-type="pmid">2558159</pub-id>
</mixed-citation>
</ref>
<ref id="B33">
<mixed-citation publication-type="journal">
<name>
<surname>Bachmaier</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Brandes</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Schlieper</surname>
<given-names>B</given-names>
</name>
<article-title>Drawing phylogenetic trees</article-title>
<source>Algorithms and Computation</source>
<year>2005</year>
<volume>3827</volume>
<fpage>1110</fpage>
<lpage>1121</lpage>
<pub-id pub-id-type="doi">10.1007/11602613_110</pub-id>
</mixed-citation>
</ref>
<ref id="B34">
<mixed-citation publication-type="journal">
<name>
<surname>Webby</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Swenson</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Krauss</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Gerrish</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Goyal</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<article-title>Evolution of swine H3N2 influenza viruses in the United States</article-title>
<source>J Virol</source>
<year>2000</year>
<volume>74</volume>
<fpage>8243</fpage>
<lpage>8251</lpage>
<pub-id pub-id-type="doi">10.1128/JVI.74.18.8243-8251.2000</pub-id>
<pub-id pub-id-type="pmid">10954521</pub-id>
</mixed-citation>
</ref>
<ref id="B35">
<mixed-citation publication-type="journal">
<name>
<surname>Zhou</surname>
<given-names>NN</given-names>
</name>
<name>
<surname>Senne</surname>
<given-names>DA</given-names>
</name>
<name>
<surname>Landgraf</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Swenson</surname>
<given-names>SL</given-names>
</name>
<name>
<surname>Erickson</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Rossow</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Yoon</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Krauss</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<article-title>Genetic reassortment of avian, swine, and human influenza A viruses in American pigs</article-title>
<source>J Virol</source>
<year>1999</year>
<volume>73</volume>
<fpage>8851</fpage>
<lpage>8856</lpage>
<pub-id pub-id-type="pmid">10482643</pub-id>
</mixed-citation>
</ref>
<ref id="B36">
<mixed-citation publication-type="journal">
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Gorman</surname>
<given-names>OT</given-names>
</name>
<name>
<surname>Ito</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Wells</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Donis</surname>
<given-names>RO</given-names>
</name>
<name>
<surname>Castrucci</surname>
<given-names>MR</given-names>
</name>
<name>
<surname>Donatelli</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<article-title>Influence of host species on the evolution of the nonstructural (NS) gene of influenza A viruses</article-title>
<source>Virus Res</source>
<year>1998</year>
<volume>55</volume>
<fpage>143</fpage>
<lpage>156</lpage>
<pub-id pub-id-type="doi">10.1016/S0168-1702(98)00038-0</pub-id>
<pub-id pub-id-type="pmid">9725667</pub-id>
</mixed-citation>
</ref>
<ref id="B37">
<mixed-citation publication-type="journal">
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Krauss</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<article-title>Avian-to-human transmission of the PB1 gene of influenza A viruses in the 1957 and 1968 pandemics</article-title>
<source>J Virol</source>
<year>1989</year>
<volume>63</volume>
<fpage>4603</fpage>
<lpage>4608</lpage>
<pub-id pub-id-type="pmid">2795713</pub-id>
</mixed-citation>
</ref>
<ref id="B38">
<mixed-citation publication-type="journal">
<name>
<surname>Shortridge</surname>
<given-names>KF</given-names>
</name>
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<name>
<surname>Butterfield</surname>
<given-names>WK</given-names>
</name>
<name>
<surname>Campbell</surname>
<given-names>CH</given-names>
</name>
<article-title>Persistence of Hong Kong influenza virus variants in pigs</article-title>
<source>Science</source>
<year>1977</year>
<volume>196</volume>
<fpage>1454</fpage>
<lpage>1455</lpage>
<pub-id pub-id-type="doi">10.1126/science.867041</pub-id>
<pub-id pub-id-type="pmid">867041</pub-id>
</mixed-citation>
</ref>
<ref id="B39">
<mixed-citation publication-type="journal">
<name>
<surname>Webster</surname>
<given-names>RG</given-names>
</name>
<name>
<surname>Bean</surname>
<given-names>WJ</given-names>
</name>
<name>
<surname>Gorman</surname>
<given-names>OT</given-names>
</name>
<name>
<surname>Chambers</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Kawaoka</surname>
<given-names>Y</given-names>
</name>
<article-title>Evolution and ecology of influenza A viruses</article-title>
<source>Microbiol Rev</source>
<year>1992</year>
<volume>56</volume>
<fpage>152</fpage>
<lpage>179</lpage>
<pub-id pub-id-type="pmid">1579108</pub-id>
</mixed-citation>
</ref>
<ref id="B40">
<mixed-citation publication-type="journal">
<name>
<surname>Munzner</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Guimbretiere</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Tasiran</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>YH</given-names>
</name>
<article-title>TreeJuxtaposer: Scalable tree comparison using Focus+Context with guaranteed visibility</article-title>
<source>Acm T Graphic</source>
<year>2003</year>
<volume>22</volume>
<fpage>453</fpage>
<lpage>462</lpage>
<pub-id pub-id-type="doi">10.1145/882262.882291</pub-id>
</mixed-citation>
</ref>
<ref id="B41">
<mixed-citation publication-type="journal">
<name>
<surname>Keim</surname>
<given-names>D</given-names>
</name>
<article-title>Visual exploration of large data sets</article-title>
<source>Commun Acm</source>
<year>2001</year>
<volume>44</volume>
<fpage>38</fpage>
<lpage>44</lpage>
</mixed-citation>
</ref>
<ref id="B42">
<mixed-citation publication-type="journal">
<name>
<surname>Santamaria</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Theron</surname>
<given-names>R</given-names>
</name>
<article-title>Treevolution: visual analysis of phylogenetic trees</article-title>
<source>Bioinformatics</source>
<year>2009</year>
<volume>25</volume>
<fpage>1970</fpage>
<lpage>1971</lpage>
<pub-id pub-id-type="doi">10.1093/bioinformatics/btp333</pub-id>
<pub-id pub-id-type="pmid">19470585</pub-id>
</mixed-citation>
</ref>
<ref id="B43">
<mixed-citation publication-type="journal">
<name>
<surname>Zaslavsky</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tatusova</surname>
<given-names>TA</given-names>
</name>
<article-title>Visualization of large influenza virus sequence datasets using adaptively aggregated trees with sampling-based subscale representation</article-title>
<source>BMC Bioinformatics</source>
<year>2008</year>
<volume>9</volume>
<fpage>237</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-9-237</pub-id>
<pub-id pub-id-type="pmid">18485197</pub-id>
</mixed-citation>
</ref>
<ref id="B44">
<mixed-citation publication-type="journal">
<name>
<surname>Guiller</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Bellido</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Madec</surname>
<given-names>L</given-names>
</name>
<article-title>Genetic distances and ordination: the land snail
<italic>Helix aspersa </italic>
in north Africa as a test case</article-title>
<source>Syst Biol</source>
<year>1998</year>
<volume>47</volume>
<fpage>208</fpage>
<lpage>227</lpage>
<pub-id pub-id-type="doi">10.1080/106351598260888</pub-id>
<pub-id pub-id-type="pmid">12064227</pub-id>
</mixed-citation>
</ref>
<ref id="B45">
<mixed-citation publication-type="journal">
<name>
<surname>Tenenbaum</surname>
<given-names>JB</given-names>
</name>
<name>
<surname>de Silva</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Langford</surname>
<given-names>JC</given-names>
</name>
<article-title>A global geometric framework for nonlinear dimensionality reduction</article-title>
<source>Science</source>
<year>2000</year>
<volume>290</volume>
<fpage>2319</fpage>
<lpage>2323</lpage>
<pub-id pub-id-type="doi">10.1126/science.290.5500.2319</pub-id>
<pub-id pub-id-type="pmid">11125149</pub-id>
</mixed-citation>
</ref>
<ref id="B46">
<mixed-citation publication-type="journal">
<name>
<surname>Roweis</surname>
<given-names>ST</given-names>
</name>
<name>
<surname>Saul</surname>
<given-names>LK</given-names>
</name>
<article-title>Nonlinear dimensionality reduction by locally linear embedding</article-title>
<source>Science</source>
<year>2000</year>
<volume>290</volume>
<fpage>2323</fpage>
<lpage>2326</lpage>
<pub-id pub-id-type="doi">10.1126/science.290.5500.2323</pub-id>
<pub-id pub-id-type="pmid">11125150</pub-id>
</mixed-citation>
</ref>
<ref id="B47">
<mixed-citation publication-type="journal">
<name>
<surname>Erdös</surname>
<given-names>PL</given-names>
</name>
<name>
<surname>Steel</surname>
<given-names>MA</given-names>
</name>
<name>
<surname>Székely</surname>
<given-names>LA</given-names>
</name>
<name>
<surname>Warnow</surname>
<given-names>TJ</given-names>
</name>
<article-title>A few logs suffice to build (almost) all trees (I)</article-title>
<source>RANDOM STRUCT ALG</source>
<year>1999</year>
<volume>14</volume>
<fpage>153</fpage>
<lpage>184</lpage>
<pub-id pub-id-type="doi">10.1002/(SICI)1098-2418(199903)14:2<153::AID-RSA3>3.0.CO;2-R</pub-id>
</mixed-citation>
</ref>
<ref id="B48">
<mixed-citation publication-type="other">
<name>
<surname>Bininda-Emonds</surname>
<given-names>OR</given-names>
</name>
<name>
<surname>Brady</surname>
<given-names>SG</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Sanderson</surname>
<given-names>MJ</given-names>
</name>
<article-title>Scaling of accuracy in extremely large phylogenetic trees</article-title>
<source>Pac Symp Biocomput</source>
<year>2001</year>
<fpage>547</fpage>
<lpage>58</lpage>
</mixed-citation>
</ref>
<ref id="B49">
<mixed-citation publication-type="journal">
<name>
<surname>Lecointre</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Philippe</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Vân Lê</surname>
<given-names>HL</given-names>
</name>
<name>
<surname>Le Guyader</surname>
<given-names>H</given-names>
</name>
<article-title>Species sampling has a major impact on phylogenetic inference</article-title>
<source>Mol Phylogenet Evol</source>
<year>1993</year>
<volume>2</volume>
<issue>3</issue>
<fpage>205</fpage>
<lpage>224</lpage>
<pub-id pub-id-type="doi">10.1006/mpev.1993.1021</pub-id>
<pub-id pub-id-type="pmid">8136922</pub-id>
</mixed-citation>
</ref>
<ref id="B50">
<mixed-citation publication-type="journal">
<name>
<surname>Rannala</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Huelsenbeck</surname>
<given-names>JP</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Nielsen</surname>
<given-names>R</given-names>
</name>
<article-title>Taxon sampling and the accuracy of large phylogenies</article-title>
<source>Syst Biol</source>
<year>1998</year>
<volume>47</volume>
<issue>4</issue>
<fpage>702</fpage>
<lpage>710</lpage>
<pub-id pub-id-type="doi">10.1080/106351598260680</pub-id>
<pub-id pub-id-type="pmid">12066312</pub-id>
</mixed-citation>
</ref>
<ref id="B51">
<mixed-citation publication-type="journal">
<name>
<surname>Graybeal</surname>
<given-names>A</given-names>
</name>
<article-title>Is it better to add taxa or characters to a difficult phylogenetic problem?</article-title>
<source>Syst Biol</source>
<year>1998</year>
<volume>47</volume>
<issue>1</issue>
<fpage>9</fpage>
<lpage>17</lpage>
<pub-id pub-id-type="doi">10.1080/106351598260996</pub-id>
<pub-id pub-id-type="pmid">12064243</pub-id>
</mixed-citation>
</ref>
<ref id="B52">
<mixed-citation publication-type="journal">
<name>
<surname>Wortley</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Rudall</surname>
<given-names>PJ</given-names>
</name>
<name>
<surname>Harris</surname>
<given-names>DJ</given-names>
</name>
<name>
<surname>Scotland</surname>
<given-names>RW</given-names>
</name>
<article-title>How much data are needed to resolve a difficult phylogeny?: case study in Lamiales</article-title>
<source>Syst Biol</source>
<year>2005</year>
<volume>54</volume>
<issue>5</issue>
<fpage>697</fpage>
<lpage>709</lpage>
<pub-id pub-id-type="doi">10.1080/10635150500221028</pub-id>
<pub-id pub-id-type="pmid">16195214</pub-id>
</mixed-citation>
</ref>
<ref id="B53">
<mixed-citation publication-type="journal">
<name>
<surname>Hedtke</surname>
<given-names>SM</given-names>
</name>
<name>
<surname>Townsend</surname>
<given-names>TM</given-names>
</name>
<name>
<surname>Hillis</surname>
<given-names>DM</given-names>
</name>
<article-title>Resolution of phylogenetic conflict in large data sets by increased taxon sampling</article-title>
<source>Syst Biol</source>
<year>2006</year>
<volume>55</volume>
<issue>3</issue>
<fpage>522</fpage>
<lpage>529</lpage>
<pub-id pub-id-type="doi">10.1080/10635150600697358</pub-id>
<pub-id pub-id-type="pmid">16861214</pub-id>
</mixed-citation>
</ref>
<ref id="B54">
<mixed-citation publication-type="journal">
<name>
<surname>Furuse</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Kamigaki</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Oshitani</surname>
<given-names>H</given-names>
</name>
<article-title>Evolution of the M gene of the influenza A virus in different host species: large-scale sequence analysis</article-title>
<source>Virol J</source>
<year>2009</year>
<volume>6</volume>
<fpage>67</fpage>
<pub-id pub-id-type="doi">10.1186/1743-422X-6-67</pub-id>
<pub-id pub-id-type="pmid">19476650</pub-id>
</mixed-citation>
</ref>
<ref id="B55">
<mixed-citation publication-type="journal">
<name>
<surname>Macken</surname>
<given-names>CA</given-names>
</name>
<name>
<surname>Webby</surname>
<given-names>RJ</given-names>
</name>
<name>
<surname>Bruno</surname>
<given-names>WJ</given-names>
</name>
<article-title>Genotype turnover by reassortment of replication complex genes from avian influenza A virus</article-title>
<source>J Gen Virol</source>
<year>2006</year>
<volume>87</volume>
<fpage>2803</fpage>
<lpage>2815</lpage>
<pub-id pub-id-type="doi">10.1099/vir.0.81454-0</pub-id>
<pub-id pub-id-type="pmid">16963738</pub-id>
</mixed-citation>
</ref>
<ref id="B56">
<mixed-citation publication-type="journal">
<name>
<surname>Schweiger</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Bruns</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Meixenberger</surname>
<given-names>K</given-names>
</name>
<article-title>Reassortment between human A(H3N2) viruses is an important evolutionary mechanism</article-title>
<source>Vaccine</source>
<year>2006</year>
<volume>24</volume>
<fpage>6683</fpage>
<lpage>6690</lpage>
<pub-id pub-id-type="doi">10.1016/j.vaccine.2006.05.105</pub-id>
<pub-id pub-id-type="pmid">17030498</pub-id>
</mixed-citation>
</ref>
<ref id="B57">
<mixed-citation publication-type="journal">
<name>
<surname>Chen</surname>
<given-names>LM</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>CT</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Cox</surname>
<given-names>NJ</given-names>
</name>
<name>
<surname>Donis</surname>
<given-names>RO</given-names>
</name>
<article-title>Genetic compatibility and virulence of reassortants derived from contemporary avian H5N1 and human H3N2 influenza A viruses</article-title>
<source>PLoS Pathog</source>
<year>2008</year>
<volume>4</volume>
<fpage>e1000072</fpage>
<pub-id pub-id-type="doi">10.1371/journal.ppat.1000072</pub-id>
<pub-id pub-id-type="pmid">18497857</pub-id>
</mixed-citation>
</ref>
<ref id="B58">
<mixed-citation publication-type="journal">
<name>
<surname>Stover</surname>
<given-names>BC</given-names>
</name>
<name>
<surname>Muller</surname>
<given-names>KF</given-names>
</name>
<article-title>TreeGraph 2: combining and visualizing evidence from different phylogenetic analyses</article-title>
<source>BMC Bioinformatics</source>
<year>2010</year>
<volume>11</volume>
<fpage>7</fpage>
<pub-id pub-id-type="doi">10.1186/1471-2105-11-7</pub-id>
<pub-id pub-id-type="pmid">20051126</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/H2N2V1/Data/Pmc/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000903 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd -nk 000903 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    H2N2V1
   |flux=    Pmc
   |étape=   Corpus
   |type=    RBID
   |clé=     PMC:3142226
   |texte=   PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Corpus/RBID.i   -Sk "pubmed:21689434" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Corpus/biblio.hfd   \
       | NlmPubMed2Wicri -a H2N2V1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Apr 14 19:59:40 2020. Site generation: Thu Mar 25 15:38:26 2021