For each virus, we know its date of collection and we know its genetic make up, allowing us count the number of viruses belonging to a particular genetic lineage through time. We use these timeseries of presence/absence to estimate clade frequencies through time. This estimation smooths month-to-month sampling variation.
Throughout 2015, clade 3c2.a viruses predominated over 3c3.a and 3c3.b viruses. The only exception to this trend has been the recent reemergence of 3c3.a viruses in Europe. However, this clade of European 3c3.a viruses lacks recent amino acid substitutions relative to the 3c3.a parent. Thus it appears that the reemergence of 3c3.a viruses in Europe is due to epidemiological factors, rather than genetic factors. In addition, we find that 3c2.a viruses still predominate in Europe over 3c3.a viruses despite their recent uptick and 3c3.a isolates have been declining in recent weeks.
Within clade 3c2.a, there have emerged multiple circulating amino acid variants including HA1:114T, HA1:142K/197R and (HA1:171K, HA2:77V/155E). This last clade corresponds to HA1:171K/406V/484E. The substitution 171K has emerged twice within 3c3.a. The earlier substitution did not spread far. However, the recent 171K substitution has occurred alongside HA2:77V/155E and is present in more recent viral isolates. Within clade 142K/197R there is a subclade of significant size bearing 168V.The 114T variant peaked in frequency around July 2015 and similarly the 142K variant has been around for more than a year with a low but even geographic distribution, although has seen a recent uptick in frequency since October due to 168V. We expect neither of these variants to be under strong positive selection. The T160K substitution is observed scattered within clade 3c2.a and overrepresented in sequences obtained from cultured virus.
We observe a recent rapid increase in frequency of (HA1:171K, HA2:77V/155E) especially in Asia and North America. This has been the fastest growing variant within 3c2.a that we have observed. This clade also shows rapid phylogenetic branching, corroborating its rapid rise. Additionally, the subclade bearing HA1:142K/197R/168V has recently risen in frequency, although not as rapidly as the (HA1:171K, HA2:77V/155E) clade. We estimate clade HA1:142K/197R/168V has gone from 1% global frequency in March 2015 to 19% global frequency today, while clade (HA1:171K, HA2:77V/155E) has gone from 1% global frequency in June 2015 to 29% global frequency today. For comparison, 3c3.a viruses increased in frequency from 1% to 31% in a similar span of 8 months time.
Barring substantial changes in other clades, we predict the (HA1:171K, HA2:77V/155E) variant to dominate.
Due to poorly performing HI assays on 3c2.a samples and the lack of recent data, antigenic diversity within 3c2.a is hard to assess. The data available to us do not suggest significant antigenic evolution. Here we show recent antigenic data (through September 2015) using sera against A/HongKong/5738/2014 from the Crick Worldwide Influenza Centre alongside model fits. Cooler color indicate greater antigenic similarity. Note that we lack HI data for (HA1:171K, HA2:77V/155E) viruses. Epitope mutation count would predict that HA1:142K/197R/168V is more drifted than (HA1:171K, HA2:77V/155E) with 2 epitope mutations vs 1 epitope mutation, although this requires confirmation through HI or other antigenic characterization. We would prioritize further characterization of viruses basal to the (HA1:171K, HA2:77V/155E) and HA1:142K/197R/168V clades. These include A/Washington/51/2015 and A/Laos/928/2015 for 171K and A/Bangladesh/3400/2015 and A/Roma/29/2015 for 168V.Written by Trevor Bedford and Richard Neher. This work is made possible by the GISAID Initiative and the open sharing of genetic data by influenza research groups from all over the world. We gratefully acknowledge their contributions. Give us a shout at @trvrb or @richardneher with questions or comments.