Human norovirus GII.4 Hong Kong variant shares common ancestry with GII.4 Osaka and emerged in Thailand in 2016

Human norovirus is a leading cause of non-bacterial acute gastroenteritis, which affects all age groups and are found globally. Infections are highly contagious and often occur as outbreaks. Periodic emergence of new strains are not uncommon and novel variants are named after the place of first reported nucleotide sequence. Here, we identified human norovirus GII.4 Hong Kong variant in stool samples from Thai patients presented with acute gastroenteritis. Comparison of amino acid residues deduced from the viral nucleotide sequence with those of historical and contemporary norovirus GII.4 strains revealed notable differences, which mapped to the defined antigenic sites of the viral major capsid protein. Time-scaled phylogenetic analysis suggests that GII.4 Hong Kong shared common ancestry with GII.4 Osaka first reported in 2007, and more importantly, did not evolve from the now-prevalent GII.4 Sydney lineage. As circulation of norovirus minor variants can lead to eventual widespread transmission in susceptible population, this study underscores the potential emergence of the GII.4 Hong Kong variant, which warrants vigilant molecular epidemiological surveillance.


Introduction
Human noroviruses are the most common cause of epidemic and sporadic acute gastroenteritis in all age groups [1]. Most outbreaks occur in closed community settings including schools, childcare facilities, restaurants, and hospitals [2]. There are as many as ten norovirus genogroups (GI to GX) and 48 genotypes [3], of which GII.4 genotype is most often detected in viral gastroenteritis patients. Reinfection throughout one's lifetime is possible due to the emergence of new variants resulting from frequent viral mutation and genome recombination near the RNA-dependent RNA polymerase (RdRp) and the major capsid protein (VP1) genes [4]. Thus, the evolving genomic sequences in norovirus can potentially result in the viral escape from pre-existing immunity, and newly emergent variants may lead to increased incidence of norovirus infections worldwide [5]. VP1 is the major component of the capsid and the primary determinant of the viral structure [6,7]. It is comprised of the shell (S) and the protruding (P) domains. The P domain interacts with the histo-blood group antigens (attachment molecules on the surface of the host cell) and is subjected to the host neutralizing antibody recognition [8][9][10] 11]. A study has shown that such variants are presumed to circulate at low levels long before they emerge as pandemic strains to cause widespread outbreaks [12]. One such variant recently under surveillance, GII.4 Hong Kong, was first identified among hospitalized patients in The Philippines beginning in 2017 and subsequently detected elsewhere in Asia and in Europe [13]. This prompted us to re-evaluate several unusual GII.4 strains circulating in Bangkok within the past five years. Here, we report the identification and characterization of two strains of GII.4 Hong Kong variant, which first emerged in 2016.

VP1 gene amplification and genotyping
Archived complementary DNA samples from our previous enteric virus studies in Thailand [14,15] were subjected to PCR to amplify partial RdRp and VP1 region with using previously described conditions [16]. The complete VP1 genes were amplified using norovirus-specific primers in S1 Table [17]. In this study, we nucleotide sequenced two strains of GII.4 Hong Kong variant (B2717 and B2793) from 2016 and 32 strains of GII.4 Sydney from 2017 to 2019. PCR products were resolved by agarose gel electrophoresis followed by gel-extraction purification (GeneAll Biotechnology, Seoul, Korea). After Sanger sequencing, nucleotide sequences were aligned using ClustalW, assembled using BioEdit version 7.2.0 [18], and subjected to an online norovirus genotyping tool (http://www.rivm.nl/mpf/norovirus/typingtool) [19]. Sequences were deposited in the GenBank database under the accession numbers MW521097-MW521130.

Phylogenetic and Bayesian evolutionary analysis
B2717 and B2793 were compared to different norovirus GII.4 variants whose nucleotide sequences were available in the GenBank database. Phylogenetic tree was constructed using the maximum-likelihood method with 1,000 bootstrap replicates implemented in MEGA7 software [20]. For the evolutionary analysis of the GII.4 complete VP1 gene, we constructed time-measured phylogenetic analysis by using the Bayesian Markov Chain Monte Carlo (MCMC) method implemented in Bayesian Evolutionary Analysis Sampling Trees (BEAST) version 2.4.3 [21]. Nucleotide variations within and between clusters were examined by applying the maximum likelihood based on the Tamura-Nei 93 (TN93) nucleotide substitution model. Bayesian coalescent skyline tree prior used in this study assumed that each parameter value occurs in the distribution proportionally to the possibility of it occurring in the natural population. The algorithm constructed a distribution of parameters estimated from the selected simulations. Dataset was estimated as a tree prior with a chain length of 300 million and the Relaxed Clock Log-Normal model that allowed evolutionary rates to vary between clades. The calculated Effective Sample Size (ESS) values (greater than 200) was implemented in TRACER version 1.6 program (http://tree.bio.ed.ac.uk/software/tracer/). Analyzed plots were visualized with the 95% highest posterior density (HPD) intervals.

Nucleotide, protein, and structural analysis
Pairwise nucleotide sequence similarity analysis between B2717 and 14 reference GII.4 sequences available from the GenBank database were assessed by using SimPlot version 3.5.1 [22]. Similarity plot was generated with a window size and step size of 200 and 20, respectively. Kimura 2-parameter distance model was used. The ratio of transitions and transversions was determined empirically. Three-dimensional crystal structure of the dimeric P domain of norovirus GII.4 (TCH05, Protein Data Bank accession number 3SKB) [23] served as template for visualizing amino acid residue changes identified in this study. Residues were mapped onto the structure using PyMOL version 1.3.

Results
Due to the continual emergence of novel norovirus strains, which necessitated the recent reclassification of several norovirus genotypes [3], we initially re-examined the partial VP1 genes from several human norovirus GII.4 Sydney strains identified in Thailand from 2015-2017 [14]. Sequences were analyzed using the updated online norovirus genotyping tool, from which two Thai norovirus strains (designated B2717 and B2793) previously classified as GII.4 Sydney were reclassified as GII.4 Hong Kong variants. To confirm our findings, we amplified the complete VP1 gene from B2717 and B2793 and evaluated them together with 32 additional GII.4 Sydney strains identified since 2017 against the different GII.4 reference strains (Fig 1) To determine the evolutionary origin of the GII.4 Hong Kong variant, we next performed time-scaled phylogenetic tree using Bayesian Skyline Plot inference. The tree affirms that GII.4 Hong Kong was most closely related to GII.4 Osaka lineage, which first emerged in 2007 (Fig 2). Results further suggest that GII.4 Hong Kong diverged relatively early from a common ancestor, which gave rise to various GII.4 progenies, and did not evolve from the commonly prevalent GII.4 Sydney cluster of recent years. The overall evolutionary rate of the GII.4 capsid sequences was estimated to be 4.5×10 −3 substitutions/site/year (95% HPD Intervals, 4.0×10 −3 -5.1×10 −3 ).
To identify prominent amino acid residues, which differentiate GII.4 Hong Kong from other recent GII.4 strains, we aligned VP1 protein sequence from different GII.4 lineages with the deduced amino acid residues from the VP1 gene sequences of B2717 and B2793 (Fig 4A). All GII.4 Hong Kong strains regardless of their origin appeared to share common residues at several positions. Most of the residues which were exclusive to the GII.4 Hong Kong strains were located on the P2 domain. They were 290L, 298Q, 299F, 302H, 306P, 355A, 364N, 366R, 368G, 375L, 386I, 393E, 394N, 395P, 397F, 398S and 409T (residue position and one-letter code). Distinctive residues in the P1 domain were 447L, 449I, and 539V. In the S domain, only 144V set GII.4 Hong Kong apart from the other GII.4 strains.
We next mapped these important residue differences onto the three-dimensional structure of an epidemic GII.4 (TCH05) dimeric P domain for which crystal structures have been solved (Fig 4B). Remarkably, all of the above residues unique to GII.4 Hong Kong mapped to the solvent-exposed surface on the P dimer. Furthermore, a significant number of residues mapped to known and novel antigenic epitopes (designated A through G) refined through a large-scale genomic study [24]. For example, 298Q and 368G were located on antigenic site A, while 375L appeared on antigenic site C. Four nearly contiguous residues mapped to antigenic site D (393E, 394N, 395P, and 397F). Finally, 355A and 364N affected the recently proposed antigenic site G. Taken together, the novelty of residues exclusive to GII.4 Hong Kong strains suggests potential changes in the antigenic characteristics that warrant further investigation.

Discussion
The circulation of GII.4 Hong Kong in Thailand in 2016 was unexpected and occurred earlier than previously reported in the literature [13]. The fact that the VP1 sequences of the two Thai strains were nearly identical to those of other GII.4 Hong Kong strains analyzed points to this variant's relatively inconspicuous circulation prior to 2017, possibly due to their low frequency of detection.

Human norovirus GII.4 Hong Kong variant in Thailand
Although GII.4 Hong Kong possessed numerous exclusive mutations and has already caused infection in Asia and Europe, its potential to become pandemic like GII.4 Sydney is uncertain. Its shared ancestry with the non-pandemic GII.4 Osaka, however, does not necessarily preclude its virulence since the previous pandemic GII. 4 New Orleans emerged from the same ancestral lineage with the non-pandemic GII.4 Apeldoorn. Moreover, unique residues on GII.4 Hong Kong affecting important antigenic sites, including the previously described residues at positions 352, 355, 368, and 378 [13], as well as additional residues described in this study, could enable possible evasion of pre-existing host immunity.
Interestingly, all GII.4 Hong Kong including the Thai strains possess RdRp gene from GII.P31, which was commonly found in combination with GII.4 Sydney in Thai patients infected with norovirus in 2017-2018 [15]. The standard method of partial RdRp sequence amplification performed as part of a minimal norovirus genotyping allowed us to deduce residues 348 to 510 of RdRp in the B2717 and B2793. Alignment of the amino acid sequences in this region of GII.P31 from this study, the prototypic strain (GenBank accession number AB434770), GII.4 Sydney (JX459908), and GII.4 Osaka (AB541319) revealed one unique and noteworthy difference in each variant. GII.P31 in Hong Kong, Sydney, and Osaka possessed P399I, S427F, and A360G, respectively. These changes are located in the palm and thumb subdomains of the norovirus RdRp, but none mapped to any RdRp motifs responsible for interaction with viral RNA template and RNA synthesis [25]. There is an increased awareness and recognition that novel norovirus variants emerged not only from changes in the VP1 capsid protein, but also the functional gain afforded by the new genotype of RdRp protein through recombination at the ORF1/ORF2 junction, or mutations in the RdRp, or both [26,27]. Attention in determining important nucleotide regions of RdRp may be needed as part of a standard norovirus characterization in addition to the entire VP1 gene sequence in order to better determine changes in the RdRp, which may influence nucleotide incorporation speed, processivity, fidelity, and ultimately viral transmissibility. With the increased public health measures brought on as a result of the recent global coronavirus pandemic, such as school closures, social distancing, and awareness in hygiene practice of frequent handwashing, a number of countries have reported decreases in the number of seasonal viral respiratory infections such as influenza [28,29]. Even a decline in the incidence of norovirus outbreaks in several U.S. states has been associated with the implementation of coronavirus public health measures [30]. Therefore, modified human behaviors (such as better hygiene awareness, avoidance of crowded conditions, and other disease avoidance lifestyle changes) may potentially delay or reduce the magnitude of the next global pandemic norovirus outbreaks.
Surveillance and awareness of emergent human noroviruses, especially those belonging to the predominant GII.4 lineage, will require vigilance in molecular epidemiology and collaborative public health efforts. The global norovirus surveillance network for pediatric gastroenteritis NoroSurv (https://www.norosurv.org) led by the U.S. Centers for Disease Control and Prevention is one such example [31]. The collective norovirus sequence data could therefore potentially help monitor geographical trends in norovirus emergence and spread, as well as serve as a gold mine towards the development of the best vaccine candidates most likely to protect against circulating global norovirus strains.

Conclusions
Similar to other RNA viruses, human norovirus variants periodically emerged to cause outbreaks worldwide. Here, we report the detection and characterization of the norovirus GII.4 Hong Kong variant, which first emerged in Thailand in 2016 and earlier than previously thought. The retrospective analysis of viral sequence data can reveal unexpected findings and help contribute to a more accurate timeline of the temporal and geographical emergence of virus variants.
Supporting information S1