Genetic Analysis of Norovirus GII.4 Variant Strains Detected in Outbreaks of Gastroenteritis in Yokohama, Japan, from the 2006-2007 to the 2013-2014 Seasons

Noroviruses (NoVs) are the leading cause of acute gastroenteritis, both in sporadic cases and outbreaks. Since the 1990s, the emergence of several GII.4 variants has been reported worldwide. To investigate the epidemic status of NoV, 6,724 stool samples collected from outbreaks in Yokohama, Japan, from the 2006–2007 to 2013–2014 seasons were assessed for NoVs. We genotyped one specimen from each GII outbreak and conducted a sequence analysis of the VP1 gene for several GII.4 strains. Of the 947 NoV outbreaks during our study, GII was detected in 835, and GII.4 was the predominant genotype of GII. Five different GII.4 variants, Yerseke 2006a, Den Haag 2006b (2006b), Apeldoorn 2007, New Orleans 2009, and Sydney 2012, were detected. During this study period, the most prevalent variant of GII.4 was 2006b, and in each individual season, either 2006b or Sydney 2012 was the predominant variant. Out of the 16 detected 2006b strains, 12 had some amino acid substitutions in their blockade epitope, and these substitutions were concentrated in three residues. Two of the 2006b strains detected in the 2012–2013 season had a S368E substitution, which is consistent with the amino acid residues at same site of NSW0514 (Sydney 2012 prototype). Among the 16 detected strains of Sydney 2012, a phylogenetic analysis showed that all five strains detected in Yokohama during the 2011–2012 season clustered away from the other Sydney 2012 strains that were detected in the 2012–2013 and 2013–2014 seasons. These five strains and other Sydney 2012 strains in Yokohama had a few amino acid differences in the blockade epitopes compared with NSW0514. The amino acid substitutions observed in this study provide informative data about the evolution of a novel GII.4 variant.


Introduction
Noroviruses (NoVs) are the most frequent cause of acute gastroenteritis worldwide among people of all ages [1,2]. They are single-stranded positive-sense RNA viruses in the family Caliciviridae. The NoV genome is 7.5 kb long and encodes three open reading frames (ORFs), a nonstructural protein (ORF1) and two structural proteins (ORF2, which encodes VP1, and ORF3, which encodes VP2). Based on their VP1 gene, NoV strains can be classified into six genogroups (G), GI-GVI, of which GI, GII, and GIV infect humans [1]. NoV GI contains nine genotypes, and NoV GII contains 22 genotypes [3].
Globally, NoV GII strains are dominant, and GII.4 has been the predominant NoV genotype. Genetically distinct novel GII.4 variants have emerged every two to three years and spread rapidly around the world [4][5][6][7]. GII.4 variants US95/96, Farmington Hills 2002, Hunter 2004, Den Haag 2006b (2006b), New Orleans 2009, and Sydney 2012 are recognized as pandemic variants, while some variants, such as Asia 2003 and Yerseke 2006a, have been reported only in limited regional epidemics [3,4,6]. Moreover, it has been reported that GII.4 causes a more severe gastroenteritis than other genotypes [8,9]. To determine the reason for the predominance of GII.4 and its increased disease severity, an analysis of the viral antigenicity and pathogenicity of GII.4 is needed. Unfortunately, an efficient culture system for human NoVs has not yet been developed.
Structural analyses show that VP1 can be divided into two distinct domains, the shell (S) domain and the protrusion (P) domain of the capsid. The P domain can be further divided into the P1 and P2 subdomains [10]. P2 is a hypervariable domain that contains the putative receptor-binding sites [11,12]. Using bioinformatic approaches, five antibody epitopes (epitope A-E) on the surface of the GII.4 P2 subdomain were predicted, and the emergence of pandemic strains is often associated with alterations in these epitopes [13,14].
To our knowledge, longitudinal studies of GII.4 strains derived from outbreaks in Japan have not been reported, although similar studies have been reported for other countries [5,6,[15][16][17][18]. In this study, to determine the trends of circulating NoV strains and to investigate the characteristics of GII.4 variant strains, we performed a genetic analysis of the strains detected in NoV outbreaks in Yokohama. This area has a population of about 3,700,000, and it is located in the center of Japan on the coastline of the Pacific Ocean. We believe this is the first study to conduct long-term monitoring of GII.4 variants in Yokohama, Japan.

Ethics Statement
All procedures in this study that involved human participants were performed in accordance with the ethical standards of the institutional research committee of Yokohama City Institute of Public Health, Kanagawa, Japan, and with the 1964 Helsinki declaration and its later amendments or with comparable ethical standards. Ethical clearance by the institutional research committee of Yokohama City Institute of Public Health is not needed because this study was conducted as outbreaks investigation for identifying causative agent of gastroenteritis. The patient information was anonymized and de-identified prior to analysis. gastroenteritis occurring in a given setting, and a NoV season was defined as the 12-month period from September through August of each year.

Detection of NoV gene
A 10% stool suspension was prepared by mixing each stool sample with phosphate-buffered saline, followed by centrifugation at 10,000 × g for 10 min at 4°C. Viral RNA was extracted from the supernatants with the RNeasy Mini Kit (Qiagen, Hilden, Germany). Real-time RT-PCR detection of NoV was performed with a Smart-Cycler II (Cepheid, Sunnyvale, CA, USA) using a QuantiTect Probe RT-PCR Kit (Qiagen) with separate reactions for NoV genogroups I and II. The primers and probes used to detect these viruses have been described in other reports [19,20].

RT-PCR for NoV GII genotyping and NoV GII.4 variant typing
One positive specimen selected randomly from each NoV GII outbreak was subjected to gene amplification of region C (the 5' end of the ORF2 gene) to determine the GII genotype and the GII.4 variant type using a web-based Norovirus Genotyping Tool Version 1.0 [21]. RT-PCR was performed with a TaKaRa One Step RNA PCR Kit (Takara Bio Inc., Shiga, Japan). The primers used for PCR have been described in other reports [22,23].

Sequence and phylogenetic analysis
The nucleotide sequences of the purified PCR products (QIAquick PCR Purification Kit, QIA-GEN) were determined using a BigDye Terminator Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) and a Genetic Analyzer 3130 (Applied Biosystems). The obtained data were used to construct a phylogenetic tree on the basis of amino acid sequences with the neighbor-joining method using MEGA 5 software (http://www.megasoftware.net/) with 1,000 bootstrap replicates. The sequences reported in this paper have been deposited in the DDBJ/ GenBank/EMBL databases under accession numbers LC005704-LC005735.

NoV outbreaks
Of the 947 NoV outbreaks during eight consecutive 12-month periods starting in September 2006, 835 (88.2%) were caused by NoV GII, 76 (8.0%) were caused by NoV GI, and 36 (3.8%) were caused by a mixture of NoV GI and GII. A summary of the NoV outbreaks in each season is listed in Table

VP1 sequence and phylogenetic analysis of NoV GII.4 strains
To clarify the genetic characteristics of the 2006b and Sydney 2012 strains which were predominant in Yokohama, we further analyzed these two variant type strains. The VP1 gene of representative strains of 2006b and Sydney 2012 variant types detected in Yokohama was analyzed by sequencing. The analyzed strains are listed in Table 2. The results of a phylogenetic analysis based on the amino acid sequences of VP1 are shown in Fig 2. We also investigated the amino   acid variation occurring in the VP1 of these strains, especially in the predicted GII.4 blockade epitope sites A-E (Tables 2 and 3).

Analysis of 2006b strains
Sixteen of the 2006b strains that were detected in Yokohama were analyzed.  (Fig 2). When the amino acid variation of VP1 in the alignment of these 16 strains was investigated, 25 sites (4.6% of the 541 amino acids) were substituted during seven seasons (from the 2006-2007 season to the 2012-2013 season). Of these, 13 sites were located in the P2 subdomain, and the P2 subdomain had a higher percentage of substitutions than other domains (Table 3). Out of the 16 detected 2006b strains, 12 strains had some amino acid substitutions in the A-E epitopes compared with DenHaag89 ( Table 2). These substitutions were largely concentrated in three residues (amino acid (aa)368 in epitope A, aa333 in epitope B, and aa393 in epitope D). Both y12-V758-3 and y12-V836-3, which were identified from the two 2006b outbreaks in the 2012-2013 season, had three amino acid substitutions, from S to E at aa368 (S368E) of epitope A, V333M of epitope B, and S393G of epitope D, compared with Den-Haag89. Notably, this substitution in epitope A was not observed in the other strains analyzed in this study from the seasons between 2006 and 2012, and these substituted amino acids of epitope A and epitope D were consistent with the residues at the same site of the Sydney 2012 prototype, NSW0514 strain (JX459908). Moreover, y12-V836-3 showed the highest nucleotide sequence identity (98.7%) in VP1 to the Taiwanese strain Taoyuan/CGMH55 (KC517366), which was detected in February 2012 by a BLAST search. This Taiwanese strain had the same amino acids as y12-V758-3 and y12-V836-3 in epitopes A-E ( Table 2).

Analysis of Sydney 2012 strains
Sixteen of the Sydney 2012 strains that were detected in Yokohama were analyzed. They showed 98.6-99.1% nucleotide sequence identity and 98.9-99.6% amino acid sequence identity with the Sydney 2012 prototype, NSW0514. A phylogenetic tree constructed using the amino acid sequences showed that all five strains detected in Yokohama in the 2011-2012 season clustered away from the other Sydney 2012 strains detected in Yokohama in the 2012-2013 and 2013-2014 seasons (Fig 2). Fourteen sites in VP1 (2.6% of the 541 amino acids) were substituted during three seasons (from the 2011-2012 season to the 2013-2014 season). Of these, seven sites (5.5% of the amino acids in the P2 subdomain) were located in the P2 subdomain. In contrast, no variable sites were detected in the shell domain (Table 3).
Compared with the prototype Sydney 2012 strain NSW0514, none of the sixteen detected Sydney 2012 strains had any substitutions in epitopes B, C, or E ( Table 2). It has also been reported that early Sydney 2012 strains, Canadian strain AlbertaEI337 (KF509947) and Italian strain PA363 (KF668567), were detected in the 2011-2012 season [5,25,26]. Compared with NSW0514, AlbertaEI337 had differences in epitope C and epitope D and PA363 had differences in epitope B and epitope D. However, we did not observe any differences from NSW0514 in the blockade epitopes of the five strains that were detected in the 2011-2012 season ( Table 2).

Discussion
NoVs are the most common cause of gastrointestinal disease outbreaks [1]. They can spread through contaminated food or water or from person to person and are highly infectious [1,2]. Consequently, NoV is a public health problem worldwide, including in Japan. In this study, to better understand the trends of NoV strains circulating in the population of Yokohama, Japan, the strains associated with outbreaks in various settings as well as the epidemic variation in NoV GII.4 strains were investigated in the seasons between 2006 and 2014. GII was the predominant genogroup throughout the eight seasons in our study and was detected in 88.2% of the NoV-positive outbreaks. The proportion of GII was consistent with those found in outbreak studies performed in the United States of America and in Canada [5,17], whereas it was lower than those shown in sporadic studies [27,28].
For GII genotypes, although GII.4 was the most predominant among the 22 different GII genotypes in all except the 2010-2011 season, the GII.4 proportion of GII differed by season. When the prevalence of GII genotypes other than GII.4 is considered, the endemic genotypes of the 2008-2009, 2009-2010, 2010-2011, 2011-2012, and 2013-2014 seasons were GII.6, GII.2, GII.2 and GII.3, GII.14, and GII.6, respectively. These four genotypes are mainly associated with sporadic infections in children in some countries, including Japan during this study period [7,27,[29][30][31]. Increasing numbers of gastroenteritis samples have been tested in the past years [32]. Additionally, the number of reported outbreaks in primary schools and childcare facilities has also increased during this period [32], resulting in an increase in the detection of GII.6, GII.  [4,5,27]. In contrast, several studies have reported that New Orleans 2009 was predominant in some seasons [5,6,17]. However, this variant was not predominant in Yokohama during any of the seasons in this study, although New Orleans 2009 was detected during three seasons since December 2009. This finding is consistent with the studies of other Asian countries [27,31] [4,24,28]. The duration of immunity to NoV has been estimated at 4.1 to 8.7 years in various models [34]. It is possible that the Asian population who had been infected with 2003 Asia NoV had antibodies with cross-reactivity with New Orleans 2009. However, this hypothesis cannot be confirmed at present because there have not been any studies published on antibodies to 2003 Asia variants. The Yerseke 2006a and Apeldoorn 2007 variants were rarely detected in this study, and this result agrees with the sporadic cases found in Asian and Russian studies that were conducted during the same period [7,27,35].
To date, 2006b is one of the pandemic variants, and currently Sydney 2012 is still a pandemic variant around the world [5-7, 17, 27]. In Yokohama, either 2006b or Sydney 2012 was the predominant variant of each season in this study period. Because of their predominance, we used representative strains of 2006b and Sydney 2012 to investigate their genetic characteristics. A phylogenetic tree constructed using the amino acid sequence coding VP1 revealed that the 2006b strains in Yokohama are distantly related to other GII.4 variants. The 2006b strains had an accumulation of mutations in the capsid P2 subdomain compared with the variants that emerged before 2006b [24,36]. Antigenic variation is an important factor contributing to NoV GII.4 evolution [13]. This huge shift of amino acids in VP1, especially in the P2 subdomain, is believed to be one of the factors that allowed the 2006b variant to escape existing herd immunity. Additionally, it has been reported that, in Brazil, the 2006b variant strains were grouped into two remarkably separate sub-clusters between the samples collected during 2006-2008 and those collected during 2009-2011 [18]. However, our 2006b strains, which were collected from the 2006-2007 to the 2012-2013 seasons did not separate into any obvious sub-clusters. The number of amino acid substitutions in the blockade epitope of our 2006b strains compared with the prototype trended upward over the course of this study. Notably, two 2006b strains detected in the 2012-2013 season had glutamic acid at aa368 of epitope A and glycine at aa393 of epitope D, which are consistent with the residues at the same site of the Sydney 2012 prototype. It is thought that epitope A contributes to antigenic differences and epitope D affects Histo-Blood Group Antigens (HBGAs) binding [14]. Epitopes A and D may be the major drivers of escape from herd immunity in contemporary strains, and screening new strains for changes in these epitopes may provide a quick and valuable method for effective vaccine design and reformulation [37].
In Yokohama, the Sydney 2012 variant replaced the 2006b variant in the 2012-2013 season and expanded rapidly, although the first appearance of the Sydney 2012 variant was in the 2011-2012 season. Similarly, it has also reported that the Sydney 2012 variant became predominant in the next winter season after its first appearance in Italy and Canada [5,25,26]. These Italian and Canadian early Sydney 2012 strains had some mutations in the blockade epitopes compared with NSW0514, and these strains likely played a role in the evolution / adaptation of the novel pandemic variant before spreading worldwide [5,25]. However our five early strains did not have these mutations in the blockade epitopes. Furthermore, our Sydney 2012 strains had little the antigenic diversity, although all five strains detected in the 2011-2012 season were phylogenetically separate from the other Sydney 2012 strains detected in the 2012-2013 and 2013-2014 seasons. However, if the prevalence of Sydney 2012 continues on a path similar to that of 2006b, the Sydney 2012 strains will probably accumulate additional substitutions. We need continuous surveillance of Sydney 2012 to determine the magnitude and mechanism of the pandemic caused by this variant because this variant still causes epidemics.
In conclusion, we identified GII.4 variants from outbreaks in various settings that occurred in Yokohama, Japan, over eight NoV seasons. The prevalence of some of the GII.4 variant types differs between Asia and other parts of the world, and we found amino acid substitutions among GII.4 variants. We believe that, although our work is limited geographically to Yokohama, Japan, these data will lead to a better understanding of the NoV requirement for constant changes in their host-binding site to escape the immune response. To our knowledge, this is the first investigation to conduct long-term monitoring of GII.4 variants in Yokohama, Japan. Many points about norovirus epidemics and evolution remain unknown. To elucidate these, steady surveillance and an accumulation of data are needed for future analyses.