The emerging GII.P16-GII.4 Sydney 2012 norovirus lineage is circulating worldwide, arose by late-2014 and contains polymerase changes that may increase virus transmission

Noroviruses are a leading cause of human gastroenteritis worldwide. The norovirus genotype GII.4 is the most prevalent genotype in the human population and has caused six pandemics since 1995. A novel norovirus lineage containing the GII.P16 polymerase and pandemic GII.4 Sydney 2012 capsid was recently detected in Asia and Germany. We demonstrate that this lineage is also circulating within the UK and USA and has been circulating since October 2014 or earlier. While the lineage does not contain unique substitutions in the capsid, it does contain polymerase substitutions close to positions known to influence polymerase function and virus transmission. These polymerase substitutions are shared with a GII.P16-GII.2 virus that dominated outbreaks in Germany in Winter 2016. We suggest that the substitutions in the polymerase may have resulted in a more transmissible virus and the combination of this polymerase and the pandemic GII.4 capsid may result in a highly transmissible virus. Further surveillance efforts will be required to determine whether the GII.P16-GII.4 Sydney 2012 lineage increases in frequency over the coming months.


Introduction
Noroviruses are the leading cause of human gastroenteritis worldwide and are estimated to be responsible for 900,000 clinic visits amongst children in industrialized countries and up to 200,000 deaths of children in developing countries annually [1,2]. Noroviruses belong to the Caliciviridae family and their~7.5Kb RNA genome contains three open reading frames (ORFs): ORF1 encodes a nonstructural polyprotein that is cleaved into six proteins including an RNA-dependent RNA polymerase (RdRp), ORF2 encodes the VP1 capsid protein and ORF3 encodes a minor structural protein, VP2. Recombination frequently occurs close to the PLOS ONE | https://doi.org/10.1371/journal.pone.0179572 June 29, 2017 1 / 9 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 junction between ORF1 and ORF2, necessitating independent genotyping of the RdRp and capsid [3]. While noroviruses are divided into seven genogroups and further into more than 30 genotypes based on capsid sequence, the majority of cases and outbreaks are caused by viruses associated with a single capsid genotype, GII.4, which has also caused six pandemics of gastroenteritis since 1995 [4,5]. Each pandemic has been caused by a distinct strain of GII.4. While the first five pandemic strains contained the GII.P4 RdRp, the most recent pandemic strain (Sydney 2012) circulated more commonly with the GII.Pe RdRp. Recent reports demonstrated circulation of the Sydney 2012 capsid with a GII.P16 RdRp in South Korea, Japan and Germany [6][7][8]. While the GII.P16 RdRp is not typically highly prevalent, a GII.P16-GII.2 virus was the dominant strain amongst a large peak of norovirus infections in Germany in Winter 2016 [8]. Here, we demonstrate using whole genome sequencing [9] and phylogenetic analyses that the GII.P16-GII.

Sample collection and sequencing
We identified noroviruses with the GII.P16 RdRp in ten stool samples collected as part of routine surveillance from South East and North West England between June 2015 and April 2016; samples were from both sporadic cases and outbreaks. Four of these faecal specimens were referred to the Virus Reference Department, Public Health England, as part of a sentinel norovirus strain surveillance programme, which collects norovirus-positive specimens from geographically disparate regions across England. The other six faecal specimens were collected from a tertiary referral paediatric hospital in London, UK. These six specimens were residual diagnostic specimens obtained from patients with confirmed norovirus infections. Specimens were collected as part of the FP7 PATHSEEK study and submitted to the UCL Infection DNA Bank. The samples were supplied to the study in an anonymised form; the use of these specimens for research was approved by the NRES Committee London-Fulham (REC reference: 12/LO/1089). Other specimens used were sent to the Enteric Virus Unit at Public Health England in the course of routine surveillance and diagnosis work. RNA was extracted and whole genome sequencing performed as described previously [9]. Sample genotypes were obtained using the norovirus genotyping tool, available at http://www.rivm.nl/mpf/norovirus/ typingtool [10]. The GenBank accession numbers for viruses sequenced in this study are as follows: KY887597-KY887606.

Phylogenetic analyses
We combined our sequences with all GII.P16 ORF1 sequences and all GII.4 Sydney 2012 capsid and VP2 sequences available on GenBank. We reconstructed maximum likelihood trees using RAxML [11] and time trees using BEAST 2 [12]. GII.P16 dating analyses were carried out using the RdRp as there are many more GII.P16 sequences containing the RdRp (n = 165) compared with the complete ORF1 (n = 45), enabling estimation of more accurate dates. The GII.4 Sydney 2012 capsid maximum likelihood tree was used to identify a well-supported monophyletic clade (bootstrap support 81) containing 70 samples that includes all of the samples with the GII.P16 RdRp. The GII.4 Sydney 2012 time tree was reconstructed using the samples in this clade. Ancestral reconstruction to identify nonsynonymous changes occurring along particular branches was carried out using PAML [13].

Data availability
All alignments, phylogenetic trees and BEAST XML files are included as supporting information (S1-S6 Files).

Results
We identified ten viruses collected in routine surveillance in the UK containing the GII.P16 RdRp. Of these, seven were found with the GII.4 Sydney 2012 capsid and three were found with the GII.3 capsid. The ten RdRp sequences formed a well-supported monophyletic clade (Fig 1), that also contains GII.

Discussion
Here, we demonstrate that the emerging GII.P16-GII.4 Sydney 2012 norovirus lineage is circulating in the UK and USA, in addition to previous reports of circulation in Asia and Germany [6][7][8]. Analysis of available sequences suggests that this lineage has been circulating since October 2014 or earlier (Fig 2). The lack of amino acid substitutions in the capsid suggests that this lineage will not be able to escape existing herd immunity generated against Sydney 2012 since its emergence as a pandemic in 2012. However, previous studies have implicated the RdRp as an important component of viral fitness and demonstrated that RdRp changes can influence viral transmission by modulating the replication fidelity, and thus the viral diversity [14,15]. Several of the changes in the RdRp are in the palm subunit that contains most of the catalytic residues (Fig 3). Little is currently known about whether changes in the other proteins