Whole-Genome Sequencing Analysis of Sapovirus Detected in South Korea

Sapovirus (SaV), a virus residing in the intestines, is one of the important causes of gastroenteritis in human beings. Human SaV genomes are classified into various genogroups and genotypes. Whole-genome analysis and phylogenetic analysis of ROK62, the SaV isolated in South Korea, were carried out. The ROK62 genome of 7429 nucleotides contains 3 open-reading frames (ORF). The genotype of ROK62 is SaV GI-1, and 94% of its nucleotide sequence is identical with other SaVs, namely Manchester and Mc114. Recently, SaV infection has been on the rise throughout the world, particularly in countries neighboring South Korea; however, very few academic studies have been done nationally. As the first whole-genome sequence analysis of SaV in South Korea, this research will help provide reference for the detection of recombination, tracking of epidemic spread, and development of diagnosis methods for SaV.


Introduction
Sapovirus (SaV) is the one of the etiological agents of human gastroenteritis and is named after the Japanese city Sapporo, where it was first discovered [1]. It is an important cause of gastroenteritis in young children and adults, and can induce symptoms such as diarrhea, vomiting, and fever [2,3]. Its transmission routes are person-to-person (fecal-oral), through aerosols, or through contaminated water or foods [4].
SaV is an RNA virus with a non-segmented, positive-sense, single-stranded RNA molecule of approximately 7.3-7.5 kb. It belongs to the family Caliciviridae, which also includes norovirus [5,6]. Phylogenetic analysis based on capsid protein (VP1) nucleotide sequences can divide this genus into 5 genogroups (GI-GV). Further analysis of 4 human SaV genogroups has led to their subdivision into 16 genotypes (GI.1-GI.7, GII.1-GII.7, GIV, and GV) [5][6][7]. Genogroups GI, GII, GIV, and GV can cause severe infection in humans, while GIII infects pigs [8]. GII and GIII genogroups have 2 ORFs, and the others have 3 each [9][10][11]. ORF1 encodes nonstructural proteins and the capsid protein VP1, but the roles of ORF2-and ORF3-encoded proteins have not been clearly defined [6,12]. For human SaV strains which were not cultivable through cell culture, molecular studies including characterization of the infectious cycle of the virus were limited. The detection system for SaV with reverse transcription-polymerase chain reaction (RT-PCR) analysis needs to be highly sensitive and accurate [13][14][15]. The purpose of this study was to analyze and present, for the first time, the full-length genome sequence of a SaV in South Korea. Phylogenetic analysis was performed for comparison with genotypes which have already been reported. We expect the data acquired from whole-genome sequencing to be useful not only for research in molecular biology, but also for basic epidemiologic analyses such as tracking of international spread.

Ethics statement
The stool sample was provided by Waterborne Virus Bank (WAVA). Due to issues concerning difficulties in tracking the exact records of the patient from the donor hospital, informed consent from the parent of the child participant could not be acquired. The Institutional Review Board reviewed and approved the use of this sample for the purpose of research as this study does not affect the patient. All of the experimental work and sample collections were supervised by the Catholic Medical Center Office of Human Research Protection Program (CMC OHRP) of South Korea (approval no. MC14SISI0096).

Sample preparation and viral RNA extraction
A SaV-positive stool sample, obtained from a female infant who presented with fever and diarrhea, was obtained from the Waterborne Virus Bank (WAVA, Seoul, South Korea). The stool sample was stored at −70°C until RNA extraction. The frozen stool sample was thawed and diluted with 10% with phosphate-buffered saline (PBS), after which it was centrifuged. Viral RNA of SaV was extracted from 140 μL of supernatant using a QIAamp Viral RNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Isolated RNA was stored at −70°C until further use.

Reverse transcription (RT) polymerase chain reaction
For the detection of SaV, RT-PCR was performed with the OneStep RT-PCR Kit (Qiagen) using SV-F11 and SV-R1 primers (Table 1). To analyze the whole-genome sequence of SaV, 10 more primer pairs were newly designed based on the Manchester strain (GenBank accession no. X86560). RT-PCR was performed with a S1000 thermal cycler (Bio-Rad, Hercules, CA, USA), and the steps comprised RT (50°C for 30 min), initial PCR activation (95°C for 15 min), 39 cycles of 3-step cycling (94°C for 30 s, 52°C-55°C for 30 s, and 72°C for 1 min), and final extension (72°C for 10 min). All RT-PCR products were examined by electrophoresis in ethidium bromide-stained 2% agarose gels.

Determination of the 5' and 3'-ends of the SaV genomic RNA
To determine the 5'-end of the SaV genomic RNA, RACE (Rapid-Amplification of cDNA Ends) was performed with the 5'-Full RACE Core Set Kit (Takara Bio Inc., Ohtsu, Japan). The first cDNA strand was synthesized through reverse transcription from target mRNA using 5' end-phosphorylated RT Primer (5'-SV-PR, Table 1), after which it was treated with RNAse H to remove hybrid RNA and with RNA Ligase to form circularized single-strand cDNA or concatemers. To amplify the product, the first PCR reaction was performed using 5'-SV-F1 and 5'-SV-R1 primers under the following conditions: 94°C for 3 min, followed by 25 cycles each of 94°C for 30 sec, 56°C for 30 sec and 72°C for 5 min. Then, the second PCR reaction was conducted with 5'-SV-F2 and 5'-SV-R2 primers through 30 cycles of 3-step cycling (94°C for 30 sec, 56°C for 30 sec and 72°C for 5min).
To attain the exact sequence for the 3'-end of the SaV genomic RNA, cDNA was synthesized using RT reaction performed with 3'-end poly A tail-based 3'-Oligo (dT)-anchor primer ( Table 1). The second PCR reaction was conducted using the SV-10F and 3'-anchor-R primers (Table 1) under the following conditions: 30 cycles of 3-step cycling (98°C for 10 sec, 56°C for 30 sec and 72°C for 1min) and 72°C for 7min.
Cloning and sequencing of the complete genome All PCR products obtained using 13 primer pairs were extracted from 2% agarose gels using HiYield Gel/PCR DNA Fragments Extraction Kit (RBC, Taipei, Taiwan) and were cloned into pGEM-T easy vectors (Promega, Madison, WI, USA). Transformed Escherichia coli DH5α- Table 1. Primers used in this study.

Primer
Sequence ( The primers were based on the Manchester strain (GenBank accession no. X86560) [13].

Phylogenetic analysis
Comparative sequence analysis, including sequence alignments and estimation of genetic distances, was performed with Clustal W using the Molecular Evolutionary Genetic Analysis software (MEGA soft version 6.0) [16]. Phylogenetic trees were constructed using the neighborjoining method in MEGA 6 [17].

Results
The SaV RNA was extracted from a stool sample collected and provided by the Waterborne Virus Bank (WAVA, Seoul, South Korea). The isolated SaV strain, designated as ROK62, had a total length of 7429 nucleotides (nt). The complete genome sequence of ROK62 was deposited in GenBank under accession no. KP298674. Its 5ʹ-UTR was12 nt long, and 3ʹ-UTR was 81 nt long. Its total length was found to be the same as that of the Mc114 virus and was 2 nt shorter than that of the Manchester virus. The location and length of ORFs and VP regions were found to be as follows: ORF1, 13-6855 (6843 nt); ORF2, 6852-7349 (498 nt); ORF3, 5180-5665 (486 nt); and VP1, 5170-6852 (1683 nt).
In the phylogenetic analysis, ROK62 sequences were aligned and compared with other reported SaV sequences. In the phylogenetic tree, ROK62 was classified under the GI genogroup, closely resembling the Manchester virus and the Mc114 virus, which are SaV GI-1 members (Fig 1). Similarities with the Manchester strain (GenBank accession number X868560) was confirmed using Basic Local Alignment Search Tool (BLAST) analysis, which revealed an identity of 94% (highest similarity), Max scores, total scores, and query coverage values were also determined, and their values were 11271, 11271, and 100%, respectively, for ROK62 and the Manchester virus. ROK62 showed 94% identity with the Mc114. The identity was determined using whole-genome sequence BLAST ( Table 2). All identity results were obtained at a query coverage rate greater than 99%.

Discussion
SaV is one of the important causal agents of acute gastroenteritis worldwide. It mostly infects children but can also infect adults [18], and can occur in during any season [19,20].
Although the occurrence rate of SaV in South Korea reported in 2012 was not high (0.1%) [21], it has been increasing globally. For example, the rate of SaV-positive gastroenteritis outbreaks was reported to be as high as 8% according to studies in 2000-2012 in Japan [22]. Moreover, there have been steady occurrences of SaV infections in other Asian countries, including China, Thailand, Taiwan, and Hong Kong. SaV infections have also been reported in European countries such as Germany, Sweden, and the Netherlands, where the rates of SaV-positive gastroenteritis outbreaks were in the range of 1.3-4% [23]. This is the first study to determine the whole genome sequence of SaV from a patient with acute gastroenteritis in South Korea. The SaV strain, ROK62, which was detected in South  Korea, belongs to GI-1 and showed no intra-or inter-genogroup recombination of the nonstructural protein-encoding region and the VP1-encoding region. ROK62 is very similar to the Sapporo (Hu/GI/Sapporo/MT-2010/1982, HM002617) strain, the first prototype of which was reported from an outbreak in Sapporo, Japan, in 1982 [24][25][26][27]. Phylogenetic analysis showed that the strain which shows the most resemblance is the Manchester strain (Sapporo virus-Manchester/UK, X86560), which was detected in the United Kingdom in 1993 and was the first SaV to have its complete genome sequenced [28,29]. The genomic organization of ROK62, including the location and length of ORFs, VP1, and VP2, was the same as that of the Manchester strain. The Mc114 (Sapovirus Mc114/JPN, AY237422) and N21 (Sapovirus N21/ THA, AY237423) strains were also very similar. Periodic monitoring of SaV is needed to keep track of the dynamic changes of genogroups and genotypes, as predominant genogroups and genotypes of vast diversity have been reported in the same geographical area [30][31][32]. Phylogenetic analysis of the currently circulating SaVs is necessary in order to remain updated regarding the rapid evolution of SaV strains. Around 2007, GIV.1 was the predominant SaV strain detected in Japan, Canada, the United States, and Europe, and therefore surveillance was considered important not only at the national level but also at the international level [32][33][34][35][36]. This underscores the importance of international cooperation in the form of information exchange among nations, in addition to national surveillance, for the prevention of epidemics. Through comparative using more data concerning whole-genome sequencing from South Korea and neighboring countries, the development of detection kits for discovering the current predominant strains and for the prediction of future predominant strains can be developed. Therefore, we surmise that this study will not only prove valuable for basic epidemiological research but also for the promotion of public health.