Genetic characterization and phylogenetic analysis of porcine epidemic diarrhea virus in Guangdong, China, between 2018 and 2019

Porcine epidemic diarrhea virus (PEDV), a leading cause of piglet diarrhea outbreaks, poses a significant danger to the swine industry. The aim of this study was to investigate the epidemic characteristics of PEDV that was circulating in Guangdong province, one of China’s major pig producing provinces. Clinical samples were collected from eight pig farms in Guangdong province between 2018 and 2019 and tested for the major porcine enteric pathogens, including PEDV, transmissible gastroenteritis virus (TGEV), Swine enteric coronavirus (SeCoV), Swine acute diarrhea syndrome coronavirus (SADS-CoV), porcine deltacoronavirus (PDCoV), and porcine rotavirus (RV). As a result, only PEDV and RV were detected at a rate of 47.0% (16/34) and 18.6% (8/34), respectively. Coinfectoin with PEDV and RV occurred at a rate of PEDV 12.5% (2/16). Subsequently, the full-length S gene sequences of 13 PEDV strains were obtained, and phylogenetic analysis suggested the presence of GII-c group PEDV strains in this region (non-S-INDEL). Two novel common amino acid insertions (55T/IG56 and 551L) and one novel glycosylation site (1199G+) were detected when the CV777 and ZJ08 vaccine strains were compared. Furthermore, intragroup recombination events in the S gene regions 51–548 and 2478–4208 were observed in the PEDV strains studied. In summary, the observations provide current information on the incidence of viral agents causing swine diarrhea in southern China and detailed the genetic characteristics and evolutionary history of the dominant PEDV field strains. Our findings will aid in the development of an updated vaccine for the prevention and control of PEDV variant strains.


Introduction
Porcine epidemic diarrhea, which is characterized by vomiting, acute watery diarrhea, dehydration, and weight loss in pigs, is a highly contagious and acute infectious disease caused by the viral agent prevalence in piglets suffering from acute diarrhea in Guangdong, China, and to provide updated information for genetic characterization of PEDV field strains.

Ethics statement
This study was approved by the Research Ethics Committee of the College of Life Science and Engineering, Foshan University. Written informed consent was obtained from all owners whose animals were used in the study.

Sample collection and cDNA synthesis
During 2018-2019, 19 feces and 15 fecal swabs were collected from diarrheal pigs in eight Guangdong province swine farms ( Table 1). The pigs did not receive vaccines against PEDV, TGEV or RV. The samples were suspended in phosphate-buffered saline (PBS, pH = 7.4) and then clarified for 10 min at 4,000 rpm using centrifugation. The viral RNA was extracted using the Body Fluid Viral DNA/RNA Miniprep kit (Axygen, China) as directed by the manufacturer. Maxima H Minus First Strand cDNA Synthesis Kit was used to reverse transcribe viral RNA (Thermo Fisher Scientific, USA).

Virus detection
The presence of swine alphacoronaviruses, such as PEDV, Transmissible gastroenteritis virus (TGEV), Swine enteric coronavirus (SeCoV), and Swine acute diarrhea syndrome coronavirus (SADS-CoV), was tested using a universal primer (Swine Cov F: 5'-AAACTGGAAYTTCASM TGG-3'; Swine Cov R: 5'-ACATARWAAGCCCAWC) designed by this study. Furthermore, the porcine deltacoronavirus (PDCoV) was detected using a primer set described previously by Sun et al. [26]. RV VP6 F: 5'-GAAACGGAATAGCTCCACAAT-3' and RV VP6 R: 5'-GAAT AATCAAATCCAGCCACC-3' are primers. The presence of porcine rotavirus was detected by targeting the VP6 gene with an expected size of 271 bp. Thermo Fisher Scientific's Phusion Hot Start II High-Fidelity PCR Master Mix (Thermo Fisher Scientific, USA) was used to amplify the M gene of swine alphacoronavirus and the S gene of PDCoV, which have expected sizes of 547 bp and 1763 bp, respectively. Table 1 lists the primers used in this study. The PCR products were purified using the Gene JET Extraction Kit from Thermo Fisher Scientific (Thermo Fisher Scientific, USA) and then subcloned into the pMD-18T vector (Takara, Japan). Sanger sequencing (Sangon Biotech, China) was used to obtain the M gene sequences, which were then BLAST searched against the GenBank database.

PEDV S gene sequencing
The M gene positive samples were subjected to obtain the full-length sequence of the S gene. Two primers sets (PEDV-S1F, 5'-ATGACGCCATTTGTGGTTTTTC-3' PEDV-S1R, 5'-GCCAGACTGAGATGGGACG-3'; PEDV-S2F:5'-TGGCAGTATTGGCTACGTCC-3' PEDV-S2 R:5'-TGACGACTGTGTCAATCGTGT-3') ( Table 2) based on the conserved region of PEDV genome were designed to amplify the full-length sequence of S gene of PEDV. The S gene of PEDV was amplified using the Phusion Hot Start II High-Fidelity PCR Master Mix (Thermo Fisher Scientific, USAPEDV). Pre-denaturation at 98˚C for 30 s, followed by 35 cycles of denaturation at 98˚C for 10 s, annealing at 53˚C for 30s, and extension at 72˚C for 2 min, followed by extension fully at 72˚C for 10 min. Table 2 lists the primers available. Gene JET Gel Extraction Kit (Thermo Fisher Scientific, USA) was used to purify the amplified products, which were then subcloned into the pMD-18T vector (Takara, Japan) and sequenced using the Sanger method (Sangon Biotech, China). The sequences were assembled using MEGA. 7(Version 7.0.26).

Phylogenetic analysis
The S gene sequences of 98 PEDV representative strains (Table 3) were extracted from the GenBank database for phylogenetic analysis to understand the evolution of the most common PEDV strains in Guangdong, China. The majority of the reference sequences were found in Asia, Europe, and North America. The MAFFT (Multiple Alignment using Fast Fourier Transform) embedded in the UGENE software was used to align multiple sequences (Version 36.0). The nucleotide (nt) and amino acid (aa) homology of S genes among the 13 strains were calculated using the Geneious software (Version 11.0.9) after alignment. The phylogenetic trees of S genes were constructed using the maximum likelihood (ML) method with 1,000 bootstrap replicates in IQ-TREE (Version 1.6.12) based on representative PEDV strains deposited in the GenBank. The phylogenetic tree was further annotated by FigTree (Version 1.4.3).

Recombination analysis and N-linked glycosylation prediction
The evolution of coronaviruses, including PEDV, was aided by genome recombination. The recombination detection program (RDP v5) was used to determine the recombination event in the PEDV S gene, which included ninedetection algorithms (RDP, GENECONV, Bootscan, Maxchi, Chimaera, SiSscan, PhylPro, LARD, and 3Seq) [27]. The detection of potential  recombinants was done using a P<0.01 threshold. As previously stated, N-linked glycosylation was predicted [28].

Sample screening and sequencing
Sixteen of the thirty-four field samples (47.0%) were found to be positive for PEDV. In the fecal samples or fecal swabs, the TGEV and PDCoV were not found. From fecal samples and fecal swabs, seven and nine PEDV positive samples were identified, respectively. Based on the RT-PCR results, eight samples (23.5%) were tested to be porcine rotavirus (RV) positive. The co-infection rate of PEDV and RV was 12.5% (2/16) ( Table 1). The complete S gene and a portion of the M gene were amplified and sequenced using Sanger sequencing. GDsg01-GDsg13 are the names of the 13 S genes that were discovered. The S genes from the five swine farms ranged in length from 4158 to 4164 nucleotides (nt), with nt and amino acid (aa) homology of 97.09%-99.95% and 96.77% -99.79%, respectively (S1 Fig). The sequences were determined using Sanger sequencing and deposited in GenBank with accession No. MW478760-MW478772 respectively.

Phylogenetic analysis of S gene
A phylogenetic tree was constructed using full-length S genes from the 98 reference PEDV strains available in GenBank, and the 13 S-gene sequences obtained in this study to understand the phylogenetic relationship of these PEDV strains. PEDV strains were divided into two categories: traditional G1 and variant G2. The CV777 and SM98 strains were in the G1-a group. The attenuated vaccine strains were found in the G1-b group (CV777 and DR13). The 13 PEDV strains reported in this study had a strong association with the GII-c subgroup, according to our phylogenetic analysis (Fig 1).

Sequence comparative analysis of S gene
The S gene sequences were compared with classic strains and vaccine strains to further investigate the genetic characteristics of the 13 detected strains. As a result, the 13 PEDV strains shared homology with CV777 (KT323979), ZJ08, and AJ1102 (JX188454) of 93.4-93.8%, 93.3-93.6%, and 96.9-97.8%, respectively (Table 4). Compared with ZJ08, a total of 104 aa mutations were observed in the 13 PEDV strains of this study. S1 proteins exhibited 72.1% (75/104) of aa mutations and were majorly distributed in the S1-NTD and S1-CTD domains. The 13 strains shared a common aa deletion ( 163 DI 164 ) and five common aa insertions ( 55 T/IG 56 , 59 QGVN 62 , 136 N, 153 H, and 551 L) (Fig 2A) compared to the reference strains CV777 and ZJ08. In addition, a high polymorphism (>30% in a specific aa position) of mutations (N = 19) such as 62 Y! 66 H/Y, 135 N! 135 N/D was observed in the S1 protein, especially the S1-CTD region (Fig 2B). In addition, two novel common mutations, including A520S and G612V, were identified in this study. The detailed list of aa mutations in the S protein of the 13 PEDV strains in comparison with vaccine strain ZJ08 is shown in Table 5.

PLOS ONE
Genetic characterization and phylogenetic analysis of PEDV

S gene recombination analysis
We performed a recombination analysis based on the 13 PEDV strains collected in this study as well as the 98 reference strains described above to understand the recombination events that occurred during the evolution of the PEDV strains circulating in Guangdong, China. The position 2478-4208 of the S gene of GDsg12 strain was predicted as a recombinant between KP765609 (major parent, GII-a) and KM609205 (minor parent, GII-a) (Fig 3A), which was supported by 6 detection methods (RDP, P-values � 1.

N-linked glycosylation prediction
The N-linked glycosylation sites were predicted based on the consensus N-X-S/T glycosylation motif. As a result, the 13 PEDV strains had 28-29 predicted N-glycosylation sites (G + ) and showed a more similar pattern to CV777 than ZJ08 (Fig 4). Two (131 G-, 235 G-) and one common loss of glycosylation sites (235 G-) were observed when compared with the reference strains ZJ08 and CV777, respectively. In contrast, the PEDV strains reported in this study gained three (302 G+ , 1199 G+ , and 1264 G+ ) and one (1199 G+ ) potential glycosylation sites compared with the ZJ08 and CV777, respectively. In addition, 61.5% (8/13) of PEDV strains reported in this study lost the glycosylation site at residue 725.

Discussion
Several coronaviruses, including PEDV, TGEV, SADS-CoV, SeCoV, and PDCoV, have been identified in piglets suffering from acute diarrhea and vomiting [29][30][31][32]. CoV entry is mediated by the S glycoprotein, which is a critical factor in determining the virus's tissue tropism and antigenicity. Additionally, the S gene of the coronavirus is prone to change through recombination or accumulation of point mutations, which can result in the virus losing its antigenicity and even the vaccine failing to work [33][34][35]. The molecular characterization of PEDVs is a major focus of swine diarrhea virus research due to the virus's high prevalence. During 2018-2019, we investigated the presence of swine diarrhea virus in eight pig farms in Guangdong, China, and then focused on the genetic characterization of PEDV. Our findings supported previous research [36], which indicated that PEDV is a leading cause of swine acute diarrhea in south China. We reported a detection rate of 47.0% for PDEV strains in fecal samples and fecal swabs collected from diarrheal pigs in eightswine farms. The reduction in TGEV reported in that study [36] is in agreement with our results. The gradual disappearance of TGEV has been attributed to the spread of porcine respiratory coronavirus, which conserved the majority of antigenic sites and caused a cross-protection against TGEV [37]. PDCoV, a newly identified member of the viral agents that cause swine diarrhea, was previously considered to be the second most prevalent swine diarrhea pathogen, following PEDV [38]. Sun et al. MUSCLE was used to align the sequences, and Geneious software (Version 11.0.9) was used to visualize them. (A) The common insertions (red box) and deletions (blue box) of amino acid (aa) mutations compared with the reference strains ZJ08 and CV777. (B). The regions in the S1 protein with a relative high polymorphism of mutations. (C) The predicted Nlinked glycosylation sites of reference strain (CV777 and ZJ08), and 13 PEDV strains collected in this study. MUSCLE and the Geneious software were used to align the sequences (Version 11.0.9). The purple arrow represents the predicted N-linked glycosylation site based on the consensus N-X-S/T (X can be any amino acid except proline) glycosylation motif.
https://doi.org/10.1371/journal.pone.0253622.g002 previously reported the genetic characterization of PDCoV in Shandong province during the same time period as this study and found a 50% co-infection rate of PDEV and PDCoV [26]. However, PDCoV was not detected in our study. Those observations indicated that the molecular epidemiology of the swine diarrhea virus varies greatly across China. Rotaviruses (RV), which belong to the Reoviridae family, have been identified as a major cause of viral gastroenteritis in young animals, such as piglets [39,40]. We found an 18.6% detection rate for RV and a 12.5% co-infection rate for PEDV and RV. Surprisingly, a recent survey of pig farms in Brazil found porcine rotavirus B to be the primary viral agent (71.1%) in newborn piglets with acute viral gastroenteritis. In contrast, another study found that RT-PCR had a low detection rate of porcine RV (3%). The detection rate of porcine RV; however, varies depending on the detection method [41]. In this study, all PEDV stains were found to be clustered into GII-c (non-S-INDEL) subgroups. In a recent study, Tian et al. found that PEDV strains with the S gene of the GII-c subgroup were the most prevalent in Sichuan [25]. Furthermore, the nt identity of the PEDV strains in this study was similar to vaccine strains CV777 and ZJ08, but slightly   The sequences were aligned using MUSCLE with the Geneious software (Version 11.0.9). The purple arrow represents the predicted N-linked glycosylation site based on the consensus N-X-S/T (X can be any amino acid except proline) glycosylation motif.
https://doi.org/10.1371/journal.pone.0253622.g004 higher than AJ1102 to. A genomic variation hotspot was also discovered in the S1-CTD region. It's been suggested that the surface of PEDV's S protein PEDV contains four major epitopes [42,43]. Our findings revealed a number of novel common mutations, including A520S and G612V. These aa changes may have an effect on the virus's antigenicity, leading to immune escape and vaccine failure. It's worth noting that all 13 PEDV strains have the same newly discovered B-cell epitope ( 722 SSTFNSTREL 731 ) [42] of S protein PEDV. In addition to the previously reported common aa deletion ( 163 DI 164 ) and aa insertions ( 59 QGVN 62 , 136 N, and 153 H), we report two new common aa insertions for PEDV strains circulating in south China: 55 T/ IG 56 and 551 L. Nevertheless, more research is needed into the impact of those common mutations on virus characterization, such as antigenicity and pathogenicity.
In addition to the accumulation of point mutations, homologous recombination among members of the same genus is a common way for the genetic evolution of coronaviruses. The recent SARS-CoV-2 outbreak was thought to be the result of cross-species recombination between bat and pangolin coronaviruses [44,45]. Similarly, PEDV is intensively recombined between other members of the Alphacoronavirus, such as TGEV. In 2016, Italy [37], Germany [46] and Slovakia [47] reported the discovery of a novel swine enteric coronavirus with a backbone derived from TGEV and the S gene derived from PEDV [47]. Notably, a recent retrospective study indicated that the recombinant SeCoV circulating in Spain may have been misidentified as PEDV using S-protein or S-gene assays [48]. Our studies indicated that recombination occurred in both the S1 and S2 regions of GII-c PEDV strains, as supported by at least six highly reliable methods. The recombination events could be a result of pigs being transported and traded frequently in China.
In conclusion, our findings revealed that PEDV and porcine RV were the two main viral agents responsible for the outbreak of diarrhea on swine farms in China's Guangdong province. A number of novel mutations were discovered, including common insertions like 55 T/ IG 56 and 5 51 L. In addition, when compared to the vaccine strain, one common loss of glycosylation site (235 G-) was observed. Intragroup recombination events were discovered in the S gene of the PEDV strains studied. Our findings highlight the critical need for the development of novel vaccines to combat recent new PEDV variants.