Immunoinformatics approach for predicting epitopes in HN and F proteins of Porcine rubulavirus

Porcine rubulavirus (PRV), which belongs to the family Paramyxoviridae, causes blue eye disease in pigs, characterized by encephalitis and reproductive failure in newborn and adult pigs, respectively. There is no effective treatment against PRV and no information on the effectiveness of the available vaccines. Continuous outbreaks have occurred in Mexico since the early 1980s, which have caused serious economic losses to pig producers. Vaccination can be used to control this disease. Searching for effective antigen candidates against PRV, we first sequenced the PAC1 F protein, then we used various immunoinformatics tools to predict antigenic determinants of B-cells and T-cells against the two glycoproteins of the virus (HN and F proteins). Finally, we used AutoDock Vina to determine the binding energies. We obtained the F gene sequence of a PRV strain collected in the early 1990s in Mexico and compared its amino acid profile with previous and more recent strains, obtaining an identity similarity of 97.78 to 99.26%. For the F proteins, seven linear B-cell epitopes, six conformational B-cell epitopes and twenty-nine T-cell MHC class I epitopes were predicted. For the HN proteins, sixteen linear B-cell epitopes, seven conformational B-cell epitopes and thirty-four T-cell MHC class I epitopes were predicted. The ATRSETDYY and AAYTTTTCF epitopes of the HN protein might be important for neutralizing the viral infection. We determined the in silico binding energy between the predicted epitopes on the F and HN proteins and swine MHC-I molecules. The binding energy of these epitopes ranged from -5.8 to -7.8 kcal/mol. The present study aimed to assess the use of HN and F proteins as antigens, either as recombinant proteins or as a series of peptides that could activate different responses of the immune system. This may help identify relevant immunogens, saving time and costs in the development of new vaccines or diagnostic tools.


Introduction
The present study determined the F gene sequence of a PRV isolate from the early 1990s and compared this sequence with that of other isolates. We analyzed the F and HN sequences in order to assess the conservation of predicted antigenic determinants for B-(humoral immune system) and T-cells (cell-mediated immune system). In addition, predicted T-cell epitopes were docked with a pig MHC-I molecule (SLA1-04:01) in order to estimate the binding energy between the peptide and the MHC-I molecule.

Virus culture
African green monkey kidney cell line (Vero, ATCC CCL-81) was cultured in Eagle's Minimum Essential Medium (MEM) supplemented with 5% fetal bovine serum, 100 U/ml penicillin and 100 μg/ml streptomycin. The Porcine rubulavirus strain PAC-1 (Michoacan, Mexico, 1990) was inoculated into confluent cell cultures and incubated for 1 h at 37˚C. The supernatant was discarded and the cells were washed with PBS. Fresh DMEM was added to the cells, which were then incubated for 72 h until cytopathic effects were apparent. Supernatants were clarified by centrifugation at 3,200 rpm for 30 min. Total RNA was extracted from the infected supernatants using TRIzol Reagent (Thermo Fisher Scientific) according to the manufacturer's instructions.

Informatic analysis of the structural proteins of PRV
The protein sequences of the RVP strains were obtained from GenBank and used for different analyzes. The sequence of the LPMV virus (1984, Accession: Y10803) was used as reference to analyze the antigenic and structural properties of the proteins under study. The physicochemical properties of the structural proteins of PRV, such as molecular weight, aliphatic index, extinction coefficient, theoretical pI, hydropathy and amino acid composition were determined using the ProtParam server (https://web.expasy.org/protparam/). The antigenicity of the proteins was determined using the VaxiJen v2.0 server (http://www.ddg-pharmfac.net/ vaxijen/VaxiJen/VaxiJen.html), which makes an alignment-independent prediction based on the physicochemical properties of the proteins [28]. The target organism selected in the software was "virus", with a default threshold of 0.4. This prediction led to the selection of antigenic proteins present in PRV for further analysis.

Prediction of cytotoxic T-cell epitopes
Cytotoxic T-cell epitopes were predicted using the NetMHCpan 4.0 Server (http://www.cbs. dtu.dk/services/NetMHCpan/) through artificial neural networks [36]. The NetMHCpan 4.0 server predicts the binding of peptides to any known MHC molecule using artificial neural networks (ANNs). The method is trained on a combination of more than 180,000 quantitative binding data and mass spectroscopy derived from MHC eluted ligands. We used the swine leucocyte allele (SLA-1:0101; SLA-1:0401; SLA-1:0801), which is widely distributed in swine populations [37]. Peptide length was set to nonamers for all the selected epitopes. The threshold for a strong binder was 0.5%; for a weak binder, 2%. All peptides predicted by the NetMHCpan 4.0 Server were analyzed using ToxinPred (https://webs.iiitd.edu.in/raghava/toxinpred/design. php), a tool that predicts if a peptide is a toxin and if it can cause damage to cells [38]. For the docking simulation study, we used the crystal structure of the SLA-1:0401 molecule (PDB ID: 3QQ3) [39]. The influenza epitope, which was complexed in the binding groove of SLA-1:0401, was removed using AutoDockTools. Prior to the docking study, the nonamers predicted by the NetMHCpan 4.0 server were optimized using PEP-FOLD 3.5 [40]. The docking simulation was carried out using AutoDock Vina [41].

Phylogenetic analysis of the F protein of PRV
The complete coding sequence of the F protein of PAC1 (541 aa) was amplified by RT-PCR and cloned into the plasmid pJETForf. This sequence was deposited in the NCBI GenBank (MK984607) and compared to other twelve F sequences in the GenBank. The PAC1 F amino acid sequence has 97.78 to 99.26% identity with other PRV F sequences and is clustered into the group of the reference strain LPMV/1984 (Fig 1), with which it has an identity of 99.26%. Four amino acid changes were observed when compared with the reference sequence of LPMV. The F protein of the Michoacan/2013 isolate had the lowest identity with PAC1 (97.78%).
The sequence analysis of PAC1 in SignalP-5.0 found a signal peptide with a cleavage site between positions 22 and 23. The NetNGlyc 1.0 Server showed five potential N-glycosylation sites (four of them above the threshold of 0.5). The SMART server found two putative transmembrane domains at positions 106-128 and 490-512; the first one corresponded to the fusion peptide present in paramyxoviral F proteins [42].

Immunogenic and physicochemical characterization of the structural proteins of PRV
The antigenicity of viral proteins was determined using the Vaxijen V2.0 server, selecting "virus" as target organism and a threshold of 0.4 (default) [28]. This server predicts an overall antigenicity score for each sequence (S1 Table). We used the prototype strain LPMV/1984 as a model to present the analysis results. Of all the sequences evaluated, only the P protein was not considered as an antigen (0.3560). The HN and F membrane glycoproteins were predicted as the most antigenic proteins (0.5271 and 0.5154 respectively); these sequences were subjected to further analysis.

Prediction of continuous B-cell epitopes for PRV HN and F proteins
The epitope sequences corresponded to amino acids of the LPMV/1984 strain. Conservancy (%) indicates the fraction of protein sequences, among all PRV strains, that contained the epitope (http://tools.iedb.org/conservancy/). Continuous B-cell epitopes were predicted by the Bepipred 1.0 (IEDB server) server using a threshold of 0.350. B-cell epitopes have variable length; in the present study, we focused on linear peptides with a minimum length of 5 residues (Table 1). Eight unique linear epitopes, with 5 residues or more, were predicted for the F protein (LPMV strain), while sixteen linear epitopes were predicted for the HN protein (LPMV strain). These epitopes were evaluated using Chou & Fasman Beta-turn prediction tool, Emini Surface Accessibility, Kolaskar & Tongaonkar Antigenicity measurement tools, Parker's Hydrophilicity index and Epitope Conservancy analysis, with a threshold of 1.0. For the F protein, three epitopes were above the threshold level of 1.0, with a conservancy of 80% (LASPDQS; PQLTNPAL and NRTYGPPAYVPPDNIIQS). For the HN protein, only one https://doi.org/10.1371/journal.pone.0239785.g001 epitope was above the threshold level of 1.0, with a conservancy of 80% (PQFSQRAAASY). The conservancy results of the B-cell epitopes showed that most F epitopes were not affected by mutations presents in the F protein; in contrast, the HN epitopes were affected by these mutations, with some epitopes present in only 30.43% of the HN sequences.
The epitope sequences correspond to the amino acids of the LPMV/1984 strain. Conservancy (%) is defined as the fraction of protein sequences, among all PRV strains, that contained the epitope (http://tools.iedb.org/conservancy/).

Prediction of discontinuous B-cell epitopes for the HN and F proteins of PRV
The epitope sequences correspond to the amino acids of the LPMV/1984 strain. The ElliPro score of each epitope is defined as a protrusion index value averaged over epitope residues; values �0.5 are considered significant for a continuous epitope [35]. The structure of the HN and F proteins was predicted by homology with MODELLER and PHYRE2, using ab initio structure prediction algorithms for the transmembrane domains. The two models were validated using the RAMPAGE server. In the HN proteins, 93.9% of the residues were in favored regions, 4.5% residues in allowed regions and 1.6% in outlier regions. In the F proteins, 93.1% of the residues were in favored regions, 5.9% in allowed regions and 0.9% in outlier regions.
Discontinuous B-cell epitopes were predicted using the ElliPro server. Six epitopes were predicted for the F protein (S2 Table); the epitopes with the highest score (0.872) were located in the stalk of the F protein (Fig 2). Seven epitopes were predicted for the HN protein, four of them localized in the head domain of the protein.

Prediction of cytotoxic T-cell epitopes for the HN and F proteins of PRV
For cytotoxic T-cell epitopes, we used the NetMHCpan 4.0 Server (DTU Bioinformatics), which predicts the binding of MHC-I peptides using artificial neural networks. The MHCI alleles used for the analysis were SLA-1 � 01:01, SLA-1 � 04:01 and SLA-1 � 08:01. These alleles

PLOS ONE
Prediction of epitopes in HN and F proteins of Porcine rubulavirus were widely distributed in the pig population [37,43]. The S3 Table shows the peptides that were strong binders to the selected MHCI molecules (the threshold for a strong binder was 0.5%; for a weak binder, 2%). Twenty-nine cytotoxic epitopes were obtained for the F protein and 34 for the HN protein. The putative immunogenicity of the peptides was assessed using the Class I Immunogenicity server (IEDB). Sixteen peptides predicted for the F protein and twenty-two peptides predicted for the HN protein were predicted to be immunogenic. Tox-inPred is an in silico method that predicts whether a peptide is a toxin that can cause damage to cells. ToxinPred uses a support vector machine (SVM) to predict toxicity along with mutations [38]. All peptides were predicted to be non-toxic. We selected the peptides that were recognized by two alleles; they had positive immunogenicity and a conservancy percentage of 100% (S3 Table). The highest number of peptides was found in the F1 peptide fragment, which may thus be proposed as the immunogenic region of the protein. In the docking simulation, the box center coordinates of the binding groove of SLA-1:0401 were X = 18.213, Y = 2.001 and Z = 41.238, and the grid box size was X = 30, Y = 30 and Z = 30. Table 2 shows the binding energy values of the predicted epitopes for the receptor of SLA-01:0401. These values suggest that all the F protein epitopes fit into the binding groove of the SLA molecule (Fig 3). AQA-TAAVAL has a binding energy of -7.0 kcal/mol. The five HN epitopes also fit into the binding groove. FSQRAAASY has a binding energy of -7.8 kcal/mol.

Discussion
Many emerging diseases have appeared in recent years. Zoonosis can be a major problem, causing previously unreported infections in humans. This is why it is so important to develop vaccines against viral diseases affecting in animals in close contact with humans. It has already been seen that pigs can transmit diseases such as Swine Influenza; thus, attention should be paid to other viruses that affect pigs such as PRV. In fact, it has been speculated that PRV may originate from bats, like other paramyxoviruses that infect animals and humans [5]. Due to the persistence of BED outbreaks in Mexico and the lack of data on the effectiveness of existing vaccines, PRV infection is a major problem for pig farms in Mexico. For these reasons, we need to understand the behavior of antigenic determinants in the structural proteins of the virus and how mutations affecting these determinants could serve to develop effective prevention strategies. The F protein is a surface protein with a high percentage of identity among all reported PRV strains (97.78 to 99.26%). We propose to consider the F protein as a possible antigen that could provide protection against different strains of PRV. None of the substitutions in the F protein cleavage site (HRKKR) have been found. Some reports suggest that the cleavage site in the F protein participates in the virulence of PRV [5,9]. There are been  pathogens such as Ebola virus, Zika virus or the Oropouche virus [45][46][47]. The antigenic capacity of the structural proteins of the PRV reference strain LPMV was evaluated using the Vaxijen V2.0 server. F, HN and NP were predicted as the most antigenic structural proteins, with a score > 0.5. Other studies have used the Vaxijen server to select promising antigens against the Human Immunodeficiency Virus (HIV) and the Hepatitis C virus, structural proteins with a Vaxijen score > 0.4 [48,49].
Antigenic determinants recognized by B-cells are important because they can induce the immune system of an organism to elicit memory protection mediated by producing antibodies with the ability to act quickly against reinfection by PRV. In some paramyxoviruses, the presence of these antibodies is enough to neutralize the infection. There are reports of antibodies with neutralizing activity that can recognize the HN and F proteins of MuV, Newcastle disease virus, Measles virus or Nipah virus (F protein) [19,50,51]. The memory response of antibodies in surviving pigs, in the case of a PRV infection, suggests that it is long-lasting; that is why determining possible antigenic regions that can activate the response of B-cells is so important [52]. The results of the present study predict that the HN protein contains more antigenic determinants for B-cells (16 epitopes) than the F protein (8 epitopes). BepiPred is a reliable tool that has already been used in immunoinformatics for the possible design of vaccines against emerging viruses such as SARS-CoV-2 [53]. For HN, only PQFSQRAAASY, and for the F protein, only LASPDQS, PQLTNPAL and NRTYGPPAYVPPDNIIQS met the criteria, based on the evaluated parameters, that is, Kolaskar and Tongaonkar antigenicity, the Chou & Fasman method and Parker's Hydrophilicity, which have been used not only to predict which region of a protein is antigenic, but even to determine how some mutations can affect the antigenicity of an epitope [54]. The ATRSETDYY and AAYTTTTCF epitopes of the HN protein are peptides with an important immunogenic capacity, since they are recognized as antigenic determinants for B cells. Furthermore, it has already been reported in vitro that these epitopes are recognized by antibodies generated during PRV infection. Zenteno et al. (2007) studied some peptides that have common sequences with two of the epitopes that we propose in present study (ATRSETDYY and AAYTTTTCF). Those peptides were able to induce antibodies in mice and one of them was able to inhibit the hemagglutinating activity of PRV, which suggests that these peptides may possibly be involved in the recognition of carbohydrates that are part of the receptor for the virus [55]. These findings suggest that it is possible that antibodies directed against these epitopes could neutralize the infection.
In the selection of epitopes, priority was given to those with high conservation among the PRV strains. The antigenic determinants in the F protein are preserved compared to other known PRV sequences, which suggests that the F protein epitopes used as antigens could be very useful targets against different PRV strains. The relative low conservation (30.43% was the lowest value) of some antigenic determinants of the HN protein (Table 1) may be related to the antigenic diversity reported in other studies on PRV [12, 13]. As far as we know, this is the first report on how mutations in the HN protein sequence can directly affect the predicted antigenic determinants. Although there are few mutations in the HN protein [5], they affect the region comprising residues 435-509. The globular region of the HN protein contains most of the continuous and discontinuous antigenic determinants.
MHC-I antigenic determinants activate an immune cellular response. This type of response normally activates cytotoxic cells that lyse the cells infected by PRV. The occurrence of this response can be determined through the presence of HLA alleles in the host organism. One way to predict the behavior of these antigenic determinants is to evaluate their binding capacity to MHCI and to evaluate their immunogenic capacity by molecular docking [47,56]. In pigs, the major histocompatibility complex (MHC in pigs) and the swine leukocyte antigen (SLA) have an important role mediating cellular immunity, which can eliminate viruses and recognize other antigens in pigs. However, this allele is highly polymorphic, which can cause difficulties for the detection of epitopes suitable for a vaccine, That is why the SLA-1 04:01 and SLA-1 08:01 alleles were used for the analysis of antigenic determinants, which have been reported as alleles with a wide distribution in pigs [57,58]. The NetMHCpan 4.0 and Class I Immunogenicity server are widely used tools for the design of vaccines with the possibility of activating the host cellular response mediated by MHC I. There are few reports of the use of this tools to predict epitopes against Hepatitis C virus or Herpes simplex virus vaccines [59,60]. These tools predicted sixteen peptides for the F protein and twenty-two peptides for the HN protein to be immunogenic. We only analyzed 100% conservancy epitopes (4 and 5 epitopes for F protein and HN protein respectively). The binding energies for these epitopes ranged from -5.8 to -7.8 kcal/mol. The binding energies of all predicted peptides were within a range similar to those reported in a docking analysis with MHC-I molecules for other viruses, including the herpes simplex virus or the Saint Louis encephalitis virus [61,62].
An important result of the present work is that none of the predicted peptides are toxic, which means that any epitope predicted could be used. Prediction of peptide toxicity is based on the recognition of motifs that are present in proteins or peptides that are experimentally known to be toxic. In similar studies, some peptides with immunogenic capacity have been shown to be toxic and must be discarded [63], but this feature apparently is not present in the selected HN and F peptides [14,55].
The NRTYGPPAYVPPDNIIQS peptide in protein F may also be important because, in addition to being considered an antigenic determinant for B cells and MHCI T cells, it has the theoretical capacity of inducing an immune response.
This type of study provides an overview of the possible elements that could be used to generate a PRV vaccine. One of the possible strategies is to generate a chimeric protein that contains the epitopes of the HN and F proteins in a single antigen. There are examples of chimeric proteins that are intended to be preventive treatments against HIV and the Influenza virus [64,65]. Zenteno et al. reported that it is possible to use peptides to induce a response that could affect proteins that participate in the infectious process of PRV. They also mentioned that, besides vaccines, immunogenic peptides can be used as rapid diagnostic tools for PRV [55]. Another approach is to generate recombinant HN and F proteins, either by expressing all the protein or only the regions that concentrate the greatest amount of B and T cell antigenic determinants, and thereby generate a divalent vaccine. In a previous work, our group showed that a recombinant HN protein (PAC1 strain) was capable of inducing the production of neutralizing antibodies in a murine model [14]. Some examples of this kind of vaccines are those used against the Human papillomavirus or the Ebola virus [66,67].

Conclusion
The need to prevent blue eye disease in pigs led us to study proteins or peptides with immunogenic capacity against PRV. Bioinformatics analysis allows to estimate the behavior of a protein in vivo. In the present study, we sequenced the F protein of the PAC1 strain, phylogenetically analyzed the F protein of PRV and predicted potential antigenic determinants of the HN and F proteins of PRV. Antigenic determinants recognized by B cells and T cells, as well as the structure of the F and HN proteins, were determined by homology modelling, which allowed us to predict conformational epitopes. This study could lead to the use of these two proteins as antigens, obtaining recombinant proteins, or only peptides, that could activate different responses in the immune system, which in turn could help optimize time and costs in the development of new vaccines or diagnostic tools.
Supporting information S1