Association of IL-17F rs2397084 (E126G), rs11465553 (V155I) and rs763780 (H161R) variants with rheumatoid arthritis and their effects on the stability of protein

Interleukin-17F (IL-17F), considered a pro-inflammatory cytokine, has been shown to contribute to skeletal tissue degradation and hence chronic inflammation in rheumatoid arthritis (RA). In this study we utilized bioinformatics tools to analyze the effect of three exonic SNPs (rs2397084, rs11465553, and rs763780) on the structure and function of the IL-17F gene, and evaluated their association with RA in Pakistani patients. The predicted deleterious and damaging effects of identified genetic variants were assessed through the utilization of multiple bioinformatics tools including PROVEAN, SNP&GO, SIFT, and PolyPhen2. Structural and functional effects of these variants on protein structures were evaluated through the use of additional tools such as I-Mutant, MutPred, and ConSurf. Three-dimensional (3D) models of both the wild-type and mutant proteins were constructed through the utilization of I-TASSER software, with subsequent structural comparisons between the models conducted through the use of the TM-align score. A total of 500 individuals, 250 cases and 250 controls, were genotyped through Tri-ARMS-PCR method and the resultant data was statistically analyzed using various inheritance models. Our bioinformatics analysis showed significant structural differences for wild type and mutant protein (TM-scores and RMSD values were 0.85934 and 2.34 for rs2397084 (E126G), 0.87388 and 2.49 for rs11465553 (V155I), and 0.86572 and 0.86572 for rs763780 (H161R) with decrease stability for the later. Overall, these tools enabled us to predict that these variants are crucial in causing disease phenotypes. We further tested each of these single nucleotide variants for their association with RA. Our analysis revealed a strong positive association between the genetic variant rs763780 and the risk of developing rheumatoid arthritis (RA) at both the genotypic and allelic levels. The genotypic association was statistically significant[χ2 = 111.8; P value <0.0001], as was the allelic level [OR 3.444 (2.539–4.672); P value 0.0008]. These findings suggest that the presence of this genetic variant may increase the susceptibility to RA. Similarly, we observed a significant distribution of the genetic variant rs11465553 at the genotypic level [χ2 = 25.24; P value = 0.0001]. However, this variant did not show a significant association with RA at the allelic level [OR = 1.194 (0.930–1.531); P value = 0.183]. However, the distribution of variant rs2397084 was more or less random across our sample with no significant association either at genotypic and or allelic level. Put together, our association study and in silico prediction of decreasing of IL17-F protein stabilty confirmed that two SNPs, rs11465553 and rs763780 are crucial to the suscetibility of and showed that these RA in Pakistani patients.


Introduction
Rheumatoid Arthritis is an inflammatory autoimmune disorder which mainly affects the joints, resulting in high levels of rheumatoid factor (RF) and anti-citrullinated protein antibody (ACPA) [1].RA is a devastating disease, causing bone and cartilage deformities, and systemic injuries, if left untreated and unmanaged [2].Its complications can be mild joints swelling or severe polyarthritis related with erosion of bones or cartilages [3].The prevalence of RA remains relatively consistent across various populations worldwide, with a global incidence of 0.5 to 1.0%.[4].Although its pathogenesis is still unclear, many genetic, environmental, hormonal, and infectious factors have significant roles in RA [5].Approximately 50-60% of RA cases can be attributed to genetic factors [6].The genetic factors include polymorphisms in cytokine receptors as well as in other functional pathways genes.However, several association studies and meta-analyses have linked about 150 genes/loci with RA in various populations [7][8][9][10][11].Genetic variations in Th17 cytokines could affect the transcriptional regulation of these genes, which in response; increase the susceptibility of patients to diseases.For example, the polymorphisms in Th17 regimens are related with different complex diseases, including RA, Crohn's disease, multiple sclerosis, and juvenile idiopathic arthritis [12].Nearly 20 genes are known to be closely associated with the development and activity of Th17 cells [13].
The IL-17 family of proinflammatory cytokines is made up of six distinct members: IL-17A, IL-17B, IL-17C, IL-17D, IL-17E, and IL-17F [14].Despite the co-localization of the genes encoding IL-17A and IL-17F on the chromosome 6p12, other members of the IL-17 family are distributed across different chromosomes [15][16][17].The expression of IL-17A and IL-17F stimulate the generation of additional cytokines, chemokines, and antimicrobial peptides, leading to the degradation of skeletal tissues and chronic inflammation in affected individuals [15].The expression of both IL-17A and IL-17F is detectable within synovial tissue, and their presence has been implicated in the pathogenesis of rheumatoid arthritis (R [18].Based on in vitro evidence demonstrated that IL-17F may be involved in the regulation of angiogenesis, as well as the production of specific cytokines from both endothelial cells (including CXCL1, ICAM1, IL-6, and IL-8) and epithelial cells (such as G-CSF) [19][20][21][22].
Recent evidence has revealed that the H161R variant functions as a natural antagonist to wild-type IL17F.This is achieved through its ability to bind to the IL17F receptor without inducing downstream signaling pathways, effectively blocking the induction of IL8 [23,24].Several inflammatory diseases, including rheumatoid arthritis, inflammatory bowel disease, asthma, Graves' disease, ulcerative colitis, and cancer, have identified IL-17F as a promising candidate gene.These findings highlight the potential involvement of IL-17F in the pathogenesis of these conditions and suggest that targeting this cytokine may hold therapeutic promise [25][26][27][28][29][30][31][32].Numerous studies have demonstrated a clear association between levels of IL-17 in both serum and synovial fluid and markers of inflammation, including rheumatoid factor (RF), C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), and the Disease Activity Score 28 (DAS-28) in patients with rheumatoid arthritis (RA) [33].
Being an important component of inflammatory pathway, the single nucleotide variants of IL-17F have been subjected to association studies and the link with RA has been reported for rs2397084, rs11465553 and rs763780 variants in various populations [14][15][16][17][18][19].Each of these variants lies in exonic regions of IL-17F gene and result in non-synonymous changes in the encoded protein.
The epigenome encompasses a variety of mechanisms that modulate gene expression patterns and phenotypic outcomes in response to environmental factors, such as nutrition, pathogens, and climate.These mechanisms include DNA methylation, chromatin remodeling, histone tail modifications, microRNAs, and long noncoding RNAs.The complex interplay between the epigenome and environmental cues highlights the crucial role of epigenetic regulation in shaping biological processes and underscores its potential as a therapeutic target for a broad range of diseases.[20,21].Environment, genome and epigenome might be involved in multi-level interaction [22].Additionally, epigenome variation appears to have an impact on health and productivity [23][24][25].In eukaryotes, gene expression is regulated through complex, temporal, and multidimensional mechanisms [26].In each type of tissue, only a small fraction of the genome is expressed, and gene expression varies with developmental stage; therefore, eukaryotes express genes differently based on their tissues [27,28].The expression of a gene is influenced by the amount of its gene products in the tissue where it is expressed, as well as in other tissues that contribute to the formation of the gene product [29].The analysis of genes and proteins associated with diseases and important traits entails studying at the cellular or chromosomal level 1 [30].
In this study we applied various bioinformatics tools to determine the effect of, rs11465553 (V155I), rs2397084 (H161R) and rs763780 (E126G) on the structure, stability and hence function of IL-17 F protein.Apart from in silico analysis, we also performed a case control study to determine the association between these three SNPs and RA in Pakistani population.

Methodology
A workflow for the complete methodology is given in (Fig 1).

Genetic analysis
For genetic analysis, 250 seropositive RA patients (187 females, 63 males) with the mean age of 43.5 (±14.5) years were recruited from Department of Rheumatology, Lady Reading Hospital (LRH) Peshawar, Pakistan.The patients' diagnoses were made by experienced rheumatologists who followed the rigorous guidelines set out by the American College of Rheumatology (ACR) [42].Each patient was examined for the number of joints involved, the extra-articular manifestations as well as for the clinical features, such as rheumatoid factor (RF), erythrocyte sedimentation rate (ESR) and ACPA.Similarly, a control group of 250 healthy individuals (181 female, 69 males) was included for comparison.The mean age of the control group was 42 (±12.6)years with a standard deviation.Individuals with any other autoimmune complex disease or family history of such illnesses were excluded.Prior to the study, written informed consent was obtained from all participants, and the study was conducted in accordance with ethical guidelines set by both the Abdul Wali Khan University Mardan and LRH Peshawar in Pakistan.The clinical characteristics of the rheumatoid arthritis (RA) patients are detailed in Table 1.
Blood sample (5 mL) was taken from each participant and DNA was isolated via phenolchloroform method [43].Three IL-17F gene SNPs rs2397084, rs11465553, and rs763780 were genotyped using Amplification Refractory Mutation System-Polymerase Chain Reaction (ARMS-PCR)with allele-specific primers (two forward and one reverse) and details of the nucleotide has been provided in Table 2.The primer sets included the following primers; 1) for IL-17F-rs2397084 (T/C), F1: 5'-TCCGGACGACCAGGGTCC-3'; F2: 5'-CTCCGGACGACCAGGGTCT R: 5'-CCAGGCTGTGTGGCTCCAGAA-3'.2) for IL-17F-rs11465553 (A/G), F1: 5'-TGACTGTTGGCTGCACCTGCA-3'; F2: 5'-ACTGTTGGCTGCACCTGCG-3'; R: 5'-CTGTTTCCATCCGTGCAGGTC-3'; and 3) for IL-17F-rs763780 (C/T), F1: 5'-ATATGCACCTCTTACTGCACAC-3'; F2: 5'-GATATGCACCTCTTACTGCACAT-3'; R: 5'-TACCCCTCGGAAGTTGTACAG-3'.To amplify our DNA samples and study the IL-17F gene SNPs, we used a carefully crafted set of PCR conditions.First, we began with an initial denaturation step at 94˚C for 5 minutes, to help prepare the DNA strands for amplification.Next, we embarked on a series of 35 cycles, each of which included denaturation at 94˚C for 30 seconds, annealing at a temperature range of 57-60˚C for 30 seconds, and extension at 72˚C for 1 minute.These cycles were designed to efficiently amplify the specific regions of interest in the DNA samples.Finally, we performed a final extension step at 72˚C for 7 minutes to ensure that all of the amplified DNA fragments were completely synthesized.With these carefully controlled PCR conditions, we were able to generate high-quality DNA amplification products for our research.We used a 2% agarose gel to separate the amplified products and looked for distinct band patterns.By comparing these patterns to known standards, we were able to accurately determine the genotype calls for each sample.The data was subjected to quality controls check using Hardy-Weinberg Equilibrium (HWE), and association of IL-17F rs2397084, rs11465553, and rs763780 with RA was tested using the statistical models.We performed statistical analyses using both the Chi-square (χ2) and Fisher's exact tests, with a confidence interval of 95% (95% CI).The Chi-square test was utilized for larger sample sizes to compare observed and expected frequencies, while Fisher's exact test was used for smaller sample sizes to calculate the probability of obtaining a frequency distribution under the assumption of independence.We only considered a P-value less than 0.05 to be statistically significant in all of our analyses.

Bioinformatics analysis
The results of bioinformatics analyses are listed in Table 3.In PhD-SNP and SNPs&GO tools, the SNPs with a prediction score >0.5 were evaluated deleterious.The scores for each variant (E126G, V155I, and H161R) were less than 0.5, which were evaluated to be neutral.In SIFT the cut-off value for tolerance index (TI) was set to 0.05.The variant V155I (0.12) was found to be tolerated, while the E126G (0.02) and H161R (0.05) were evaluated as deleterious through SIFT.The PROVEAN (threshold value -2.5) predicted V155I (-0.926) as neutral and E126G (-5.786) and H161R (-2.620) variants as deleterious.PolyPhen2 (score range 0-1) predicted V155I (1) and E126G (0.999) as probably damaging and H161R (0.048) as a benign variant.Furthermore, I-Mutant predicted that all the three variants decreased the stability of IL-17F protein [E126G (RI = 9); V155I (RI = 4); H161R (RI = 5)].MutPred tool predicted structural and functional consequences including gain of intrinsic disorder, loss of allosteric site, creation of glycosylation and catalytic site, and altered membrane protein etc.Only H161R showed loss of strand, altered transmembrane protein, and altered protein stability.ConSurf predicted that the variants E126G and V155I were highly conserved, exposed and functional residues while H161R was highly conserved and exposed but not functionally active residue.The 3-D proteins structures were produced using I-TASSER and were subjected to TM-align analysis.Thus, significant structural changes were determined through TM-scores and RMSD values of the variants.TM-scores and RMSD values were 0.85934 and 2.34 for E126G, 0.87388 and 2.49 for V155I, and 0.86572 and 0.86572 for H161R, respectively.Finally, protein structures were viewed and characterized through Chimera 1.11and are shown in (Fig 2).

Association analysis
The findings of association analyses are given in Table 4 and gel electrophoresis pattern are shown in (Fig 3)
The in-silico and computational methods have identified several variants that significantly affect the structure and function of certain proteins [52][53][54][55].Therefore, we utilized multiple in-silico tools to assess the deleterious impacts of the variants on the structure and function of ).When a variant receives a CADD score of 30, it falls within the top 0.1% of the most damaging single nucleotide polymorphisms (SNPs) known to science.Similarly, a CADD score of 20 signifies that the variant is among the top 1% of the most harmful SNP variants found throughout the entirety of the human genome.The CADD score of E126G variant was 27 followed by V155I and H161R that were 23 and 21, respectively.MutPred1.2server predicted that E126G variant had highest P-value of 0.677, followed by V155I and H161R (0.252 and 0.088, respectively).These results suggested that these variants may structurally and functionally affect IL17F protein.Furthermore, I-Mutant predicted that these variants decreased the protein stability, as listed in Table 3.To ensure the reliability of our findings, we double-checked the results from I-Mutant by comparing them with those generated by the CUPSAT server (http://cupsat.tu-bs.de/),which ultimately provided further support for our initail findings.ConSurf analysis facilitated the assessment of protein evolutionary conservation, with the identification of the most highly conserved amino acids providing key insights into their fundamental role in protein structure and function [38].Furthermore, the highly conserved residues are very important for protein-protein interactions.
According to Miller and Kumar, highly conserved nsSNPs are most damaging ones [33].
In this study, we analyzed three variants (rs763780, rs11465553 and rs2397084) of IL-17F genes for their association with RA and found significant association for two of them (rs763780 and rs11465553) in Pakistani population.Contrary to our findings, a case-control study on Polish population reported no association of IL-17F rs763780 and rs2397084 with RA.However, a significant association was observed, for rs763780, upon stratification of individuals, based on joints tenderness, HAQ score or DAS-28-CRP level.Furthermore, rs2397084 was associated with longer disease activity in RA cases [14].Similarly, another Polish study reported IL-17F rs763780 as risk factor of RA [15].A group of investigators, genotyped IL-17F (rs763780, rs11465553, rs2397084) using RFLP method and established an insignificant distribution in Turkish population [16].There were different observations even in the same population, when the literature was reviewed.A comparatively larger case-controls study on individuals of Polish decent (422 RA cases and 337 control) determined an insignificant association of IL-17F (rs763780, rs11465553, rs2397084) through TaqMan assays [17].Marwa et al. performed a case-control study in Tunisian population using RFLP method and showed a significant association of IL-17F rs2397084 and rs763780 with RA [19].In a meta-analysis of 7,474 patients and 10,628 controls, obtained from 25 different studies, established that IL-17F rs763780 was significantly involved with increasing the RA risk [18].To our knowledge, genetic factors have rarely been studied in Pakistanis.Few studies have replicated the known SNPs/variants in Pakistani RA patients [11,[58][59][60][61]. Recently, IL-17F gene was sequenced in 50 RA cases and 50 controls of Pakistani origin and established a significant association of rs763780 [62].Furthermore, there is no newly established and authentic data available on the prevalence of RA in Pakistani population.However, the trend in disease progression with respect to gender differences have been shown [63], with the prevalence rate of 0.14% [64].
There is a limitation of the study.We did not perform haplotype analyses of the three SNPs and the information of linkage disequilibrium is not available.Although our study provides valuable insights, it is important to note that ethnic variability may exist, highlighting the need for large-scale, multi-ethnic population studies to fully elucidate the pathogenic mechanisms and evolutionary background of the genetic factors implicated.

Conclusions
It is concluded that IL-17F rs763780 (H161R) and rs11465553 (V155I) are the risk factors for RA in Pakistani patients.Furthermore, these are important coding variants, encoding highly conserved amino acids.Thus, these variants decrease the stability of IL-17F protein, thereby causing structural changes.Thus, on the basis of these findings IL-17F can be considered to be one of the potential therapeutic targets.

Table 4 . Statistical models used in association analysis.
Bioinformatics tools predicted deleterious effect of variants and the findings were cross-checked with CADD, REVEL, Mutation Assessors and MetalR in Ensemble genome browser 96 (accessed: 25th January, 2022