Common Variants in CLDN2 and MORC4 Genes Confer Disease Susceptibility in Patients with Chronic Pancreatitis

A recent genome-wide association study (GWAS) identified association with variants in X-linked CLDN2 and MORC4, and PRSS1-PRSS2 loci with chronic pancreatitis (CP) in North American patients of European ancestry. We selected 9 variants from the reported GWAS and replicated the association with CP in Indian patients by genotyping 1807 unrelated Indians of Indo-European ethnicity, including 519 patients with CP and 1288 controls. The etiology of CP was idiopathic in 83.62% and alcoholic in 16.38% of 519 patients. Our study confirmed a significant association of 2 variants in CLDN2 gene (rs4409525—OR 1.71, P = 1.38 x 10-09; rs12008279—OR 1.56, P = 1.53 x 10-04) and 2 variants in MORC4 gene (rs12688220—OR 1.72, P = 9.20 x 10-09; rs6622126—OR 1.75, P = 4.04x10-05) in Indian patients with CP. We also found significant association at PRSS1-PRSS2 locus (OR 0.60; P = 9.92 x 10-06) and SAMD12-TNFRSF11B (OR 0.49, 95% CI [0.31–0.78], P = 0.0027). A variant in the gene MORC4 (rs12688220) showed significant interaction with alcohol (OR for homozygous and heterozygous risk allele -14.62 and 1.51 respectively, P = 0.0068) suggesting gene-environment interaction. A combined analysis of the genes CLDN2 and MORC4 based on an effective risk allele score revealed a higher percentage of individuals homozygous for the risk allele in CP cases with 5.09 fold enhanced risk in individuals with 7 or more effective risk alleles compared with individuals with 3 or less risk alleles (P = 1.88 x 10-14). Genetic variants in CLDN2 and MORC4 genes were associated with CP in Indian patients.


Introduction
Chronic pancreatitis (CP) is a progressive and irreversible inflammatory disorder of pancreas leading to abdominal pain, diabetes and exocrine insufficiency. The prevalence of CP varies from 10 to 125/100000 individuals in different countries and is reported to be on much higher side in Indian population [1]. Various risk factors for CP are known such as alcohol, smoking, hereditary, metabolic but etiology is not known in the majority of patients termed as having Idiopathic chronic pancreatitis (ICP) [2,3]. Even in those with known risk factors, the role is not fully explained by the injurious agent such as alcohol, as most other individuals with same risk factor do not develop CP. Thus, genetic susceptibility has been suggested to play an important role in etiopathogenesis of CP [4].
Indeed, starting with cationic trypsinogen gene (PRSS1) mutations in hereditary pancreatitis, mutations in many other genes predominantly SPINK1, CFTR, chymotrypsin C, CPA1, CEL etc. have been reported in patients with CP [5][6][7][8][9]. The risk susceptibility for CP cannot be fully explained by known genetic mutations. In this regard, a genome wide association study (GWAS) on pancreatitis identified polymorphisms at X-linked CLDN2 locus being robustly associated with recurrent acute pancreatitis and predominantly alcohol-related chronic pancreatitis in North American patients of European ancestry [10].
Studies have reported various genetic variations in SPINK1, CFTR, Cathepsin B and chymotrypsin C genes being associated with CP in Indian patients [11][12][13][14]. However, these associations particularly of SPINK1 gene are found in only up to 30-40% of patients and do not explain the disease risk in the majority of patients.
Various studies had revealed that genetic constitution of Indian population differs from other major ethnic population of the world [15]. Therefore, genes associated with a disorder in other populations need to be evaluated for their role in Indian population. In present study, we investigated recently identified GWAS variants associated with pancreatitis [10], in 1807 unrelated Indian individuals of Indo-European ethnicity for their association with CP.

Ethics Statement
Prior informed written consent was obtained from both patient and healthy control participants. The study was approved by the Human Ethics Committee of CSIR-Institute of Genomics and Integrative Biology, All  Midha et al., 2009 [12]. Briefly, the diagnosis was made in the appropriate clinical setting if there was evidence of pancreatic duct dilatation and irregularity and/or pancreatic calcification on imaging studies that included ultrasonography, endoscopic retrograde cholangiopancreatography (ERCP); contrast enhanced computed tomography (CECT) scan, and/or magnetic resonance cholangiopancreatography (MRCP). The etiology of CP was determined as follows: (a) Alcoholic CP: If a patient was drinking >60 g (females) or >80 g (males) of alcohol per day for >5 years, (b) Hereditary CP: If >2 firstdegree relatives were suffering from CP with an autosomal dominant pattern of inheritance; and those without autosomal dominant inheritance were labeled as familial chronic pancreatitis (c) Metabolic: If there was evidence of hyperparathyroidism or hypertriglyceridemia, (d) Traumatic: If there was a history of definite abdominal trauma with imaging evidence of pancreatic injury and subsequent ductal dilatation, and (e) Idiopathic: If no definite cause of CP was identified.
All control subjects of the current study are part of the INDICO (Indian Diabetes Consortium) [16]. The inclusion and exclusion criteria for control subjects were as described earlier [15], controls were ! 40 years aged healthy persons with no family history of diabetes and no visible symptoms of pancreatitis. They had fasting glucose level <110mg/dl and HbA1c level 6%. All patients underwent a detailed questionnaire based evaluation and extensive diagnostic work-up. In particular, data regarding age at onset of pancreatitis, age at diagnosis, duration of disease, family history of pancreatitis and known risk factors for chronic pancreatitis such as alcohol and smoking, and complications of CP such as diabetes were recorded.

Genotyping
Out of the 11 SNPs (S1 Table) (p value<10 -7 for the stage 1 or stage 2 or combined analysis) reported by Whitcomb et al. [10], 9 were taken for genotyping including two proxy SNPs. Two SNPs (rs7057398 and rs5917027) were not included in the study because these did not get multiplexed. Two proxies (rs12012022, proxy for rs12014762 and rs1985888/rs2855983, proxy for rs10273639) were used in this study because the original SNPs did not get multiplexed. Two SNPs (rs1985888 and rs2855983) were in LD with each other. Finally, we worked with 9 SNPs for all analysis. Genotyping of the cases was done using Sequenom MassARRAY 1 system (iPLEX GOLD) (Sequenom, San Diego, CA, USA) following the manufacturer's protocol and controls were genotyped on Illumina Human610-Quad Bead Chips (Illumina Inc., San Diego, CA). The quality control of genotype data has been as reported earlier [15]. Some of control samples (~3%) were genotyped using Sequenom MassARRAY 1 to determine consistency in genotyping platforms and >95% genotype consistency was observed in both platforms. The genotype results of each plate were accepted only if the concordance rate of >99% was observed among duplicates. Quality control criteria for the SNPs to qualify for further analysis included: minor allele frequency (MAF) !0.05, missingness per SNP <5% and no significant deviation from the Hardy-Weinberg equilibrium (HWE).

Statistical analysis
Association of the SNPs with CP was investigated for the whole group of CP patients and also separately for ICP and ACP. Genotype distribution for all SNPs was analyzed for deviation from HWE using χ 2 analysis. Logistic regression analysis, assuming log additive model, to determine the association between SNPs and the risk for CP was performed. The associations were adjusted for sex and age as appropriate and OR with 95% CI are presented with respect to the allele as reported in initial study [10]. A P value of <0.0055 (α = 0.05/9) for 9 independent SNPs was considered significant for association with CP after Bonferroni correction for multiple testing. Association analysis was performed using following categories: ACP and ICP combined, ICP alone and ACP alone versus controls. For analysis of sex chromosome SNPs, male hemizygous genotypes were assumed as equivalent to female homozygous genotypes and coded as 0 and 2 for computational ease. As PLINK sets the count of minor alleles in males as 0 and 1, and includes a sex effect, using above coding has no impact on association tests based on logistic regression [17]. SNPs with OR <1 are considered protective while those with OR >1 are considered as conferring risk for the disease. The combined effect of the SNPs on the risk of CP was assessed by computing effective number of risk alleles, as described earlier [18]. Briefly, only SNPs with p<0.0055 (corrected P value) were used to calculate effective number of risk alleles. The weighted risk given by SwiXi/Sw i (where X i is the genotype coding for the i th SNP as provided above and w i is the logarithm of the odds Ratio corresponding to that SNP) was multiplied by 14 (maximum possible number of risk alleles corresponding to 7 SNPs found significant in present study) to obtain the 'effective' number of risk alleles. The combined risk analysis for CLDN2 and MORC4 genes was based on 4 SNPs and hence, effective number of risk alleles was obtained by multiplying weighted risk by 8. We note that while the maximum number of risk alleles at these 4 SNPs can only be 4 for males, the coding of 0 or 2 for hemizygous male genotypes allow for effective number of risk alleles to be computed as above. The analysis included only those individuals whose genotypes at all 7 SNPs (or 4 for CLDN2 and MORC4 locus) were available. Considering the individuals with <5 risk alleles as the reference (or <3 for CLDN2 and MORC4 locus), ORs and P values for increase in disease risk with each 'effective' risk allele were calculated after adjusting for age and sex. With adjustment for sex, the results of combined risk analysis would be identical to those obtained when effective number of risk alleles is computed based on the actual number of risk alleles at 4 SNPs (that is, by coding male hemizygous genotypes as 0 or 1). The genotype information of samples for studied SNPs has been given in S4 Table. To further rule out the possibility of spurious association because of coding of male hemizygote as being male homozygote, sex matched case control analysis was performed for both ACP and ICP combined samples and ICP samples separately and meta-analysis of the summary statistics were done.
Gene-environment interaction of alcohol and smoking, with CP was estimated using generalized linear models (additive model using CGEN package in R). For 3 SNPs (rs11988997, rs12008279, rs6622126), above model did not fit and data for interaction with alcohol were analyzed under a recessive mode of inheritance and an additive risk model comprising the main effects of genotype, alcohol consumption and their interaction effect. Additionally, the interaction of SNP (rs11988997) with smoking was also analyzed using a similar additive risk model and χ 2 test was used to determine significance of combined effect of SNPs at CLDN2 and MORC4 locus with CP. Statistical analyses were performed using R (version 3.0.1) and PLINK (http://pngu.mgh.harvard.edu/purcell/plink/).

Study demographics
Of the 519 patients with CP, 434 had ICP and 85 had ACP as shown in Table 1. Patients with ICP developed disease at a significantly younger age (mean age at onset 23.42 years) compared with ACP patients (mean age 37.58 years; p<0.0001). Of the patients with ICP, 63.36% were males while all ACP patients were males, compared to 53.42% males in the controls (Table 1).
Association analysis for CP (ACP and ICP combined) vs. controls. Comparing CP vs. controls, 4 genetic variations in X-linked CLDN2 and MORC4 genes showed significant association with CP and remained so even after adjustment for age and sex ( .53 x 10 -04 ) also conferred significant risk. Furthermore, another variant at rs379742 for MUM1L1-CXorf57 was also found to be significantly associated with CP (OR 1.32, 95% CI [1.11-1.58]; P = 0.0023), that is in consistence with earlier reports.
We also found significant association at PRSS1-PRSS2 locus with OR 0.60 (95% CI [0.48-0.76], P = 9.92 x 10 -06 ) as protective for CP. In addition, another variant at rs11988997 for SAMD12-TNFRSF11B was also found to be protective (OR 0.49, 95% CI [0.31-0.78]; P = 0.0027). Sex-stratified analysis also showed similar results as shown in S2 We also found that the variant rs2855983 at PRSS1-PRSS2 locus was significantly associated in ICP patients similar to its status in the combined analysis (OR 0.63, 95% CI [0.49-0.82]; P = 5.78 x 10 -04 ). Sex-stratified analysis also showed similar results as shown in S3 Table. Association analysis for ACP vs. controls. In the ACP patients, all 5 variants in X-linked CLDN2 and MORC4 genes showed significant association with ACP even after adjustment for age and sex (Table 4). Similar to combined analysis, CLDN2 and MORC4 gene locus showed conferment of susceptibility in ACP group with ORs ranging from 1.48 to 2.00 and P value 0.003 to 1.89 x 10 -06 (Table 4).

Combined risk analysis
The analysis based on the effective number of risk alleles of all 7 significant SNPs (at p < 0.0055), revealed significantly enhanced risk of CP by 1.2 fold with increase in each unit of 'effective' risk allele (P = 7.3 x 10 -08 ) (Fig 1). Individuals with 13 or more risk alleles carried 7.17 fold increased risk for CP compared with the individuals with 5 or less risk alleles (P = 7.51 x 10 -12 ).

Assessment of variation at CLDN2 and MORC4 locus
As, CLDN2, MORC4 genes are positioned closely and showed strongest association with risk of CP, a detailed analysis of SNPs within the CLDN2 and MORC4 genes, termed CLDN2 and MORC4 locus was performed. Homozygous individuals for risk alleles were more prevalent in cases versus controls (Table 5). Having either one or more variation within the locus increased the * Effect sizes of current study are presented with respect to the reported allele (A1) in the source study, for proxy SNPs the allele on same strand as that of reported allele has been used. Significance achieved at P = 0.05/9 (α = 0.0055) after Bonferroni correction.

Gene-Environment interaction analysis
Interaction of genetic variants with environmental factors showed significant effect of rs12688220 (MORC4) with alcohol (OR 14.62 for interaction of homozygous and OR 1.51 for  *Effect sizes of current study are presented with respect to the reported allele (A1) in the source study, for proxy SNPs the allele on same strand as that of reported allele has been used. Significance achieved at P = 0.05/9 (α = 0.0055) after Bonferroni correction. interaction of heterozygous risk allele, P = 0.0068) hence amplifying the risk of CP (Table 6). No significant interaction of any other risk alleles was observed with either alcohol or smoking in CP patients.

Discussion
In present study, we examined the association of 9 SNPs, identified in a recent GWAS on pancreatitis [10], in 1807 Indian subjects. Whitcomb et al., reported CLDN2 and PRSS1-PRSS2 loci to be associated with recurrent acute pancreatitis and CP (predominantly alcohol-related chronic pancreatitis) in North American subjects of European ancestry. This work confirms association of variants in X-linked CLDN2 and MORC4 gene with risk susceptibility in Indian patients with CP. The association might be more robust for CP since we included well characterized patients with CP only and not those with recurrent acute pancreatitis. GWAS by Whitcomb et al., recently reported association of CLDN2 gene variants with CP in Chinese, European, Japanese and Indian patients [19][20][21][22]. Interestingly, our study confirms  the same in Indian CP patients of Indo-European ethnicity. CLDN2 gene codes for protein claudin-2 that lies in tight junctions and immunohistochemical staining showed weak but membrane-bound expression in acinar, ductal and islet cells in normal human pancreatic tissue [23]. However, premalignant pancreatic cystic lesions showed a more robust expression of claudin-2 suggesting that its aberrant form may directly interfere with tight junction formation and function. Whether the variants in the gene might be related to impaired permeability, affecting either acinar or ductal cells function remains to be explored. Additionally, CLDN2 promoter includes a nuclear factor κB (NF-κB) binding site [24] and a variant CLDN2 might have a role in NF-kB activation.
MORC4 shows wide spread expression in normal tissues including a weak but positive signal in human pancreas [25] and Liggins et al., identified its protein to contain a highly conserved N-terminus, an ATPase domain that may indicate a functional requirement for ATP hydrolysis, a zinc finger motif that may promote protein-protein or protein-nucleic acid interactions and a CW zinc finger with a probable role with methylated DNA or chromatin [25]. A Swedish study reported significant association between risk to Crohn's disease and SNP in MORC4 (rs6622126, also significantly associated with CP in present study) using a case-control approach [26]. Our findings suggest an increased risk of CP with variants in MORC4 gene in addition to a high risk coupled with alcohol intake suggesting a strong gene environment interaction. Polymorphisms of alcohol metabolizing genes have not explained the susceptibility of some individuals to develop pancreatitis. If proven, such a gene environment interaction could explain individual susceptibility and possibly organ specificity. The downstream signaling of MORC4 activation in pancreatitis needs exploration using experimental models but a proteinprotein interaction database (Protein Interaction Network Analysis; PINA) [27] suggests its interaction with AMPKa1, GEMIN4, HECW2, SKIL, STAT3 and UBC, and downstream regulation of all/either of these might be pathophysiologically relevant. It seems likely that CLDN2 and/or MORC4 genes play an important pathophysiological role in CP. Association of X-linked genetic variants with CP might be an important reason for male predominance among the patients with CP as has also been suggested by Whitcomb et al [10].
Although gain-of-function mutations in gene for cationic trypsinogen (PRSS1) are well established in hereditary pancreatitis in patients of different ethnicity [28,29], previous studies in Indian patients with CP did not identify any significant mutations/polymorphisms in the PRSS1 gene [11,12,30,31]. Interestingly, a loss-of-function mutation (p.G191R) in anionic trypsinogen gene (PRSS2) has been reported protective against developing pancreatitis in European, Japan and Indian populations [32][33][34][35]. A current investigation identified a functional promoter variant rs4726576C/A near rs10273639 to affect the expression of PRSS1 gene in pancreatic tissue in mice [36]. Significant association of protective variant in PRSS1-PRSS2 locus in all three categorical analyses i.e. ICP and ACP combined as well as only ICP/ ACP versus controls suggesting a role of PRSS1-PRSS2 locus in Indian patients.
We also found significant association of variant (rs11988997) in SMAD12-TNFRSF11B conferring a protective status to CP in our population. SAMD12 (Sterile Alpha Motif Domain containing 12) proteins containing SAM domains show varied protein-protein interaction nodes, though a conclusive function has not been assigned to this gene [37]. TNFRSF11B (gene for osteoprotegerin, OPG) is a member of TNF-receptor super family and its expression was shown to correlate with the aggressiveness of pancreatic ductal cell carcinoma [38]. Using a rat beta cell line and human primary pancreatic islets, OPG was shown to prevent cytokine induced p38 MAPK phosphorylation and subsequent beta cell death, and suggested to act as an autocrine or paracrine survival factor for beta cells [39]. On the contrary, a recent study demonstrated the role of serum OPG in promoting beta cell dysfunction in vivo in mice [40] and therefore its role in pancreatic structure and function needs further investigation. There are no other reports showing association of this gene with CP.

Conclusion
We replicated association of common variants in CLDN2, MORC4, PRSS1-PRSS2 and SMAD12-TNFRSF11B loci with CP in Indians subjects of Indo-European origin and showed significant association particularly with variants in the MORC4/CLDN2 locus. Further studies through deep sequencing are required to identify the actual polymorphism/mutation associated with CP as are functional studies to elucidate the mechanism of injury.
Supporting Information S1 Table. List of SNPs selected in the present study from source study. The R 2 value has been given on the basis of Human Genome version 17. (DOCX) S2 Table. Association analysis and meta-analysis result of (ACP+ICP) verses control samples stratified by sex. Meta-analysis was done using PLINK for summary statistics of male and female. Ã Effect sizes of current study are presented with respect to the reported allele (A1) in the source study, for proxy SNPs the allele on same strand as that of reported allele has been used. Significance achieved at P = 0.05/9 (α = 0.0055) after Bonferroni correction. SNP = single nucleotide polymorphism, OR f = odds ratio for fixed-effects meta-analysis, P f = P value for fixed effect meta-analysis, CI = confidence interval, Q = P value for Cochrane Q statistic. (DOCX) S3 Table. Association analysis and meta-analysis result of ICP verses controls samples stratified by sex. Meta-analysis was done using PLINK for summary statistics of male and female. Ã Effect sizes of current study are presented with respect to the reported allele (A1) in the source study, for proxy SNPs the allele on same strand as that of reported allele has been used. Significance achieved at P = 0.05/9 (α = 0.0055) after Bonferroni correction. SNP = single nucleotide polymorphism, OR f = odds ratio for fixed-effects meta-analysis, P f = P value for fixed effect meta-analysis, CI = confidence interval, Q = P value for Cochrane Q statistic. (DOCX) S4