Oncogenic CagA Promotes Gastric Cancer Risk via Activating ERK Signaling Pathways: A Nested Case-Control Study

Background CagA cellular interaction via activation of the ERK signaling pathway may be a starting point in the development of gastric cancer. This study aimed to evaluate whether genes involved in ERK downstream signaling pathways activated by CagA are susceptible genetic markers for gastric cancer. Methods In the discovery phase, a total of 580 SNPs within +/−5 kbp of 30 candidate genes were genotyped to examine an association with gastric cancer risk in the Korean Multi-center Cancer Cohort (100 incident gastric cancer case-control sets). The most significant SNPs (raw or permutated p value<0.02) identified in the discovery analysis were re-evaluated in the extension phase using unconditional logistic regression model (400 gastric cancer case-control sets). Combined analyses including pooled- and meta-analysis were conducted to summarize all the results. Results 24 SNPs in eight genes (ERK, Dock180, C3G, Rap1, Src, CrkL, Mek and Crk) were significantly associated with gastric cancer risk in the individual SNP analyses in the discovery phase (p<0.05). In the extension analyses, ERK rs5999749, Dock180 rs4635002 and C3G rs7853122 showed marginally significant gene-dose effects for gastric cancer. Consistently, final combined analysis presented the SNPs as significantly associated with gastric cancer risk (OR = 1.56, [95% CI: 1.19–2.06], OR = 0.61, [95% CI: 0.43–0.87], OR = 0.59, [95% CI: 0.54–0.76], respectively). Conclusions Our findings suggest that ERK rs5999749, Dock180 rs4635002 and C3G rs7853122 are genetic determinants in gastric carcinogenesis.

Among several downstream pathways activated by CagA, the extracellular signal-regulated kinase (ERK) cascade is a core pathway as it plays an important role in gastric carcinogenesis. CagA cellular interactions with Src, SHP2, Crk, CrkL or GRB2 are significantly associated with ERK activation, and other diverse proteins involved in CagA signaling are intimately connected to ERK signal pathways [10,11,12,13]. Proteins can be regulated by their host genes; therefore, genes encoding proteins related to CagA and ERK signaling process may be important for gastric carcinogenesis but few studies have focused on these genetic polymorphisms.
CagA oncogenicity may be a starting point in the development of gastric cancer via activation of the ERK signal pathway. These signal transductions appear to be modified by host genetic variants. Thus, we hypothesized that genes involved in the ERK downstream signaling pathways activated by CagA may be susceptible genetic markers for gastric cancer. To evaluate this hypothesis, we conducted a multi-stage genetic association study that included 1) discovery phase: a candidate gene analysis that focused on 30 genes, Crk, CrkL, Csk, GRB2, c-Met, NFATC4, PTPN11, SMS, SOS1, Src, ERK, FAK, PLCc, KRAS, NRAS, BRAF, RAF1, MAP2K1, MAP2K3, MAP2K4, MAP2K5, MAP2K6, p21, Dock180, RA-C1,RAP1, WAVE, Arp2, Arp3 and C3G, involved in the CagA and ERK signal transduction pathway, and 2) extension phase that examined the most significant SNPs identified in the discovery analysis.

Study population
Discovery phase. The discovery candidate gene analysis was a population-based nested case-control study within the Korean Multi-Center Cancer Cohort (KMCC). From 1993 to 2004, the KMCC recruited a total of 19,688 participants from four urban and rural areas in Korea. All participants completed detailed standardized questionnaires by personal interview after informed consent. Blood and urine samples were also collected. Through record linkages to the national death certificate, the health insurance medical records databases and the national cancer registry, all participants were passively followed-up, and newly diagnosed cases were ascertained. Detailed information about the KMCC is described elsewhere [14]. In December of 2005, 249 gastric cancer cases defined according to the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10, C16) were identified. Cases diagnosed before recruitment (n = 52), with no blood samples (n = 35), or insufficient DNA level under 50 ng/ml (n = 62) were excluded. Cancer-free controls were matched by age (65 years), sex, residential district, and enrollment year. Finally, 100 sets of gastric cancer cases and matched controls were defined.
Extension phase. 1) In December 2008, 95 new gastric cancer cases were additionally ascertained from the KMCC. These cases and 116 cases that were excluded in the discovery cohort due to prevalence status or inadequate DNA concentration were included in the extension phase (n = 211). Using the same matching method as the discovery phase, 211 controls were selected. 2) Gastric cancer cases were obtained from two university hospitals in Korea that were Chungnam University Hospital and Hanyang University GURI Hospital. From March 2002 to September 2006, a total of 490 gastric cancer patients were newly diagnosed at the hospitals. Their epidemiological data and venous whole blood samples were collected at the time of diagnosis or prior to gastric cancer surgery. Among them, 189 cases with sufficient DNA samples and informed consents were also included in the extension set. Also, 189 community-based controls were matched by age (65 years) and sex from the KMCC subjects enrolled after 2000.

Ethics Statement
The study protocols for the KMCC study, the hospital-based study and the current nested case-control study were approved by the institutional review boards (IRB) of Seoul National University Hospital and the National Cancer Center of Korea

Genotyping
Discovery phase. Genotyping was performed using the genome-wide human SNP Array 5.0 according to the standard protocol recommended by the manufacturer's instructions [15]. Before genotyping, concentrations of genomic DNA for all study subjects were measured using a spectrophotometer (NanoDrop ND-1000, NanoDrop Technologies). For each individual assayed, 250 ng of genomic DNA was digested with a restriction enzyme (Nsp I or Sty I). Though we screened a total of 580 SNPs within +/ 25 kbp of the 30 target gene locations, 103 polymorphisms were excluded due to a SNP call rate of less than 95% or a HWE value less than 0.0001. Because the genome-wide human SNP Array 5.0 was manufactured based on a Caucasian population, 115 SNPs did not meet the criteria of MAF,0.05 in Asians and were also excluded. Additionally, 20 cases and 14 controls were excluded due to insufficient genomic DNA (,250 ng), sex discordance or poor genotyping (,90%). Finally, 362 SNPs in 30 genes (genotyping rate of 99.5%) were genotyped in 81 cases and 85 controls. The cluster images of signal intensity were reviewed for all SNPs.
Extension phase. Seven SNPs with raw or permutated p value,0.02 (rs5999749, rs9418677, rs4635002, rs10901081, rs7853122, rs530801, rs747182) identified in the discovery analysis were genotyped using the Illumina VeraCode GoldenGate Assay with BeadXpress according to the manufacturer's protocol (Illumina, San Diego, CA, USA) [16]. To ensure the reliability of the two genotyping methods, 135 samples (59 cases and 76 controls) were genotyped twice by both the genome-wide human SNP Array 5.0 and the Illumina VeraCode GoldenGate Assay, and the concordance rate was .98.2%. Of the 7 SNPs, rs9418677 was excluded due to a SNP call rate ,95%. Two cases and 40 controls with low DNA availability (n = 15) or genotyping call rate ,90% (n = 27) were also excluded in the analysis. Finally, six SNPs in five genes (genotyping rate of 99.1%) were analyzed in 398 cases and 360 controls in the extension phase.

H. pylori and CagA detection
H. pylori infection status and CagA seropositivity were evaluated using immunoblot assay, Helico Blot 2.1 TM (MP Biomedicals Asia Pacific, Singapore). Helico Blot 2.1 TM kits have reported a sensitivity of 99% for both H.pylori and CagA seropositivity and a specificity of 98% for H. pylori and 90% for CagA seropositivity [17].

Statistical analysis
The Hardy-Weinberg equilibrium (HWE) in the control group was evaluated by Fisher's exact test with a cut-off level of HWE ,0.0001.
In the primary scan in the discovery set, the association between individual SNPs and gastric cancer risk was evaluated based on both raw and permutated p-values using the LRT with 1 degree of freedom in the trend (additive) model. The trend test assumes a dose-response effect with an increasing number of variant alleles. Permutated p-values were estimated by 100,000 permutation tests. Based on the additive model, gastric cancer risk was calculated as odds ratios (ORs) and 95% confidence intervals (CIs) using unconditional logistic regression model adjusting for risk factors that were smoking status (ever vs. never), H. pylori infection (positive vs. negative) and CagA seropositivity (positive vs. negative). The Benjamini-Hochberg false discovery rate (BH-FDR) corrected pvalues of each SNP was computed to avoid spurious association with false positive outcomes [18].
In the extension phase, the most significant SNPs with raw or permutated p value,0.02 identified in the discovery phase were re-evaluated. Based on the additive model, gastric cancer risk was estimated as ORs and 95% CIs using unconditional logistic regression model adjusting for risk factors that were smoking status (ever vs. never), H. pylori infection (positive vs. negative) and CagA seropositivity (positive vs. negative). To summarize the results from the discovery and the extension analyses, data-pooling and meta-analysis were conducted. The summary ORs and 95% CIs were calculated using a fixed-effect model and heterogeneity was evaluated by the Cochran Q statistics [19].
All statistical analyses were performed using SAS software version 9.1 (SAS Institute, Cary, North Carolina) and PLINK software version 1.06 (http://pngu.mgh.harvard.edu/purcell/plink) [20]. Meta-analyses were conducted using STATA version 10 (Stata, College Station, TX). Table 1 summarizes basic characteristics of the study participants in each phase. Gastric cancer cases showed significantly higher rates of CagA seropositivity (p = 0.03 in discovery phase, p = 0.09 in extension phase and p = 0.02 among total subjects). H. pylori infection, VacA seropositivity and smoking status were also higher among gastric cancer cases (Table 1).

Results
In the primary scan of the discovery set, of the 362 SNPs selected from CagA-related genes in the signal transduction pathway, 24 SNPs in eight genes, ERK, Dock180, C3G, Rap1, Src, CrkL, Mek and Crk, were significantly associated with gastric cancer risk in the single SNP analysis (p,0.05). According to the 100,000 permutation test, ERK rs5999749 and Dock180 rs9418677 presented a stronger association with gastric cancer (p,0.01). These SNPs showed significant gene-dose effects in the linear trend test and were significantly associated with an increased risk of gastric cancer (OR = 2.83, [95% CI: 1.42-5.65] for ERK rs5999749; OR = 1.90, [95% CI: 1.18-3.05] for Dock180 rs9418677). Except for Dock180 rs9418677 and Rap1 rs17028287, most SNPs downstream from CagA-Crk signaling (Dock180, C3G, Rap1 and Mek) were significantly associated with a reduced risk of gastric cancer ( Table 2).
In the extension phase, all associations between the selected SNPs and gastric cancer risk were relatively attenuated. ERK rs5999749, Dock180 rs4635002 and C3G rs7853122 showed

Discussion
In our multi-stage genetic analysis, three SNPs, ERK rs5999749, Dock180 rs4635002 and C3G rs7853122, showed strong associations with gastric cancer and may be important regulatory factors in the CagA signal transduction pathway.
Extracellular signal-regulated kinase (ERK), also known as mitogen-activated protein kinase (MAPK), is an important integration point for multiple cellular signals that regulates various oncogenic responses. The ERK signal pathway interacts with a considerable number of substrates, including protein kinases, phosphatases, cytoskeletal components, apoptosis regulators, and a range of other signaling-related molecules [21]. CagA signaling is one of the streams involved in the ERK signal cascade. Numerous studies have reported that CagA-positive H. pylori can activate ERK in gastric epithelial cells to promote inappropriate cellular functions [11,21,22,23]. Moreover, interaction between CagA and signal transduction proteins promotes ERK signaling in conjunction with Ras, Mek and NF-kB, inducing gastric carcinogenesis. In cellular mechanisms, ERK appears to be involved in a wide variety of cellular processes. Thus, the host gene of ERK protein may be more important in determining protein expression and capacity. The results of this study demonstrate that ERK rs5999749 is primarily selected in SNP-based analysis and retains its strong association with gastric cancer in the final combined analyses. This supports that its genetic effect can play a critical role in gastric carcinogenesis equal to its protein activity level at the cellular stage.
Dock180, synonymous with a dedicator of cytokinesis 1, is a 180 kDa protein downstream-combining molecule of Crk and up-regulator of Rac1 [24]. It modulates various functions, including cell spreading, cell migration, and actin cytoskeletal organization through activation of Rac1 [24,25,26,27,28]. This protein is one of the Crk-downstream proteins involved in the cascade of CagA and Crk signaling through the Crk-Codk180-ELMo pathway [9]. Similarly, C3G known as Rap guanine nucleotide exchange factor (GEF) 1 (RAPGEF1) also interacts with the Crk [29]. Previous studies demonstrated that the Crk-C3G-Rap1 signaling can activate the ERK cascade and induce apoptosis, cell growth, migration, adhesion and mortality [30,31,32]. In human carcinogenesis, the C3G gene appears to play a crucial role by itself. Alteration of the C3G genetic activity via amplification or methylation is associated with several cancers such as lung, gastrointestinal and gynecological cancers [33,34]. Although it is not exactly well known whether the genetic variants of the Dock 180 and C3G gene are linked to gastric carcinogenesis, our study  suggests several SNPs in these genes, especially rs4635002 and rs7853122, are significantly associated with risk of gastric cancer, and thus may be a susceptible gene in the development of gastric cancer.
Based on the present results and review of cellular mechanisms [3,9,11,21,22,23,24,30], CagA oncogenicity induced by activation of the ERK signal pathway can be infered (Figure 1). The CagA interaction with binding molecules such as Src, Crk, GRB2 and SHP-2 stimulates the downstream signals in the ERK cascade linked to aberrant cellular functions that leads to gastric carcinogenesis. During this process, the genetic effects of ERK, Dock180 and C3G can play critical roles equal to their protein activities. These results provide support for the genetic and cellular importance of those molecules.
Our genetic analysis presented plausible evidence on genetic variants of the ERK signal transduction pathway activated by CagA, but several limitations should be noted. First, the number of study subjects was insufficient to ensure statistical power to assess the exact association between selected SNPs and gastric cancer risk. Second, due to the small sample size and lack of cardiac gastric cancer patients (less than 5%), we could not conduct stratified analysis according to gastric cancer type, cardiac vs. noncardiac. Therefore, results should be interpreted with caution.
This study indicates that genes involved in the ERK signal transduction pathway activated by CagA can modify risk of gastric cancer. ERK, Dock180 and C3G genes may play important roles in the development of gastric cancer. Replication studies in other populations will allow us to elucidate gastric cancer pathological mechanisms. Further biological studies focused on these genes can clarify their roles in gastric carcinogenesis.