Genetic Variations in the Regulator of G-Protein Signaling Genes Are Associated with Survival in Late-Stage Non-Small Cell Lung Cancer

The regulator of G-protein signaling (RGS) pathway plays an important role in signaling transduction, cellular activities, and carcinogenesis. We hypothesized that genetic variations in RGS gene family may be associated with the response of late-stage non-small cell lung cancer (NSCLC) patients to chemotherapy or chemoradiotherapy. We selected 95 tagging single nucleotide polymorphisms (SNPs) in 17 RGS genes and genotyped them in 598 late-stage NSCLC patients. Thirteen SNPs were significantly associated with overall survival. Among them, rs2749786 of RGS12 was most significant. Stratified analysis by chemotherapy or chemoradiation further identified SNPs that were associated with overall survival in subgroups. Rs2816312 of RGS1 and rs6689169 of RGS7 were most significant in chemotherapy group and chemoradiotherapy group, respectively. A significant cumulative effect was observed when these SNPs were combined. Survival tree analyses identified potential interactions between rs944343, rs2816312, and rs1122794 in affecting survival time in patients treated with chemotherapy, while the genotype of rs6429264 affected survival in chemoradiation-treated patients. To our knowledge, this is the first study to reveal the importance of RGS gene family in the survival of late-stage NSCLC patients.


Introduction
Non-small cell lung cancer (NSCLC) is the leading cause of cancer mortality worldwide [1]. Over 45% of NSCLC patients present with unresectable late-stage (stage IIIA/B or stage IV) disease in the United States [2]. A combined modality therapy is the current standard of care for patients with stage III NSCLC with good performance status (performance score 0 or 1). Numerous clinical trials have shown that concurrent chemoradiation offers a significant survival advantage over sequential chemoradiation [3]. Although concurrent chemoradiotherapy significantly improves the survival of patients with locally advanced disease, the majority of patients still die within 5 years because of locoregional or distant disease progression [4]. The stage IV patients are usually offered palliative chemotherapy and supportive care [5]. There is a wide variability in patients' response to chemoradiation and clinicopathological variables alone do not provide satisfactory guidance for the decision of treatment strategy. The application of pharmacogenomics may improve the prediction of response and help clinicians determine cancer treatments for individual NSCLC patient according to his unique genetic background. Therefore, in this study, we aimed to identify genetic predictors for clinical outcomes of late stage NSCLC patients.
G proteins (guanine nucleotide-binding proteins) are important cellular signal transduction molecules that are expressed in all human cells [6,7]. They are activated by G protein-coupled receptors (GPCRs) and thereby may transduce extracellular signals into the interior of a cell [8]. GPCRs are a family of seventransmembrane domain receptors. When GPCRs traduce a signal inside the cell, the extracellular domain of GPCR first binds to the signal molecules, and then the intracellular domain of GPCR activates a heterotrimeric G-protein. The heterotrimeric G protein functions as ''molecular switches'' and can activate a cascade of signaling factors and downstream target activation [7]. This G protein-coupled biological process requires fine-tuning through accessory molecules such as the regulator of G-protein signaling (RGS) [9]. RGS proteins are a big family of over 30 intracellular proteins [10], which can negatively modulate GPCRs signaling pathways [11,12]. RGS are multi-functional, GTPase-accelerating proteins that promote GTP hydrolysis by the alpha subunit of heterotrimeric G proteins, thereby inactivating the G protein and rapidly switching off GPCR signaling pathways [11]. All RGS proteins contain a RGS domain (also referred as ''RGS-box'') ,which is required for their activities [13], and these RGS domains mediate the interaction with other signaling proteins, allowing RGS proteins to serve as signaling scaffolds [8]. Malfunctions of RGS proteins have been reported to be related to the pathogenesis of many common human diseases and drug addiction [14,15,16,17]. Multiple RGS proteins were found differentially expressed in a variety of solid and hematological malignancies [18,19,20,21 The single nucleotide polymorphisms (SNPs) of RGS have been associated with several human diseases, suggesting that genetic variation in the RGS pathway may play a significant role in these diseases' pathogenesis [37,38]. Recently, RGS SNPs have also been reported to play important roles in lung cancer. For instance, SNPs in RGS17 on chromosome 6q23-25 was associated with familial lung cancer susceptibility [39]. SNPs in RGS2 and RGS6 may modulate the risks of bladder and lung cancers [37,40]. Whether genetic variants in the RGS pathway could influence clinical outcomes in patients with NSCLC remains unknown. In this study, we tested the hypothesis that genetic variations of RGS are associated with the survival of late-stage NSCLC patients receiving chemotherapy or chemoradiation.

Results
We included 598 NSCLC patients in this study, with a mean age of 59.7 years ( Table 1). Of the 598 patients, 456 were dead and 142 were alive. We found no significant difference in age (P = 0.884), ethnicity (P = 0.937), smoking status (P = 0.860), and pack-years of smoking (P = 0.926) between the two groups of patients (Table 1). However, we observed a significant difference in mortality status by gender (P = 0.002), clinical stage (P = 0.004), and performance status (P = 0.002) ( Table 1).

Associations between SNPs and overall survival in latestage NSCLC patients
A total of 13 SNPs in 6 genes were significantly associated with the risk of death at P,0.05 ( Table 2). Among them, the variant alleles of four SNPs, rs7549021 and rs1056515 of RGS5, rs944343 of RGS3, and rs2749786 of RGS12, were associated with decreased risks of death, with adjusted HRs of 0.42 (95% CI, 0.22 to 0.83), 0.72 (95% CI, 0.54 to 0.97), 0.80 (95% CI, 0.67 to 0.95), and 0.58 (95% CI, 0.40 to 0.85), respectively. Other SNPs conferred increased risks of death. All SNPs in the RGS1 gene were in linkage disequilibrium (with r 2 .0.8) with similar HRs in a dominant model.

Associations between SNPs and risk of death stratified by treatment
We then performed a stratified analysis by treatment modality, chemotherapy or chemoradiation (Tables 3 and 4). Nine SNPs were associated with overall survival in patients who received chemotherapy only, 5 of which had bootstrap P values ,0.05 at least 70 times out of 100 times ( were significantly associated with altered median-survival time (MST) (log-rank P value ,0.05) ( Table 3).

Cumulative effects of the unfavorable genotypes on survival
We further assessed the cumulative effects of the unfavorable genotypes in either treatment groups using the SNPs with bootstrap P values ,0.05 at least 70 times out of 100 times in each group ( Table 5). There were significant gene-dose effects in patients receiving both treatments ( Table 5). In those patients receiving chemotherapy only and taking the low-risk reference   Table 5 and Figure 1).

Higher-order gene-gene interactions
The results of STREE program analysis for the interaction of the 10 bootstrap-validated significant SNPs (the SNPs which had  bootstrap P values,0.05 at least 70% of time in Tables 3 and 4) in stratified analysis were presented in Figure 2. Survival tree analysis resulted in four terminal nodes in the chemotherapy group and two terminal nodes in the chemoradiation group ( Figure 2A). In the chemotherapy group, the initial split was rs944343 (RGS3), and subsequent splits were rs2816312 (RGS1) and rs1122794 (RGS11). Different nodes had different percentages of death event. To assess the risk associated with each of the terminal nodes, node 1 in the chemotherapy branch was taken as the reference group, composed of individuals with the heterozygous and the homozygous variant genotypes of rs944343 (RGS3) and the homozygous wild-type genotype of rs1122794 (RGS11  Figure 2B and Table 6). In the chemoradiation group, there was only one additional split. Compared to the patients with the homozygous wild-type genotype of rs6429264 (RGS7) (node 5), who had an MST of 19.28 months, the patients carrying variant-containing genotypes of rs6429264 (RGS7) exhibited a 1.89-fold increased risk of death (95% CI, 1.06 to 3.38), with an MST of 12.37 months ( Figure 2C and Table 6).

Discussion
In this study, we found that genetic variations in RGS genes were associated with overall survival in late-stage NSCLC patients. Our findings also reinforced the importance of evaluating the cumulative and interaction effects of genetic variations when predicting clinical outcomes of patients with NSCLC.
NSCLC patients are mostly treated with the platinum-based chemotherapy, often in combination with radiation therapy. The platinum-based chemotherapy may be related to several cellular pathways, such as the DNA damage/repair, cell cycle control, and apoptosis pathways [41]. However, there has been no study reporting that RGS is involved in the platinum-based chemotherapy related pathways.
NSCLC cells can invade adjacent tissues and metastasize to nonadjacent organs and tissues, processes that may be attributed to altered cellular signaling pathways [42,43]. Oncogenic transformation is often the direct result of mutations of the signaling molecules, which constitute these pathways. In this study, 5 SNPs were associated with the overall risk of death with bootstrap P values ,0.05 at least 90 times out of 100 times. Three of these 5 SNPs, rs6678136 (RGS4), rs3820487 (RGS5) and rs2749786 (RGS12) conferred significantly different MST in Kaplan-Meier curve ( Table 4). Previous studies reported that RGS4 gene expression were associated with invasion of several cancer [36,44]. In addition, RGS4 protein acts as an inhibitor of epithelial and endothelial cell tubulogenesis by regulating mitogenactivated protein kinases and vascular endothelial growth factor signaling, thereby inhibiting cell proliferation, migration, and invasion [45]. Xiao et a.l reported that multiple SNPs in combination in RGS5 may confer risk for hypertension in Chinese population [46]. RGS5 was reported to be a key modulator of tumor pericyte maturation and play a pivotal role in tumor neovascularization [9]. RGS5 knockout mice showed larger tumor burden and earlier death which may be caused by pericyte maturation and vascular normalization [33]. RGS5 has also been identified as a broadly expressed tumor antigen in multiple types of cancer [47]. RGS12 is a large RGS protein with multiple functional domains such as PDZ, PTB (phospho-tyrosine binding) and Rap binding domains [48]. PDZ domain of RGS12 interacts with a GPCR, CXCR2, and thereby contributes to the GAP action of RGS12 on CXCR2-mediated G-protein signals [49]. Therefore, it is biologically plausible that RGS4, RGS5, and RGS12 are associated with lung cancer survival. The functions of the significant SNPs on these genes are not clear since they are most likely tagging SNPs. Future studies are needed to find the causal SNPs. In stratified analyses, 5 SNPs in the chemotherapy group and another 5 in the chemoradiation group were associated with the risk of death with bootstrap P values ,0.05 at least 70 times out of 100 times. The genotypes of four SNPs: rs2816312 (RGS1), rs944343 (RGS3), rs1051013 (RGS3), and rs1122794 (RGS11) were found to be significantly associated with MST in chemotherapy group. The most significant one was rs944343 (Log-rank P = 0.0009), which was a tagging SNP located at the 39 flanking region of RGS3. Increased RGS3 expression has been used as a diagnostic marker for soft tissue sarcoma and was associated with resistance to chemotherapy in breast cancer [50,51]. In addition, RGS3 has been reported to modulate glioma cell mobility [36]. The other host genes of SNPs in chemotherapy group, RGS1, and RGS11, have also been associated with the etiology and prognosis of cancer. Rangel et al. reported that RGS1 may be a prognostic marker in melanoma progression and its expression was associated with survival for melanoma patients [20]. Martinez-Cardus, et al. reported that RGS11 expression was significantly associated with the resistance to platinum therapy in colorectal cancer [52]. These studies support the role of RGS1, RGS3, and RGS11 in lung cancer prognosis. There were only two SNPs in chemoradiation group, rs6429264 and rs6689169, significantly associated with MST (log-rank P = 0.0055 and 0.0441, respectively). both of which are tagging SNP and located in RGS7. Several studies demonstrated that tumor necrosis factor-a, a major inflammation cytokine that plays an important role in many human cancers, can rapidly activate the expression level of RGS7 [53,54]. The mechanisms by which these genotypes determine their phenotypes and affect the outcome of NSCLC are not clear. Further studies are warranted to identify the causal variants and the biologic mechanisms underlying our observed associations.
We also observed cumulative effects of RGS SNPs on the survival of late-stage NSCLC patients. In addition, we used survival tree analysis to identify interactions among these SNPs. These gene-gene interactions resulted in four terminal nodes with different risks of death in the chemotherapy-only group and two terminal nodes with different risks of death the chemoradiation group. The cumulative-effects analysis and survival tree analysis may allow us to identify more powerful prognostic or predictive markers and signatures based on the combination of each patient's genetic variations. It should be noted that these types of analyses were exploratory, and the results need to be validated in independent studies.
There are a few strengths to our study. First, our current pathway-based approach is a logical extension of the candidate gene approach and avoids the requirement of much larger sample size by genome-wide association study. Second, we have collected a relative large population of NSCLC patients from the same institution. The uniform standard operation procedures in the cancer identification, pathological staging, and even strategy determination for cancer treatment made our findings more comprehensive and applicable to future clinical studies. Third, we have performed internal statistical validation by a bootstrap resampling procedure to minimize false discoveries. Fourth, we have performed exploratory gene-gene interaction analysis to establish a novel combination of SNPs to predict the outcome of NSCLC patients for their therapy, which could help clinicians in determining the optimal personalized treatment and the quality of care for survival.
To the best of our knowledge, this is the first study investigating the association of genetic variations in RGS family with survival for NSCLC. Our results have provided not only SNP-based analysis, but also a more comprehensive pathway-based approach in the clinical outcome prediction for NSCLC patients who underwent chemotherapy or chemoradiation. Future independent validation in larger population and detailed functional assays are necessary before these findings can be translated to the clinics.

Ethics Statement
All patients signed a written informed consent and this study has been reviewed and approved by the Institutional Review Board (IRB) of MD Anderson.

Study population and collection of epidemiologic and clinical data
A total of 598 patients with late-stage NSCLC, including stages IIIA, IIIB (Dry), IIIB (Wet), and IV, recruited between 1995 and 2007 from an epidemiological lung cancer study being conducted at The University of Texas MD Anderson Cancer Center. None of them had been previously treated by surgery chemotherapy, and/or radiotherapy before enrollment into the study. All participants had completed a risk factor questionnaire that collected data on demographic characteristics, tobacco use history, occupational and environmental exposures, prior medical history, and any history of cancer in first-degree relatives, and also had donated a 40-ml blood sample for genotyping. We extracted the clinical information from the patients' medical records of their co-morbid conditions, tumor size, clinical stage, pathologic stage, histological type, tumor grade, treatment type, tumor recurrence, survival, and tumor progression for all the analyses. The median follow-up time was 11.8 months.

SNP selection and genotyping
A comprehensive panel of cancer-related genes including RGS gene family was identified and classified in each specific pathway according to their major reported functions. In the gene list, seventeen genes in the RGS family (RGS 1-5, 7-14, 16, 18, 20, and 22) were selected for this panel. The detailed procedure for compiling the panel of genes and SNPs were reported previously [55]. Genomic DNA was extracted from the peripheral blood lymphocytes of the patients' blood samples, and all the genotyping work were performed according to the standard protocol provided by Illumina Inc. Then the results of genotyping were automatically generated by the Illumina's BeadStudio software. Finally, 95 SNPs in the RGS pathway were selected and successfully genotyped in these patients, as shown in Table S1 in the Supporting Information.

Statistical analysis
STATA statistical software (StataCorp LP, College Station, TX) version 10.2 was used for the analysis of hazard ratios (HRs), P values, median survival time (MST), P values for log-rank test and Kaplan-Meier survival estimate. x 2 test (for categorical variables) and Student's t-test (for continuous variables) were used to assess differences in variables between dead and alive patients. For each SNP, the risk of death as a hazard ratio (HR) and 95% confidence interval (CI) were estimated with the Cox proportional hazards regression model. In addition, multivariate adjustment was used to control for potential confounding factors (age, gender, ethnicity, smoking status and pack-years, performance status, clinical stage, and treatment). For each SNP, the genetic distribution were assessed by three genetic models (dominant, recessive, and additive), and the model with the smallest P value was selected as the best-fitting model [56]. To validate the results, the bootstap resampling method was used. For each bootstrap sample drawn from the original data set, 100 bootstrap samples were generated. We obtained the P value for each SNP among the dominant, recessive, and additive models. The cumulative effects of different genotypes were calculated by summing up the individual effects of significant SNPs, that is, SNPs that showed significant association in single-SNP analysis and also had a bootstrap P value ,0.05 at least 70 times. We used Cox proportional hazards regression model to estimate the HRs and 95% CIs. The Kaplan-Meier method and the log-rank test were used to estimate their effects on survival duration for these SNPs. Finally, the STREE program (http://masal.med.yale.edu/stree/) was used to perform survival tree analysis for the higher-order gene-gene interactions of the SNPs. For these analyses, we only included SNPs that had been validated internally by bootstrapping. A two-sided P,0.05 was considered statistically significant.

Author Contributions
Conceived and designed the experiments: XW. Performed the experiments: JD JG DC. Analyzed the data: JL. Contributed reagents/materials/ analysis tools: CL DS JAR XW. Wrote the paper: JD XW.