The MKK7 p.Glu116Lys Rare Variant Serves as a Predictor for Lung Cancer Risk and Prognosis in Chinese

Accumulated evidence indicates that rare variants exert a vital role on predisposition and progression of human diseases, which provides neoteric insights into disease etiology. In the current study, based on three independently retrospective studies of 5,016 lung cancer patients and 5,181 controls, we analyzed the associations between five rare polymorphisms (i.e., p.Glu116Lys, p.Asn118Ser, p.Arg138Cys, p.Ala195Thr and p.Leu259Phe) in MKK7 and lung cancer risk and prognosis. To decipher the precise mechanisms of MKK7 rare variants on lung cancer, a series of biological experiments was further performed. We found that the MKK7 p.Glu116Lys rare polymorphism was significantly associated with lung cancer risk, progression and prognosis. Compared with Glu/Glu common genotype, the 116Lys rare variants (Lys/Glu/+ Lys/Lys) presented an adverse effect on lung cancer susceptibility (odds ratio [OR] = 3.29, 95% confidence interval [CI] = 2.70–4.01). These rare variants strengthened patients’ clinical progression that patients with 116Lys variants had a significantly higher metastasis rate and advanced N, M stages at diagnosis. In addition, the patients with 116Lys variants also contributed to worse cancer prognosis than those carriers with Glu/Glu genotype (hazard ratio [HR] = 1.53, 95% CI = 1.32–1.78). Functional experiments further verified that the MKK7 p.116Lys variants altered the expression of several cancer-related genes and thus affected lung cancer cells proliferation, tumor growth and metastasis in vivo and in vitro. Taken together, our findings proposed that the MKK7 p.Glu116Lys rare polymorphism incurred a pernicious impact on lung cancer risk and prognosis through modulating expressions of a serial of cancer-related genes.


Introduction
Ever-increasing epidemiological studies, especially the genome-wide association studies (GWAS), have extensively identified numerous genetic variants, including single-nucleotide polymorphisms (SNPs), to be associated with risk and progression of various human malignancies [1][2][3]. Despite these discoveries, much of the genetic contributions to complex diseases remains unclearly illuminated because of the fact that only a small proportion of cancer heritability can be explained by those common SNPs, typically with minor allele frequency (MAF) >5%, reflecting that some 'missing heritability' existed [4,5]. Recently, accumulating evidence revealed that rare variants (MAF<1%) could decipher accessional disease risk or trait variability [6][7][8]. An example is that the rare variants located in proto-oncogenes or tumor suppressor genes may contribute to phenotypic variations through modifying their biological functions or genes expression, and thus play an important role in cancer initiations and progressions [9,10]. These findings provide novel approaches for the exploration of cancer mechanism.
Human mitogen-activated protein kinase kinase 7(MKK7, also known as MAP2K7, MIM: 603014) belongs to the MAP kinase kinase family, and is identified as a tumor suppressor gene [11]. Evidence has demonstrated that MKK7 serves as a critical signal transducer involved in several cancer-related signaling pathways and genes, and thus participates in regulating cells proliferation, differentiation and apoptosis [12][13][14]. MKK7 deletion in mice caused distinct phenotypic abnormalities [15], whereas expression of MKK7 could inhibit lung cancer cells development [16]. In addition, several studies also indicated that MKK7 acts as a suppressor in tumors migration, invasion and metastasis [17][18][19].
Human MKK7 gene is located at chromosome 19p13.3-p13.2, a region spanning over a fragile site associated with various human diseases [20,21]. A study reported that the somatic mutations and loss of heterozygosity at 19p13.2 commonly existed in lung cancer [22]. Furthermore, another study showed that several non-synonymous somatic mutations of the MKK7 gene also occurred and were associated with colorectal cancer predisposition [23]. Nevertheless, it is still molecularly unexplained how these rare variants implicated in cancer initiation and development. Therefore, in the current study, we test the hypothesis that the rare variants in MKK7 might be associated with lung cancer risk and prognosis by disturbing the biological functions of MKK7.
Based on three independent case-control studies, we genotyped five rare SNPs in MKK7 (i.e., rs28395770G>A: p.Glu116Lys, rs56316660A>G: p.Asn118Ser, rs56106612C>T: p. Arg138Cys, rs55800262G>A: p.Ala195Thr and rs1053566 C>T: p.Leu259Phe) and investigated their associations with lung cancer risk, metastasis and prognosis. The biological effects of those promising rare variants on lung cancer were further assessed by a series of functional experiments.

Characteristics of the study populations
The demographic distributions of the three study populations are described in Table 1. Consistently, no significant deviations were observed in distributions of age, sex, drinking and family cancer history from the cases to controls in all the studied sets (P >0.05 for all), except for smoking status (P < 0.05). These variables were further adjusted in the multivariate logistic regression model to control possible confounding on the main effects of the studied polymorphisms. The histological types and clinical stages of the cases were also enumerated in Table 1. In addition, we recalculated the samples size based on population sources. There were 3005 cases and 3013 healthy controls in Guangzhou area, 2011 cases and 2168 cancer-free controls in Suzhou area.
Associations between the MKK7 rare SNPs and lung cancer risk Table 2 summarized the genotype distributions of the studied MKK7 rare SNPs and their associations with lung cancer risk. In the discovery set, we found a significant frequency deviation between the cases and controls (exact P = 4.12×10 −12 ) in p.Glu116Lys rare polymorphism. Compared to individuals with 116Glu/Glu genotype, the carriers with Lys/Glu heterozygote harbored a 3.33-fold increased risk of lung cancer (odd ratio [OR] = 3.33, 95% confidence interval [CI] = 2.29-4.86), and carriers with Lys/Lys variant genotype exerted a much higher cancer risk (OR = 3.94, 95% CI = 1.09-14.3). When combined with variant genotypes, they (Lys/Glu+Lys/Lys) also contributed a pernicious impact on lung cancer risk (OR = 3.38, 95% CI = 2. 35-4.85), conforming to the fitted genetic model with the smallest akaike information criterion (AIC = 4415.3). However, we did not receive any association between other rare SNPs and lung cancer risk.
We further confirmed the above associations in another two validation sets, and obtained consistent results. The p.116Lys variants genotypes (Lys/Glu+Lys/Lys) exerted a 3.52-fold increased risk of lung cancer (OR = 3.52, 95% CI = 2.54-4.89) in validation set I, and a 2.87-fold increased risk of lung cancer (OR = 2.87, 95% CI = 2.04-4.04) in validation set II. Because the homogeneity test showed that the association in the above three sets was homogeneous (P = 0.711), we then merged the three populations to increase the study power, and found that the compared with Glu/Glu common genotype, the carrier with Lys/Glu or Lys/Lys had a remarkably adverse effects on lung cancer risk (OR = 3.23, 95% CI = 2.62-3.98; OR = 3.75, 95% CI = 2.09-6.71; respectively). Similarly, the Lys (Lys/Glu+Lys/Lys) variants also had a 3.29-fold increased risk of lung cancer under the dominant model (OR = 3.29, 95% CI = 2.70-4.01). The heritability test indicated that the p.Glu116Lys rare variant could explain about 2.16% of lung cancer heritability.
Stratified analysis of association between MKK7 p.Glu116Lys and lung cancer risk further evaluated the relationships between MKK7 p.Glu116Lys and lung cancer progression, and found that p.Glu116Lys was significantly associated with pejorative clinical stages (P <0.001, shown in S1 Table). As is revealed in S1 Table, the patients with 116Lys variants had increased probability of progressing to IV stage (OR = 1.69, 95% CI = 1.28-2.24). Likewise,  Associations between the combinations of MKK7 rare SNPs and lung cancer risk We further evaluated the associations between the combined types of those selected rare SNPs and lung cancer risk. As is shown in S2 Table, the individuals with only p.Glu116Lys variant was associated with lung cancer susceptibility (exact P = 1.18×10 −33 ), accompanying by a 3.24-fold increased cancer risk (OR = 3.24, 95% CI = 2.64-3.97), which was best fitted for the heredity model (AIC value = 13840.2). It achieved 100% study power and yielded a value of 0.000 with a 0.001 prior probability lower than the preset FPRP-level criterion 0.20, suggesting that this finding is noteworthy. Individuals with a combination of p.Glu116Lys and p.Asn118-Ser variant genotypes also had an increased risk of lung cancer (OR = 3.16, 95% CI = 1.02-9.76), but it achieved only 66.7% moderate power and a 0.985 FPRP value at a 0.001 prior probability, which is higher than the preset criterion 0.20. Furthermore, we also used the SKAT method to test combined genotypes associated with lung cancer risk, and found that only those combinations containing the p.Glu116Lys rare variation had prominent relevancies with lung cancer risk(P <0.01 for all). All these results indicated that among all of the MKK7 five rare polymorphisms, the p.Glu116Lys contributed the main effect on lung cancer risk. A serial of experiments was further conducted to decipher the biological mechanisms of p.Glu116Lys on lung cancer.

Associations of the MKK7 rare SNPs with lung cancer prognosis
The distributions of demographic and clinical characteristics in the three datasets are presented in S3 Table. The Log-rank test and univariate Cox analysis revealed that patients with characteristics including 60, smoking or advanced stage had a significantly shorter median survival time (MST) and an increased death risk (P <0.05 for all). In contrast, the female patients, and those patients suffering from surgical operations, chemotherapy or radiotherapy prolonged survival time and had a more benignant prognosis (shown in S3 Table).

Stratified analysis of the MKK7 p.Glu116Lys and lung cancer survival
As is revealed in Table 5, although the strength of relevance represented by the HR values between the p.116Lys variants and lung cancer prognosis were different across a plurality of stratums, the homogeneity test showed that the difference was only significant in subgroups of

Associations between the combinations of MKK7 rare SNPs and lung cancer prognosis
We further analyzed the associations between the combinational genotypes of MKK7 rare SNPs and lung cancer prognosis. As is presented in S4 Table, combined-type of only p. Glu116Lys was significantly associated with cancer outcome. Patients only with Lys variants genotypes showed a shorter MST (9 months vs. 14 months, Log-rank test P = 4.01×10 −8 ) and a higher death risk (HR = 1.58, 95% CI = 1.35-1.85) when compared with patients without those genotypes. This noticeable result achieved 100% study power and yielded a value of 0.000 at a 0.001 prior probability, which is lower than the preset FPRP-level criterion 0.20. Although the combination of the p.Glu116Lys and p.Asn118Ser was significantly associated with survival time using Log-rank test (P = 0.025), but it obliterated the relevance in the Cox regression analysis (HR = 1.74, 95% CI = 0.99-3.08) and obtained a FPRP value of 0.991 at a 0.001 prior probability higher than the preset criterion 0.20, suggesting that this result was likely to be untrustworthy.

Effects of MKK7 p.Glu116Lys on cellular proliferation, apoptosis, migration and invasion
To explore the effects of the MKK7 p.Glu116Lys rare variant on cell biological behaviors, multitudinously functional experiments were further executed. The proliferation test showed that cells with over-expressing MKK7-116Lys displayed a higher proliferation potential than cells with over-expressing MKK7-116Glu (Fig 2A, ANOVA test P<0.001). Cells highly expressing MKK7-116Lys also had strikingly promoted abilities of colony formation in common plate, as well as in soft-agar, compared to the cells with MKK7-116Glu (Fig 2B and 2C). In addition, we further performed flow cytometry to evaluate the influence of p.Glu116Lys variants on cells cycle and apoptosis. We found that the over-expressing MKK7-116Lys in A549 cells induced a significantly reduction in the G0/G1 phase (12.9% decreased, P = 0.015) and a corresponding increase in the G2/M phase (7.4% increased, P = 0.038), while compared with the cells stably expressing MKK7-116Glu ( Fig 2D). Notably, the A549 cells with MKK7-116Lys also had decreased apoptosis rate than cells with MKK7-116Glu (Fig 2E, P = 0.043). Furthermore, cells with highly expression of MKK7-116Lys showed remarkably promoted migration and invasion capabilities in comparison to cells with over-expressing MKK7-116Glu (Fig 2F and 2G). These arresting results also occurred in the L78 cells with stably over-expressing MKK7-116Lys. All these findings suggested that the MKK7-116Lys variant had a detrimental impact on promoting cell proliferation, invasion and immigration.

Effects of MKK7 p.Glu116Lys on tumor growth and metastasis
To further determine the effect of p.Glu116Lys on tumor growth and metastasis in vivo, cells with stably over-expressing MKK7-116Glu or MKK7-116Lys were injected into nude mice subcutaneously (both for A549 and L78 cell lines), and intravenously (for A549 cell line only), respectively. As is shown in Fig 3A, the injection of MKK7-116Lys cells resulted in tumor formation began 4 days earlier compared to the results from injection of MKK7-116Glu cells. The tumor grew faster, and after 4 weeks, the tumor size in the former group was larger than the latter group (For A549: 1246.3±102.3 mm 3 vs. 846.3±78.5 mm 3 , P <0.001, Fig 3A; for L78: 1474.5±99.4 mm 3 vs. 921.1±88.4 mm 3 , P <0.001, Fig 3B). Moreover, we used the MRI and histology examination to determine whether the MKK7 p.Glu116Lys could cause tumor metastases, and found that all the mice injected with A549 cells over-expressing MKK7-116Lys suffered from pulmonary metastasis, while the mice group injected with MKK7-116Glu A549 cells did not (Fig 3C, 3D and 3E). These findings demonstrated that MKK7-116Lys variant enhanced lung tumor growth and metastasis in vivo.

Effects of MKK7 with p.Glu116Lys rare variant on the gene expression profiles
To decipher the potential mechanisms behind the MKK7 with p.Glu116Lys rare variant induced lung cancer risk and progression, we further performed DGE sequencing to compare gene profiles between A549-MKK7-116Glu cells and A549-MKK7-116Lys cells. We found that compared to cells with stably over-expressing MKK7-116Glu, cells with MKK7-116Lys had 192 genes differentially expression with a q value of <0.001 (S1 File). Among these genes, 128 genes were up-expressed, and 64 genes were down-expressed. We further validated the DGE results using qRT-PCR assay, with detecting differently expressed genes including 5 up-expression genes(STC2, SLC1A3, MSMO1, BCL10 and HMGCR) and 5 down-expression genes (SAA1, SBK2, CDH5, COL4A2 and BCL9L). The results were in concordance with those findings through DGE sequencing. Furthermore, we conducted Gene Ontology (GO) analysis using these differentially expressed genes. The GO results indicated that these 192 differentially expression genes were annotated to be associated with cell cycle process, cell proliferation, apoptosis, tissue development, tumor invasion, and metastasis et al. Above results suggests that alteration from 116Glu to Lys in MKK7 might influence its downstream targets expression and thus facilitate lung cancer initiation and development.

Discussion
In the current study conducted among southern and eastern Chinese with a total of 5,016 lung cancer patients and 5,181 controls, we estimated the relationships between rare variants in MKK7 gene and lung cancer risk and prognosis, and found that the p.Glu116Lys rare variant was significantly associated with an increased lung cancer risk, progression and prognosis. The individuals with 116Lys variants had promotional cancer risk and higher probability of metastasis at diagnosis. The harmful role of the 116Lys variants also resulted in a poorer lung cancer prognosis in the patients than in the patients with Glu/Glu genotype. Further functional assays demonstrated that lung cancer cells with p.116Lys variant accelerated cell growth, proliferation, colony formation, migration and invasion. They also promoted the xenograft growth and metastasis of nude mice in vivo through regulating a serial of cancer-related genes. However, no conspicuous evidence was obtained to prove any significant association between other rare SNPs and lung cancer risk and prognosis.
To the best of our knowledge, this is the first study to investigate the associations between the genetic rare variants in MKK7 and lung cancer risk, as well as metastasis and prognosis. Accumulating evidence indicated that 'missing heritability' in complex human diseases caused increasing attention over the past a few years because the findings from the GWAS and other epidemiological studies did not completely explained the genetic heritability [4,5,24]. With rapid advances in high-throughput sequencing technologies, unnoticed genetic components such as low-frequency (1%MAF< 5%) and rare genetic variants (MAF< 1%) are being thoroughly assessed and investigated for their associations with complex human diseases [6,7,25]. These approaches highlight an unparalleled opportunity to decipher unexplained genetic contributions in forming complex traits [26,27], especially in human malignancies [10].
MKK7 has been identified to be a tumor suppressor gene constitutive activation of JNK signaling pathway to induce cell apoptosis [28,29]. Recently, a study reported that tissue-specific inactivation of the stress signaling kinase MKK7 in ras-driven lung carcinomas and NeuTdriven mammary tumors markedly accelerates tumor onset and reduces overall survival through directly coupling oncogenic and genotoxic stress to the p53 stability [11]. Lin HJ et al. identified that MKK7 could negatively regulate the expressions of MMP-2 and MMP-9 and thus inhibited cancer cell migration and invasion [17]. In addition, another report showed that ectopic expression of the MKK7 suppresses the formation of overt metastases by inhibiting the ability of disseminated cells to colonize the lung [14]. Furthermore, several studies display an intimate linkage with germline mutations in MKK7 and cancer onset and progression [30][31][32].
In the present study, we found that p.Glu116Lys rare variant in MKK7 contributed a pernicious impact on lung cancer risk and prognosis. We also observed a remarkable interaction between clinical stage and the rare variant on cancer survival. The p.Glu116Lys variant located at kinase activity domain of the MKK7 gene, which might influence the structure and functions of the MKK7 based on the bioinformatics analysis (http://snpinfo.niehs.nih.gov/). Our biological assays demonstrated that the 116 locus alteration from Glu to Lys in MKK7 could promote cells proliferation, migration, invasion, and reduce cells apoptosis in vitro; the adverse role of 116Lys variant was also found to facilitate the xenograft growth and metastasis in vivo. The 116Lys variant further altered the expression of downstream genes modulated by MKK7 as the DGE results showed, which might be closely related with lung cancer initiation and development. Among these differentially expressed genes, there were ones annotated as cell-regulated genes, cell apoptosis, cancer-related genes, tumor invasion and metastasis. For example, as the DGE results had indicated, STC2, YEATS4 and SLC1A3 genes were up-expressed in the cells with stably over-expressing MKK7-116Lys compared with the cells with MKK7-116Glu. A study had reported higher mRNA and protein expressions of STC2 in lung cancer tissues compared to the adjacent normal tissue. Knockdown of STC2 slowed down lung cancer cell growth progression, colony formation and metastasis [33]. Another article showed the proof that overexpression of YEATS4 abrogated senescence in human bronchial epithelial cells, while RNAimediated attenuation of YEATS4 could conversely reduce lung cancer cells proliferation and tumor growth, impair colony formation, and induce cellular senescence [34]. In addition, several genes such as CDH5 and UBA7 (also known as UBE1L) were significantly down-regulated in the cells with MKK7-116Lys. A previous study had reported a downregulation of CDH5 in Bulgarian patients with early-stage non-small cell lung cancer [35]. Loss of UBE1L is a common event in lung carcinogenesis, and the UBE1L gene suppressed lung cancer growth by preferentially inhibiting cyclin D1 [36,37]. All the above public evidence was in accordance with our findings in the current study, which convincingly supported our ultimatums that MKK7 p. Glu116Lys rare variant exerted adverse effects on lung cancer risk, progression and prognosis by modulating a number of cancer-related genes.
Our study has several strengths and limitations. Based on three independent case-control studies, we have obtained consistent results of the association between the MKK7 p.Glu116Lys rare variant and lung cancer risk and prognosis, with a compellingly strong study power of 100% (twosided test, α = 0.05) to detect an OR of 3.29 for the 116Lys variant genotypes (which occurred at a frequency of 2.7% in the controls), and with a 100% statistical power for HR with a value to 1.53, while compared with the 116Glu wild-genotype. A serial of functional experiments further sustained the results that the p.116Lys variants conferred noxious effects on lung cancer risk, progression and prognosis. However, there are also some limitations. The selection bias is unavoidable on account of the hospital-based retrospective studies. Also, with restriction to a Chinese Han population, it is uncertain whether our findings could be generalized to other populations. Furthermore, due to the technological limitations, we did not promulgate any direct target genes of MKK7 with respect to the p.Glu116Lys rare polymorphism, which might help us to understand the precisely molecular mechanism of this rare SNP on influencing cancer risk and progression.
In conclusion, our findings indicated that the p.Glu116Lys rare variant of MKK7 was associated with an increased lung cancer risk and worsened prognosis in Chinese, which was likely to be related to modulation of a serial of cancer-related genes. These results suggested that the MKK7 p.Glu116Lys may be a useful predictive biomarker for lung cancer susceptibility and prognosis. Validations through larger population-based studies in different ethnic groups, and functional assay to reveal target gene of the p.Glu116Lys rare SNP in MKK7 are warranted.

Ethics statement
Each participant was scheduled for an interview to collect individual information on smoking status, alcohol use, and other selected factors, and to obtain a donated 5 mL of peripheral venous blood under his or her informed consent. The study was approved by the institutional review boards of Guangzhou Medical University (Ethics Committee of Guangzhou Medical University: GZMC2007-07-0676) and Soochow University (Ethics Committee of Soochow University: SZUM2008031233). All experiments and procedures involving animals were conducted in accordance with guidelines approved by the Laboratory Animal Center of Guangzhou Medical University.

Study population and follow-up
In this study, three two-stage independently retrospective studies with a total of 5,015 lung cancer patients and 5,181 healthy controls were performed in southern and eastern Chinese populations. In brief, 1,559 lung cancer cases and 1,679 cancer-free controls as the discovery set, which included southern Chinese with 1,056 primary lung cancer cases and 1,056 healthy subjects recruited from the Guangzhou city, and eastern Chinese with 503 patients and 623 controls enrolled from Suzhou city, have been previously described [1,38,39]. In the validation set I, 1,949 lung cancer patients that were continuously recruited from Guangzhou between April 2009 and June 2014 with a 90% response rate and 1,957 sex and age (± 5 years) frequency matched cancer-free controls who were randomly selected from about 3,000 individuals participating in health community programs with an 83% response rate were used. Moreover, the other population from Suzhou city was used as validation set II, in which 1,508 lung cancer cases were enrolled between December 2009 and March 2014 with an 85% response rate and 1,545 controls were randomly selected from 8000 participators in the annual healthy checkup programs with a response rate of 91%. All the participants were genetically unrelated ethnic Han Chinese and none had blood transfusion in the last six months. Definitions of smoking status, pack-years smoked, drink status, family history of cancer and family history of lung cancer have been previously described [38,39].
As was previously reported, clinical information and characteristics of patients were also collected [40]. Patient follow-ups were performed through telephone calls every three months from time of enrollment to the last scheduled follow-up or death. Survival time was calculated starting from the day the patients first received confirmed diagnoses to the date of the last follow-up or death, and dates of death were acquired from medical records or information provided by family members through telephone follow-ups. Patients that were lost to follow-ups or had no accurate data on clinical information were excluded. In the finalized study, 908 patients from the discovery set, 1027 patients from validation set I and 971 patients from validation set II that have completed the follow-up and had intact survival data were included in this study. In addition, to eliminate the bias in patient selection, we analyzed the differences in clinical features, as well as in survival data, between the included and excluded groups, and no deviated results were observed.

SNP selection and genotype determination
Because no published data reveal potentially functional variants in MKK7, we only selected those exon variants in gene coding region causing amino acid change that are supposed to be with most functional potential. Through the strategy of searching for the rare polymorphisms located in the MKK7 gene exons region based on the public dbSNP database (http://www.ncbi.nlm.nih. gov/snp/, access to 1/1/2014), we found that 5 SNPs of MKK7 gene (i.e., rs28395770G>A: p. Glu116Lys, rs56316660A>G: p.Asn118Ser, rs56106612C>T: p.Arg138Cys, rs55800262G>A: p. Ala195Thr and rs1053566C>T: p.Leu259Phe) were rare with MAF<1% in Chinese population. We then re-sequenced the whole cDNA of MKK7 in 100 normal Chinese Hans randomly picked from the controls, and no newfound rare variants outside of those 5 SNPs were obtained. Therefore, we chose these above rare SNPs in the current study.
Genomic DNA was extracted from 2 mL peripheral blood using the routine method. Genotypes of all the selected SNPs were determined by direct DNA sequencing. A fragment of a total of 1,102 bp from the whole genomic DNA templates with the forward primer 5 0 -CCCA GCATTGAGATTGACCAGA-3 0 and reverse primer 5 0 -TGCCATGTAGGCGGCACA-3 0 , which comprises the 5 studied SNPs was amplified. The PCR program for the amplification was as follows: 95°C for 5 minutes and then 40 cycles of denaturation (95°C for 45 seconds), annealing (61°C for 1 minute), and extension (72°C for 1 minute and 30 seconds), and a final polymerization step at 72°C for 7 minutes. The products were then separated by a 1% agarose gel and extracted. Finally, the PCR products were sequenced by an automated sequencing system (ABI Prism 3730 Genetic Analyzer; Applied Biosystems, Foster City, USA) operating according to the manufactures' protocols (S1 Fig).

Plasmids construction, lentivirus package and cell transfection
The cDNA sequence of human MKK7 gene with a wild-type (p.116Glu) was synthesized by the Sangon Biotech Company (Shanghai, China) and cloned into pLVX-IRES-neo expression vector (Clontech Laboratories Inc., San Francisco, CA, USA). The mutated pLV-MKK7-116Lys plasmid was induced by site-directed mutagenesis using the Quick Change XL site-directed mutagenesis kit (Stratagene, La Jolla, CA, USA). The resulting constructs were verified by direct sequencing. The lentiviral production and transduction were performed abiding by protocol described elsewhere [40]. In brief, replication-defective VSV-G pseudotyped viral particles were packaged using a 3-plasmid transient cotransfection method (Lenti-T HT packaging mix, Clonetech, San Francisco, CA, USA). Viruses were then harvested and concentrated. For transfection, two human lung cancer cell lines, A549 (a human lung adenocarcinoma cell line) and L78 (a human lung squamous carcinoma cell line) were infected with control lentivirus (an "empty" vector without the MKK7 fragment inserted), pLV-MKK7-116Glu lentivirus and the pLV-MKK7-116Lys lentivirus, respectively. The cells were stably selected with G418 at 100 μg/ml (Gibco, Lyon, France), and the drug-resistant cells were confirmed by qRT-PCR and western blotting assays (S2 Fig).

Cell viability assay
Cells infected with different allele lentivirus (pLV-MKK7-116Glu and pLV-MKK7-116 Lys) were seeded into 96-well flat-bottomed plates. 1,000 cells per 100 μl of cell suspension were used to add in each well. After a certain time of cultivation, cell viability was measured by MTT assay as is previously described [40]. In brief, 20 μl MTT solutions (5 mg/mL, Sigma, USA) per well were added for 4 h before the end of the experiment. After that, the supernatant fluid was removed and 150 μl of DMSO was added to each well. The absorbance was then measured at 490 nm wavelength using a Plate Reader (Bio-Tec Instruments, Inc.) after shaking the plate for 15 min at room temperature.

Flow cytometry analysis of cell cycle and apoptosis
For cell cycle analysis, cells with stably expressing MKK7-116Glu or MKK7-116Lys were collected, washed with PBS and fixed by 70% ethanol for at least 1 h. Subsequently, the cells were stained with 0.5 mL propidium iodide (PI) staining solution, and cellular DNA content was analyzed using a flow cytometry (BD Biosciences, CA, USA). For cell apoptosis, an annexin vfluoresce-in isothiocyanate (V-FITC)/PI double staining assay was conducted according to the manufacturer instructions. In brief, the cells were harvested and stained with annexin V-FITC and PI for 20 min at room temperature in the dark. The cells were then washed twice with PBS, and the fluorescence of the cells was measured by flow cytometry.

Colony formation assay
Cells with stably over-expressing MKK7-116Glu or MKK7-116Lys were seeded into a 6-well plate (100 cell/well) with RPMI 1640 medium supplemented with 10% fetal bovine serum (FBS), and allowed to grow until visible colonies formed (approximately 2 weeks). After washing with PBS, the cell colonies were fixed with 4% paraformaldehyde and stained with crystal violet (Invitrogen) for 30 min, then washed, air dried, photographed and counted. Furthermore, colony formation assay in soft-agar was also executed to detect the effect of MKK7 Glu116Lys rare variant on cell malignant transformation. The detailed procedures were previously described [40]. Briefly, cells suspended with DMEM medium containing a concentration of 0.35% soft agar were poured onto 6-cm tissue culture dishes coated with 5 ml of 0.75% bottom agar. At the end of the experiment, the colonies were then stained, photographed and counted.

Transwell migration assay and matrigel invasion assay
Cell migration and invasion abilities were appraised by Corning transwell insert chambers (8-uM pore size; Costar, USA) and BD BioCoat Matrigel Invasion Chamber (Becton Dickinson Biosciences, USA), respectively. 2×10 4 (migration assay) or 2×10 5 (invasion assay) transfected cells in 200μl serum-free RPMI 1640 medium were seeded in the upper chamber, and 800 μl medium with 10% FBS were added to the lower compartment. After 24 h for migration assays or 48h for invasion assays at 37°C in a 5% CO 2 humidified atmosphere, cells in the upper chamber were carefully scraped off using a cotton swab, and the cells that had migrated to or invaded the lower surfaces of the membrane were fixed with 4% paraformaldehyde solution and stained with crystal violet (Invitrogen), imaged and counted. Assays were independently conducted for three times.

Xenografts in mice
Female BALB/c nude mice that were 4-5 weeks of age were purchased from the Laboratory Animal Center of Guangdong province (Guangzhou, China). Cells with MKK7-116Glu or MKK7-116Lys were diluted to a concentration of 5×10 7 /ml in physiological saline. 0.1 ml of the cells suspension was injected subcutaneously into the dorsal flank of mice to construct tumor growth model (both for A549 and L78 cell lines), or injected intravenously into the caudal vein of mice to construct tumor metastasis model (for A549 cell line only). Six nude mice were used for each group. When a tumor was palpable in the growth model, tumor size was measured every other day using a caliper along two perpendicular axes and calculated according to the following formula: Volume = 1/2×length×width 2 . The tumor metastases were evaluated by magnetic resonance imaging (MRI) and histology examination.

Magnetic resonance imaging
MRI was performed proximately 10 weeks post-injection using Philips Gyroscan Intera 1.5T ultraconducted MRI scanner (Netherlands) and incorporating a removable gradient coil insert. The details of MRI imaging were conducted as suggested by the public literature [41]. In brief, mice were placed prone on an MR-compatible sled within a carrier tube and positioned in the magnet. Induction and maintenance of anesthesia during imaging was achieved through inhalation of 10% chloral hydrate. MRI examination of coronal T2-weighted (T2WI) scanning was conducted with the following variables: Repetition time (TR) = 4000ms, echo time (TE) = 111ms, field-of-view (FOV) = 3 cm, number of slices = 20, slice thickness = 1.0 mm, matrix = 256×256. Following image acquisition, raw image sets were transferred to a processing workstation and processed using the medical imaging software. Tumor metastatic burden were calculated from manually traced regions-of-interest (ROI).

Histological analysis
The animals were euthanized and their tumor masses were harvested and fixed with 10% neutral formalin solution, embedded in paraffin, and sectioned at 5 μm. The sections were then stained with hematoxylin-eosin (HE) staining and examined by light microscopy at 20× magnification.

RNA extraction and genes profiling sequencing
The total RNAs from different A549 transfectant cells were extracted using the TRIzol reagent (Invitrogen) in accordance with the manufacturer instructions. RNA quantity and quality were assessed using a NanoDrop 2000 spectrophotometer (Thermo Scientific, MA, USA). The gene expression profiling both in A549-MKK7-116Glu cells and A549-MKK7-116Lys cells were conducted using Illumina NlaIII digital gene expression (DGE) sequencing. Analyses were performed according to the manufacturer recommendations [42]. Briefly, DGE sequence libraries were sequenced using Illumina HiSeq 2000 platform. Differentially expressed genes between the two groups of cells were identified using the reads per kilobase of transcript per million mapped reads (RPKM) method. The q value 0.001 and the absolute value of log2 ratio 1 were as the threshold to judge the significance of gene expression differences.

Quantitative real-time PCR analysis
On the basis of genes profiling sequencing results, the expression levels of 10 selected genes (includes 5 up-expression genes and 5 down-expression genes) in the A549-MKK7-116Glu cells and A549-MKK7-116Lys cells were verified by the quantitative real time PCR (qRT-PCR) assay described elsewhere [39]. The relative levels of RNA were detected using the ABI Prism 7900HT sequence detection system (Applied Biosystems) and with the SYBRPremix Ex Taq (Perfect Real Time, TaKaRa, China) and β-actin as the internal reference. Each assay was performed in triplicate and independently repeated three times. All the primers used for PCR amplification are listed in S5 Table. Statistical analysis The chi-square test was used to assess differences in the distributions of demographic characteristics between cases and controls. The distributions of genotypes between cases and controls were analyzed with Fisher's exact test. Unconditional logistic regression model with or without adjustment for surrounding factors was used to evaluate the associations between the MKK7 rare SNPs and lung cancer risk and metastasis. The correlations between MKK7 rare genotypes and lung cancer clinical features were tested using Spearman rank correlation. The sequence kernel association test (SKAT) was used to estimate the combined effect of multiple variants in MKK7 and lung cancer risk using R software (version 3.0.2; The R Foundation for Statistical Computing) with the SKAT package [43]. The REML model was used to assess the heritability explained by the genetic variants [44]. Breslow-Day test was used to test the homogeneity between the subgroups. The statistical power was calculated using the PS Software. The falsepositive report probability (FPRP) test was applied to detect false-positive association findings [45]. The associations between clinical variables, as well as genotypes, and overall survival time were estimated using the Kaplan-Meier method and Log-rank test. The Cox proportional hazards regression model with or without adjustment for confounders was used to evaluate the effect of rare polymorphisms on lung cancer prognosis. Multiplicative interactions were assessed by logistic regression or Cox regression [38]. The differences in gene expression, colonies number levels, and cells' ability to invade and migrate were analyzed using the student's ttest. Repeated measure ANOVA test was performed to analyze the deviation of cell proliferation and tumor growth in different groups. All tests were two-sided using the SAS software (version 9. 3; SAS Institute) and P <0.05 was considered statistically significant.   Table. Sequence of primers used in real time RT-PCR analysis. (DOC) S1 File. Differentially expressed genes between A549-MKK7-116Lys cells and A549-MKK7-116Glu cells.