Functional Short Tandem Repeat Polymorphism of PTPN11 and Susceptibility to Hepatocellular Carcinoma in Chinese Populations

Background PTPN11, which encodes tyrosine phosphatase Shp2, is a critical gene mediating cellular responses to hormones and cytokines. Loss of Shp2 promotes hepatocellular carcinoma (HCC), suggesting that PTPN11 functions as a tumor suppressor in HCC tumorgenesis. The aim of this study was to evaluate the effects of the short tandem repeat (STR) polymorphism (rs199618935) within 3'UTR of PTPN11 on HCC susceptibility in Chinese populations. Methodology/Principal Findings We analyzed the associations in 400 patients from Jiangsu province of China, validating the findings in an additional 305 patients from Shanghai of China. Unconditional logistic regression was used to analyze the association between rs199618935 and HCC risk. Additional biochemical investigations and in-silico studies were used to evaluate the possible functional significance of this polymorphism. Logistic regression analysis showed that compared with individuals carrying shorter alleles (11 and 12 repeats), those subjects who carry longer alleles (13 and 14 repeats) had a significantly decreased risk of HCC [adjusted odds ratio (OR)  = 0.63, 95% confidence interval (CI)  = 0.53–0.76, P = 2.00×10−7], with the risk decreased even further in those carrying allele 15 and 16 (adjusted OR = 0.46, 95% CI = 0.34–0.62, P = 1.00×10−7). Biochemical investigations showed that longer alleles of rs199618935 conferred higher PTPN11 expression in vivo and in vitro. The altered luciferase activities in reporter gene system suggested that STR regulation of PTPN11 expression could be a transcriptional event. Finally, in-silico prediction revealed that different alleles of rs199618935 could alter the local structure of PTPN11 mRNA. Conclusions/Significance Taken together, our findings suggested that the STR polymorphism within PTPN11 contributes to hepatocarcinogenesis, possibly by affecting PTPN11 expression through a structure-dependent mechanism. The replication of our studies and further functional studies are needed to validate our findings.


Introduction
Hepatocellular carcinoma (HCC) is one of the most common malignancies and the third leading cause of cancer death [1]. Approximately 80% of HCCs occur in developing countries where hepatitis B virus (HBV) infection is endemic, with the highest incidences being in the Asia-Pacific region, and sub-Saharan Africa [2]. In addition, chronic alcoholism, and long-term exposure to aflatoxin B1 are well-established risk factors for HCC [3]. Molecular biology of carcinogenesis and tumor progression of HCC have been increasingly elucidated with intense research in recent years. However, the molecular and cellular mechanisms of HCC pathogenesis are still poorly understood. Compelling evidence suggests the involvement of host genetic factors in HCC carcinogenesis and genome-wide association studies (GWAS) have greatly contributed to the identification of common genetic variants related to HCC [4,5]. Thus, it is of particular interest in identifying HCC-related genetic variations, which will definitely benefit the prediction of HCC risks, and the exploration of approaches to prevent HCC development.
Protein tyrosine phosphatase, non-receptor type 11 (PTPN11) encodes the non-receptor protein tyrosine phosphatase SHP2, which is critical for RAS/ERK pathway activation in most receptor tyrosine kinase, cytokine receptor, and integrin signaling pathways [6]. SHP2 is widely expressed in most tissues and plays a regulatory role in various cell signaling events that are important for a diversity of cell functions, such as mitogenic activation, metabolic control, transcriptional regulation, and cell migration. Activating mutations in PTPN11 have been shown to be directly associated with the pathogenesis of Noonan syndrome and childhood leukemias [7,8]. Several lines of evidence have indicated that PTPN11 is involved in development of multiple cancers [9][10][11][12], including HCC. PTPN11 is first identified as a protooncogene in leukemia [13]. However, most recent findings suggest an unexpected tumor suppressor role of PTPN11 in HCC [14,15], implying its dual faces in tumorigenesis.
Previous studies have reported genetic variation within PTPN11, either dependent or independent of interaction with helicobacter pylori, is associated with the risks of gastric cancer and/or atrophic gastritis that precede carcinoma [16]. While the contributions of PTPN11 polymorphisms to HCC susceptibilities has not been investigated. Considering the important roles of PTPN11 in HCC, we hypothesize that genetic variations in PTPN11 may modulate its expression thus involve in HCC carcinogenesis. In the current study, we selected one trinucleotide short tandem repeat (STR) polymorphism (rs199618935) and conducted a two-stage case-control study to analyze the genetic effect of the polymorphism on the susceptibilities to HCC in Chinese populations. Additional experimental and in-silico studies were used to evaluate the possible functional significance of this polymorphism.

Ethics Statement
This study was approved by the Ethical Committee of Soochow University. Written informed consent was obtained from each participant before investigation.

Study Populations
The case-control study was performed on genomic DNA extracted from peripheral blood of 705 newly diagnosed incident HCC cases together with 723 controls after obtaining informed consent. All subjects recruited were unrelated ethnic Han Chinese. For the Jiangsu's case-control study (Panel I), the case series were comprised of 400 HCC patients diagnosed, hospitalized and treated in the affiliated hospitals of Soochow University from 2007 to 2011. For the Shanghai's case-control study (Panel II), 305 HCC patients were recruited at Nantong Tumor Hospital from 2003 to 2005. Controls were cancer-free individuals selected from a community nutritional survey that was conducted in the same regions during the same period as the recruitment of cancer patients. Controls without clinical evidence of liver disease were frequency matched for age (65 years) and sex to each set of HCC individuals. The diagnosis of the cases, the inclusion and exclusion criteria for the cases and controls, and the definition of smokers and drinkers were described previously in details [17,18]. Briefly, the diagnosis of these patients was confirmed by a pathological examination combined with positive imaging (Magnetic resonance imaging and/or computerized tomography). Tumor stages were assigned according to a modified American Joint Committee on Cancer (AJCC) and international union against cancer (UICC) standard. The ''current smokers'' were individuals who had kept smoking almost every day for more than one year till the time of interview; and the ''former smokers'' were those who experienced the same degree of smoking as the ''current smokers'', but stopped smoking at least one year prior to the interview; the non-smokers were those either never smoked or seldom did. Subjects were considered as ''light drinkers'', if they consumed 1-2 alcohol drinks per week for more than one year. Those who consumed more than 2 alcohol drinks per week for more than one year were categorized as ''heavy drinkers''. ''Non-drinkers'' were those either never drank or seldom did.
Additional 48 tumor tissues and adjacent non-HCC tissues from patients with a diagnosis of HCC were collected from Department of General Surgery, the First Affiliated Hospital of Soochow University from 2011 to 2012. All cases had histological confirmation of their tumor diagnosis and none of these patients had received preoperative chemotherapy or radiotherapy. After surgical resection, the fresh tissues were immediately stored at 280uC until the DNA/RNA extraction was processed.

DNA Extraction and Genotyping
Genomic DNA of peripheral blood samples, tissue samples and HCC cell lines were isolated using genomic DNA purification kit (Qiagen). DNA fragments containing rs199618935 were amplified with a pair of genotyping primers (Forward primer: 59-GTGTCCCTTCTACTTCCCTCT-39, Reverse primer: 59-GCTGGGCTTGTGACTTGTTT-39). The PCR products were analyzed by 7% non-denaturing polyacrylamide gel electrophoresis (PAGE) and visualized by silver staining [19]. For the six different alleles we observed, a direct sequencing method was used to determine the number of repeat motif. The nomenclature of allele was determined according to the recommendations of the DNA commission of the International Society for Forensic Haemogenetics [20]. The genotypes of all samples were analyzed using a homemade allelic ladder, a mixture of all six different alleles. Approximately 10% of the samples were randomly selected and examined in blind duplicates by independent researchers, and the reproducibility was 100%.

Real-Time RT-PCR Analysis
The Hep-G2, Hep3B and Huh7 hepatoma cell lines were obtained directly from Shanghai Cell Bank of Chinese Academy of Sciences. Cells were cultured in Dulbecco's Modified Eagle's Medium (DMEM) supplemented with 10% fetal bovine serum (FBS) and 1% penicillin-streptomycin at 37uC in a humidified 5% CO2 incubator before RNA extraction. Total RNA was isolated from hepatoma cell lines and tissue samples using RNA isolation kit (Qiagen). cDNA was generated using random primers and Superscript II reverse transcriptase (Invitrogen). A SYBR Green real-time PCR was performed using Roche LightCycler 480 to quantify relative PTPN11 expression in these samples. GAPDH was chosen as the internal control. Primer sequences used for PTPN11 and GAPDH were as follows: PTPN11-F: 59-TCAG-CACAGAAATAGATGTG-39, PTPN11-R: 59-TGCTTATCA-AAAGGTAGTCA-39, GAPDH-F: 59-CTCTCTGCTCCTCC-TGTTCGAC-39, GAPDH-R: 59-TGAGCGATGTGGCTCGG-CT-39. The 25 ml total volume final reaction mixture consisted of 1 mM of each primer, 12.5 ml of Master Mix (Applied Biosystems), and 50-100 ng of cDNA. The negative control experiments were performed with distilled H 2 O as template. The expression levels of target genes were normalized with GAPDH using a 22DDCT method [21]. In addition, the melting curve analysis was performed for the PCR products to verify the specificity of primer.

Construction of Reporter Plasmids and Luciferase Assays
The partial structures (,570 bp) of human PTPN11-39UTR containing allele 12, 14 and 15 of rs199618935 were amplified with forward primer 59-GATCTCTAGACCCCAACTGT- TAGTCAATCTGAGC-39 and reverse primer 59-CATG-GATCCTTGTCCCAGCTACTGTAAGCAGC-39 from three homozygous human genomic DNA samples. The PCR products were separated in agarose gel and extracted, purified, and cloned with TA cloning Kit (Promega). The repeat numbers of different alleles were confirmed by sequencing. Finally, the 39UTR of Renilla luciferase in the vector pRL-SV40 (Promega) was replaced with the cloned 39UTR of PTPN11 by restriction enzymes XbaI and BamHI. The resulting constructs were verified by sequencing. The Hep-G2, Hep3B, sk-Hep-1 and Huh7 hepatoma cell lines were seeded at 1610 5 cells per well in 24-well plates (BD Biosciences). Twenty-four hours after the plating, cells were transfected by Lipofectamine 2000 according to manufacturer's manual. In each well, 500 ng constructed pRL-SV40 vector and 50 ng pGL3 control vector were cotransfected. Six replicates were performed for each group and each experiment was repeated at least three times. After transfection for 24 hr, cells were harvested by the addition of 100 ml passive lysis buffer. Renilla luciferase activities in cell lysate were measured with the Dual Luciferase assay system (Promega) in TD-20/20 luminometer (Turner Biosystems) and were normalized with the firefly luciferase activities.

In-silico Predicting Effects of STR Polymorphism on PTPN11 Folding Structures
As certain conserved structures more likely serve important biological functions, a 60-bp region covering the polymorphism was analyzed using RNAfold to predict the putative influence of different alleles on local folding structures of PTPN11 using default parameters [22].

Statistical Analysis
The genotype distribution was analyzed by Hardy-Weinberg equilibrium using x 2 test. Since rs199618935 is a multi-allele polymorphism, the genotypic frequencies were calculated by a specific counting method based on different alleles. The samples would be classified into specific genotypic groups provided it has one or two specific alleles. Genotypic and allelic frequencies for each allele between HCC patients and controls were compared by x 2 test. To facilitate analysis, alleles with frequencies lower than 3% were combined with the adjacent alleles (e.g. allele 11 and 12, allele 13 and 14, allele 15 and 16). Unconditional logistic regression was used to analyze the association between rs199618935 and HCC risk, adjusted by gender, age, smoking, drinking and HBV infection status. As HBV infection was one of the major risk factors, a stratified analysis by HBV infection status for overall population was performed using binary logistic regression model. Student's t test was used to examine the differences in luciferase reporter gene expression. The normalized expression values of PTPN11 in tissue samples and hepatoma cell lines were compared with student's t test and one-way ANOVA, respectively. These statistical analyses were implemented in Statistic Analysis System software (version 8.0, SAS Institute).

The Associations of STR Polymorphism with HCC Susceptibility
The demographic characteristics of the 705 HCC patients and 723 controls from two independent case-control sets were summarized in Table 1. There were no statistically significant differences in terms of the frequency distribution of sex, age, smoking and drinking status, suggesting that the frequency matching was adequate. Approximately 73.0% of the cases and 10.0% of the controls were HBsAg-positive, in accordance with the fact that HBV infection was a major risk factor for HCC. Example output from sequencing and genotyping assay of the STR polymorphism were shown in Figure 1. The observed genotypic frequencies for rs199618935 were consistent with those expected from the Hardy-Weinberg equilibrium in both cases and controls (all P values .0.05).
Six different alleles (11, 12, 13, 14, 15 and 16) were detected corresponding to 11-16 repeats (allele nomenclature rule is described in methods) and there were totally 11 and 12 different genotypes observed in overall cases and controls, respectively. Genotypic and allelic frequencies of rs199618935 as well as its associations with HCC susceptibility were presented in Table 2. The carriage of allele 11 and 12 was significantly more common (72.5%) in HCC patients, whereas allele 13, 14, 15 and 16 was more common (37.9%) in controls. For the Jiangsu's case-control study, compared with the 11/12 genotype, subjects of 13/14 or 15/16 genotypes of rs199618935 had a significantly decreased risk of HCC in a dose dependent manner (adjusted OR = 0.76, 95%C.I. = 0.58-0.99; adjusted OR = 0.52, 95%C.I. = 0.35-0.79, respectively). Similar trends were observed in the Shanghai's casecontrol study. Furthermore, based on HBV stratification analysis, no obvious difference was observed between HBV positive and negative population ( Table 3).

The Genotype-Phenotype Correlations Between the STR Polymorphism and PTPN11 Expression
To further explore the effect of rs199618935 on the expression of PTPN11, we used different genotypic HCC tissue samples as well as their adjacent non-tumor tissues to examine PTPN11 expression. As shown in Figure 2A, results of q-PCR demonstrated that the PTPN11 expression level was significantly correlated with the genotypes of the STR polymorphism. Compared with 12-12 genotype, the PTPN11 expression of 14-14 or 14-15 genotypes   was dramatically increased in both HCC tissues and adjacent nontumor tissues (fold change = 2.36 and 2.02, respectively, P,0.01). To validate our findings in tissue samples, we further examined the genotype-phenotype correlations in three common hepatoma cell lines (Huh7, Hep3B and Hep-G2). Compared with Hep-G2 cell lines carrying 12-15 genotype, the PTPN11 mRNA expression levels of Huh-7 (14-14 genotype) and Hep3B (14-14 genotype) were significantly increased ( Figure 2B). Thus, we observed a differential PTPN11 expression pattern in a STR genotypedependent manner in vivo and in vitro. Finally, the expression level of PTPN11 in adjacent non-tumor tissues was 2.39-fold higher than that of HCC tumor tissues (Figure 3).

Effects of the STR Polymorphism on Transcriptional Activity
We further investigated the molecular mechanism underlying correlations between the STR polymorphism and PTPN11 expression. Since rs199618935 was located within 39UTR of PTPN11, two luciferase reporter gene constructs were framed by PCR, and they were used to transiently transfect HCC cell lines. As shown in Figure 4, we found that the constructs containing allele 14 and allele 15 drove an increased reporter expression compared with the constructs containing allele 12 in all four hepatoma cell lines. Of note, allele 14 displayed the highest luciferase expression, which was significantly higher than that of allele 12 and allele 15.

In-silico Analysis of the STR Polymorphism on PTPN11 Folding
Considering the fact that the STR polymorphism is located within 39UTR of PTPN11, it is plausible that different allele may affect the folding structures of PTPN11, which in turn influence its expression through a structure-dependent mechanism. Using RNAfold algorithms, we predicted the local structure changes of PTPN11 caused by different alleles. As shown in Figure 5, the different ''TCA'' repeat motif displayed different local structures. Specifically, the allele 12 and 15 appeared to disrupt a highly base paired region which could be formed by the allele 14.

Discussion
We presented here the first case control study evaluating the association between the novel STR polymorphism within 39UTR of PTPN11 and HCC susceptibility. On the basis of our current findings, we propose a schematic model to illustrate the molecular mechanism and functional basis for polymorphism-associated hepatocarcinogenesis conferred by PTPN11 expression. Therefore, the novel STR polymorphism may serve as a potential marker for genetic susceptibility to HCC in Chinese populations.
Most recent experimental data suggest that in contrast to the proto-oncogene effect of dominant-active mutants, PTPN11 may act as a tumor suppressor in hepatocarcinogenesis [14]. Further studies confirmed the tumor suppressor roles of PTPN11 in HCC tumorgenesis and decreased PTPN11 expression has been shown to be a prognostic marker in HCC [15]. Our results also demonstrated that PTPN11 expression was significantly higher in self-matched adjacent non-tumor tissues compared with that of HCC tissues (P,0.01), which validated the tumor-inhibiting effect of PTPN11 in HCC. Similarly, Ptpn11 has also been proved to be a tumor suppressor in cartilage and involved in metachondromatosis by inducing hedgehog signaling [23]. Given the critical link between protein-tyrosine kinases (PTKs) activation and oncogenesis, the opposing functions of PTPN11 within different cellular context remain to be fully elucidated.
To test whether the different allele in human PTPN11 39UTR regulates mRNA level, we transfected three different constructs containing different alleles (e.g. 12, 14 and 15) into Huh7, Hep3B, sk-Hep-1 and Hep-G2 cells and then assayed luciferase levels. Our data provided first evidence that the construct containing allele 14 displayed the highest luciferase activity, which was significantly higher than that of constructs containing allele 12 and 15. The altered luciferase activities we observed in the reporter gene system suggested that STR regulation of PTPN11 expression can be a transcriptional event, such as changed RNA stability in a posttranscriptional level. Early studies have shown that STR polymorphism in the 39UTR formed structural elements (stemloops) and contributed to mRNA regulation [24]. Indeed, our insilico studies have shown different structures formed by sequences containing different alleles ( Figure 5). Based on the results of our current study, we hypothesized that different alleles of the STR may act as an enhancer or repressor to regulate PTPN11 gene expression. Coordinately controlled by PTKs and protein tyrosine phosphatases (PTPs), PTPN11 is a feature of many important signaling pathways that are involved in cell proliferation, adhesion, and migration [25,26]. Our results showed that longer alleles (14 and 15) conferred higher PTPN11 expression, which was consistent with the fact that longer alleles were associated with decreased HCC risks.
Deregulation of PTPN11 causes hyperactivation of ERK, leading to growth abnormality. Gain-of-function PTPN11 mutations have been found in various types of human cancer [27][28][29]. It has been shown that the activating SHP2 mutant promotes lung tumorigenesis [30]. Additional data suggest that targeting SHP2 may represent an effective strategy for treatment of epidermal growth factor receptor (EGFR) inhibitor resistant non-small cell Functional STR Polymorphism in PTPN11 Confers HCC Risk PLOS ONE | www.plosone.org lung cancer [31]. Therefore, the STR polymorphism identified in our study may serve as a potential marker for individualized diagnosis or treatment of cancers.
Although this is the first report for the possible association between PTPN11 STR polymorphism and HCC risk, the significance of this finding is limited by the relative small sample size used in this study. However, result from our genetic association analysis provides a preliminary data and evokes the need for future study with different or expanded case-control populations to confirm our observations. Furthermore, the underlined molecular mechanisms between the STR polymorphism and PTPN11 expression still need to be fully investigated both at genetic and functional levels.
In summary, we have provided initial evidence that the length variation of the ''TCA'' repeats within human PTPN11 39UTR may play a functional role in regulating the expression of PTPN11 and subsequently affect development of HCC. Therefore, PTPN11 may be a promising marker for personalized diagnosis and therapy of HCC.