Polymorphisms and a Haplotype in Heparanase Gene Associations with the Progression and Prognosis of Gastric Cancer in a Northern Chinese Population

Background Human heparanase plays an important role in cancer development and single nucleotide polymorphisms (SNPs) in the heparanase gene (HPSE) have been shown to be correlated with gastric cancer. The present study examined the associations between individual SNPs or haplotypes in HPSE and susceptibility, clinicopathological parameters and prognosis of gastric cancer in a large sample of the Han population in northern China. Methodology/Principal Findings Genomic DNA was extracted from formalin-fixed, paraffin-embedded normal gastric tissue samples from 404 patients and from blood from 404 healthy controls. Six SNPs were genotyped by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. A chi-square (χ2) test and unconditional logistic regression were used to analyze the risk of gastric cancer; a Log-rank test and Cox proportional hazards model were used to produce survival analysis and a Kaplan-Meier method was used to map survival curves. The mean genotyping success rates were more than 99% in both groups. Haplotype CA in the block composed of rs11099592 and rs4693608 had a greater distribution in the group of Borrmann types 3 and 4 (P = 0.037), the group of a greater number of lymph node metastases (N3 vs N0 group, P = 0.046), and moreover was correlated to poor survival (CG vs CA: HR = 0.645, 95%CI: 0.421–0.989, P = 0.044). In addition, genotypes rs4693608 AA and rs4364254 TT were associated with poor survival (P = 0.030, HR = 1.527, 95%CI: 1.042–2.238 for rs4693608 AA; P = 0.013, HR = 1.546, 95%CI: 1.096–2.181 for rs4364254 TT). There were no correlations between individual SNPs or haplotypes and gastric cancer risk. Conclusions/Significance A functional haplotype in HPSE was found, which included the important SNP rs4693608. SNPs in HPSE play an important role in gastric cancer progression and survival, and perhaps may be a molecular marker for prognosis and treatment values.


Introduction
Gastric cancer is the fourth most common cancer worldwide and second leading cause of cancer mortality [1]. Despite advances in diagnosis and treatment, the prognosis for patients with advanced gastric cancer remains dismal [2]. Furthermore, gastric cancer is a disease of gene-environment interactions and genetic factors play an important role in tumorigenesis and progression [3]. Therefore, discovery and application of biomarkers incorporated with traditional cancer diagnosis, staging, and prognosis could be considered the best option for controlling this life-threatening disease [4].
Single nucleotide polymorphisms (SNPs) have been thought to be attractive biomarkers in cancer risk assessment, screening, staging, or grading [5]. Also, the human genome is composed of a series of 'haplotype blocks', which are nonrandom associations of alleles due to linkage disequilibrium (LD) and it is possible to exploit a vast amount of information considering these haplotype blocks [6,7]. Although the application of individual SNP analysis has been limited thus far, haplotype-based association study has been proposed as a powerful and comprehensive approach to identify causal genetic variation underlying complex diseases [8,9].
Heparanase is the only known mammalian enzyme that degrades heparan sulfate (HS) proteoglycans in basement membranes and the extracellular matrix [10]. This leads to disassembly of extracellular barriers, release of HS-bound bioactive factors and generation of HS fragments that promote growth factor-receptor binding and signaling [11,12]. Heparanase is strongly associated with cancer progression and metastasis, including cell survival, invasion, proliferation, neovascularization, and the creation of a growth-permissive microenvironment [13,14] and it has both prognostic and therapeutic applications [15]. The heparanase gene (HPSE), first cloned in 1999, is located on chromosome 4q21.3 [16]. There have been few studies on SNPs in the HPSE gene. Molecular epidemiologic studies have shown distribution differences in SNPs in HPSE in various Israeli Jewish populations [17]. Associations to tumor susceptibility have also been demonstrated, including hematological malignancies and gastric cancer, but the results have not been accordant [18][19][20]. In addition, Shirley Ralphand [21] has shown an HPSE haplotype was correlated to stages in ovarian carcinoma and Yue et al. [20] have shown SNPs were correlated to clinicopathological parameters and survival rate. Specifically, the study indicated that SNPs in HPSE were associated with heparanase expression levels and provided the basis for further studies on the associations between SNPs and disease [22]. However, these association studies were limited to small samples.
Recently, Hennig G [23] and Horn H [24] observed high genotyping detection rates (93.5% and 94-97%) and a perfect concordance rate of 100% with DNA extracted from normal formalin-fixed, paraffin-embedded tissues (FFPETs) compared to germline DNA using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). Besides, other reports also demonstrated high genotyping detection rates and a perfect concordance rate with FFPET-derived DNA including decades-old blocks compared to blood from the same individual using other methods, even in genome-wide genotyping [25][26][27][28]. It has been ascertained that FFPET-derived DNA was sufficient for genetic polymorphism analysis. In the present study, we used a large collection of FFPET-derived DNA samples from patients and blood-derived DNA from controls in a MALDI-TOF MS method to genotype and study the potential associations between six SNPs (rs4693602, rs6856901, rs4364254, rs11099592, rs4693608 and rs4328905) or haplotypes in HPSE and tumor susceptibility, clinicopathological parameters, and survival of gastric cancer with a large sample of the Han population in northern China. As a result, individual SNPs and a haplotype were found to show associations with the progression and prognosis of gastric cancer.

Subject characteristics
The average age was 56.67611.923 y and the percentage of males was 70.54% in the case group. The average age of the control group was 56.91611.477 y and the percentage of males was 70.54%. There was no distribution difference in sex and age between the patients and controls (P = 1.00 for both sex and age

Genotyping success rates
We used MassArray Typer Analyzer software 4.0.4.20 for automated spectra processing and genotype identification. Representative MALDI-TOF-MS profiles of each genotype of the six SNPs in HPSE were shown in Figure S1. All the SNPs were polymorphic with minor allele frequency .10% and genotype distributions were all in agreement with Hardy-Weinberg equilibrium (data not shown). High success rates, ranging between 96.29% and 100% (mean: 99.09%) in the FFPETs group and between 99.50% and 100% (mean: 99.79%) in control group, were shown (Table S1).

Associations between individual SNPs and clinicopathological parameters and survival
Allelic frequencies and genotypic frequencies in the six SNPs were not significantly different between patients and controls (P.0.05 and P.0.05 after a permutation test for allelic frequencies; P.0.05 and P.0.05 after being adjusted for sex and age for genotypic frequencies; Tables S2 and S3).
SNPs were evaluated for associations with the clinicopathological parameters. rs4364254 genotypes were associated with histologic grades (P = 0.002; Table S4), genotype TT was correlated to well cell differentiation compared to the genotype TC/CC (OR = 0.482; 95%CI: 0.300-0.774). Other SNPs had no significant correlations to clinicopathological parameters.
Associations between haplotypes in HPSE and gastric cancer clinicopathologic features at the time of diagnosis were evaluated. Haplotype CA in block 2 had greater distribution in the group of Borrmann types 3 and 4 compared to TG+CG haplotypes (P = 0.037; Table 3). Haplotype distribution differences were observed in the pN category (P = 0.045; Table 3), CA had a greater distribution in the N3 group than the N0 group compared to CG (OR = 1.837,95%CI: 1.010-3.341, P = 0.046), but there were no significant distribution differences between the N2 and N0 groups and between the N1 and N0 groups (P = 0.671 and P = 0.496, respectively).
The haplotypes of block 2 indicated a significant difference in tumor-related survival. Patients carrying the CA haplotype had a poor gastric cancer-specific survival (CG vs CA: HR = 0.645, 95%CI: 0.421-0.989, P = 0.044; Table 4).

Discussion
As we have known, FFPET-derived DNA has lower extraction efficacy and quality (fragmented DNA) due to partial nucleic acid cross-linking and degradation than blood-derived DNA. But archived FFPETs provide an invaluable source for molecular genetic studies with several advantages such as (i) the only type of samples available for individuals who cannot otherwise provide a DNA sample, (ii) an excellent resource for large-scale retrospective biomarker studies, (iii) a large number of samples conjunct with long-term clinical follow-up data, (iv) a valuable resource with diagnosis and histological identification, (v) an available resource from pathology archives. Recently, FFPET-extracted DNA has been reported to be adequate for genotyping, and has been allowed for biomarker and functional genomics studies [23][24][25][26][27][28][29]. MALDI-TOF MS method, offering approximately 100% accuracy for SNP genotyping, is currently considered as a gold standard [29,30]. Genotyping of FFPET-derived DNA by MALDI-TOF MS has been proven to be reliable and reproducible. Previously reports showed that there were no allelic frequency differences between FFPET-derived DNA and blood-derived DNA from the same individual through several methods including MALDI-TOF MS [24][25][26]28]. Our efforts showed high success rates ranging between 96.29% and 100% (mean: 99.09%), which were in accordance with previous reported data [23,24,29].
SNPs are stably inherited, highly abundant and show diversity within and among populations, which are thought to be attractive biomarkers. However, the application of individual SNPs has been limited because they are low penetrance and their effects are relatively difficult to identify [5,31]. Therefore, the importance of haplotype information has been increasing to link DNA sequence variation with disease [32]. Articles have reported that functional SNPs in HPSE were associated with heparanase expression differences and heparanase has been shown to be closely involved in the pathological process, progression and outcome of the disease [10,22]. How to incorporate SNPs, however, in studies about gastric cancer predisposition and prognosis and how to determine the true associations are still challenging tasks.
There were no individual SNPs correlated to gastric cancer risk in our results. The associations between four individual SNPs (rs4328905, rs4693608, rs11099592 and rs6856901) and gastric cancer risk with 155 patients and 204 controls reported by Yue et al. [20] were in accordance with our results. Furthermore, all six common haplotypes had no significant differences in gastric cancer risk. This consistency showed SNPs in HPSE had no correlation to the incidence of gastric cancer in ethnic Han northern Chinese, not only from the perspective of individual SNPs, but also from the perspective of haplotypes.
In our present study, genotype rs4364254 TT was correlated to well cell differentiation. In addition, Ostrovsky et al. [22] found individuals with genotype TT possessed relatively high mRNA levels (P = 0.0029). However, there have been conflicting results reported as to associations between heparanase expression and histological differentiation. Endo K et al. [33] found that histological differentiation was worse in the heparanase mRNApositive gastric cancer tissues (p,0.01). Chen JQ et al. [34] found that histologic differentiation was not related to heparanase mRNA expression in gastric cancer (P = 1.000). Takaomi Ohkawa et al. [35] demonstrated that heparanase expression was detected as stronger in well-differentiated cells (P = 0.0277), which is a finding that consistents with our results. Therefore, heparanase might be involved in cell differentiation, but the mechanisms are not clear at present.
In univariate analysis, patients carrying rs4693608 AA genotype had a poor survival (P = 0.049); in multivariate analysis, rs4693608 AA and rs4364254 TT both were significantly correlated with poor survival (P = 0.030 for rs4693608 AA and P = 0.013 for rs4364254 TT). Possibly, the absence of consensus in the univariate and multivariate analysis was due to the weak effect of the individual SNP, but when the individual SNP was considered together with the other SNP, Borrmann type, pT category, and pN category in multivariate analysis, it generated an influence on the prognosis. There is ample evidence to suggest that genetic factors contribute to the disease process in common complex trait diseases, but the effect of a single variant is probably small [36]. Besides, Ostrovsky et al. [37] provided a first evidence of correlation between functional SNPs rs4693608 and rs4364254 and risk of acute graft-versus-host disease (GVHD) development, and the rs4693608 was the most important. Their results were accordance to ours. In addition, Ostrovsky et al. [22] reported both rs4364254 TT genotype and also rs4693608 AA genotype were correlated to a relatively high mRNA level (P = 0.0029 and 0.004, respectively), which might partially explain a worse prognosis in patients with rs4693608 AA or rs4364254 TT in the present study. These observations were biologically plausible because overexpression of HPSE was closely associated with greater invasiveness of gastric cancer [38][39][40]. The present results demonstrated our presumption that SNPs were involved in the regulation of heparanase expression, thereby affecting invasion ability and survival in gastric cancer.
Though neither genotype rs11099592 CC nor rs4693608 AA showed a statistical difference in Borrmann type, haplotype CA composed with them did show a significant difference. Haplotype CA had a greater distribution in the group of Borrmann types 3 and 4 (P = 0.037). Perhaps patients with haplotype CA were more likely to develop a poorer general type. Besides, haplotype CA had a greater distribution in the N3 group (P = 0.046, compared to the N0 group). However, there were no significant distribution differences between the N2 and N0 groups and between the N1 and N0 groups. Perhaps there was an association between haplotype CA and greater numbers of lymph node metastases, but it needs further study. Moreover, patients carrying the CA haplotype also showed poor gastric cancer-specific survival, which was consistent with the differences in Borrmann type and numbers of lymph node metastases. Furthermore, Ostrovsky et al. [22] reported that rs11099592 CC genotype and rs4693608 AA genotype were correlated to high mRNA expression (P = 0.0167 and P = 0.004, respectively), which were in accordance with our results about haplotype CA. Perhaps the absolute risk associated with each of SNPs was low, but combined haplotype analysis may be more helpful in identifying individuals at high risk for progression of the disease. Perhaps it was specific haplotypes that play a significant role in gastric cancer invasion and metastasis, further affect prognosis.
A functional haplotype block composed of rs11099592 and rs4693608 was found in our results, which was associated with Borrmann type, pN category and prognosis of gastric cancer. On the one hand, SNP rs11099592 is an A-G substitution nonsynonymous SNP located in exon 8 and this alteration results in an arginine-to-lysine replacement at position 307, perhaps leading to a functional difference in the protein. On the other hand, SNP rs4693608 is located in intron 3 and showed a correlation to survival. Increasing amounts of evidence indicates that genomic variants in non-coding sequences might alter the expression of gene products by changing gene regulation, exon splicing, mRNA stability, cryptic splice sites activation and so on, which can therefore cause disease phenotypes. Besides, haplotypes may provide more relevant information than individual SNPs [7,41]. Furthermore, whether gene transcription, maintaining of cellular differentiation and induction of an invasive metastatic phenotype are due to the direct interaction of heparanase with DNA is yet to be demonstrated. Ostrovsky et al. [22] showed important association between combined genotypes for rs4693608 and rs4364254 SNPs and heparanase mRNA expression level. Furthermore, they divided all combined genotypes into three subgroups (LR-low expression, MR-intermediate expression, HR-high expression) according to heparanase mRNA expression level of each genotype, and they confirmed significant differences between three subgroups of combined genotypes carriers and mRNA levels. Besides, Ostrovsky et al. [37] first found correlations between combined genotypes for rs4693608 and rs4364254 SNPs and risk of acute GVHD development in their following study with this subgroups analysis method. It is an important and valuable method. Moreover, this method is useful for risk prediction associated with haplotype approach in following clinical practice. Our future study, which connected mRNA expression level to genotypes or haplotypes in HPSE of the Han population in northern China, would use this method.
Because the sample size of wild-type homozygote was relatively too small for stratified analysis on each genotype of all six SNPs investigated in our study, we could not show results of SNP analysis on each genotype, but we carried out analysis combined heterozygote with wild-type homozygote. It was a limitation of this study.
In conclusion, this study evaluated polymorphisms of the HPSE gene in gastric cancer with a MALDI-TOF MS method and archived FFPETs in a large northern Chinese case-controlled cohort. We found a functional haplotype block composed of rs11099592 and rs4693608, which was associated with Borrmann type, pN category and prognosis; and SNP rs4693608, which was included in the block, showed a correlation to survival. These results are supported by associations between SNPs in HPSE and mRNA expression levels reported previously by Ostrovsky et al. [22]. In addition, six individual SNPs and haplotypes were not correlated to gastric cancer risk. These results were consistent with our initial assumption that heparanase was involved in cancer invasion and metastasis and affected prognosis ultimately, but it was not involved in the incidence of cancer.

Sample collection
404 patients with histopathologically confirmed gastric cancer who had received radical surgery between January 1998 and December 2004 were consecutively selected. The patients were from northern China and were believed to be good representatives from this region. 404 normal gastric tissue samples were obtained from a segment of the resected specimens farthest from the tumor (.10 cm) and FFPETs were archived in the Surgical Oncology Department of the First Hospital of China Medical University in northern China. All samples were fixed and embedded under standard clinical histological conditions and were stored at room temperature. Paraffin sections of FFPETs were stained with hematoxylin and eosin (H&E) for pathological inspection to confirm the absence of tumorous tissue. The tumor histological grade was assessed according to World Health Organization criteria and tumors were staged using the 7th edition of the TNM staging of the International Union Against Cancer (UICC)/ American Joint Committee on Cancer (AJCC) system (2010) based on postoperative pathologic examination of the specimens. Complete pathological data were obtained including age, gender, date of surgery, location of the primary tumor, histologic grade, venous invasion, lymphovascular invasion, depth of invasion, number of LNs retrieved, number of metastatic LNs, and number of tumor deposits retrieved. Those (i) with synchronous or metachronous malignant tumors, (ii) with distant metastasis found preoperatively, (iii) who underwent preoperative radiotherapy or chemotherapy, or (iv) with incomplete pathological data entries were excluded from this study. Follow-up was completed for the entire study population by January 2010. Two patients died in the postoperative period and 21 patients were lost during follow-up, therefore 381 patients were included in survival analysis. Median and mean follow-up periods were 90.0 months and 93.3620.24 months (range: 61-136 months), respectively. The following data were obtained for all patients: date of death (if applicable), cause of death (if applicable), and date of follow-up. The primary endpoint was cause-specific survival duration from the date of gastric cancer diagnosis to the date of death. The 5-year survival rate of the 404 patients was 54.2%. 404 blood samples were obtained from cancer-free individuals who were randomly selected based on physical examinations during December 2009 to August 2011, as the control group, and this group was believed to be a good representation of the population in northern China region. The selection criteria included no individual history of cancer, frequency matching to cases on sex and age and individuals were unrelated ethnic Han Chinese. The samples (Ethylene Diamine Tetraacetic Acid [EDTA] anticoagulate) were stored at 220uC within 30-40 minutes, and then moved to a freezer at 280uC within 2 or 3 days after collection.
The study was approved by the Research Ethics Committee of China Medical University, China. Written informed consents were obtained from all patients before participating in the study.

DNA extraction
Genomic DNA was extracted from FFPET samples in the case group. Sections with a thickness of 8 mm and a surface area of up to 250 mm 2 were prepared with a microtome and DNA was isolated from 6 to 12 sections, depending on the tissue size and cell counts. The microtome was cleaned and blades were changed to avoid intersample contamination. DNA extraction from FFPETs was performed with QIAampH DNA FFPE Tissue Kit (Qiagen, Hilden, Germany) [29], following the procedures described by the manufacturer, including (i) dissolve paraffin in xylene and remove, (ii) lyse sample under denaturing conditions with proteinase K, (iii) reverse the formalin crosslinking incubation at 90uC, (iv) bind DNA to the membrane and allow contaminants to flow through, (v) wash residual contaminants, and (vi) elute pure and concentrated DNA from the membrane (with tris-EDTA buffer [TE]). About 2-10 mg of DNA was recovered in 50 ml final solution and was stored at 280uC. Genomic DNA was extracted from blood samples from the control group with the Universal Genomic DNA Extraction Kit Ver.3.0 (TAKARA) according to the manufacturer's instructions. About 2-6 mg of DNA was recovered in TE and was stored at 280uC.

Selection of SNPs and genotyping
The study included six SNPs in HPSE, which were taken from the NCBI SNPs database (http://www.ncbi.nlm.nih.gov/snp) and the HapMap database (the Phase III database) (http://hapmap. ncbi.nlm.nih.gov/index.html.zh). These SNPs were mapped in HPSE gene (Figure 2). rs 11099592 was unique, not only with a minor allele frequency (MAF).1%, but also as polymorphic in Han China Beijing (HCB) population among all coding region SNPs (cSNPs) in HPSE, which was registered in the databases. In addition, other five SNPs were located in intronic and 39-UTR regions. Furthermore, other investigators have shown that rs11099592, rs4693608, and rs4364254 were correlated with heparanase mRNA expression [22]. Also, associations between individual SNPs or haplotypes in HPSE and susceptibility, clinicopathological parameters and prognosis of tumor reported in these articles was complex, but mostly concentrated on the six SNPs we selected [17][18][19][20][21][22].
SNPs were genotyped using the MALDI-TOF MS system (MassARRAY; Sequenom, San Diego, CA,USA) with primers and probes (Table S6) as previously described [29,42]. To ensure the typing quality, 1% positive samples (YanHuang cell strain) were incorporated into every genotyping plate to validate the reliability of the primers and 1% negative samples (water with no DNA) to monitor contamination. 5% random samples were tested in duplicate by different persons and the reproducibility was 100%. The laboratory personnel were blinded to the sample arrangement during the process. There were six steps including PCR amplification, shrimp alkaline phosphatase treatment, base extension, salt removal with resin, SpectroCHIP dispensing (Sequenom, San Diego, CA,USA), and data acquisitions with MALDI-TOF MS according to Justenhoven et al. [43]. Finally, data analysis was performed using MassArray Typer Analyzer software 4.0.4.20 (Sequenom, San Diego, CA) [44].

LD block determination and haplotype construction
Haploview 4.2 software was used to evaluate LD and construct haplotypes [31]. LD between the six SNPs used in haplotype analysis was measured by a pairwise D' statistic. The structure of the LD block was examined using the method of Gabriel et al. [45], using the 80% confidence bounds of D' to define sites of historical recombination between SNPs. Haplotypes were constructed from genotype data in the full-size case-control panel within blocks by using an accelerated expectation-maximization algorithm method [46]. Briefly, this method creates highly accurate population frequency estimates of the phased haplotypes based on the maximum likelihood as determined from unphased input [47].

Statistical analysis
Statistical analysis was undertaken using the PASW Statistics 18.0 software (SPSS, Inc., Somers, NY, USA). A two-sided chisquare (x2) test was used to estimate population distribution characteristics, compare differences in allelic and genotypic frequencies between cases and controls and assess associations between individual SNPs and clinicopathological parameters. A permutation procedure (1,000 tests) was used to correct the P value of single-locus association results. Odds ratios (OR) and confidence intervals (CI; 95%) were calculated by unconditional logistic regression to analyze the association between genotype frequencies and gastric cancer risk, and were adjusted for sex and age. Univariate and multivariate survival analysis were done with the log-rank test and Cox proportional hazards model using the clinicopathological parameters and SNPs. This resulted in the identification of covariates that significantly correlated with survival of the patients. Multivariate survival analysis was carried out by separately adding the SNP variables to all the clinicopathological parameters. A Kaplan-Meier method was used to map survival curves. The Haploview 4.2 software package was used to: estimate pair-wise linkage disequilibrium (LD), detect departure from the Hardy-Weinberg equilibrium, construct haplotype and calculate haplotype frequencies and estimate associations between haplotypes and gastric cancer risk. The Haplo.states software was used to assess associations between haplotypes and clinicopathologic features [31]. The THEsias software based on Cox proportional hazards survival regression in haplotype-based association analysis using the Stochastic-EM algorithm was used to produce survival analysis of haplotypes [48]. All tests were twotailed and P,0.05 was considered statistically significant.