Novel DNA Variants and Mutation Frequencies of hMLH1 and hMSH2 Genes in Colorectal Cancer in the Northeast China Population

Research on hMLH1 and hMSH2 mutations tend to focus on Lynch syndrome (LS) and LS-like colorectal cancer (CRC). No studies to date have assessed the role of hMLH1 and hMSH2 genes in mass sporadic CRC (without preselection by MSI or early age of onset). We aimed to identify novel hMLH1 and hMSH2 DNA variants, to determine the mutation frequencies and sites in both sporadic and LS CRC and their relationships with clinicopathological characteristics of CRC in Northeast of China. 452 sporadic and 21 LS CRC patients were screened for germline and somatic mutations in hMLH1 and hMSH2 genes with PCR–SSCP sequencing. We identified 11 hMLH1 and seven hMSH2 DNA variants in our study cohort. Six of them were novel: four in hMLH1 gene (IVS8-16 A>T, c.644 GAT>GTT, c.1529 CAG>CGG and c.1831 ATT>TTT) and two in hMSH2 gene (−39 C>T, insertion AACAACA at c.1127 and deletion AAG at c.1129). In sporadic CRC, germline and somatic mutation frequencies of hMLH1/hMSH2 gene were 15.59% and 17.54%, respectively (p = 0.52). Germline mutations present in hMLH1 and hMSH2 genes were 5.28% and 10.78%, respectively (p<0.01). Somatic mutations in hMLH1 and hMSH2 genes were 6.73% and 11.70%, respectively (p = 0.02). In LS CRC, both germline and somatic mutation frequencies of hMLH1/hMSH2 gene were 28.57%. The most prevalent germline mutation site in hMSH2 gene was c.1168 CTT>TTT (3.90%), a polymorphism. Somatic mutation frequency of hMLH1/hMSH2 gene was significantly different in proximal, distal colon and rectal cancer (p = 0.03). Our findings elucidate the mutation spectrum and frequency of hMLH1 and hMSH2 genes in sporadic and LS CRC, and their relationships with clinicopathological characteristics of CRC.


Introduction
Colorectal cancer (CRC) is one of the most common malignancies globally, and ranks the fifth of all cancers in China. World Health Organization estimates that 220,000 new CRC cases occurred in China in 2008 (GLOBOCAN, 2008). The incidence of CRC has increased by 5.73% on a yearly basis between 1992 to 2005 (13.06 to 23.54/10,0000) in Nangang District, Harbin, China [1].
One of the genetic pathways in the development of CRC is the failure of DNA mismatch repair (MMR) system [2], which contributes to the maintenance of genomic stability by recognizing and removing insertion/deletion mutations that occur during DNA replication [3]. The two main mismatch repair genes are hMLH1 and hMSH2, which map to chromosomes 3p21.3-23 [4] and 2p21-22 [5], respectively.
Since the first report of hMLH1 and hMSH2 gene mutations in Lynch syndrome (LS) CRC [4,5], studies on hMLH1 and hMSH2 gene mutations have been published. However, the majority of the published papers focused on LS or LS-like CRC. In total, 30 small-sample size (n = 5-61, except for one of 315 patients) studies have been published that screened germline mutations in hMLH1 and hMSH2 genes in sporadic CRC. Pathological mutations of hMLH1 and hMSH2 genes were more likely to be present in younger patients [6], and in those with microsatellite instability (MSI). In our analysis of these 30 studies, MSI or early-age onset (under the age of 40, 45, 50 or 55 years) was used to preselect patients for hMLH1 and hMSH2 gene mutations in sporadic CRC. However, no study aimed to detect mutation frequencies of hMLH1 and hMSH2 genes in mass sporadic CRC without MSI or age preselection. In China, four studies (n = 26-58) screened germline or somatic mutations of hMLH1 and hMSH2 genes in sporadic CRC with preselection by MSI [7,8,9,10]. Whether high frequencies of hMLH1 and hMSH2 gene mutations occur in sporadic CRC in China has not been elucidated. Moreover, strong evidence suggests that rare mutations of severe effect are responsible for a substantial portion of complex human cancer [11]. We therefore conducted this study to identify novel hMLH1 and hMSH2 DNA variants, to determine both the mutation frequencies and sites in both sporadic and LS CRC, and to estimate the relationships between germline and somatic mutations of hMLH1/hMSH2 gene and clinicopathological characteristics of CRC in Northeast China.

Subjects
After obtaining informed consent from study subjects, and approval from Institutional Research Board of Harbin Medical University, we identified CRC patients who underwent surgery at the Cancer Hospital and the Second Affiliated Hospital of Harbin Medical University, without preselection and based on pathologic diagnosis alone. Patients with neuroendocrine carcinoma, malignant melanoma, non-Hodgkin's lymphoma, gastrointestinal stromal tumors, and metastatic colorectal carcinoma were excluded from the analysis. From June 1, 2004

DNA Extraction
DNA was successfully extracted from all 457 blood samples (436 sporadic CRC and 21 LS CRC) and 356 tumor tissues (342 sporadic and 14 LS) using the classical phenol-chloroform procedure [12].
In the collection of blood and tissue samples and DNA extraction, we could not obtain the tumor tissue DNA of 117 CRC patients (110 sporadic and 7 LS) due to that the tumor    (Table 1), including exon-intron boundaries, were synthesized for genomic PCR. PCR amplifications were performed using the following protocol for 35 cycles: denaturation for 30 s at 95uC, annealing for 30 s at 54uC to 64uC, extension for 30 s at 72uC, followed by a final extension for 5 min at 72uC (ABI 9700). PCR products were identified by 1% agarose electrophoresis (Biowest Agarose, Gene Company Ltd). PCR products were denatured at 98uC for 8 min and placed on ice. Electrophoresis was performed on 8% to 15% nondenaturing polyacrylamide gels. After electrophoresis, gels were stained with silver (Refined Chemical Plant, Shanghai, China). 15% of the samples were replicated in detecting mutations of every amplified PCR fragment in the PCR-SSCP analysis, with the concordance rate ranging from 99.1% to 100% for various amplified PCR fragments.
PCR products showing abnormal mobility under SSCP analysis were sent to sequence using ABI3730XL. Sequencing results were analyzed for gene mutations with Chromas 2.22 software (Technelysium Pty. Ltd., QLD, Australia).

Assessment of Mutation Pathogenicity
For previously reported mutations, results of function verification were used to determine pathogenicity. If no function verification was reported, function prediction by any two of the PolyPhen/SIFT/MAPP-MMR results was used to determine their pathogenicity.
For the novel DNA variants, the pathogenicity of base substitution in exons were predicted by PolyPhen program [13]   and MAPP-MMR [14]. Base insertion, deletion and substitution in promoter, introns or 39UTR were assessed by criteria to determine potential pathogenicity [15]. We also detected the novel DNA variants in 100 healthy controls to determine potential pathogenicity.

Mutations
We also identified two polymorphisms. c.655 ATC.GTC was reported to be a common polymorphism in Caucasians [23,24,25], while c.1151 GTT.GAT was reported to be more common in Asian population [26]. Therefore, we did not categorize them as mutations in our study.
Mutations in hMSH2 gene. We identified seven hMSH2 DNA variants. Insertion AACAACA at c.1127 and deletion AAG at c.1129 was somatic DNA variants, other six DNA variants were both germline and somatic variants. Two DNA variants (239 C.T, insertion AACAACA at c.1127 and deletion AAG at c.1129) were newly detected in this study ( Figure 2 and Table 2). In screening the two novel DNA variants in 100 healthy controls, no variants were detected. The pathogenicity of the two DNA variants was uncertain. Five other mutations (c.23 ACG.ATG, c.471 GGC.GGA, c.505 ATA.GTA, c.1168 CTT.TTT and c.1886 CAA.CGA) were previously reported in the InSiGHT database [14,27].
Two male patients carried somatic mutations in both hMLH1 and hMSH2 genes. Another male patient carried the c.1831 ATT.TTT mutation of the hMLH1 gene and the c.23 ACG.ATG mutation of the hMSH2 gene in both tumor tissues and blood.
Germline mutation frequency was not significantly different from that of somatic mutation frequency in hMLH1 and hMSH2 genes, respectively (p = 0.49 and p = 0.69, respectively).
Mutation frequencies in LS CRC patients. Among 21 blood DNA samples of LS CRC patients, one (4.76%) patient carried a germline mutation of hMLH1 and five (23.81%) patients carried germline mutations in hMSH2. Overall, six (28.57%) patients exhibited germline mutations of the hMLH1/hMSH2 gene.
Tumor tissues were only available in 14 LS CRC patients, one (7.14%) patient carried a somatic mutation in hMLH1 and three

The Relationships between Germline and Somatic
Mutations of hMLH1/hMSH2 Gene and Clinicopathological Characteristics of CRC Somatic mutation frequency of hMLH1/hMSH2 gene was 22.7% (15/66) in proximal colon cancer, 17.7% (11/62) in distal  (Table 4 and 5). Germline and somatic mutation frequency of hMLH1/hMSH2 gene was not significantly different in other clinicopathological characteristics (age, gender, BMI, Dukes stage, Histotypes, Pathological types, Differentiated degree and tumor size) of CRC.
Because of less LS CRC patients, we did not analyze the relationships between germline and somatic hMLH1/hMSH2 gene mutations and clinicopathological characteristics of LS CRC.

Discussion
Under the supposed model of common disease-rare variant [28,29], we screened the rare variants of hMLH1 and hMSH2 genes in sporadic and LS CRC. We identified 18 types of DNA Variants in our study. Six were novel DNA variants and 12 have been previously reported. Of the six novel DNA variants, four were in hMLH1 and two in hMSH2.
Two of the four novel hMLH1 DNA variants, p.Asp235 Val (c.644 GAT.GTT) and p.Gln510Arg (c.1529 CAG.CGG), both lead to amino acid polarity changes, which may affect the structure of the hMSH2 binding domain and hPMS2/hPMS1 binding domain of the hMLH1 gene respectively and cause the dysfunction of DNA MMR system. Another DNA variance, p.Ile611Phe (c.1831 ATT.TTT), lead to no amino acid polarity changes in the hPMS2/hPMS1 binding domain of the hMLH1 gene product, may have no effect on the function of DNA MMR system [30]. IVS8-16 A.T is predicted to have no effect on splicing in exon 9.
One of the two novel hMSH2 DNA variants, 239 C.T, was a variance in 59UTR, which may affect mRNA Transcription. The other variance, c.1127 ins AACAACA and c.1129 del AAG, was a frameshift mutation, which may affect the hMSH6 binding domain and hMutL homolog interaction of the hMSH2 gene product and cause the dysfunction in the DNA MMR system [30].
Although the failure of DNA MMR system is one of the genetic pathways in the development of CRC [2]. According to the criteria of mutation pathogeneity assessment, one novel DNA variant, c.1529 CAG.CGG, was predicted to have no pathogeneity, the pathogeneity of other five novel DNA variants were uncertain. Therefore, we cannot elucidate the role of these novel DNA variants of hMLH1 and hMSH2 genes in the occurrence and development of CRC.
Since we detected a higher prevalence of c.1168 CTT.TTT of hMSH2 in both LS (14.29%, 3/21) and sporadic (3.90%, 17/436) CRC, we screened for the mutation in healthy controls. The mutation frequency in healthy controls was 4.16% (21/505), which was not significantly different comparing with CRC (p = 0.84). This particular mutation was also reported as a polymorphism in Korea by Kim et al, who did not detect a significant difference between cases and controls [26].
Significant association was only observed between somatic hMLH1/hMSH2 gene mutations and tumor location of sporadic CRC (p = 0.03). The somatic mutation frequency of hMLH1/ hMSH2 gene was highest in rectal cancer, the following was in proximal colon cancer, and the lowest was in distal colon cancer. The non-pathogeneity or uncertain pathogeneity may explain the non-significant association between hMLH1/hMSH2 gene mutations and other clinicopathological characteristics of sporadic CRC.
All the published studies detected germline or somatic mutations in sporadic CRC with preselection (MSI, early-onset age, or TGFb RII mutation) [36], which could explain the higher mutation frequency in the published individual studies and meta-analyses of previously published studies. In addition, the small sample size in those published studies may also contribute to the inconsistent results.
Only one study in Asia detected somatic mutations of hMLH1 and hMSH2 genes in 31 sporadic CRC patients without preselection [37]. The largest study detecting germline mutations was of 315 European BG-CRC patients under the age of 55; the mutation frequency of hMSH2 was found to be 0.32% (1/325, uncertain pathogenicity), whereas no mutation in hMLH1 was detected [39].
Five Asian studies detected the somatic mutation of hMLH1 or hMSH2 in LS CRC [7,41,42,43,44]. The pooled somatic mutation frequencies in hMLH1 and hMSH2 genes were 9.57% (95% CI: 1.36-44.73%) and 25.65% (95% CI: 10.30-50.89%), respectively upon meta-analysis. In our study, the somatic mutation frequency of hMSH2 in LS CRC was 14.29% (2/14) (excluding the polymorphic mutation, c.1168 CTT.TTT). However, no somatic mutations in hMLH1 exons were found in LS CRC, similar to the two Japanese studies [41,43]. The somatic mutation frequency of hMSH2 in LS CRC varied from 5.88% to 58.33% in the five Asian published studies. A small sample size may explain the variances of mutation frequency in LS CRC.
In conclusion, we identified six novel DNA variants (four in hMLH1 and two in hMSH2). In sporadic CRC, germline and somatic mutation frequencies of hMLH1/hMSH2 gene were 15.59% and 17.54%, respectively. The prevalence of germline mutations was 5.28% in hMLH1 and 10.78% in hMSH2. The somatic mutation frequencies in hMLH1 and hMSH2 genes were 6.43% and 11.70%, respectively. In LS CRC, both germline and somatic mutation frequencies of hMLH1/hMSH2 gene were 28.57%. The most prevalent germline mutation site in hMSH2 gene was c.1168 CTT.TTT (3.90%), a polymorphism. Somatic mutation frequency of hMLH1/hMSH2 gene was significantly different in proximal colon cancer, distal colon cancer and rectal cancer.
Our findings could help to elucidate the DNA variant spectrum and frequency of the hMLH1 and hMSH2 genes in CRC patients, especially sporadic CRC patients in China, and their relationships with clinicopathological characteristics of sporadic CRC. Functional studies to determine how these novel DNA variants affect protein function are required.