Variants in FAT1 and COL9A1 genes in male population with or without substance use to assess the risk factors for oral malignancy

A number of genetic variants were suggested to be associated with oral malignancy, few variants can be replicated. The aim of this study was to identify significant variants that enhanced personal risk prediction for oral malignancy. A total of 360 patients diagnosed with oral squamous cell carcinoma, 486 controls and 17 newly diagnosed patients with OPMD including leukoplakia or oral submucous fibrosis were recruited. Fifteen tagSNPs which were derived from somatic mutations were genotyped and examined in associations with the occurrence of oral malignancy. Environmental variables along with the SNPs data were used to developed risk predictive models for oral malignancy occurrence. The stepwise model analysis was conducted to fit the best model in an economically efficient way. Two tagSNPs, rs28647489 in FAT1 gene and rs550675 in COL9A1 gene, were significantly associated with the risk of oral malignancy. The sensitivity and specificity were 85.7% and 85.5%, respectively (area under the receiver operating characteristic curve (AUC) was 0.91) for predicting oral squamous cell carcinoma occurrence with the combined genetic variants, betel-quid, alcohol and age. The AUC for OPMD was only 0.69. The predictive probability of squamous cell carcinoma occurrence for genetic risk score without substance use increased from 10% up to 43%; with substance use increased from 73% up to 92%. Genetic variants with or without substance use may enhance risk prediction for oral malignancy occurrence in male population. The prediction model may be useful as a clinical index for oral malignancy occurrence and its risk assessments.


Introduction
Oral squamous cell carcinoma (OSCC) is a growing public health problem in the world [1]. Various anatomical sites of oral cavity showed different incidence rates of OSCC and their PLOS ONE | https://doi.org/10.1371/journal.pone.0210901 January 18, 2019 1 / 12 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 different treatment strategies, with either single treatment or a combination of surgery, radiotherapy and chemotherapy [2]. The disease continues to have a poor prognosis with a 5-year survival rate of <50% [3]. The survival rate increased with early detection which suggests the importance of the early prevention for OSCC in reducing morbidity and mortality [4]. With increased understanding of genetics and environmental risk factors in oral tumorigenesis, novel approaches have developed for prevention, early detection, risk stratification and treatment of OSCC. Major risk factors associated with the occurrence of oral malignancy are genetic risk factors [5,6] and environmental risk factors, including betel quid (BQ) chewing and cigarette and alcohol consumption [7,8]. Genetic and environmental risk factors have interactive effects on the occurrence of oral malignancy [9].
Oral potentially malignant disorders (OPMD) are early clinical features that are thought to undergo histopathological and molecular changes resulted in invasive oral cancer [10,11]. OPMD can be visually detected in the oral cavity and it is a well-established pre-cancer stage. OPMD like OSCC occurrence is primarily caused by risk exposures such as tobacco smoking, betel quid chewing and alcohol [12,13]. Because inter-individual and inter-population differences in risk [11] could be partially explained by different distributions of genetic variants, personal variation in the ability to metabolize carcinogens and effective DNA repair of the damage may be caused by genetic factors [14,15]. Identifying genetic factors that render individuals susceptible to OPMD risk could have practical significance in terms of identifying potential biomarkers for earlier identification of OSCC [16] Genetics is an important risk factor for oral malignancy occurrence [5,6], but whether genetic information could improve the prediction of OSCC occurrence risk remains unclear. Therefore, we identified the significant variants and developed a model that integrates genetic profiling and environmental factors to detect high-risk groups for oral malignancy occurrence. Since OPMDs require long-term monitoring to assess their risk of developing OSCC, the prediction model may be useful in the identification of different risk groups to suggest intervention and would be increasingly important in reducing risks of OSCC.

Study population
A total of 360 newly diagnosed patients with OSCC and 486 controls were recruited from the Department of Dentistry and the Department of Otorhinolaryngology, Kaohsiung Medical University Hospital (KMUH) in Southern Taiwan and Changhua Christian Hospital (CCH) in mid-Taiwan. The control samples included patients with eye problems (cataract and glaucoma), bone fractures, and subjects undergoing physical checkups from KMUH and CCH. Another 161 subjects composed of 17 newly diagnosed patients with OPMD including leukoplakia or oral submucous fibrosis and 144 subjects without oral malignancy were recruited from a program for cancer screening by China Medical University Hospital (CMUH). Cancer screening program is a community-based program for substance users to detect oral potentially malignant disorders and oral cancer early. Data on social-demographic factors, anthropometric parameters, cigarette smoking, alcohol drinking and BQ chewing habits, medical history, and current medications were obtained by interviewing the subjects. BQ chewers were the subjects who had consumed at least one quid of any type of betel or areca nut product per day for a minimum of 6 months in their lifetime, and current BQ chewers were defined as participants who had chewed these products within the year prior to the interview. Substance use of alcohol and cigarettes were defined with the same criteria of BQ use. An individual without substance use of alcohol, BQ, and cigarettes was defined as the nondrinker, nonchewer, and nonsmoker, respectively. Detailed patterns of BQ, alcohol, and tobacco use comprised of addicted types, age at initial use, daily consumption, frequency of usuage, years of substance use and achievement of abstinence were described in the previous paper [17]. This study was approved by the Institutional Review Boards of CMUH, KMUH and CCH, informed consent committee on human subjects, and biospecimen unitization committee.

Selection of tag SNPs of susceptibility candidate genes
Although a number of genetic variants were suggested to be associated with OSCC by genome-wide association studies, few variants are able to be replicated among the different population. Thus, attention has turned to the somatic mutations that are reported in the Cancer Genome Atlas that has shown promising genes associated with the initiation and progression of OSCC [18]. We evaluated and choose promising driver genes related to OSCC were nominated based on Cancer Genome Atlas and mutation rate greater than 20% by next-generation sequencing approaches [9,[19][20][21]. These genes included TP53, CASP8, FAT1, NOTCH1 and COL9A1. Variants associated with cancer occurrence may be related to the acquired somatic mutations that drive cancer development. Based on linkage disequilibrium patterns of Han Chinese, Fifteen SNPs nearby somatic mutations among 5 genes with a minor allele frequency greater than 5% were selected from the HapMap database. SNPs were genotyped using Sequenom Mass ARRAY System (Sequenom, San Diego, CA, USA) at the Academia Sinica National Genotyping Centre (Taipei, Taiwan). The significant SNPs associated with OSCC were further genotyped and examined in association with subjects recruited from OPMD screening program. With the expectation to reduce numbers of variables to minimal and to reach high predictive power in the statistical models, we conducted stepwise model analysis to select risk factors to fit in the best model.

Statistical analysis
Statistical analysis in this study was performed by using SAS 9.2 software. All genotype frequencies of control population were tested for Hardy-Weinberg equilibrium. The difference between the practical and expected number of each genotype was compared by the χ 2 test. Hardy-Weinberg equilibrium was assumed for P value more than 0.05. Student t-test was applied to compare the age difference, whereas Pearson χ 2 test or Fisher exact test was used to determining the difference of gender distribution and SNP genotype frequencies between case and control subjects. The false discovery rate (FDR) method was used for multiple testing corrections in genetic association studies. Logistic regression analysis [22] was performed to estimate the association between SNPs and OSCC. All tests are two-tailed, and a P value of <0.05 is considered to be statistically significant. The OR and the corresponding 95% confidence intervals (95%CI) were assessed by logistic regression. We calculated genetic risk scores (GRS) using risk allele frequencies, assuming independence of additive risks. Genetic risk scores were calculated from FAT1 and COL9A1. A score of 1 was given to each T allele of COL9A1-rs550675 and G allele of FAT1-rs28647489. A multinomial logistic regression model including GRS and BQ use interaction term and adjusted for covariates was applied to examine in an association between OSCC risk and genetic variants. We employed ROC curves to compare the diagnostic ability of OSCC occurrence. Due to genetic influences on lifespan, we used age as follow-up years. Risk scores for oral malignancy were developed by Cox proportional hazard models.

Results
The proportions of males were 95.3% in the OSCC case group, 97.4% in the control group (P = 0.1037). Percentage of gender was not significant. The proportions of participants were 100% males in the OPMD screening group. The mean age of the OSCC cases and normal control was 54.2 years (SD, 10.5), 51.7 and years (SD, 13.4) respectively (P = 0.01). The mean age was not significant between OPMD and normal group (age = 48.71(9.31), 47.12(12.79); p = 0.7432).
There were 86.9% the habit of smoked cigarettes in the OSCC groups, and 51% in the control groups (all P < 0.0001). Of the OSCC cases, 80.6% had the habit of consuming betel quid, 86.9% had smoked cigarettes and 68.1% had drinking alcohol. Of the control group, 13.4% had the habit of consuming betel quid, 51% had smoked cigarettes and 26.9% had drinking alcohol (P < 0.0001; Table 1). The distribution for a percentage of substance use in OPMD screening program was similar with OSCC group. Substance use of BQ, alcohol and cigarettes were not significant between OPMD and normal group (Right panel of Table 1). Table 2 and S1 Table listed the genotype and allele frequencies of individual SNP in OSCC cases and control subjects. A total of 15 SNPs in 5 candidate genes were examined in associations with the occurrence of OSCC. We found that two SNPs, rs28647489 in FAT1 gene and rs550675 in COL9A1 gene, were associated with a higher risk of OSCC. Compared with those carrying wild-type genotype and allele, patients carrying variant GA and GG genotype of rs28647489 had an increased risk of OSCC (OR = 2.06; 95% CI, 1.05-4.05; OR = 2.85; 95% CI, 1.19-6.84). The CT and TT genotypes of rs550675 had the additive effects on the risk of OSCC (OR = 1.31; 95% CI, 1.01-1.65; OR = 2.63; 95% CI. 1.47-4.72; Table 2).
We constructed GRS by using two significant SNPs in FAT1 and COL9A1 genes. The univariate associations of environmental risk factors, the use of alcohol, BQ, and cigarette, and GRS associated with OSCC risk are shown in Table 3. The GRS was associated with the risk of OSCC (unadjusted OR ranged from 1.64 to 4.86). After adjustment for age, and the use of alcohol, BQ, and a cigarette, the GRS remained independently associated with the risk of OSCC (OR ranged from 1.68 to 6.12).
Based on the results of Table 3, we conducted stepwise model analysis to select the significant factors for the best fit to prediction model of OSCC. The results were shown in the S2 Table. FAT1, COL9A1, BQ use and habit of consuming alcohol were included in the best fit model (p-value was set <0.05). The diagnostic ability for GRS and environmental factors was  measured as areas under the ROC curves (AUC) in the prediction models. The AUC for GRS was 0.61. The AUC for BQ and alcohol were 0.77 and 0.73 respectively. The AUC of the combined genetic, BQ and alcohol were significant predictors for the occurrence of OSCC. The addition of the GRS into the age, BQ and alcohol use included model showed significant improvement in AUC (the AUC increased to 0.91) with 85.70% of sensitivity and 85.5% of specificity for OSCC occurrence, respectively (Fig 1).
We also conducted stepwise model analysis to select significant factors for the best-fit prediction model of OPMD. The results were shown in the S3 Table. The best fit models for OPMD screening were included FAT1, COL9A1, BQ use and habit of cigarettes smoking. The AUC for GRS was 0.61. The AUC for BQ and cigarettes were 0.58 and 0.51 respectively. The AUC of the combined genetic variants, BQ and cigarettes were 0.69. According to AUC model, we calculated a predictive probability of OSCC occurrence for each group in term of genetic risk score BQ use and alcohol drinking. Predictive probability of OSSC occurrence increased with genetic risk score from 0.10 to 0.43 for the subject without substance use and 0.73 to 0.92 for the subject without substance use. Without genetic effects, the predictive probability of occurrence increased from 0.10 to 0.38 for BQ use and 0.10 to 0.29 for alcohol drinking respectively. The predictive probability of OSCC occurrence for genetic risk score with substance use increased from 73% to 92%. The synergic effects of the probability of occurrence for genetic risk score, BQ use and alcohol drinking increased from 0.10 to 0.92 (Table 4).

Discussion
Although a number of genetic variants were suggested to be associated with OSCC occurrence by association studies [23,24] and genome-wide association studies [25], however, few variants are able to be consistent in association with OSCC among the different population. Even variants in P53 gene in meta-analysis comprised of 2298 OSCC and 2111 controls were not associated with OSCC occurrence [26]. Therefore, attention has turned to the somatic mutations identified by next-generation sequencing approaches which are reported in the Cancer Genome Atlas that has shown promising genes associated with the initiation and progression of OSCC [25,[27][28][29]. We found tagSNPs in the FAT1 and COL9A1 gene nearby somatic mutations that drive cancer development were associated with oral malignancy occurrence. COL9A1 encodes one of the three alpha chains of type IX collagen. The levels of methylation in COL9A1 were decreased in breast tumor tissue [30] and variants associated with OSCC occurrence [9]. The FAT1 gene encodes a cadherin-like protein, which is able to potently suppress cancer cell growth [31,32]. More recently, Morris et al. reported that FAT1 via mutation promotes Wnt signaling that drives the development of many types of human malignancy Risk assessments for oral malignancy [33]. High mutation frequency in the FAT1 exclusively associated with HPV-negative head and neck squamous cell carcinoma (HNSCC) [18,34]. Although a loss of function in FAT1 were identified to be associated with HNSCC and OSCC in cell models, the frequencies of mutations were rare, suggesting that variants with large effect sizes will impact a small proportion of the population. It cannot be applied to screen general populations. Based on 1000 genome database, we identified two SNPs in FAT1 and COL9A1 gene that compared two major allelic frequency (A>G, SNP rs28647489 and C>T, SNP rrs550675) with other ethnic populations and found that were different from those of European, American, African, but the allelic frequency is similar with East Asia. These differences in allelic frequent distribution could be attributed to affect the risk of OC and OPMD occurrence in different ethnic populations. We showed that an interaction between additive genetic risk score of FAT1 and COL9A1 genes and BQ chewing has strong and graded associations with an occurrence of OSCC in case and control study and further confirmed that applied to early detection of OPMD risk. Generally, OSCC is diagnosed in advanced stages, resulting in poor survival rates. Therefore, early oral malignancy prevention or detection are imperative where treatment in the preinvasive stage offers the better prognosis and even the chance of cure. Using genetic and environmental factors to identify high-risk individuals would be effective in reducing the incidence and mortality from OSCC. Chuang et al. reported that early detection OPMD led to a 21% reduction in stage III or IV oral cancer diagnosis and a 26% reduction in oral cancer mortality in cancer screening program targeting Taiwanese cigarette smokers, or betel quid chewers [35]. Although cancer screening program can reduce incidence and mortality of oral cancer, subject without substance use and low risk for OPMD who do not attend screenings program may decrease the effectiveness in reducing the incidence, mortality and survival rate of OSCC. We developed a new prediction model integrated with genetic information for improving the detectability of oral malignancy. We used two genetic variants for OPMD screening, with the least number of variants to achieve the highest economic efficiency and screening power. The predictive probability increased with genetic risk scores which ranged from 0.32 to 0.70 (S4 Table). Because of enhanced awareness in those with high genetic risk, our prediction model may assess high-risk individuals with/without substance use to improve early oral malignancy detection.
A linear equation was constructed to produce the risk scores (S3 Table and Table 3). Subjects were divided into three groups based on risk score and would assist in the screening of individuals in high risk and be helpful to the personalized prevention of OPMD. Apart from predicting the occurrence of OSCC, several risk models or risk scores have been built for risk assessment by using environmental factors, lifestyle and clinical data [36][37][38][39]. Our study integrated GRS and environmental factors to detect the risk of oral malignancy occurrence early. When this prediction is used for risk assessment in asymptomatic individuals, those who test at high-risk score may consider several preventive strategies. The most common strategy is increased surveillance which includes annual visual inspection by clinician and intervention in quitting of BQ use. Recently substance use, especially for BQ use is defined as an addictive substance by DSM-5 criteria that may provide the strategy in the cessation of BQ use [40]. The efficacy of prediction model may be useful in the identification of different risk groups to suggest intervention and would be increasingly important in the prevention and diagnosis of OSCC.
We integrated environmental exposure information and genetic risk score and used Cox proportional hazards model to identify individuals at different risk groups for developing of OMPD before the onset of clinical disease. Regarding prediction model and risk factors, predictive methods may vary from study to study. There are some limitations to our study. Misclassification of OPMD in a high-risk group resulted in a very low AUC valueand lower predictive probability of OPMD occurrence (S4 Table) in OPMD prediction model. Subjects without OPMD may occur to OSCC in the future.
We selected the candidate genes which were reported highly significantly associated with OSCC in male population. We cannot include all possible genes in the prediction model. It resulted in a very low AUC value for GRS. Our study included the small number of OPMD cases which might have resulted in limited statistical power. This study provides evidence of the effects of germ line variants on the development of OPMD. The nature of risk factors for oral malignancy may differ by country and region. After stepwise analysis in selecting risk factors to best fit model in an economically efficient way, OPMDs and OSCC have the same genetic factors and BQ use in the prediction model. The difference is that tobacco smoking is important to OPMD, and alcohol drinking is important to OSCC. The risk factors of OPMD in prediction model needs to be validated in large sample size studies.

Conclusions
In summary, our study provided evidence that GRS comprised of FAT1 and COL9A1 genes was associated with oral malignancy occurrence in male population. For the risk prediction models, combined GRS and environmental factor were highly predicting OSCC occurrence. This prediction model can be applied to identify high-risk subjects with habits of betel quid chewing, cigarette smoking and further provide the appropriate intervention to reduce the risk of oral malignancy occurrence in male population.
Supporting information S1