Association of the FAM46A Gene VNTRs and BAG6 rs3117582 SNP with Non Small Cell Lung Cancer (NSCLC) in Croatian and Norwegian Populations

We analyzed for associations between a variable number of tandem repeat (VNTR) polymorphism in the Family with sequence similarity 46, member A (FAM46A) gene and a single nucleotide polymorphism (rs3117582) in the BCL2-Associated Athanogene 6 (BAG6) with non small cell lung cancer in Croatian and Norwegian subjects. A total of 503 (262 Croatian and 241Norwegian) non small cell lung cancer patients and 897 controls (568 Croatian and 329 Norwegian) were analyzed. We found that the frequency of allele b (three VNTR repeats) of FAM46A gene was significantly increased in the patients compared to the healthy controls in the Croatian and the combined Croatian and Norwegian subjects. Genotype frequencies of cd (four and five VNTR repeats) and cc (four VNTR repeats homozygote) of the FAM46A gene were significantly decreased in the patients compared to the healthy controls in the Croatian and Norwegian subjects, respectively. Logistic regression analyses revealed FAM46A genotype cc to be an independent predictive factor for non small cell lung cancer risk in the Norwegian subjects after adjustment for age, gender and smoking status. This is the first study to suggest an association between the FAM46A gene VNTR polymorphisms and non small cell lung cancer. We found also that BAG6 rs3117582 SNP was associated with non small cell lung cancer in the Norwegian subjects and the combined Croatian-Norwegian subjects corroborating the earlier finding that BAG6 rs3117582 SNP was associated with lung cancer in Europeans. Logistic regression analyses revealed that genotypes and alleles of BAG6 were independent predictive factor for non small cell lung cancer risk in the Norwegian and combined Croatian-Norwegian subjects, after adjustment for age and gender.


Introduction
Lung cancer is the most predominant cause of cancer death globally [1]. Through epidemiological studies many environmental risk factors have been established for lung cancer including smoking, air pollution and industrial substances [2]. Although tobacco smoking is the major risk factor, genetic factors also affect lung cancer susceptibility [3][4][5]. Direct evidence for genetic predisposition to lung cancer is highlighted by several genome wide association studies (GWAS) that has been done [6][7][8][9][10][11].
Most of the genetic association reports studying lung cancer use single nucleotide polymorphisms (SNPs) as markers. A genomic variant that is understudied is the variable number of tandem repeats (VNTR) probably due to VNTR complexity and the challenges in assaying them. These limitations do not favor the discovery of novel VNTRs as potential predictive and prognostic factors in lung cancer etiology. Predictive and prognostic factors are important in the diagnosis and treatment of lung cancer [12][13][14]. The positive long term economic impact of robustly testing for predictive factors cannot be underestimated. This enhances the quality of medical care by significantly reducing false positives or negatives that may impact negatively on the treatment outcome. For example, robust testing for epidermal growth factor receptor (EGFR) gene mutational status in non small cell lung cancer (NSCLC) patients has defined this gene as an important predictive and prognostic factor in NSCLC diagnosis and treatment. This has also allowed for the therapeutic targeting of this genetic locus in NSCLC treatment [15;16].
VNTRs can modulate many biological processes such as gene transcription and protein function. They may also be responsible for many disorders in humans that include unstable (genetic) repeat expansions [17;18]. There are few reports on the association between VNTRs and lung cancer susceptibility in case-control studies. These include VNTRs in the H-ras gene [19;20], interleukin-1 receptor antagonist gene (IL1RN Ã 2) [21 ;22] and Mitogen-activated protein kinase 2 gene (MAPKAPK2) [23].
The Family-with-sequence-similarity 46, member A (FAM46A) gene [24] is located at chromosome 6.14.1. It harbors a VNTR within its coding sequence in exon 2. This VNTR may vary from two to seven repeats per chromosome [25;26]. The FAM46A polypeptide chain also contains the Domain of unknown function 1693 (DUF1693) [27] and as such no biological role has been assigned to the FAM46A gene as of date [28;29]. FAM46A protein interacts with the BCL2-Associated Athanogene 6 (BAG6) protein [28] which has been reported to modify risk of lung cancer [30]. The BAG6 gene is located on chromosome 6p21.3 and regulated apoptosis and HSP70 [31]. FAM46A protein also interacts with the zinc finger, FYVE domain-containing 9 (ZFYVE9) protein [28] which is involved in TGF-β signaling.
We previously reported that the mouse homologue of the Fam46a gene is expressed in developing tooth buds. Due to its nuclear localization and interaction with the human transcription factor, ZFYVE9 protein, we suggested that the FAM46A protein might be involved in cellular proliferation [32]. In addition, we have recently reported that the FAM46A gene VNTR is associated with increased risk of tuberculosis and osteoarthritis [33;34]. Data suggest that patients with tuberculosis are associated with increased lung cancer [35]. Based on these facts, we hypothesized that the FAM46A gene may be involved in lung cancer and that this involvement may be through variations in the length of the FAM46A VNTR. In addition, we analyzed for the association of a BAG6 SNP to validate its previous reported association with lung cancer [30]. The associations were investigated in two different European populations.

Ethics Statement
The study was approved by the Medical ethics committees of the University Hospital, University of Rijeka, Croatia, and Regional Committees for Medical and Health Research Ethics, Oslo, Norway. Written consents were obtained from all participants.

Subjects
The number of participants, sex and age distribution of the subjects are described in Table 1 for the both the Croatian and Norwegian subjects, respectively. Blood samples were collected at the Clinical Institute for Transfusion Medicine, University Hospital Center Rijeka, Rijeka, Croatia for the Croatian subjects and at National Institute of Occupational Health, Oslo, Norway for the Norwegian subjects. Ethnicity of participants was established by patient or healthy individual interview and by consulting the admission documentation at the hospitals. Some clinical information for particularly subjects was lacking and as such, not all subjects were included in the characteristics estimations in Table 1. Subjects were not matched for possible confounding factors such as age, gender, and smoking status. Not all patients and controls were typed for the two markers due to lack of particular samples.

Genomic DNA extraction
Genomic DNA from NSCLC patients and healthy controls was extracted from whole blood as described previously [36][37][38]. In brief, 200 μl of whole blood was mixed with 400 μl of sucrose buffer (0.32 M sucrose, 10 mM Tris-HCl, pH 7.5, 5 mM MgCl2, 1%, v/v, Triton X-100) and incubated for 1 min at room temperature. To collect white cell nuclei, samples were centrifuged 2 min at 5,000g. Precipitated nuclei were washed twice with 800 μl of sucrose buffer and centrifuged (2 min at 5,000g). After the second wash, the nuclei were re-suspended in 400 μl DNAzol (Invitrogen Corporation, Carlsbad, CA, USA) and incubated at room temperature for 5 min. Genomic DNA was precipitated with 200 μl of 100% ethanol and collected by centrifugation (2 min at 5,000g). The precipitate was washed with 1 ml of 75% ethanol and centrifuged for 1 min at 5,000g, twice. Genomic DNA was re-suspended in Tris-EDTA buffer solution (10 mM Tris-HCl, 1 mM disodium EDTA, pH 8.0; Sigma-Aldrich Chemie Gmbh, Munich, Germany).

Genotyping by DNA-sequencing Capillary Electrophoresis
DNA fragments of 647 base pairs (bp) in length encoding the FAM46A gene were amplified from human genomic DNA by using a FAM-labeled forward primer, designated Gfam_VF (5 0 -AGGGTACTTCGCCATGTCTG-3 0 ), in combination with an unlabeled reverse primer, designated GEX_R (5 0 -CTCGTGATGGCCACAGATT-3 0 ), by PCR as previously described [33;34]. The 25 μL total volume PCR mixtures contained the following: 25 ng of genomic DNA, 0.2 μm each of the specific primers, and 1x Paq5000 Hotstart PCR master mix (Agilent Technologies, Inc., CA, USA). PCR was performed in a Peltier Thermal cycler (MJ Research, Massachusetts, USA). The Paq5000 polymerase was activated by an initial step at 95°C lasting 2 min, followed by 35 cycles of denaturing, annealing, and extension steps at 95°C for 20 s, 65°C for 20 s, and 72°C for 30 s, respectively, followed by a final extension step at 72°C for 5 min. Amplicons were resolved by 1% ethidium bromide-stained agarose gel electrophoresis and visualized by the Geldoc imaging system (Bio-Rad, Hercules, CA, USA). Amplicons (0.5 μl) were mixed with 0.5 μl GeneScan 1200 LIZ Size Standard (Life Technologies, NY, USA) and loaded onto a 3730 DNA Analyzer (Life Technologies, NY, USA) for allele separation. Separated alleles were analyzed by the Genemapper software (Life Technologies, NY, USA). Allele (VNTR) identity was confirmed by sequencing directly PCR amplicons from samples that were genotyped as various homozygotes (two each). To further confirm the genotyping results, 10% of the samples were re-genotyped with 100% concordance. Also, randomly selected samples that were genotyped as heterozygotes were sub-cloned into TOPO Zero Blunt Sequencing plasmids (Life Technologies, NY, USA) prior to sequencing. Sequencing reaction was performed using the BigDye chemistry 3.1 (Life Technologies, NY, USA) with forward and reverse primers GVF (5 0 -AGGG-TACTTCGCCATGTCTG-3 0 ) and GEX_R (5 0 -CTCGTGATGGCCACAGATT-3 0 ), respectively and resolved by the ABI 3730 DNA analyzer (Life Technologies, NY, USA).
Genotyping of BAG6 rs3117582 Single Nucleotide Polymorphism (SNP) BCL2-Associated Athanogene 6 (BAG6) gene rs3117582 SNP was assessed by probe-based real-time PCR assays as described by the Kits manufacturer (Life Technologies, NY, USA) in our NSCLC patient and control groups, respectively. Stratagene MX3005 real-time PCR cycler was applied (Agilent Technologies, Santa Clara, CA, USA) for temperature cycling and signal quantification.

Statistical analysis
The differences in the distribution of categorical variables, including demographic characteristics, selected variables, allelic and genotypic frequencies were analyzed by the chi-square (Fisher two tailed) method using the 2-way Contingency , which showed that the FAM46A VNTRs and the BAG6 genotypes were in Hardy-Weinberg equilibrium (HWE). A statistically significant difference was defined when p was 0.05. The associations between the FAM46A VNTRs and BAG6 rs3117582 SNP polymorphisms and non small cell lung cancer risk were estimated by computing the odds ratios (ORs) and their 95% confidence intervals (CIs). We performed logistic regression analysis to evaluate the influence of age, gender and smoking status as confounders for the association of FAM46A VNTRs and BAG6 rs3117582SNP, respectively, with non small cell lung cancers in the Croatian and Norwegian subjects. Smoking status could not be included as a cofounder when we analyzed for the association of these genetic elements with non small cell lung cancer in the Croatian subjects and the combined Croatian-Norwegian subjects because information on the smoking status of the Croatian healthy control group was not available. Logistic regression analyses were carried out using SPSS (version 21; SPSS Inc., Chicago, IL, USA) and a p-value of 0.05 was set as the criterion for statistical significance.

Results
Allelic and genotypic frequencies of the FAM46A gene VNTR The frequency of VNTR genotypes and alleles for the FAM46A gene in the patients and healthy controls are listed in Table 2 and 3. We found twenty genotypes comprising of six alleles for the FAM46A gene with different frequencies in the two populations. Therefore, in further analyses of genotypes we chose to analyze results separately for each population, in addition to performing combined analyses. Analysis of the allelic frequencies, however, showed that the VNTR with 5 repeats (d allele in Table 3) was the most frequent allele with a similar frequency in healthy controls in both the Croatian (42%) and Norwegian (40%) populations. We chose, therefore, d allele as the reference allele in further analysis of the odds ratios. Tables 4 and 5 show the risk for lung cancer associated with each genotype (Table 4) or allele (Table 5) comparing frequencies of each genotype or allele in cases and controls with the most frequent genotype or allele as reference. In the Croatian subjects the cd genotype and in the Norwegian subjects the cc genotype was associated with a reduced risk of lung cancer, whereas combining both groups only cc genotype had a significant effect on reduction of the cancer risk (Table 4). Regarding allele frequencies, only subjects with a b allele in the Croatian population had a significant increased risk of lung cancer (Table 5).

Genotypic and Allelic frequencies of BAG6 rs3117582 SNP
For the BAG6 rs3117582 polymorphism we found a gene-dosage increased risk for lung cancer associated with the C allele of BAG6 rs3117582 SNP in the Norwegian subjects. The subjects carrying one C allele had 1.70-fold increased lung cancer risk and subjects with two variant alleles (CC) had almost 7-fold increased risk of lung cancer. No such association was present in the Croatian subjects; however, combining the subjects from both populations the associations remained significant and similar to the associations found in the Norwegian population (Table 6).

Discussion
The present study investigates the association between a VNTR in exon 2 of the FAM46A gene and a SNP (rs3117582) in the BAG6 gene and risk of NSCLC in two case-control studies from Croatia and Norway. We found that an allele of FAM46A gene that carries three VNTR repeats (designated as allele b) was associated with increased risk of lung cancer in the Croatian subjects. In addition, a genotype of FAM46A gene with 4 and 5 VNTR repeats (designated cd) was associated with reduced risk of cancer in these subjects. However, another genotype (cc) conferred reduced risk to lung cancer in the Norwegian subjects. To date, this is the first report of an association between the FAM46A gene and NSCLC Of particular interest is our observation that the cc and cd genotypes of the FAM46A gene confer reduced risk to NSCLC in the Norwegian and Croatian subjects, respectively. In addition, we found that the FAM46A bd and dd genotypes were the dominant genotypes in the Croatian and Norwegian subjects, respectively. These findings may suggest that the frequency of the dominant genotype of the FAM46A gene may influence the particular genotypes(s) that associates with NSCLC risk in each population. There is the possibility that these polymorphisms could be markers for susceptibility or reduced risk factor for example binding sites for  [25]. The DUF domain is present in many hypothetical proteins including nematode prion-like proteins [27] and in nucleotidyltransferase superfamily genes with unknown function [27]. Our results suggest that the VNTR-encoded PS50315 domain of the FAM46A protein might have functional importance in NSCLC. The FAM46A gene might have a role in TGF-β signaling, cell death (apoptosis) and/or inflammation due to its interaction with the ZFYVE9 protein [28]. An experimentally determined interacting partner of the FAM46A protein is the ZFYVE9 protein [28]. ZFYVE9 protein is involved in the recruitment of unphosphorylated forms of SMAD2/SMAD3 to the TGF-β receptor (TGF-βR) [39]. Phosphorylation of SMAD2/ SMAD3 induces dissociation from ZFYVE9. This allows for the formation of SMAD2/SMAD4 complexes and the consequent translocation of the SMAD2/SMAD4 complexes to the nucleus. Perhaps the FAM46A protein is involved in this cascade of events. TGF-β is a potent inhibitor of cell growth and accumulating evidence suggests that perturbation of TGF-β signaling pathway leads to tumorigenesis [40]. It is therefore tempting to speculate that FAM46A protein may be involved in lung cancer etiology through its participation in the TGF-β signaling pathway. This may involve the VNTR-encoded PS50315 domain of FAM46A gene. Our results may therefore encourage further studies to validate this postulation. We found that the BAG6 rs3117582 SNP is associated with NSCLC both at the genotypic and allelic levels in the Norwegian subjects and the combined Croatian and Norwegian subjects. This is interesting since BAG6 protein is an interacting partner of FAM46A protein [28]. Previous reports have shown that SNPs in BAG6 gene (rs1052486 and rs3117582) conferred susceptibility to lung cancer [8;30]. Our results support these previous reports and further suggest a role for the BAG6 gene in the etiology of non small cell lung cancer. BAG6 polypeptides regulates a variety of cellular processes such as apoptosis [41], HLA class II expression [42], T cell responses [43], protein modification and gene expression [44]. All these processes may require that the BAG6 protein is constitutively expressed at steady-state levels for optimal functioning. In addition, the BAG6 rs3117582 SNP is located at the promoter region of the BAG6 gene (38 basepairs from the transcription start site). Mutations at the promoter region of genes have been shown to influence the expression levels of the respective genes. This may concurrently attenuate the physiological efficiency of the genes in a dose-response manner [45;46]. Our results suggest that the C allele of BAG6 rs3117582 SNP is associated with increased risk for NSCLC in the Norwegian subjects and the combined Croatian and Norwegian subjects in a gene-dosage manner. This may also suggest that, the presence of the minor C allele of BAG6 rs3117582 SNP in its promoter may perturbed the BAG6 gene expression. This perturbation may lead to significant increase in NSCLC risk in individuals that bear this allele. This postulation needs to be investigated in further studies to determine and ascertain the possible underlying mechanism(s) of action.
In conclusion, our study suggests that the FAM46A gene VNTR and BAG6 rs3117582 SNP are associated with NSCLC in both the Croatian and Norwegian populations. Our results also corroborate the previous findings that BAG6 gene is associated with lung cancer and suggest gene-dosage association of the BAG rs3117582 SNP with NSCLC. We further suggest that these loci could be potential research targets that might have therapeutic potentials.