A Population-Based Study of Four Genes Associated with Heroin Addiction in Han Chinese

Recent studies have shown that variants in FAT atypical cadherin 3 (FAT3), kinectin 1 (KTN1), discs large homolog2 (DLG2) and deleted in colorectal cancer (DCC) genes influence the structure of the human mesolimbic reward system. We conducted a systematic analysis of the potential functional single nucleotide polymorphisms (SNPs) in these genes associated with heroin addiction. We scanned the functional regions of these genes and identified 20 SNPs for genotyping by using the SNaPshot method. A total of 1080 samples, comprising 523 cases and 557 controls, were analyzed. We observed that DCC rs16956878, rs12607853, and rs2292043 were associated with heroin addiction. The T alleles of rs16956878 (p = 0.0004) and rs12607853 (p = 0.002) were significantly enriched in the case group compared with the controls. A lower incidence of the C allele of rs2292043 (p = 0.002) was observed in the case group. In block 2 of DCC (rs2292043-rs12607853-rs16956878), the frequency of the T-T-T haplotype was significantly higher in the case group than in the control group (p = 0.024), and fewer C-C-C haplotypes (p = 0.006) were detected in the case group. DCC may be an important candidate gene in heroin addiction, and rs16956878, rs12607853, and rs2292043 may be risk factors, thereby providing a basis for further genetic and biological research.


Introduction
Heroin addiction is a chronic brain disease characterized by compulsive drug-seeking, drug abuse, physical dependence, tolerance, and relapse [1]. Heroin is one of the most commonly used drugs in China. At the end of 2014, a total of 2.955 million drug addicts were registered in China. Opioid drug addicts numbered 1.458 million, 49.3% of whom were heroin users. A total of 9.3 tons of heroin were seized in 2014. The direct economic losses resulting from drug addiction approach CNY 500 billion every year, representing a substantial economic burden to individuals and families. Similarly to other neuropsychiatric diseases, drug addiction results from a combination of genetic and environmental factors [2]. Family, adoption, and twin studies have suggested that genetic factors account for 30-60% of the overall variance in the risk of developing drug addiction [3][4][5].
Recently, FAT atypical cadherin 3 (FAT3), kinectin 1 (KTN1), discs large homolog 2 (DLG2) and deleted in colorectal cancer (DCC) have been reported to be associated with the function of the human mesolimbic reward system [6], which is the neurobiological basis of drug addiction. Cadherin, encoded by the FAT3 gene, regulates neuronal morphology by affecting cell interactions [7], a crucial mechanism of pathological memory formation during drug addiction [8,9]. The FAT3 gene affects the volume of the caudoputamen [6], which plays important roles in habit formation, motivation, and the mechanism of drug addiction [10]. KTN1 is responsible for organelle transport and localization [11], and this protein is also closely associated with the formation and quantity of dendritic spines [12], which form the common anatomical substrate of drug addiction [9]. Another biological function of KTN1 is facilitating vesicle binding with kinesin, this binding is followed by kinesin-driven vesicle fast anterograde transport in axons [13,14], suggesting that KTN1 is a promising candidate gene involved in drug addiction. Recently, the role of DLG2 has been investigated in a multitude of neuropsychiatric diseases. Genetic variants in DLG2 affect learning and cognitive flexibility [15]. Genetic mapping of habitual substance users has revealed that DLG2 is overexpressed at the neural synapse [16]. The DCC gene encodes netrin-1 receptor, which affects axon guidance and migration [17]. DCC has been widely studied in a multitude of neuropsychiatric diseases. Sensitizing amphetamine pretreatment regimens result in selective upregulation of the expression of DCC in the ventral tegmental area of adult rodents [18], and DCC haploinsufficiency decreases sensitivity to the cocaine mediated enhancement of reward seeking behavior [19]. Furthermore, DCC is a regulator of maladaptive responses, such as tolerance, dependence and opioid-induced hyperalgesia to chronic morphine administration [20]. On the basis of these findings, these four genes may be important mediators of drug addiction. To the best of our knowledge, the roles of these genes in heroin addiction have not previously been reported.
Variations in gene functional regions may represent the most direct molecular mechanisms of disease [21]. The exon sequence can be transcribed into the final mRNA. Variations in exon regions may change the amino acid sequences. The most prominent example is brain-derived neurotrophic factor (BDNF), whose rs6265 SNP is directly associated with the clinical phenotype of drug addiction [22,23]. Variations in promoter affect the efficiency of gene transcription. Variations in intron-exon borders may affect exon recognition and change the attributes of the alternative products [24,25]. 5'UTRs are DNA regulatory sequences located in the 5'termini of protein-coding genes. These sequences can be transcribed to mRNA, but cannot be translated to protein. 5'UTRs contain a variety of regulatory elements, including the 5'cap, secondary structure, alternative 5'UTRs, internal ribosome entry sites, and upstream open reading frames (uORFs), among others. In general, 5'UTRs primarily regulate transcriptional initiation [26]. 3'UTRs are DNA regulatory sequences located downstream of the protein coding sequences, and these sequences primarily regulate gene expression at the post-transcriptional level, including transcriptional stability and cleavage, and polyadenylation, among others [27]. Because determining associations between functional polymorphisms and heroin addiction would be meaningful, we used HapMap (Han Chinese population) HCB data to systematically scan the promoter, 5'UTR, 3'UTR, exon, and intron-exon border regions of FAT3, KTN1, DLG2 and DCC, and 20 SNPs were selected to do association analysis with heroin addiction.

Materials and Methods Subjects
A total of 1080 individuals were recruited for the present study. All of these individuals were biologically unrelated individuals of China Han ancestry. Among them, 523 individuals were heroin addiction patients (mean age 45.13±7.270years) recruited from the Methadone Maintenance Treatment (MMT) Program at the Xi'an Mental Health Center between October 2013 and May 2015. At least two senior psychiatrists independently interviewed all patients, and urine testing and the Diagnostic and Statistical Manual of Mental Disorders, fourth revision (DSM-IVR) diagnostic criteria were applied to diagnose opioid addiction. A case vignette was generated to assist with the diagnosis, using a semi-structured interview with questions regarding (a) the age of onset and the duration of heroin use, (b) the quantity of the drug used during this period, (c) the route of administration (i.e., nasal inhalation or injection), (d) whether other substances were used or abused, and (e) comorbidity with any other psychiatric disorder. Participants meeting DSM-IVR criteria for an additional Axis I disorder; with a history of cigarette, alcohol, amphetamine, barbiturate, or benzodiazepine dependence; exhibiting mental illness or neurological diseases; or a history of hematological diseases, seizures, or other chronic physical illnesses were excluded.
The control cohort comprised 557 healthy people (mean age 45.80±10.449 years) recruited from the health examination center at the First Hospital Affiliated with the Medical College of Xi'an Jiao Tong University. The selection criteria were: having no individual history of drug addiction or mental illness, and frequency matching to cases on the basis of gender and age.
All participants provided written informed consent. Our study protocol was approved by the Ethical Committee of Xi'an Mental Health Center, Xi'an, China and the methods were performed in accordance with the approved guidelines.

SNP selection
A total of 20 SNPs were selected on the basis of the following criteria: (1) location in functional region of the gene, including the promoter region, untranslated regions (UTRs), exons, and intron-exon borders, and (2) minor allele frequencies (MAF) greater than 0.05 on the basis of HapMap. The chromosomal positions of the six SNPs in KTN1 (rs10146870, rs1138345, rs10483647, rs1951890, rs17128657, and rs945270) were searched from 55554095 to 55706484bp. The chromosomal positions of the six SNPs in DCC (rs17753970, rs934345, rs2229080, rs16956878, rs12607853, and rs2292043) were searched from 52338192 to 53536381bp. The chromosomal positions of the four SNPs in FAT3 (rs10765565, rs4753069, rs2197678, and rs7927604) were searched from 92312328 to 92896960bp. The chromosomal positions of the four SNPs in DLG2 (rs575050, rs2512676, rs17145219, and rs2507850) were searched from 83454513 to 85629270bp. The databases were HapMap and dbSNP (HCB), and the positions of these SNPs are listed in Table 1.

Genotyping
Peripheral blood samples from the enrolled subjects were collected in EDTA-coated tubes. Genomic DNA was extracted from blood leukocytes by using E.Z.N.A.™ Blood DNA Midi Kit (Omega Bio-Tek, Norcross, GA, USA) according to the manufacturer's instructions. A total of 20 SNPs were genotyped by using SNaPshot SNP technology. A segment of DNA surrounding each SNP (151-368 bp) was amplified in a 10-μl PCR reaction containing 1XHotStarTaq buffer, 3.0 mM Mg 2+ , 0.3 mM dNTPs, 1 U of HotStarTaq polymerase (Qiagen Inc., USA), 1 μl of DNA and 1 μl of each PCR primer. The PCR program included an initial cycle at 95°C for To purify the PCR products, 5 U of shrimp alkaline phosphatase (SAP) enzyme and 2 U of Exonuclease I (Exo I) were mixed with 10 μl of the PCR product, incubated for 1 hour at 37°C and inactivated for 15 min at 75°C. The purified PCR products were used in a SNaPshot multiple single-base extension reaction. The extension reaction system (10 μl) contained 5 μl of the SNaPshot Multiplex Ready Reaction Mix (Applied Biosystems Co Ltd., CA, USA), 2 μl of the purified PCR product, 1 μl of the extension reaction primers, and 2 μl of ultrapure water. The PCR program initiated at 96°C for 1 min, and this was followed by 28 cycles of 96°C for 10 s, 55°C for 5 s, and 60°C for 30 s, and an indefinite hold 4°C. The products were purified after incubation with 1 U of SAP for 1 hour at 37°C, and this was followed by inactivation for 15 minutes at 75°C. Subsequently, 0.5 μl of the purified product was added to 0.5 μl of 120 Liz SIZE STANDARD (Applied Biosystems, Foster City, CA, USA) and 9 μl of Hi-Di (Applied Biosystems, Foster City, CA, USA), and this was followed by sequencing on an ABI3130XL Sequencer (Applied Biosystems, Foster City, CA, USA) after degeneration at 95°C for 5 minutes. The primary data were analyzed using GeneMapper 4.1 (AppliedBiosystems Co., Ltd., USA). The genotypes were determined on the basis of the nucleotide present at the SNP site, visualized as either one or two color peaks. For quality control, 5% of the subjects (54 subjects) were randomly selected and blinded researchers conducted genotyping again, with a reproducibility of 100%.

Expression quantitive trait locus analysis
The mRNA expression level and genotype data for significant SNPs were received from the SNPexp database (http://tinyurl.com/snpexp) [28]. The HapMap version for the genotype was HapMap2r23 unfiltered 3.96 million SNPs. The data form RNA expression levels were obtained from transcripts of lymphoblastoid cell lines from the same 45 unrelated Han Chinese individuals in Beijing. The correlations between the genotype and mRNA expression levels of significant SNPs were calculated by using linear regression and the Wald test.

Statistical analysis
The genotype and allele frequencies of each individual polymorphism and the Hardy-Weinberg equilibrium (HWE) of the control and case groups were calculated by using the chi-square test. The associations between polymorphisms or other categorical variables with heroin addiction were assessed by using Pearson's Chi-square test. Continuous variables, such as the dose of heroin used and the age of heroin addiction onset, were analyzed using a correlate test. P values were calculated on the basis of the codominance or dominance of the rare allele, or the heterosis and recessive models of rare allele inheritance.
We computed pairwise LD statistics (D' and r 2 ) and haplotype frequencies using Haploview 4.0 (Broad Institute of MIT and Harvard, Cambridge, MA). We constructed haplotype blocks based on the criteria of Gabriel et al [29]. When the frequency of the haplotype was less than 5%, this value was excluded from the statistic analysis. We used PHASE 2.1.1 [30] software to verify the composition and frequency of positive haplotypes and to conduct permutation tests.
We analyzed the gene-gene interaction using Multifactor Dimensionality Reduction(http:// sourceforge.net/projects/mdr/) which identifies high dimensional gene-gene interactions [31]. P values are presented as two-sided, and p<0.05 was considered statistically significant. We used Bonferroni's correction to adjust the test level, and the p value was multiplied by all 20 loci or the haplotypes of each gene. All statistical analyses were conducted using SPSS 20.0 software (SPSS Inc., Chicago, IL, USA).

Power analysis
A sufficient sample size was required in this genetics study [32]. Thus, we conducted a power analysis using a Power and Sample Size Program (http://biostat.mc.vanderbilt.edu/wiki/Main/ PowerSampleSize).

Results
No significant deviation from HWE was observed for any of the SNPs in the case group. In the control group, the rs17128657 SNP statistically deviated from HWE (p = 0.028). Five blocks were identified in the linkage disequilibrium (LD) analysis of the case and control data. For the KTN1 gene, block 1 contained five SNPs (rs17128657, rs1951890, rs10483647, rs1138345 and rs10146870). For the DLG2 gene, block 1 contained three SNPs (rs2507850, rs2512676 and rs17145219). For the FAT3 gene, block 1 contained two SNPs (rs10765565 and rs4753069). For the DCC gene, block 1 contained two SNPs (rs934345 and rs17753970), and block 2 contained three SNPs (rs2292043, rs12607853 and rs16956878) (Fig 1). The distributions, frequencies and statistical analyses of the genotype, allele, and haplotype are provided in Tables 1 and 2. For the DCC gene, the rs16956878 and rs12607853 genotypes were strongly associated with heroin addiction (p = 0.002, and p = 0.006, respectively). The T allele frequencies of rs16956878 (p = 0.0004, odds ratio [OR] = 1.447, 95% confidence interval[CI] = 1.221-1.715) and rs12607853 (p = 0.002, OR = 1.393, 95%CI = 1.175-1.650) were significantly higher in the case group than in the control subjects. A significant difference was also observed in the distribution of the genotype frequency for rs2292043 between the case and control groups (p = 0.020). Compared with the control group, the case group exhibited a lower frequency of the C allele (p = 0.002, OR = 1.405, 95% CI = 1.180-1.673). For the KTN1 gene, addiction cases had a significantly higher frequency of the C allele than the control group at rs945270, but was not significant after Bonferroni correction (p = 0.140, OR = 0.752, 95%CI = 0.612-0.925). The sample size showed a 79%-91% power to detect associations with heroin addiction, with a presumed OR of 1.5, alpha value of 5%, and MAF ranging from 0.190 to 0.497.
In block 2 of DCC (rs2292043, rs12607853 and rs16956878), the frequency of the T-T-T haplotype was significantly higher than that in the control group (p = 0.024, OR = 1.381, 95% CI = 1.086-1.754), and fewer C-C-C haplotypes (p = 0.006, OR = 0.679, 95% CI = 0.530-0.870) were observed in the case group. The frequencies of these haplotypes in block 2 were similar to those obtained using PHASE (S1 Table). The p value was adjusted by using the 1000 permutations test (p = 0.026). On the basis of the results, we selected DCC rs16956878 as a representative of rs2292043, rs12607853 and rs16956878 for subsequent analyses. We analyzed the mRNA expression level and genotype data for DCC rs16956878. The mRNA expression levels of subjects with TT and CC genotypes were similar, and showed no significant differences (S2 Table).
The demographic and addiction characteristics were analyzed with respect to DCC rs16956878 (Table 3). Compared with the CC genotype, the TT and TC genotypes were more likely to be associated with more varied routes of heroin administration. The TT and TC genotypes, compared with the CC genotype, were more likely to be associated with heroin use through sniffing, smoking, intravenous injection, or compound delivery methods.
The results of the gene-gene interaction using MDR are listed in Table 4. The testing balance accuracy and cross validation consistency were the highest in models of rs12607853, rs2229080, and rs934345 (S1 Fig). Because it had the highest cross validation consistency and testing balance accuracy, the three-locus model was considered to be the optimal model.

Discussion
Addiction is a disease resulting from interactions between genes and the environment [33]. The genetic susceptibility of heroin addiction primarily refers to the likelihood of individuals to use or become addicted to heroin because of differences in genetic factors [34][35][36]. Family, adoption, and twin studies have suggested that genetic factors account for 30-60% of the overall variance in the risk of developing drug addiction [3,4]. The aim of our research was to identify additional genetic markers of heroin addiction through a case-control study. In addition to genetics, the substances effect is also an important factor leading to heroin addiction [37]. Indeed, the effect is different among the same people receiving different doses of heroin in a certain range [38]. However, not all people exposed to heroin will become addicted to this drug [37]. Our heroin addicts were recruited from the MMT Program and were diagnosed with heroin addiction. Our healthy controls were never self-exposed to heroin. Thus, our results should indicate that some subjects are more likely to become addicted to heroin, despite the effects of environmental factors. Four Genes with Heroin Addiction DCC is involved in axon guidance pathways, and its genetic variants influence the structure of the human mesolimbic reward system [6], which plays a key role in drug addiction. Our study provides direct evidence that polymorphisms of the DCC gene are associated with heroin addiction in the Chinese Han population.
In the present study, we observed that the T alleles of the DCC SNPs rs16956878 and rs12607853 were strongly associated with an increased risk of heroin addiction, whereas the C allele of DCC rs2292043 was associated with a decreased risk of heroin addiction, and these variants are located within the DCC 3' UTR. Moreover, we observed a significant increase in the T-T-T haplotype (rs2292043-rs12607853-rs16956878) in heroin addicts compared with the members of the control group. These results suggest that the subjects carrying the T-T-T haplotype are more likely to become addicted to heroin. The 3' UTR of a gene contains a number of regulatory sequences that are targets of a variety of regulatory molecules, including RNA binding proteins (RBPs) and small noncoding RNAs (ncRNAs), which recognize small cis-elements present in the 3' UTRs and determine the stability, cellular localization, and translation of the encoded mRNA [39,40]. Among these regulatory molecules, microRNAs down-regulate genes and promote RNA cleavage through perfect base pairing with a target sequence [27]. By searching the MirSNP database [41], we observed that when rs12607853 allele changes from T to C, the mRNA of DCC can combine with hsa-miR-141-3p, and decrease DCC gene expression. When the rs2292043 allele changes from T to C, the mRNA of DCC can combine with hsa-miR-141-3689d, and decrease DCC gene expression. When the rs16956878 allele changes from T to C, the combined effect of the mRNA of DCC and hsa-miR-4666a-5p increases, thereby decreases DCC gene expression. Thus, in the heroin group, the higher frequency of T than C alleles, led to increased RNA expression. Our results are in agreement with the previous animal experimental conclusions. DCC haploinsufficiency mice showed blunted sensitivity to cocaine-mediated enhancement of reward seeking behavior [19]. We conducted an eQTL analysis of the mRNA expression level and examined the genotype data for DCC rs16956878 obtained from the SNPexp database. The mRNA expression levels of the subjects with TT and CC genotypes showed no significant differences. Because the RNA expression level and genotype data were obtained from only 45 unrelated individuals, the results may reflect the small sample size. Therefore, additional studies on the mRNA expression level of DCC and examination of the genotype data for rs16956878 with larger sample sizes are urgently needed. Grant et al. have reported an association between schizophrenia and the rs2270954 polymorphism in the 3' UTR of the DCC gene [42]. Peng et al have reported an association between schizophrenia and the rs2229080 polymorphism in the exon 3 of the DCC gene [43]. These results further support an important role for DCC in neuropsychiatric diseases. To confirm the link between the DCC gene and addiction, rs16956878 was analyzed, and the results suggested an association with the route of heroin administration. The rs16956878 TT and TC genotypes were associated with increased variance in the route of heroin administration and therefore might be associated with easier access to drugs. DCC is involved in axon guidance pathways and plays a critical functional role in the organization of brain development and in adult neuroplasticity [17,44]. These results suggest that the DCC gene may contribute to the genetic basis of individual differences in susceptibility to heroin addiction.
KTN1 encodes the protein kinectin, which is primarily found in the endoplasmic reticulum in the dendrites and thesoma of neurons [12]. KTN1 plays a critical role in the regulation of neuronal cell shape, spreading, and migration through kinectin-kinesin interactions [45]. Disrupting the kinectin-kinesin interaction results in a morphological change to a rounded cell shape and reduced cell spreading and migration [45], which decreases the polarization of the neuronal architecture and the cellular complexity essential for neuronal functions [46]. Therefore, KTN1 may affect the density or complexity of the dendritic spines in drug addicts, thereby causing brain region-specific changes in the density of these structures [47]. The rs945270 SNP is located 50 kb downstream of the KTN1 gene of 14q22.3. The C allele of rs945270 increases the expression of KTN1 in the frontal cortex and putamen [48,49]. In the present study, we identified a significantly higher C allele frequency in the heroin addiction group, although this result was not significant after correction. It has been suggested that subjects with the C allele in addicts might exhibit higher KTN1 expression in the frontal cortex and putamen. Interestingly, amphetamine, cocaine and nicotine increase the spine density on the apical dendrites of the medial prefrontal cortex [50][51][52] and morphine significantly increases dendritic spine density in the orbital frontal cortex of adult rats [47]. Thus, KTN1 may increase the density or complexity of dendritic spines in the frontal cortices of heroin addicts.
FAT3 is the human homolog of Drosophila FAT which inhibits Yokie through phosphorylation and subsequently activates the expanded-Hippo-Warts signaling cascade [53]. Phosphorylation of yes-associated protein 1(YAP1) in Hippo signaling inhibits the Wnt signaling cascade through interactions with β-catenin [54]. The cell polarity protein complex, Dlg/Lgl/Scrib affects the cell-cell contacts, thus leading to the deregulation of the actin cytoskeleton through interactions with Hippo pathways [55]. Netrin and Wnt signaling pathways play important roles in axon guidance [56]. Netrin signaling is primarily responsible for dorso-ventral (D/V)-graded distributions and Wnt signaling is primarily responsible for antero-posterior (A/P) distributions [57]. Kinesin-1 acts with DCC in sensory neuron position [58]. Thus, we speculated that these four genes might be involved in Hippo and/or Wnt signaling pathways. Studies have shown that the Wnt pathway regulates the susceptibility of chronic stress and addiction through the regulation of the differentiation of dopamine neurons in the mesolimbic reward system [59,60]. Unfortunately, we did not obtain direct evidence from the KEGG pathway and PATHWAY STUDIO databases. The SNPs in the optimal model of gene-gene interaction were rs12607853, rs2229080, and rs934345, all of which are located in the DCC gene and no gene-gene interactions were detected. Thus, a pathway study of these genes would be meaningful in the future.

Conclusion
To the best of our knowledge, this is the first report demonstrating an association between heroin addiction and functional polymorphisms within the DCC gene in a homogeneous sampling population. However, further replication or validation across populations should be considered in the future. Moreover, studies of these polymorphisms and their expression are warranted to further the understanding of how these variants influence the expression and induction of these genes. These studies should help to elucidate the pathogenesis of heroin addiction and may offer a basis for the diagnosis and treatment of addiction.
Supporting Information S1 Dataset. Genetic data in this study. (XLSX) S1 Fig. Optimal models determined by using MDR. Graphical model of rs12607853, rs2229080, and rs934345 (for SNP: 0 = no risk alleles, 1 = 1 risk allele, and 2 = 2 risk alleles). In each small square, the numbers represent the number of cases (left) and controls (right). Darkshading for each square represents a high risk of disease, whereas light shading indicates a low risk of disease. (TIF) S1