Genome-Wide Pharmacogenomic Study on Methadone Maintenance Treatment Identifies SNP rs17180299 and Multiple Haplotypes on CYP2B6, SPON1, and GSG1L Associated with Plasma Concentrations of Methadone R- and S-enantiomers in Heroin-Dependent Patients

Methadone maintenance treatment (MMT) is commonly used for controlling opioid dependence, preventing withdrawal symptoms, and improving the quality of life of heroin-dependent patients. A steady-state plasma concentration of methadone enantiomers, a measure of methadone metabolism, is an index of treatment response and efficacy of MMT. Although the methadone metabolism pathway has been partially revealed, no genome-wide pharmacogenomic study has been performed to identify genetic determinants and characterize genetic mechanisms for the plasma concentrations of methadone R- and S-enantiomers. This study was the first genome-wide pharmacogenomic study to identify genes associated with the plasma concentrations of methadone R- and S-enantiomers and their respective metabolites in a methadone maintenance cohort. After data quality control was ensured, a dataset of 344 heroin-dependent patients in the Han Chinese population of Taiwan who underwent MMT was analyzed. Genome-wide single-locus and haplotype-based association tests were performed to analyze four quantitative traits: the plasma concentrations of methadone R- and S-enantiomers and their respective metabolites. A significant single nucleotide polymorphism (SNP), rs17180299 (raw p = 2.24 × 10−8), was identified, accounting for 9.541% of the variation in the plasma concentration of the methadone R-enantiomer. In addition, 17 haplotypes were identified on SPON1, GSG1L, and CYP450 genes associated with the plasma concentration of methadone S-enantiomer. These haplotypes accounted for approximately one-fourth of the variation of the overall S-methadone plasma concentration. The association between the S-methadone plasma concentration and CYP2B6, SPON1, and GSG1L were replicated in another independent study. A gene expression experiment revealed that CYP2B6, SPON1, and GSG1L can be activated concomitantly through a constitutive androstane receptor (CAR) activation pathway. In conclusion, this study revealed new genes associated with the plasma concentration of methadone, providing insight into the genetic foundation of methadone metabolism. The results can be applied to predict treatment responses and methadone-related deaths for individualized MMTs.


Introduction
Heroin dependence is a severe psychiatric disorder characterized by a heroin craving behavior and the inability to stop using heroin. The World Health Organization reported that there were more than 9 million heroin users in 2014, and the prevalence of heroin dependence is increasing globally. Heroin abuse can incur medical complications and generate social problems; these problems have substantially increased the burden on social security and health insurance systems [1,2].
Replacement or maintenance therapy with an opioid analog is among the most commonly used treatments for heroin dependence, and reduces craving and withdrawal symptoms, increases treatment compliance, and improves the quality of life of patients [3]. Methadone, a synthetic opioid, is a commonly used medication for treating heroin dependence [4]. Methadone maintenance treatment (MMT) reportedly retains patients and decreases heroin use more effectively than no treatment [4][5][6].
The chemical structure of methadone has a chiral center that produces the racemic enantiomers of R-and S-form [7]. Methadone R-and S-enantiomers have different metabolic preferences for the liver cytochrome P-450 (CYP450) isozyme [8]: the CYP2C19 isozyme metabolizes methadone R-enantiomer [8,9] and CYP2B6 isozyme metabolizes methadone Senantiomer [10,11]. Several pharmacogenetic studies have found that the genetic variants located in the CYP2B6 region were associated with methadone [12] and could be observed to predict methadone-related deaths [13,14]. Through an interaction between CYP3A4 and CYP2C19, a genetic matrix exhibited a titration prediction effect for the required methadone dose [15].
The steady-state plasma concentration of methadone is an index for quantifying methadone metabolism and can serve as a surrogate marker for the treatment responses and efficacy of MMT. Previous pharmacogenetic studies have reported the association between CYP2B6 and the plasma concentration of methadone S-enantiomer [16][17][18]. We previously established a system used to measure methadone enantiomers and their metabolism product, metabolite 2-ethylidene-1,5-dimethyl-3,3-diphenyl pyrrolidine (EDDP) [9]. Pharmacogenetic studies on the CYP2B6 gene in a methadone maintenance cohort in Taiwan revealed the association between the plasma concentration of methadone S-enantiomer and single nucleotide polymorphisms (SNPs) located at the region of CYP2B6 gene [16]. In addition, the pregnane X receptor, a steroid and xenobiotic sensing nuclear receptor, possibly interacts with genetic variants in CYP2B6 to associate with the plasma concentration of methadone S-enantiomer [19]. In addition to genes in the CYP450 gene family, ABCB1 was reported to be associated with the plasma concentration of methadone [20]. However, the association remains unclear because the finding was not replicated in other studies [18,21]. Dennis et al. provided a systematic review on the effect of ABCB1 and CYP2B6 on methadone metabolism [18].
Although the methadone metabolism pathway has been partially revealed [22], there is still no report of genome-wide association analyses that characterize the plasma concentrations of the methadone R-and S-enantiomers. Moreover, the metabolism regulatory genes associated with the plasma concentration of methadone remain unclear. Previous pharmacogenetic studies on the plasma concentration of methadone enantiomer have used a candidate gene approach and single locus association analyses [16,17]. Because of the clinical importance of treatment responses to MMT and lack of results from genome-wide pharmacogenomic research, the aim of this study was to conduct the first genome-wide pharmacogenomic investigation of the plasma concentrations of methadone enantiomers and their metabolites. We recruited and genotyped a Han Chinese methadone maintenance cohort comprising 360 heroin-dependent patients in Taiwan by using the Axiom Genome-Wide CHB 1 Array. Genomewide single-locus and multilocus methods were applied to identify the susceptibility loci and their contribution in the variability of the methadone plasma concentration. Furthermore, an independent methadone maintenance cohort comprising 78 heroin-dependent patients under the same inclusion and exclusion criteria was recruited to replicate the susceptibility loci identified in the genome-wide pharmacogenomic study (discovery stage). In addition to identifying novel genes associated with the plasma concentration of methadone, this pharmacogenomic study was the first to verify the results of previous biological metabolic tests and the methadone pathways. In total, 16 individuals were removed from the subsequent association analyses. After the quality examination of SNP, 14,342 nonautosomal SNPs were removed, 7,926 SNPs with a GCR of <0.95 or a minor allele frequency (MAF) of <0.01 were removed, and 2,277 additional SNPs were removed because of a deviation from the Hardy-Weinberg equilibrium (HWE) (p < 8.1 × 10 −8 ). In this study, 615,216 valid SNPs (approximately 98% of the autosomal SNPs on the Axiom Genome-Wide CHB 1 Array) were analyzed.

Three covariates and four quantitative traits
We examined three covariates (age, sex, and body mass index [BMI]) and four quantitative traits (the plasma concentrations of the methadone R-and S-enantiomers and their metabolites). Based on the post-quality-control data, the distributions of the three covariates and four quantitative traits are summarized in Table 1. Among the male patients, the mean age and BMI were 39.3061 (standard deviation [SD] = 7.6587) and 23.9176 (SD = 3.4537), respectively. Among the female patients, the mean age and BMI were 33.0318 (SD = 5.4506) and 22.3980 (SD = 3.5792), respectively. The normality of the quantitative trait was a critical assumption required for the association analysis. The normality of the four quantitative traits was examined; Table 1 shows a summary of the results. Before any data transformation, the Kolmogorov-Smirnov goodness-of-fit tests for normality [23] revealed that the four quantitative traits

Genome-wide singe-locus association analysis
As shown in the Manhattan plot (Fig 2(A)) and Quantile-Quantile plot (Fig 2(B)), the genome-wide single-locus association analysis identified only SNP rs17180299 to be significantly associated with the plasma concentration of R-methadone after a multiple-test correction of a false discovery rate (raw p = 2.24 × 10 −8 ). Fig 2(C) depicts the regional association plot for the flanking genomic region of SNP rs17180299 on chromosome 9. Rs17180299 was an A/ G polymorphism; the frequency of minor allele G was 0.09. A trend of the dose-response effect of rs17180299 on the plasma concentration of R-methadone was observed (Fig 2(D)). Patients who had more allele G tended to have a higher plasma concentration of R-methadone. A SNP cluster plot showed that individuals with different genotypes were clearly separated, implying that the genotype calls were reliable. However, this SNP was located in an intergenic region.
For the other three traits, no single SNP was significant after a multiple-test correction of the false discovery rate. The Manhattan plots of genome-wide SNPs (in a scale of -log 10  , rs1448332 (raw p = 8.18 × 10 −7 ), and AX-13599782 (raw p = 2.08 × 10 −6 ), respectively. The SNP AX-16534452 was located on SPON1 on chromosome 11. The SNPs rs1448332 and AX-13599782 were located in an intergenic region on chromosome 3 and chromosome 21, respectively.

Genome-wide haplotype-based association analysis
We performed sliding-window haplotype-based association analyses with a window size of 2, 3, 4, 5, and 10 individually. For each window size, we separately performed haplotype-based association analyses for the four quantitative traits, adjusting for age, sex, and BMI. Only the Table 1. Summary statistics of covariates and the raw data and transformed data of quantitative traits by gender. The number of individuals (i.e., sample size), means ± standard deviations (SD) for covariates and quantitative traits are provided. The final column provides the p value of the Kolmogorov-Smirnov Good-of-Fit test for normality for the raw data and transformed data of the four quantitative traits. results on a basis of a window size of 5 were reported because of the following two reasons. First, the locations of significant haplotypes in the haplotype-based association analyses of different window sizes were quite consistent. Among the 25 most significant windows in each of the analyses of different window sizes, higher than 92.5% of the significant windows were overlapped. Second, in the study population, a window size of 5 provided a distribution of window width nearest to the distribution of the block size of the LD of the Asian population in the International HapMap II Project (S1 Table) [24,25].
The results for the plasma concentration of R-methadone are shown in the Manhattan plot (Fig 3(A)) and Quantile-Quantile plot (Fig 3(B)). The results for the plasma concentration of S-methadone are shown in the Manhattan plot (Fig 4(A)) and Quantile-Quantile plot (Fig 4  (B)). The results for the plasma concentration of R-EDDP (S7(A) Fig) and the plasma concentration of S-EDDP (S7(B) Fig) are also provided. After a multiple-test correction of the false discovery rate [26], we determined that four sliding windows were significantly associated with the plasma concentration of R-methadone (Fig 3(A)), and 23 sliding windows were associated with the plasma concentration of S-methadone (Fig 4(A)). No significant results were found in the haplotype-based association analyses of the plasma concentration of R-EDDP (S7 ( The four sliding windows associated with the plasma concentration of R-methadone are summarized in Table 2; the four windows consisted of one window located on chromosome 4 (Fig 3(C)) and three windows located in consecutive genomic regions on chromosome 9 (Fig 3  (D)). The significant window regions were expanded to encompass the flanking region on both sides of the significant windows. The expanded regions on chromosomes 4 and 9 contained two LD blocks (Fig 3(C)) and five LD blocks (Fig 3(D)), respectively. We further examined whether individual haplotypes in the LD blocks were associated with the plasma concentration of R-methadone. No significant individual haplotype was found in the significant region of chromosome 4 ( Table 2 and Fig 3(C)). Significant individual haplotypes were found in the first four LD blocks in the region of chromosome 9 ( Table 2 and Fig 3(D)). The fourth LD block contained the only significant SNP (rs17180299) identified during the genome-wide singlelocus association test. The slope estimates and their standard errors and the corresponding 95% confidence intervals of the 4 significant haplotypes are summarized in S2 Table. All of the SNPs were located in intergenic regions. Trends of dose-response effects exerted by the four significant haplotypes on the plasma concentration of R-methadone were found (Fig 3(E)). Patients with more minor haplotypes tended to exhibit a higher plasma concentration of Rmethadone.
The 23 sliding windows associated with the plasma concentration of S-methadone are summarized in Table 3; the 23 windows were located within three consecutive genomic regions (Fig 4(A)). They contained one window on chromosome 11 (Fig 4(C)), two windows on chromosome 16 (Fig 4(D)), and 20 windows on chromosome 19 (Fig 4(E)). An individual haplotype analysis in LD blocks revealed that 17 haplotypes in the 23 windows were significantly associated with the plasma concentration of S-methadone. The slope estimates and the corresponding 95% confidence intervals of the 17 significant haplotypes are summarized in S2 Table. The significant genomic region of chromosome 11 contained two significant haplotypes in LD blocks B2 and B3 on SPON1 (Table 3 and Fig 4(C)). The significant genomic region of chromosome 16 contained five significant haplotypes in LD blocks B1 and B2 on GSG1L (Table 3 and Fig 4(D)). Finally, the significant genomic region of chromosome 19 contained 10 significant haplotypes in LD blocks B2-B7 (Table 3 and Fig 4(E)). This genomic region contained multiple CYP450 genes. For all the 17 significant haplotypes, trends of dose-response effects exerted by the significant haplotypes on the plasma concentration of S-methadone were found; the dose-response effects of the 2, 5, 10 significant haplotypes on chromosomes 11, 16, and 19 are shown in Fig 5(A), 5(B) and 5(C), respectively. S3 and S4 Tables summarize all haplotypes with a minor haplotype frequency of >0.01 whatever they were significant or not in association tests of individual haplotypes for the plasma concentrations of methadone R-and S-enantiomers, respectively.  Table 4 shows the proportions of the variations in the plasma concentrations of methadone Rand S-enantiomers explained by significant haplotypes. Regarding the plasma concentration of R-methadone, one SNP (rs17180299) and four haplotypes on chromosome 9 were significant. Rs17180299 accounted for 9.541% of the variation in the plasma concentration of R-methadone. The most significant haplotype, C-G-G-C-G, was located in LD block B4 (Fig 3(D)). The SNP rs17180299, which was the only significant SNP identified during the genome-wide single-locus association test, was a tag SNP of the other four SNPs in this region. Therefore, this haplotype accounted for the same amount of variation as did rs17180299. According to the effect exerted by haplotype C-G-G-C-G, the remaining three significant haplotypes accounted for an extra variation of 1.809%. In total, 11.350% of the variation in the plasma concentration of R-methadone was accounted for by the identified SNPs and haplotypes. Regarding the plasma concentration of S-methadone, 17 haplotypes in the 23 windows was included into the final model sequentially; a haplotype which contributed a higher increment of model R 2 was included earlier in a sequential order. The 17 significant haplotypes accounted for 23.96% of the variation in the plasma concentration of S-methadone. The five most significant haplotypes, which included two significant haplotypes in CYP2B6, accounted for more than 20% of the variation in the plasma concentration of S-methadone.
Replication analysis of the significant SNP and haplotypes identified in the genome-wide pharmacogenomic study  (Table 1 and S5 Table). The results of association analysis are summarized in S6 Table. Association analysis showed that haplotypes and T-C-T-A-C-G-C-A-C (raw p = 2.60 × 10 −2 ) on CYP2B7P1 and CYP2A7P1 were associated with the plasma concentration of S-methadone. For clinical applications, we considered urine morphine test, which is an indicator for treatment response to MMT, as a stratified variable. In the test-positive group (i.e., poor-response group), T-T-A (raw p = 2.94 × 10 −2 ) on SPON1 was significant; in the testnegative group (i.e., better-response group), C-T-T-C-C-G-C-A-T (raw p = 3.97 × 10 −2 ), T-C-T-A-C-G-C-A-C (raw p = 3.83 × 10 −2 ), and T-A-A-T-C-G (raw p = 4.37 × 10 −2 ) on CYP2B7P1, CYP2A7P1, and CYP2B6 were significant. Other haplotypes had a p value of >0.05. The five replicated haplotypes (i.e., the significant haplotypes in the 1 st , 2 nd , 10 th , 12 th , and 16 th inclusion steps for the plasma concentration of S-methadone in Table 4) accounted association. (C) Regional association plot and LD plot on chromosome 4. In the regional association plot, the vertical axis is the raw p values of the association tests (in a scale of -log10) and the horizontal axis is physical position of initial SNP of a haplotype (in unit of kb). Raw p values of single SNPs (orange circle) and haplotypes which were significant after adjusting for false discovery rate [26] (symbol: green square with frame) and were not significant (symbol: green square without frame) are shown. Moreover, haplotypes which passed a multiple-test correction of a false discovery rate [26] in a LD block (purple triangle) are shown. In the LD plot, pairwise LD of SNPs was measured by D' [27]. SNPs with a strong LD were framed in a black inverse diamond block which was defined according to the confidence interval method [24]. (D) Regional association plot and LD plot on chromosome 9. (E) Distribution of plasma concentration of R-methadone for the haplotypes on chromosome 9 significantly associated with the plasma concentration of R-methadone.  (C) Regional association plot and LD plot on chromosome 11. In the regional association plot, the vertical axis is the raw p values of the association tests (in a scale of -log10) and the horizontal axis is physical position of initial SNP of a haplotype (in unit of kb). Raw p values of single SNPs (orange circle) and haplotypes which were significant after adjusting for false discovery rate [26] (symbol: green square with frame) and were not significant (symbol: green square without frame) are shown. Moreover, haplotypes which passed a multiple-test correction of a false discovery rate [26] in a LD block (purple triangle) are shown. In the LD plot, pairwise LD of SNPs was measured by D' [27]. SNPs with a strong LD were framed in a black inverse diamond block which was defined according to the confidence interval method [24]. (D) Regional association plot and LD plot on chromosome 16. (E) Regional association plot and LD plot on chromosome 19. for 17.75% and 19.51% of the variation in the plasma concentration of S-methadone at the discovery stage and replication stage, respectively. Regression coefficients of the replicated haplotypes at the discovery and replication stages were in the same direction. A meta-analysis by combining p-values at the discovery and replication stages showed that most of the haplotypes identified at the discovery stage were still statistically significant (S7 Table). However, we may have overlooked some important findings such as the significance of rs17180299 (raw p = 0.3578 in the replication study) and some important haplotypes (S6 Table) because of a small sample size in the replication study.

Discussion
R-and S-methadone can be metabolized through various enzymatic pathways. Results from a previous candidate gene study of CYP2B6 [16] and other pharmacogenetic studies [10,11] have shown that the methadone S-enantiomer was frequently metabolized by CYP2B6 isozymes. The present genome-wide association study confirmed that CYP2B6 is associated with Table 2. The significant haplotypes associated with plasma concentration of R-methadone. We list the chromosome, window, linkage disequilibrium (LD) block, and gene where the haplotypes are located. Haplotype frequencies and raw and adjusted p values of the significant haplotypes are provided in the final two columns.

Gene
Significant haplotype Haplotype frequency a The sliding-window haplotype-based association analysis using an omnibus association test identified 4 windows associated with the plasma concentrations of R-methadone. The number followed by "w" is the index of significant windows. Notation "-" indicates the upstream of a window when we expanded a significant window to encompass the flanking region on either side. b Notation "NS" indicates the haplotype was not significant. the plasma concentration of methadone S-enantiomer. The two significant haplotypes, T-A-A-T-C-G and T-C-C-T-T-T of rs8100458, rs7250601, rs7250991, rs11882424, rs8192719, and rs10853744, on CYP2B6 that were identified in this study accounted for 10.72% of the variation in the plasma concentration of S-methadone.
Haplotype T-C-C-T-T-T was found to be associated with a urine morphine test in a permutation-based logistic regression analysis with 100,000 random shuffles of test-positive and test-negative patients (empirical p = 4.04 × 10 −3 ). The results implied that the identified haplotypes were relevant to the response to MMT. 0.337 2.08×10 −6 (9.36×10 −5 ) a The sliding-window haplotype-based association analysis using an omnibus association test identified 23 windows associated with the plasma concentrations of S-methadone. The number followed by "w" is the index of significant windows. Notation "-" and "+" indicate the upstream and downstream of a window respectively when we expanded a significant window to encompass the flanking region on either side. We further examined the genetic contribution of CYP2B6 by performing an integrative analysis that involved seven SNPs (rs8100458, rs7250601, rs7250991, rs11882424, rs8192719, rs10853744, and rs1042389) on CYP2B6 from the present study and 10 SNPs (rs8100458, rs10500282, rs10403955, rs2279342, rs3745274, rs2279343, rs2279345, rs1038376, rs707265, and rs1042389) from a previous study of CYP2B6 [16]. Overall, the haplotype analysis involved 15 distinct SNPs, including two SNPs (rs8100458 and rs1042389) that were investigated in the two studies. LD blocks were constructed and haplotype association tests were conducted in each LD block by using the same association test described in the Methods section.  The results of the two haplotype analyses from the current study and the aforementioned combined analysis of 15 SNPs were compared. Two significant haplotypes of six SNPs on CYP2B6 identified in the current study were also in the LD block near the 5' end. The patterns of these two haplotypes were compatible with the three significant haplotypes identified in the combined analysis of 15 SNPs. The two significant haplotypes can be presented as T-A-A-X-X-X-X-X-X-T-X-C-G (raw p = 2.03 × 10 −7 , haplotype frequency = 0.311) and T-C-C-X-X-X-X-X-X-T-X-T-T (raw p = 9.04 × 10 −6 , haplotype frequency = 0.126), where X denotes an unknown allele caused by the unavailability of the SNPs in the Axiom Genome-Wide CHB 1 Array. The proportion of the variation in the plasma concentration of methadone S-enantiomers accounted for by the five haplotypes on CYP2B6 increased from 10.725% to 13.433% after the SNPs of the Axiom Genome-Wide CHB 1 Array and the SNPs from our previous candidate gene study of CYP2B6 were combined. Of the 15 SNPs, two were missense SNPs: rs3745274 (raw p = 8.08 × 10 −7 , minor allele frequency = 0.185) and rs2279343 (raw p = 3.62×10 −3 , minor allele frequency = 0.263). These two SNPs have been associated with methadone in heroin-dependent patients [12] and poor metabolism, plasma concentration, and efavirenz clearance in children with human immunodeficiency virus [28][29][30].
CYP2B6 was highly expressed in the liver (S10(A) Fig) and influenced methadone metabolism. We investigated whether the identified haplotypes near CYP2B6 can regulate the gene Table 4. The proportion of variation explained by the significant haplotypes we identified. In accordance with the inclusion steps of haplotypes in each analysis of plasma concentration of R-methadone and S-methadone, we list the chromosome and linkage disequilibrium (LD) block where the haplotypes are located. A haplotype which contributed a higher increment of model R 2 was included earlier in a sequential order. Univariate R 2 and model R 2 of the significant haplotypes are provided in the final two columns.

Transformed plasma concentration
Step  expression of CYP2B6 through the mechanism of cis-regulation. An RT-PCR experiment was conducted to measure the gene expression of CYP2B6 in a random sample of 55 patients. The results showed that none of the 10 haplotypes associated with the plasma concentration of methadone S-enantiomer in the third significant region of chromosome 19 (Fig 4(E)) achieved significance. Only haplotype A-G-C of rs12461727, rs73038469, and rs4803406 in LD block B2 in CYP2A7P1 and CYP2B7P1 showed a dose-response pattern and had a p value near a marginal significance level (raw p = 0.0644). More samples are needed to verify that haplotype A-G-C is an expression quantitative trait locus. In addition to the association between the CYP2B6 gene and the plasma concentration of methadone S-enantiomers, we discovered that the haplotype SNPs located in the gene regions of SPON1 and GSG1L were significantly associated with the plasma concentration of methadone S-enantiomers. To investigate the roles of SPON1 and GSG1L genes and their coexpression with CYP2B6 in a regulated plasma concentration of S-methadone in the liver, the cultured human liver cell line HepG2 was incubated with the CAR agonist CITCO, a CYP2B6 inducer [31]. The mRNA expression levels of CYP2B6, SPON1, and GSG1L in HepG2 cells were measured; the results revealed that the gene expression of CYP2B6, SPON1, and GSG1L can be activated concomitantly through the CAR activation pathway. Methadone reportedly activates the CAR [32], a phenomenon that partially explains why the three genes were significantly associated with the plasma concentrations of S-methadone. SPON1 was expressed in all the tissues, including the brain, lymph node, kidney, and intestine (S10(B) Fig). GSG1L was expressed mostly in connective tissue and the tissues of the brain, eye, and testis (S10(C) Fig). Both SPON1 and GSG1L were expressed in brain tissue. A previous study found that GSG1L was an α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) glutamate receptor relevant to neurotransmission [33]. In the present study, a permutation-based association test with 100,000 random shuffles was conducted, revealing that haplotype C-T-G-C was associated with treatment emergent symptoms for adverse events related to MMT (empirical p = 1.82 × 10 −2 ). In addition, the preliminary interaction analysis suggested that there was an interactive effect between haplotype T-C in SPON1 and haplotype T-C-G-C-T in GSG1L (raw p = 7.55 × 10 −3 ). According to these findings, the roles of SPON1 and GSG1L in the brain and the methadone action pathway will be further investigated in future studies.
The methadone R-enantiomer reportedly tends to be metabolized by the CYP2C19 isozyme [8,9]. Our previous study found that a genotype combination of two functional SNPs (rs4986893 in exon 4 and rs4244285 in exon 5) on CYP2C19 were associated with the plasma concentration of methadone R-enantiomers (p = 0.007; the p values of rs4986893 and rs4244285 were 0.8 and 0.013, respectively). Moreover, the genotype combination can provide a gene dosage model for classifying poor, intermediate, and extensive metabolizers [34]. On the basis of the same dataset, the current analysis further revealed that rs4986893, rs4244285, and their genotype combination accounted for 0.198%, 2.014%, and 2.351% of the variation in the plasma concentration of methadone R-enantiomer, respectively.
This genome-wide single-locus association analysis did not find a single SNP on CYP2C19 that exceeded the genome-wide significance level in the association analysis of the plasma concentration of methadone R-enantiomer. In addition, rs4244285 on CYP2C19 was not designed in the Axiom CHB arrays. In contrast to the candidate gene study, this genome-wide singlelocus association analysis identified that only the SNP rs17180299 (raw p = 2.24 × 10 −8 ) had a strong signal of genetic association. Patients with more allele G of rs17180299 on average had higher plasma concentration of methadone R-enantiomer. Rs17180299, which accounted for 9.541% of the variation in the plasma concentration of methadone R-enantiomer, accounted for a substantially greater variation than did rs4986893, rs4244285, and their genotype combination. This SNP may be crucial for predicting the plasma concentration of methadone R-enantiomer and classifying heroin-dependent patients with different methadone metabolism responses. Permutation-based association tests with 100,000 random shuffles further revealed that haplotype C-G-G-C-G (Fig 3(D)), which contained rs17180299, was associated with the weight-adjusted methadone dosage (empirical p = 2.46 × 10 −2 ). Previous studies have suggested that methadone R-enantiomers induce an enhanced analgesic effect; however, it also exerts an adverse and toxic cardiac effect [35,36]. Therefore, patients with more allele G can be administered lower doses of methadone R-enantiomers to minimize the adverse effect.
Although rs17180299 is located in an intergenic region, the data from the Roadmap Epigenomics Projects revealed a relationship between rs17180299 and heterochromatic histone H3 lysine 9 trimethylation (H3K9me3) in the primary T regulatory cells from peripheral blood (pvalue signal score was 2.47354). Several studies reported the association of histone mark H3K9me3 and substance abuse. For example, cocaine dynamically regulates H3K9me3 [37] and H3K9me3 is a key player in the regulation of affective vulnerability and substance abuse [38]. The information suggests that rs17180299 may play a role in the regulation of plasma concentration of methadone R-enantiomer through epigenetic histone modification of H3K9me3.
We also imputed untyped SNPs using the SHAPEIT software [39] and IMPUTE2 software [40] and ran association analysis in a 2-Mb flanking region of rs17180299 on chromosome 9. Strong association signals were found in the almost complete LD region closest to rs17180299 (S11 Fig). Among the significant SNPs, the imputed SNP rs17083093 at 82,938,858 bp (Raw p = 6.80 ×10 −9 ) was even more significant than rs17180299 at 82,944,201 bp (Raw p = 2.24 ×10 −8 ). All of the significant SNPs were located in intergenic regions and only rs17180299 provides direct biological explanations for substance abuse.
In general, the available sample sizes in pharmacogenomic studies are smaller than the sample sizes in genome-wide association studies for complex diseases. In this MMT studies, most patients were not willing to come to the clinic in daily basis for more than 5 days to fulfil the requirement of the steady-state status in which the half-life of methadone was considered with an average of 24 hours. This study overcame the difficulty and recruited a moderate sample size of steady-state MMT patients through multicenter collaborations. Furthermore, we successfully replicated that CYP2B6, SPON1, and GSG1L are associated with the plasma concentration of methadone S-enantiomer. However, we may have overlooked some important findings such as the significance of rs17180299 and some important haplotypes because of a small sample size in the replication study. SNP rs17180299 cannot be replicated because of a low frequency of minor allele G in the replication study; several MMT patients with genotype GG were observed at the discovery stage but no patients with genotype GG were observed at the replication stage. We calculated the sample size required to replicate the association of significant individual haplotypes by using Quanto software [41]. For significant individual haplotypes with a stronger effect (e.g., the haplotypes on CYP2B6), on the basis of the current sample size, we had a power of 0.65 to replicate the results. For the haplotypes with a weaker effect (e.g., the haplotypes on GSG1L), on the basis of the current sample size, we had only a power of 0.25 to replicate the results. In the latter situation, we will need 330 replication samples in order to attain a power of 0.80. On the other hand, the replication of the association findings was mainly based on haplotype-specific association tests. Further investigation should be undertaken to prevent spurious association findings. We also evaluated the influence of "winner's curse" on the estimation of the proportion of variation explained by the five replicated haplotypes. The replicated haplotypes accounted for 17.75% and 19.51% of the variation in the plasma concentration of S-methadone at the discovery stage and replication stage, respectively. Because the two proportions of variation explained were close, this information suggested no winner's curse in this study. In this study, we examined samples from a single East-Asian population to reduce false-positive and false-negative results caused by genetic heterogeneity. Currently, no other genome-wide pharmacogenomic study on the plasma concentration of methadone in other populations exists for comparison.
In conclusion, this study was the first genome-wide pharmacogenomic study to identify genes associated with the plasma concentrations of methadone R-and S-enantiomers and their respective metabolites in a methadone maintenance cohort of heroin-dependent patients. A significant intergenic SNP (rs17180299) and four intergenic haplotypes on chromosome 9 accounted for 11.350% of the variation in the overall plasma concentration of methadone Renantiomers. Regarding the plasma concentration of methadone S-enantiomers, 17 significant haplotypes were identified: two haplotypes on SPON1, five haplotypes on GSG1L, and 10 haplotypes on CYP450 genes, including CYP2B6. These haplotypes accounted for approximately one-fourth of the variation in the plasma concentration of S-methadone. These results revealed that additional candidate genes are associated with the plasma concentration of methadone, affording insights into the mechanism of methadone metabolism, and facilitating further enhancement of MMT.

Study participants
This genome-wide pharmacogenomic study recruited 360 heroin-dependent patients who underwent the MMT. The inclusion and exclusion criteria for participation are described as follows. This study included patients who (1) resided in Taiwan and were Han Chinese; (2) were older than 18 years; (3) could participate in a clinical assessment in Chinese (including Mandarin and Taiwanese dialects); (4) were diagnosed as a patient with heroin dependence based on the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition; (5) underwent the MMT for at least 3 months; (6) received the MMT regularly in the past one week; (7) had no change of >10 mg in methadone dosage throughout the past one week; and (8) signed the informed consent form. This study excluded patients (1) who were pregnant and (2) who had severe cognitive impairment or severe comorbid mental disorders, including organic mental disorders and schizophrenia. For replication, we recruited another 78 independent samples under the same inclusion and exclusion criteria of the genome-wide pharmacogenomic study.

Methadone and its metabolites in the plasma
The plasma concentrations of methadones and its metabolite EDDP enantiomers were measured using high-performance liquid chromatography according to the settings described in our previous report [41]. A Chiral-AGP analytical column was used (5 μm, 100 × 3 mm; Chrom Tech, Cheshire, UK) with a Chiral-AGP guard column (10 × 3 mm) (Chrom Tech). Methadone, EDDP, and amitriptyline as an internal standard (40 ng) were extracted from the plasma samples by using a C18-E Strata solid-phase extraction column with a 100-mg/mL capacity (Phenomenex, Torrance, CA). In addition, 800-μL aliquots of each serum sample and 40 ng of the amitriptyline internal standard were added. The column was then washed and the remaining compounds were eluted. The collected eluent was then evaporated and the remaining residue was dissolved in 100 μL of mobile phase. A total sample volume of 50 μL was chromatographed. The intraday and interday coefficients of variation were 3.3% and 6.6% for Rmethadone, 2.5% and 5.6% for S-methadone, 1.6% and 3.9% for R-EDDP, and 2.8% and 5.5% for S-EDDP, respectively. The recovery rates for R-methadone, S-methadone, R-EDDP, and S-EDDP were 109.0 ± 7.6%, 96.7 ± 8.6%, 96.6 ± 6.6%, and 87.4 ± 3.2%, respectively. The recovery rate for the internal standard was 60.2 ± 4.8%. The details are described in our previous publications [42,43].

DNA samples and SNP genotyping experiments
The genomic DNA of 360 Han Chinese MMT patients at discovery stage and 78 Han Chinese MMT patients at replication stage were isolated from leukocytes by using a Puregene Blood kit C (QIAGEN Sciences, Germantown, Maryland). The DNA concentration was measured and adjusted to 15-20 ng/μL by using a NanoDrop ND-1000 Spectrophotometer (NanoDrop Technologies, DE). All samples were genotyped at the National Center of Genomic Medicine (Taipei, Taiwan) by using the Axiom Genome-Wide CHB 1 Array (Affymetrix, Inc., San Diego, CA) following the manufacturer's protocols (http://www.affymetrix.com). The genotype calling was performed using the Genotyping Console 4.0 with default parameters (http://www. affymetrix.com). This SNP genotyping platform optimized the genomic coverage of common SNPs having a minor allele frequency of >5% of the genomes of Han Chinese populations. In total, this array comprised 642,832 SNPs selected from the Axiom Genomic Database, which includes SNP probes from the International HapMap Project (http://hapmap.ncbi.nlm.nih. gov/), the 1,000 Genomes Project (http://www.1000genomes.org/), and the dbSNP database . The cultures were maintained in a humidified atmosphere with 5% CO 2 at 37°C, and the medium was refreshed every three or four days.
The HepG2 cells (Passages 98-100) were seeded on 10-cm petri dishes with a density of 2.5 × 10 5 cells/10 mL of culture medium supplemented with 10% FBS. After 24 h of overnight cellular attachment, the medium was replaced with a medium containing different concentrations of CITCO or a vehicle (0.1% DMSO). The drug was incubated with the cells for 24 h. The entire cellular RNA was isolated using a Trizol reagent according to the manufacturer's protocol (Life Technologies, Carlsbad, CA).
RT-PCR was conducted using a RevertAid H Minus First Strand cDNA Synthesis Kit (Fermentas, Waltham, MA) featuring a random hexamer and the RT-PCR operating on an ABI StepOne Plus System. The RT-PCR was performed for CYP2B6, SPON1, GSG1L, and a housekeeping gene, TATA-box binding protein (TBP), by using predesigned gene-specific TaqMan probes and primer sets (Hs03044634_m1 for CYP2B6, Hs01120488_m1 for SPON1, Hs00896151_m1 for GSG1L, and Hs00920497_m1 for TBP) purchased from Applied Biosystems (Applied Biosystems, Foster City, CA). Gene expression was quantified relative to TBP expression by using ABI StepOne Plus Software. The relative expression level of CYP2B6, SPON1, or GSG1L compared with that of TBP was defined as CT = [CT target gene CT TBP ], where CT is the cycle threshold. The ratio of the mRNA of a target gene to the mRNA of TBP was calculated by using 2 CT . The relative expression level of CYP2B6, SPON1, or GSG1L within various concentrations of the CITCO was calculated and compared according to an unpaired Student's t test by using GraphPAD Prism, Version 5 (GraphPad Software, Inc., La Jolla, CA). Data were represented as mean ± SD of triplicates of the experiment. Fig 1(A) depicts a flowchart of the overall statistical analyses. Quality of samples and SNPs were controlled, and genome-wide single-locus association tests and haplotype association tests were performed. The identified SNPs, haplotypes, and genes were further examined through a replication study. The details of analysis procedures are described below.

Statistical analyses
The quality of samples and SNPs was examined according to the procedures in the analysis protocol [44] in addition to examinations developed in this study. Fig 1(B) shows a summary of the procedures and results. Prior to examination, 3,071 control SNPs provided in the Axiom Genome-Wide CHB 1 Array were removed. Regarding sample quality control, first, we assessed the concordance of the sex information according to the self-reported gender and homozygosity pattern on X chromosomes; male participants with a homozygosity rate of <0.2 and female participants with a homozygosity rate of >0.8 were identified as having a sex error, and the homozygosity pattern of their X chromosomes was further examined. After the sex examination, we excluded 14,342 nonautosomal SNPs from the subsequent analyses. Second, we calculated the GCR; patients who had a GCR of <0.95 were identified as poor-quality samples. Third, the average genome-wide homozygosity rate was calculated and the 99.7% confidence interval was constructed; patients who had a genome-wide homozygosity rate outside the confidence interval were identified as outlying patients, and the pattern of their run of homozygosity was further examined. Fourth, we identified patients with unknown familial relationships; the IBD of each pair of individuals was estimated, and the pairs of individuals who had an estimated IBD of >0.1875 (i.e., greater than the average expected IBD values for second-and third-degree relatives) were identified as having cryptic relatedness. Finally, we identified individuals who showed evidence of divergent ancestry; the confidence bands of the first two principal components of allele frequency were constructed for each of five populations, including the Han Chinese population in Taiwan and four additional populations from the International HapMap II Project. The patients from Taiwan whose values of the first two principal components were located outside the confidence bands of the Han Chinese population in Taiwan were regarded as having divergent ancestry.
Regarding SNP quality, we first calculated the GCR of the SNPs; the SNPs that had a GCR of <0.95 were identified as poor-quality SNPs. Second, the MAF was estimated using an allele counting approach; the SNPs that had an MAF of <0.01 were identified as nonpolymorphic markers. We examined the HWE by using an exact HWE test [45]; SNPs that had a p value that was less than the Bonferroni genome-wide significance level were identified as deviations from the HWE. Finally, after the SNPs associated with the four quantitative traits of study were identified, we examined the SNP cluster plots to ensure that the genotype calls were reliable.
In the aforementioned quality control of samples and SNPs, the patterns of homozygosity on X chromosomes and excessive runs of homozygosity in the human genome were examined using LOHAS software [46]. The R programs in this study were used to construct principal components of allele frequency and their confidence bands and identify patients with divergent ancestry. Other statistical quality control procedures were performed using PLINK software [47].
Four quantitative traits (i.e., plasma concentrations of the methadone R-and S-enantiomers and their metabolites) were examined in the genetic association analysis. Before the genetic association analysis, data transformation to normality was performed as follows: the quantitative traits were transformed through a winsorization procedure with a threshold of 0.01, followed by a log transformation, and then standardized by subtracting the mean from the traits and dividing the results by the standard deviation. Normality of the transformed data was tested by a Kolmogorov-Smirnov goodness-of-fit test [23]. The aforementioned data transformation was completed using R programs.
The normalized quantitative traits were studied through genetic association analyses, including the genome-wide single-locus association test and haplotype association test. The genome-wide single-locus association test was performed according to a linear regression model by using PLINK software [47]. The four normalized quantitative traits were individually considered as dependent variables. The independent variables included a SNP variable coded as 0, 1, and 2 for genotypes AA, Aa, and aa, respectively, and three additional variables for covariate adjustments, including age, sex, and BMI. The p value of a one degree of freedom Wald test was obtained for each SNP to examine the additive genetic effect. During each analysis of the four quantitative traits, the p values on a scale of -log 10 obtained from genome-wise single-locus association tests were displayed in a Manhattan plot and a Quantile-Quantile plot. The false discovery rate [48] was determined for a multiple-test correction. Finally, the significant SNPs were depicted in a Regional Association Plot [49].
We performed genome-wide sliding-window haplotype-based association analyses with different window sizes individually. The human genome was partitioned into a series of sliding windows according to a specific window size. For each sliding window, a four-step haplotype analysis was performed as follows: First, haplotype configurations were inferred using a standard expectation-maximization algorithm [50]. Only haplotype configurations with a posterior probability of >0.01 and only haplotypes with a minor haplotype frequency of >0.01 were analyzed in the subsequent steps. Second, an omnibus haplotype association test with linear regression analysis with an adjustment for age, sex, and BMI was performed using PLINK software. The omnibus test, which involved a Wald test with a degree of freedom of h-1 (where h is the number of distinct haplotypes in the window), was used to examine whether all haplotypes in the study window were associated with an overall study trait.
Third, a window that was significant in the second step was expanded to encompass the flanking region on either side. The width of extension on each side was one half of the window width (i.e., the physical position of the last SNP minus the position of the first SNP in the window). If windows overlapped, they were combined to form a new window before the region was expanded. Finally, in each extended genomic region defined at the third step, LD blocks were constructed based on the Gabriel's method [24] by using HAPLOVIEW [51]. In each LD block, a linear regression analysis with an adjustment for age, sex, and BMI was conducted using PLINK to examine the association between the study traits and individual haplotypes versus that of other haplotypes. All haplotypes with a minor haplotype frequency of >0.01 were analyzed and the raw p values of individual haplotypes were adjusted for multiple testing by using the false discovery rate in the entire extended regions of the significant windows and also the Bonferroni correction by the extended regions of the significant windows. Only the haplotypes which passed both of the corrections were claimed to be significant.
For each of the studied quantitative traits, the proportion of variation accounted for by the significant SNPs and haplotypes were calculated as follows. Based on the variable(s) or covariate (s) in a regression model, the next SNP or haplotype was included if the SNP or haplotype produced the maximal increment of model R 2 . Using this variable inclusion procedure, we included all SNPs and haplotypes significantly associated with a quantitative trait in the regression model sequentially. Model R 2 revealed the coefficient of determination of a full regression model that contained one or more haplotypes. In addition, the marginal R 2 was calculated for each SNP or haplotype according to the regression model that contained only that SNP or haplotype.
In a replication study, we evaluated quality control of samples and SNPs using the same quality control procedures at the discovery stage. The identified SNPs and haplotypes in the genome-wide pharmacogenomic study were further examined using the same genetic association tests at the discovery stage. A meta-analysis by combining p-values at the discovery and replication stages was performed by using Fisher's product p-value method [52]. Supporting Information S1 Table. Distributions of windows with different sizes and distributions of LD blocks with different sizes. In terminology, the window width indicates the physical distance between the first and last SNPs in a window, and the window size indicates the number of SNPs in a window. For each window size of 2, 3, 4, 5, and 10 SNPs, the frequency of windows with an average width of 0-1 kb, 1-10 kb, 10-20 kb, 20-50 kb, and >50 kb in our study are provided. In the last column, the frequency of linkage disequilibrium (LD) blocks with an average width of 0-1 kb, 1-10 kb, 10-20 kb, 20-50 kb, and >50 kb in the HapMap Asian population are provided [24,25]. Table. Effect sizes of the significant haplotypes we identified. In accordance with the inclusion steps of haplotypes in each analysis of plasma concentration of R-methadone and Smethadone, we list the chromosome and linkage disequilibrium (LD) block where the haplotypes are located. Slope estimate and its standard error (s.e.) and the corresponding 95% confidence interval are provided in the final two columns. (DOCX) S3 Table. All haplotypes in association tests of individual haplotypes for plasma concentration of R-methadone. We list the chromosome, window, linkage disequilibrium (LD) block, and gene where the haplotypes are located. Haplotype frequencies and raw and adjusted p values of the significant haplotypes are provided in the final two columns. (DOCX) S4 Table. All haplotypes in association tests of individual haplotypes for plasma concentration of S-methadone. We list the chromosome, window, linkage disequilibrium (LD) block, and gene where the haplotypes are located. Haplotype frequencies and raw and adjusted p values of the significant haplotypes are provided in the final two columns. (DOCX) S5 Table. Descriptive statistics of three covariates and two quantitative traits. The number of individuals and means ± standard deviations (SD) of covariates and quantitative traits are provided by gender. In the final column, p values of the Kolmogorov-Smirnov Good-of-Fit tests for normality for the pre-and post-transformation data of four quantitative traits. (DOCX) S6 Table. Haplotype frequencies and p-values of the significant haplotypes, identified by our genome-wide pharmacogenomic study, at the discovery and replication stages. We list the chromosome (Chrom.), linkage disequilibrium (LD) block, and significant haplotypes followed by their haplotype frequencies (HF) and raw p-values (P) at discovery stage and replication stage. The results at the replication stage are further stratified according to the urine morphine test (UMT): UMT = All, Negative, and Positive. (DOCX) S7 Table. Results of a meta-analysis by combining p-values at the discovery and replication stages. We list the chromosome (Chrom.), linkage disequilibrium (LD) block, and significant haplotypes followed by their p-values according to the urine morphine test (UMT): UMT = All, Negative, and Positive. (DOCX) S1 Fig. Sex check plots. The homozygosity intensities of each individual are calculated and plotted based on 12,531 SNPs on chromosome X using software LOHAS. Five individuals exhibited a homozygosity pattern of X chromosome(s) inconsistent to their self-reported gender: (A) Individual 206-055: the self-report gender is "female" but the pattern of homozygosity intensity suggests that chromosome X is hemizygous or homozygous. (B) Individual 301-041: the self-report gender is "female" but the pattern of homozygosity intensity suggests that chromosome X is hemizygous or homozygous. (C) Individual 306-023: the self-report gender is "female" but the pattern of homozygosity intensity suggests that chromosome X is hemizygous or homozygous. (D) Individual 306-026: the self-report gender is "male" but the pattern of homozygosity intensity suggests that chromosome X is neither hemizygous nor homozygous. (E) Individual 401-023: the self-report gender is "male" but the pattern of homozygosity intensity suggests that chromosome X is neither hemizygous nor homozygous. (TIF) S2 Fig. Homozygosity check plot. The genome-wide homozygosity rate (i.e., one minus heterozygosity rate) and proportion of missing genotype (i.e., one minus genotyping call rate) of each individual are calculated and plotted based on autosomal SNPs. The two vertical reference lines are the mean homozygosity rate ± 3 standard deviations. Four individuals are located outside the lower and upper reference lines of homozygosity rate, including three samples (217-024, 301-156, and 306-013) with an over-high genome-wide homozygosity rate and one sample (206-059) with an over-low genome-wide homozygosity rate are found. The sample (206-059) was already removed because of a low genotyping call rate of <0.95. SNPs on CYP2B6 was measured by D' [27]. SNPs with a strong LD were framed in a black inverse diamond block which was defined according to Gabriel's confidence interval method [24]. (TIF) S10 Fig. Gene expression for CYP2B6, SPON1, and GSG1L in different tissues. The gene expression plots were generated from the Genotype-Tissue Expression (GTEx) Project [54] (http://www.gtexportal.org/home/). In each bar chart, the height of a bar indicates gene-level RPKM (i.e., reads per kilobase per million reads) values, from RNA sequencing experiments. (TIF) S11 Fig. Regional association plot and LD plot of the imputation region. Regional association plot and LD plot on the flanking region of rs17180299. In the regional association plot, the vertical axis is the raw p values of the association tests (in a scale of -log10) and the horizontal axis is physical position of SNPs (in unit of kb). Raw p values of single SNPs (orange circle) are shown. Rs17180299 is indicated by a square. In the LD plot, pairwise LD of SNPs was measured by D' [27]. SNPs with a strong LD were framed in a black inverse diamond block which was defined according to Gabriel's confidence interval method [24]. (TIF)