Metabolites of the Polycyclic Aromatic Hydrocarbon Phenanthrene in the Urine of Cigarette Smokers from Five Ethnic Groups with Differing Risks for Lung Cancer

Results from the Multiethnic Cohort Study demonstrated significant differences in lung cancer risk among cigarette smokers from five different ethnic/racial groups. For the same number of cigarettes smoked, and particularly among light smokers, African Americans and Native Hawaiians had the highest risk for lung cancer, Whites had intermediate risk, while Latinos and Japanese Americans had the lowest risk. We analyzed urine samples from 331–709 participants from each ethnic group in this study for metabolites of phenanthrene, a surrogate for carcinogenic polycyclic aromatic hydrocarbon exposure. Consistent with their lung cancer risk and our previous studies of several other carcinogens and toxicants of cigarette smoke, African Americans had significantly (p<0.0001) higher median levels of the two phenanthrene metabolites 3-hydroxyphenanthrene (3-PheOH, 0.931 pmol/ml) and phenanthrene tetraol (PheT, 1.13 pmol/ml) than Whites (3-PheOH, 0.697 pmol/ml; PheT, 0.853 pmol/ml) while Japanese-Americans had significantly (p = 0.002) lower levels of 3-PheOH (0.621 pmol/ml) than Whites. PheT levels (0.838 pmol/ml) in Japanese-Americans were not different from those of Whites. These results are mainly consistent with the lung cancer risk of these three groups, but the results for Native Hawaiians and Latinos were more complex. We also carried out a genome wide association study in search of factors that could influence PheT and 3-PheOH levels. Deletion of GSTT1 explained 2.2% of the variability in PheT, while the strongest association, rs5751777 (p = 1.8x10-62) in the GSTT2 gene, explained 7.7% of the variability in PheT. These GWAS results suggested a possible protective effect of lower GSTT1 copy number variants on the diol epoxide pathway, which was an unexpected result. Collectively, the results of this study provide further evidence that different patterns of cigarette smoking are responsible for the higher lung cancer risk of African Americans than of Whites and the lower lung cancer risk of Japanese Americans, while other factors appear to be involved in the differing risks of Native Hawaiians and Latinos.


Introduction
Lung cancer is the leading cause of cancer death in both men and women in the U.S., with more than 158,000 deaths in 2015, approximately 90% of which are caused by cigarette smoking [1]. Worldwide in 2012 there were 1,589,800 deaths from lung cancer, about 3 per minute [2]. Cigarette smoking accounts for 80% of the worldwide lung cancer burden in males and at least 50% in females [3]. Diminishing this horrible death toll by science-based tobacco control policies and related approaches to prevention of lung cancer is a critical goal of cancer research. An understanding of the mechanisms by which smoking causes lung cancer is one approach to reach this goal.
Results from the Multiethnic Cohort (MEC) study provide a potentially important lead for understanding mechanisms of smoking-induced lung cancer. This study demonstrated that, for the same number of cigarettes smoked, and particularly at low levels of smoking, African-Americans and Native Hawaiians had a significantly higher risk for lung cancer than Whites while the risks of Latinos and Japanese-Americans were significantly lower than those of Whites [4]. Our hypothesis is that there are phenotypic and genotypic differences among these groups that could account for their differing risks for lung cancer. Thus, we have analyzed urine samples from subjects in each ethnic group for metabolites of carcinogens and toxicants which are found in tobacco smoke and have also carried out genome wide association studies (GWAS) in search of common genetic variants that might predict differences in levels of these metabolites [5][6][7][8]. The results to date demonstrate important differences in levels of metabolites of nicotine, the tobacco-specific lung carcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK), acrolein, crotonaldehyde, and benzene among the ethnic groups which may contribute to the observed differences in lung cancer susceptibility in the MEC [5][6][7][8]. We have also observed strong relationships between polymorphisms in certain genes-CYP2A6, UGT2B10, and GSTT1 -and concentrations of specific metabolites [5][6][7][8].
Polycyclic aromatic hydrocarbons (PAH) are a large group of structurally related compounds formed during the incomplete combustion and pyrolysis of organic matter. Common sources of exposure to PAH include polluted air, occupational exposures which occur in the aluminum and coke production industries and in roofing and paving with coal tar, consumption of broiled food, and cigarette smoking [9,10]. The planar PAH molecules, of which the most abundant have 3-5 aromatic rings, always occur together as mixtures. Many individual PAH demonstrate strong carcinogenic activity in the lung and other tissues of laboratory animals, and certain occupations associated with PAH exposure are considered carcinogenic to humans [9]. Benzo[a]pyrene, a widely studied prototypic and highly carcinogenic PAH, is also classified as carcinogenic to humans [9]. PAH are considered to be among the principal causes of lung cancer in cigarette smokers [11]. Phenanthrene, with 3 angular rings, is a typical member of the PAH class although it lacks significant carcinogenicity [9]. Phenanthrene metabolites have been widely used as dosimeters of PAH exposure because their levels vary with those of carcinogenic PAH metabolites and, due to the higher concentration of phenanthrene in PAH mixtures, phenanthrene metabolites are more readily quantified [12]. In the study reported here, we quantified 3-hydroxyphenanthrene (3-PheOH) and phenanthrene tetraol (PheT) (Fig 1) in the urine of smokers from the five different ethnic groups of the MEC, and carried out a GWAS to investigate genes associated with 3-PheOH and PheT levels in these smokers.

Subjects
Approval for this study, including the consent procedure, was obtained from the Institutional Review Boards of the University of Minnesota, the University of Hawaii, and the University of Southern California. Study participants provided written consent. IRB Code Number: 0912M75654. The subjects in this study are a subgroup of current smokers from the Multiethnic Cohort study (MEC). The MEC is a prospective cohort study investigating the association of genetic and lifestyle factors with chronic diseases in a population with diverse ethnic backgrounds [13]. The cohort consists of 215,251 men and women, ages 45 to 75 at baseline, belonging mainly to one of the following five ethnic/racial groups: African Americans, Native Hawaiians, Whites, Latinos, and Japanese Americans. Potential participants were identified and recruited between 1993 and 1996 in Hawaii and California (mainly Los Angeles County) through voter registration lists, drivers' license files, and Health Care Financing Administration data. Each participant completed a mail-in self-administered questionnaire, which included questions regarding demographic, dietary, lifestyle, smoking, and other exposure factors.
Approximately 10 years after their entry into the MEC, 2,393 current smokers participated in the MEC bio-specimen sub-cohort where a blood sample and a first morning urine sample (subjects recruited in California) or overnight urine sample (subjects recruited in Hawaii) was collected. Participants also completed an epidemiologic questionnaire, smoking history questionnaire and medication record. The overnight urine sample was collected starting between 5 pm and 9 pm (depending on the subject) and includes all urine passed during the night as well as the first morning urine. All urine was kept on ice until processing. Aliquots were subsequently stored in a -80°C freezer until analysis. Approval for this study, including the consent procedure, was obtained from the Institutional Review Boards of the University of Minnesota, the University of Hawaii, and the University of Southern California.

Statistical methods
For this analysis, 2,310 subjects were retained. These subjects had TNE >1.27 nmol/ml (4-times the limit of quantitation) [5] and had either PheT or 3-PheOH measured. Of the 2,310 subjects, 14 subjects were missing measures of PheT and 56 subjects were missing measures of 3-PheOH.
Additionally, among the subjects retained for this analysis, participants who were missing measures of BMI or cigarettes per day (CPD) at the time of urine collection were imputed using the Markov Chain Monte Carlo method and PROC MI statement from the SAS v9.2 software (SAS institute, Cary, NC). Details regarding the imputation have been published [5,7].
To examine the correlation between PheT, 3-PheOH and measures of smoking [cigarettes per day (CPD) and TNE], Pearson's partial correlation coefficients (r) were adjusted for age, sex and race/ethnicity and creatinine levels (natural log). The Wilcoxon Mann-Whitney test was used to compare the rank of PheT and 3-PheOH levels across race/ethnicity. Also, the covariate-adjusted geometric means for PheT and 3-PheOH were computed for each ethnic/ racial group at the mean covariate vector. We used four multivariable linear models. The first adjusted for the following predictors: age at time of urine collection (continuous), sex (if applicable), race, BMI (natural log) and creatinine levels (natural log) and the second additionally adjusted for TNE. The remaining two models did not include creatinine levels. To better meet model assumptions, the measures of PheT and 3-PheOH were transformed by taking the natural log. For ease of interpretation, the values presented in the tables were back-transformed as geometric means to their natural scale.

Analysis of 3-PheOH and PheT
These analyses were carried out essentially as described previously [15,16]. Certain modifications were made in the 3-PheOH assay to accommodate the large number of samples. The assay was carried out on 350 μL samples of urine, using 96-well plates. The first solid-phase extraction was performed on an Isolute (Biotage) 400 mg supported liquid-liquid extraction plate with elution by toluene. The samples were further purified on 96-well plates containing 30 mg per well Copper Phthalocyanine Rayon, then silylated and analyzed by GC-electron impact-MS as essentially previously described. Among blind duplicate samples, there was an inter-batch CV of 12.2% for PheT and 19.7% for 3-PheOH. The intra-batch CVs were 5.8% and 9.5% for the respective metabolites. Each batch contained approximately the same number of subjects from each sex and ethnic group.

Genetic Methods
We genotyped a total of 2,418 current smokers using the Illumina Human1M-Duo BeadChip (1,199,187 SNPs), previously described elsewhere [6]. We also imputed variants in the 1000 Genomes Project (http://www.1000genomes.org/) using SHAPEIT [17] and IMPUTE2 [18] with a cosmopolitan reference panel with all groups included. Post imputation, SNPs were filtered with an IMPUTE2 info score 0.30 and minor allele frequency (MAF) >1% in any MEC ethnic group. For PheT and 3-PheOH, a total of 2,225 and 2,185, respectively, study participants with complete genotype and phenotype data, and 11,892,802 SNPs/indels (1,131,426 genotyped and 10,761,376 imputed) were included in the final analyses. The GWAS analyses were adjusted for age, sex, BMI, TNE, race and principal components, with a p-value cut-off of 5x10 -8 for genome-wide significance. Among regions with multiple globally significant associations we used conditional models to determine the leading loci. We also performed ethnic-specific analyses to discover population specific loci of importance.
A GSTT1 deletion polymorphism was successfully genotyped among 2,111 individuals via TaqMan, using a pre-designed TaqMan GSTT1 copy number assay (Hs00010004), and run on the 7900HT Fast Real-Time System (Life Technologies, Foster City, CA). Copy number counts were calculated with the Life Technologies CopyCaller v2.0 software.
For the smoking variables, least-square means (or geometric means) were estimated and compared between populations, in this case, copies of GSTT1. R-squared values were used to assess percentage of variability of PheT and 3-PheOH accounted for by the variants examined.

Results
Characteristics of the study subjects whose urine was analyzed for 3-PheOH or PheT, and information on their cigarette consumption and TNE are summarized in Table 1. There were 331-709 subjects per ethnic group, with mean ages ranging from 60-65 years, and BMI from 24.4-26.9 kg/m 2 . Creatinine ranged from a low of 54 mg/dL in Whites to a high of 89 mg/dL in African-Americans, and cigarettes per day from a low of 7.1 in Latinos to a high of 20 in Whites. TNE were lowest (27.2 nmol/mL) in Japanese Americans and highest (44.4 nmol/mL) in African Americans. All characteristics were significantly different among the groups (P<0.0001).
Males smoked significantly more cigarettes per day than females in all groups except African Americans (p = 0.08). TNE were significantly higher in males than in females among African Americans, Whites, and Japanese Americans (p's < = 0.01).
Correlations among 3-PheOH, PheT, TNE, total NNAL, SPMA, 3-HPMA, and HMPMA are summarized in Table 2, for the entire group and separately for males and females. All correlations were statistically significant (most with P<0.0001).  The strongest correlations in the overall group were between 3-PheOH and PheT (Pearson's r = 0.49-0.59).
Because of the large differences in creatinine levels among the groups, metabolite levels were expressed per mL urine. Medians and interquartile ranges for 3-PheOH and PheT are summarized in Table 3. The highest levels of 3-PheOH, 0.931 pmol/mL urine, were found in African Americans, and these were significantly higher (P<0.0001) than those in Whites, 0.697 pmol/mL urine. The lowest levels of 3-PheOH, 0.621 pmol/mL urine were found in Japanese Americans and these were significantly lower than in Whites (P = 4.50x10 -3 ). High levels of PheT, 1.13 pmol/mL urine, were also found in African Americans, and were significantly higher than in Whites (0.853 pmol/mL urine, P = 0.0001). PheT levels in Japanese Americans were not significantly different from those in Whites. When dichotomized by sex, African Americans had higher levels of both 3-PheOH and PheT than Whites and Japanese Americans had lower levels than Whites, and most differences were significant.
Levels of 3-PheOH in the urine of Latinos were similar to those in Whites while levels in Native Hawaiians were significantly lower than in Whites (P = 0.0107). Levels of PheT in the urine of Latinos were significantly higher than in Whites (P<0.0001) while those of Native Hawaiians were similar to Whites. When dichotomized by sex, most of these comparisons were non-significant or borderline, except PheT in Latino females, which was significantly higher than in Whites (P = 7.00x10 -4 ). Geometric means of 3-PheOH and PheT, expressed per mL of urine, are presented in Table 4. In Model 1, adjusting for age, sex, and BMI, African Americans had the highest levels of both 3-PheOH and PheT and Japanese Americans the lowest; these levels were significantly different from those in Whites. In Model 2, additionally adjusting for TNE, similar results were obtained except that the Japanese Americans' 3-PheOH levels were not significantly different from those of Whites, and their PheT levels were significantly higher than Whites. Latinos had significantly higher levels of 3-PheOH and PheT than Whites in both models while Native Hawaiians had lower levels than Whites, and these were significant for 3-PheOH in Model 1. Additional adjustment for creatinine in Model 1 abolished the significant difference between African Americans and Whites in levels of 3-PheOH and PheT (data not shown), due to the relatively high creatinine levels in African Americans (Table 1).
In the GWAS analyses of 3-PheOH and PheT, we observed little evidence of inflation in the test statistic in the overall multiethnic sample (λ = 1.0; S1 and S2 Figs) or in any single ethnic group (0.96 λ's 1.0) for either phenotype. In the overall analysis for PheT, there were 408 globally significant variants located between 24.2-24.4 Mb near the region of glutathione Stransferase T1 and T2 (GSTT1 and GSTT2) on chromosome 22q11 (S3 Fig, S1 Table). There were three other rare globally significant associations on chromosomes 1, 6 and 14. The chromosome 1 variant is located in the intronic region of an open reading frame of gene C1orf159, 109 Mb from the GSTM1 gene. The chromosome 6 variant is located within gene PPP1R14C, a protein phosphatase gene involved in protein synthesis and metabolism, and the chromosome 14 variant is near gene TTC6 / FOXA1. As these three variants are rare, we focus our presentation on the globally significant associations on chromosome 22 near genes GSTT1 and GSTT2. When the 408 globally significant associations were conditioned on the GSTT1 deletion polymorphism, there remained 262 globally significant variants; the GSTT1 deletion does not fully explain the association with PheT (S2 Table). The GSTT1 deletion explains only 2.2% of the variability in PheT, compared to the strongest association, rs5751777 (p = 1.8x10 -62 ), which on its own explains 7.7% of the variability in PheT. The deletion allele, which is associated with significantly lower PheT levels in the overall group (p = 3.4 x 10 −16 ), and in Native Hawaiians, Whites, and Latinos, varies in frequency across populations from 0.40 in Latinos to 0.66 in Japanese ( Table 5). The highly significant associations observed at 22q11 are in part explained by the GSTT1 deletion (n = 2,097; Table 5). Through forward selection regression analysis (at a threshold of 1x10 -3 ) of the 262 variants that remained significantly associated with PheT after adjustment for the GSTT1 deletion polymorphism, we identified 5 independent variants including our top association rs5751777 (Table 6). Together, these 5 variants explained 9.33% of the variability in PheT.
In ethnic specific GWAS analyses for African Americans there were 16 globally significant variants on chromosome 22 near the GSTT2 gene, for Latinos there were 141 globally significant variants, for Japanese Americans there were 206 globally significant variants, for both the Native Hawaiians and Whites there were 59 globally significant variants near the GSTT2 and GSTTP1 genes (S4-S8 Figs). The GSTT1 deletion polymorphism explained from 0.1% of variability in PheT among African Americans to 5.7% of the variability in PheT among Latinos (Table 5). Increasing GSTT1 copy numbers were significantly associated with higher PheT levels (p = 3.4 x 10 −16 ) in the overall group, and in Native Hawaiians, Whites, and Latinos (Table 5).
There were no globally significant variants for either the overall or the ethnic specific analyses for 3-PheOH using our genomic threshold of p <5x10 -8 (S2 Fig).

Discussion
In previous analyses of tobacco smoke carcinogen and toxicant metabolites in urine samples from the MEC, we have observed that African Americans had the highest TNE levels and Japanese Americans the lowest, both being significantly different from the intermediate levels in Whites [5]. Consistent with these observations, African Americans also had the highest levels of NNAL and its glucuronides, metabolites of the tobacco-specific lung carcinogen NNK, while Japanese Americans had the lowest, and these were significantly different from those in Whites [7]. Further studies demonstrated completely analogous results for S-phenylmercapturic acid, a metabolite of the volatile toxicant and carcinogen benzene [14]. In the study reported here, we have extended these analyses to 3-PheOH and PheT, metabolites of phenanthrene, a representative PAH. Consistent with the previous studies, we find high levels of 3-PheOH and PheT in African Americans, and the lowest level of 3-PheOH in Japanese Americans, and these were significantly different from those in Whites. Overall, these results provide compelling evidence that exposure to representative tobacco smoke carcinogens-NNK, benzene, and PAH -is highest in African Americans, who are at highest risk for lung cancer in the MEC, and lowest in Japanese Americans, at lowest risk for lung cancer. Thus, differences in carcinogen exposure from cigarette smoke among these three groups can at least partially explain their differing risks for lung cancer. African Americans smoke cigarettes differently from Whites, extracting more nicotine and more carcinogens per cigarette (Table 1 and [5,7]). This is also consistent with the observation that, among African Americans, the time to first cigarette upon waking in the morning is significantly shorter than in Whites [19]. Japanese Americans have a higher number of CYP2A6 polymorphisms associated with lower nicotine metabolism than Whites, and consequently need to obtain less nicotine per cigarette and are therefore exposed to lower amounts of other toxicants and carcinogens than Whites [20], although that effect was not so strong in this study, perhaps due to other PAH exposure routes in Japanese Americans. Hydroxylated PAH such as urinary 3-PheOH and the closely related compound 1-hydroxypyrene are accepted biomarkers of PAH exposure [21]. These simple hydroxylated PAH are metabolically formed by a cytochrome P450-catalyzed epoxidation of the PAH ring followed by rearrangement of the epoxide ("the NIH shift") or by direct oxidation of the PAH ring [9,10]. These simple metabolic processes favor a direct relationship between PAH exposure and level of the hydroxylated metabolite in urine. Furthermore, in the case of phenanthrene, approximately 90% of its excreted metabolites are found in urine, based on studies in rats [22]. Thus urinary 3-PheOH is an excellent biomarker of exposure to phenanthrene, a representative PAH closely related to carcinogenic PAH such as benz[a]anthracene, chrysene, benzofluoranthenes, and benzo[a]pyrene which are simultaneously formed during cigarette smoking by combustion of tobacco. The NHANES study of 3-PheOH reported levels of this urinary metabolite in smokers that are similar to those in our study [12]. The metabolism pathway leading to PheT is more complex, and interpretation of the resulting data might not be as straightforward [23]. The initially formed epoxide is hydrated to a dihydrodiol, catalyzed by epoxide hydrolase. This dihydrodiol is then further oxidized to a dihydrodiol epoxide, which then undergoes hydrolysis to form PheT. As discussed below, the epoxides potentially can be detoxified by glutathione-S-transferases, which could influence PheT levels. For higher PAH such as benzo[a] pyrene, the same sequence of reactions resulting in PheT leads to a major ultimate carcinogen, benzo[a]pyrene-7,8-dihydrodiol-9,10-epoxide (BPDE) [10,24,25]. Thus, PheT can be considered a surrogate for PAH exposure plus metabolic activation, and levels of PheT in human urine correlate with those of the tetraol derived from BPDE [26]. In spite of these complexities, exposure to phenanthrene is a major driving force for PheT formation, as indicated by the correlation between 3-PheOH and PheT (Pearson's r = 0.49-0.59). Considerable evidence favors a significant role for PAH as causes of lung cancer in cigarette smokers [27]. Multiple PAH have been identified in cigarette smoke and some of these are powerful carcinogens, readily inducing tumors at the site of application, including the lung, as well as at distant sites [11,28,29]. Subfractions of cigarette smoke condensate enriched in PAH are tumor initiators on mouse skin, and the combination of cigarette smoke PAH and tumor promoters or co-carcinogens partially recapitulates the carcinogenic activity of cigarette smoke condensate in mouse skin experiments [28]. Total levels of established lung carcinogenic PAH in cigarette smoke are about 50-60 ng/cigarette, similar to levels of the lung carcinogen NNK [27]. DNA adducts of BPDE have been identified in lung tissue from some smokers, and reactions of BPDE and other PAH diol epoxide metabolites with the p53 tumor suppressor gene produce adducts at the same sites known to be mutational hot spots in human lung cancer (although the same pattern is also formed by acrolein) [30][31][32][33].
Levels of 3-PheOH were significantly lower in Native Hawaiians than in Whites, which is consistent with previous observations for TNE and total NNAL. These results suggest that intake of smoke constituents in Native Hawaiians is less than in Whites, in spite of the fact that they are at higher risk for lung cancer. We have reported however that urinary levels of the acrolein biomarker 3-HPMA were highest in Native Hawaiians compared to the other MEC groups, suggesting that endogenous processes related to lipid peroxidation perhaps play a role in the relatively high risk of Native Hawaiians for lung cancer. Levels of PheT were significantly higher in Latinos than in Whites, and this effect was confined to females. This is different from our observations with TNE and total NNAL, and requires further study. Collectively, these observations suggest that there are factors influencing lung cancer susceptibility in Native Hawaiians and Latinos that have not been recognized in our metabolic and genetic studies to date.
The GWAS demonstrated a signal associated with PheT on chromosome 22 near GSTT1 and GSTT2. While glutathione S-transferases M1 (GSTM1), GSTP1, and GSTA1 are known to catalyze the detoxification of diol epoxides formed from various PAH, there is scant evidence that GSTT1 or GSTT2 have activity in the detoxification of PAH epoxides or diol epoxides, including those formed from phenanthrene [34][35][36][37][38][39]. To the extent that such activity exists, we would have expected lower levels of PheT in individuals with both copies of the GSTT1 gene because the diol epoxide precursor to PheT would be intercepted (Fig 1), but instead we observed significantly higher levels of PheT in individuals with both copies of the gene (Table 5). There is evidence from some previous studies that the GSTT1 deletion is protective, which would be consistent with our surprising observation of lower PheT levels in the null individuals, but this requires further study [40]. Apparently, the relationship of GSTT1 and GSTT2 activity to PheT levels may be more complex than previously thought.
In summary, the results of this study demonstrate higher uptake of carcinogenic PAH, as demonstrated by urinary 3-PheOH and PheT levels, in African American smokers than in Whites and lower levels in Japanese American smokers. These observations are consistent with our previous studies demonstrating patterns of nicotine, tobacco-specific nitrosamine, and benzene uptake in these groups, consistent with their differing risks for lung cancer. The relationships of PAH uptake to lung cancer risk in Native Hawaiians and Latinos appear to be more complex. We also observed strong effects of GSTT1 deletion and a GSTT2 polymorphism on PheT levels, suggesting previously unknown factors influencing the PAH diol epoxide metabolic activation pathway.