A Candidate Gene Approach Identifies the CHRNA5-A3-B4 Region as a Risk Factor for Age-Dependent Nicotine Addiction

People who begin daily smoking at an early age are at greater risk of long-term nicotine addiction. We tested the hypothesis that associations between nicotinic acetylcholine receptor (nAChR) genetic variants and nicotine dependence assessed in adulthood will be stronger among smokers who began daily nicotine exposure during adolescence. We compared nicotine addiction—measured by the Fagerstrom Test of Nicotine Dependence—in three cohorts of long-term smokers recruited in Utah, Wisconsin, and by the NHLBI Lung Health Study, using a candidate-gene approach with the neuronal nAChR subunit genes. This SNP panel included common coding variants and haplotypes detected in eight α and three β nAChR subunit genes found in European American populations. In the 2,827 long-term smokers examined, common susceptibility and protective haplotypes at the CHRNA5-A3-B4 locus were associated with nicotine dependence severity (p = 2.0×10−5; odds ratio = 1.82; 95% confidence interval 1.39–2.39) in subjects who began daily smoking at or before the age of 16, an exposure period that results in a more severe form of adult nicotine dependence. A substantial shift in susceptibility versus protective diplotype frequency (AA versus BC = 17%, AA versus CC = 27%) was observed in the group that began smoking by age 16. This genetic effect was not observed in subjects who began daily nicotine use after the age of 16. These results establish a strong mechanistic link among early nicotine exposure, common CHRNA5-A3-B4 haplotypes, and adult nicotine addiction in three independent populations of European origins. The identification of an age-dependent susceptibility haplotype reinforces the importance of preventing early exposure to tobacco through public health policies.


Introduction
Nicotine addiction has profound clinical and public health consequences because it is associated with reduced ability to cease tobacco use [1,2], and tobacco use is the leading cause of preventable morbidity and mortality in developed countries [3].Meta-analysis of numerous twin studies shows that both genes and environment play an important role in smoking-related behaviors [4].Nicotine is the primary agent in tobacco smoke that leads to addiction, and while progress has been made in finding genes that contribute to nicotine addiction in humans [5][6][7][8][9], there is a great need for additional progress.
The neuronal nicotinic acetylcholine receptor genes (nAChRs) are likely candidates for harboring functional variants contributing to nicotine addiction since these ligand-gated ion channels are the initial physiological targets of nicotine in the central and peripheral nervous system.They have also been implicated in nicotine addiction in animals where chronic nicotine exposure leads to persistent changes in brain nAChRs [28,29], and where engineered mouse models support the crucial role of the a4b2 nAChRs in nicotine addiction [30,31].Previous candidate gene association studies using six CHRNA4 SNPs found evidence for associations with measures of nicotine dependence in Chinese men [5], and females of European-American and African-American descent [6].A wider survey of nAChR gene variants in a casecontrol study for nicotine dependence found evidence for nominally significant associations in CHRNA7, CHRNA9, CHRNA5 and CHRNB3 in young Israeli women [7].Several recent genomewide association studies (GWAS) using either nicotine dependent smokers as cases and non-dependent smokers as controls [8] or cigarettes per day as a quantitative trait [32] have failed to yield statistically significant findings at the genome-wide level.However, when these studies independently examined nAChR candidate genes [9,32], evidence was found for associations between common variants in CHRNB3 and the CHRNA5-A3-B4 gene cluster at 15q 24 and their respective phenotypes.Most recently, and subsequent to submission of this article, three separate GWAS reports provide strong evidence for an association between SNP variation at 15q24 and lung cancer [33][34][35].One study suggests that the effect of 15q24 variants on lung cancer is primarily mediated through smoking behavior [35], while the other studies failed to associate 15q24 variants with smoking behavior and suggest that the disease mechanism with lung cancer is not explained by an association with nicotine addiction [33,34].
This report describes results of a comprehensive haplotype discovery and nicotine addiction association study within nAChRs genes across three European American populations of 2,827 longterm smokers.Moreover, it tests the a priori hypothesis that associations between nAChR genetic variants and nicotine dependence severity assessed in adulthood will be stronger among smokers who began daily smoking in adolescence than among those who did not.The results show significant dependencehaplotype associations in the CHRNA5-A3-B4 gene cluster occurring only in the early onset subjects, consistent with the hypothesis that the association of genetic risk variants for nicotine addiction may be influenced by the age of onset of daily smoking.

Study Populations
Nicotine dependence was assessed with the Fagerstrom Test of Nicotine Dependence (FTND) because it predicts important dependence attributes such as the likelihood of relapse back to tobacco use and biochemical measures of nicotine self-administration [2,36,37].The FTND assesses a pattern of heavy, compulsive smoking and has been used in other genetic association studies [5][6][7][8][9].Genetic variants were assessed in 2,827 subjects from three European American cohorts with a mean age of 49.6 years (SD = 9.5), 1155 (41%) of whom were females (Table S1).All participants were either current or previous daily cigarette smokers; 222 (8%) had not smoked for at least 2 years prior to participation in the study.One cohort comprised participants in a study of genetic risk factors for nicotine dependence and chronic obstructive pulmonary disease (COPD) recruited in Salt Lake City, Utah (N = 486, UT).Another cohort was made up of participants in randomized trials of smoking cessation interventions recruited in Madison and Milwaukee, Wisconsin (N = 398, WI).A final cohort was drawn from the Lung Health Study (N = 1943, LHS), a multisite longitudinal study of COPD sponsored by the Division of Lung Disease of the National Heart, Lung and Blood Institute [38].All exsmokers were in the UT cohort and responded to smoking related assessments based upon their prior smoking patterns [39].Participants began daily smoking at a mean age of 17.3 (SD = 4.1), smoked a mean of 28.3 cigarettes per day (CPD, SD = 13.9),smoked for a mean 30.7 years (SD = 9.5), and had a mean FTND score of 5.7 (SD = 2.2).Consistent with a history of chronic, heavy smoking, most subjects in the UT and LHS cohorts (N = 2302, 81% of the total sample) had mild to moderate chronic obstructive pulmonary disease (COPD) as determined by pulmonary function testing; lung function testing was not performed on the WI subjects.
Age of onset of daily smoking was dichotomized into early onset (onset of daily smoking at age 16 or younger) vs. late onset (onset of daily smoking at age 17 or older).Previous studies have shown that 16 vs.17 is an appropriate age range to differentiate the impact of early from late nicotine exposure on dependence [40].We used a dichotomous variable to represent age because the effect was hypothesized to be nonlinear in nature due to the fact that nicotine exerts distinct effects when administered during adolescence vs. adulthood [24].We attempted to reflect this underlying causal model in our analyses by creating an age cut-score that reflected this nonlinear effect.Also, the age 16 vs.17 dichotomy approximated a median split, which yielded near equivalent statistical power for tests within the two age samples; 46% of the sample were thus classified as early onset smokers.Research suggests that age of smoking onset can be reliably assessed with retrospective assessments [41].To compare distinct levels of nicotine dependence, FTND scores were dichotomized into low (FTND = 0-4) and high (FTND = 6-10) dependence [7].An FTND score of 6 or higher identifies subjects with high nicotine dependence [42], while a score of ''5'' has an ambiguous relation with dependence [42][43][44].Of the initial sample of 2827, 731 (26%) were assigned to the low dependence condition, 1556 (55%) were assigned to the high dependence condition and 540 (19%) had an intermediate score of 5.
Consistent with previous reports, age of onset of daily smoking was inversely related to level of dependence in the present sample.Age of onset and FTND score were correlated, r(2135) = 20.18,P,0.001, and dichotomized age of onset and FTND were highly associated (x2 = 51.6, 1 df, P,0.001).In early onset smokers, 24.2% had low FTND scores compared with 38.7% of late onset smokers.

Author Summary
Tobacco use is a global health care problem, and persistent smoking takes an enormous toll on individual health.The onset of daily smoking in adolescence is related to chronic use and severe nicotine dependence in adulthood.Since nicotine is the key addictive chemical in tobacco, we tested the hypothesis that genetic variants within nicotinic acetylcholine receptors will influence the severity of addiction measured in adulthood.Using genomic resequencing to define the patterns of variation found in these candidate genes, we observed that common haplotypes in the CHRNA5-A3-B4 gene cluster are associated with adult nicotine addiction, specifically among those who began daily smoking before age 17.We show that in populations of European origins, one haplotype is a risk factor for dependence, one is protective, and one is neutral.These observations suggest that genetic determinants expressed during human adolescence contribute to the risk of lifetime addiction severity produced from early onset of cigarette use.Because disease risk from the adverse health effects of tobacco smoke is related to lifetime tobacco exposure, the finding that an age-dependent effect of these haplotypes has a strong influence on lifetime smoking behavior reinforces the public health significance of delaying smoking onset.

SNP and Haplotype Discovery
As an initial step, we conducted an exhaustive SNP discovery survey surrounding the neuronal nAChR coding regions in a small sample selected to represent the most extreme heavy and light dependent smokers.We used a genomic resequencing strategy to generate a dense set of variants in a-like nAChR subunits (a2, a3, a4, a5, a6, a9, and a10) and b-like nAChR subunits (b2, b3, and b4) in 144 smokers and 48 population-matched non-smokers.This survey identified 262 SNPs, including 38 nonsynonymous, 35 synonymous, 57 UTR, 1 stop, and 1 frameshift (see Table S2 for location and allele frequency).The stop and frameshift alleles, both located in CHRNA6, were each observed as heterozygotes in single individuals, indicating that the depth of the resequencing survey reached the boundary of rare loss-of-function alleles.For a preliminary x2 analysis of association in the resequencing sample, we subdivided the 144 smokers into high (n = 72) and lowdependent (n = 72) categories.Even with this small selected sample set, a nominally significant association signal (P,0.01) was observed for five SNPs located in the CHRNA5-A3-B4 cluster on chromosome 15q25: rs951266 (intronic A5), rs16969968 (nonsynonymous A5), rs8192482 (39-UTR A5), rs4887067 (intergenic A5-A3) and rs17487223 (intronic B4).All five SNPs are in strong linkage disequilibrium.In order to extend this survey, we defined haplotype structures in all nAChR subunits based on inference from the unphased resequencing data, and derived a minimal set of tagging SNPs (n = 87) for genotyping in the larger three-cohort sample.
Genotyping of all 87 tagging SNPS across the neuronal nAChR a and b subunits was first carried out in the UT (n = 439) and WI cohorts (n = 339), and single marker allelic tests using x 2 statistics were evaluated for association with FTND scores in early versus late onset samples (Table S3).The strongest association signals (P,0.005) to dichotomized FTND by age of onset were observed only in the early onset sample for six SNPs within the CHRNA5-A3-B4 gene cluster with significant allele test P values ranging from 4.8610 23 to 5.0610 24 (Table 1).No other nAChR SNP had a P value less than 0.02 in either the early or late onset group.These six CHRNA5-A3-B4 SNPs stratified into two groups of significant linkage disequilibrium (LD), and examination of phased resequencing haplotypes (Figure 1) revealed that these association signals occur within a ,50 kb LD block-like structure spanning CHRNA5 and CHRNA3.This LD block in the resequencing samples is composed of four major haplotypes: Haplotype A (H A ) = 38%, Haplotype B (H B ) = 34%, Haplotype C (H C ) = 20%, Haplotype D (H D ) = 5%.

CHRNA5-A3-B4 Haplotype Effects
To evaluate the significance of these CHRNA5-CHRNA3 single marker associations and the underlying haplotypes, five tagging SNPs capable of distinguishing these four haplotypes were assessed for association to FTND by age of daily smoking onset (daily smoking onset by vs. after age 16) in the combined UT-WI-LHS cohorts (total N = 2,827).Observed frequencies of Haplotypes A-D, respectively, by cohort were as follows: LHS, 39%, 37%, 19%, and 5%; UT, 38%, 38%, 20%, and 4%; and WI, 41%, 38%, 17%, and 4%.A test of all four haplotypes showed a significant omnibus association P value of 2.6610 24 with high vs. low FTND score within the early onset group (Table 2), but not within the late onset group (P = 0.444).This omnibus haplotype test was supported by single marker association P values ranging from 1.1610 23 to 1.7610 24 (Figure 1).The main haplotype effect can be partitioned into two significant haplotype-specific associations (Table 2) with both a susceptibility effect for high dependence in the early onset group for haplotype H A (odds ratio (OR) 1.50, 95% CI 1.21 to 1.86, P = 1.3610 24 ), and a protective effect for haplotype H C (OR 0.66, 95% CI 0.52 to 0.85, P = 1.1610 23 ).Importantly, haplotype H B shows no significant association (OR 0.93, 95% CI 0.76 to 1.14, P = 0.50), suggesting that the two haplotype-specific associations are the result of distinct susceptibility (H A ) and protective (H C ) haplotype effects in the early onset group.
An interaction between haplotype status and age of onset in predicting dependence severity was also obtained via logistic  regression analyses (Table 3).A significant interaction effect (P = 0.006) was found, as well as a significant H A vs. H C contrast within the early onset condition (P = 2.0610 25 , OR = 1.82, 95% CI 1.39 to 2.39).H C was associated with reduced likelihood of severe nicotine dependence relative to H A , but only among those smoking daily at age 16 or younger.Gender was associated with age of onset in our cohorts (41% of females began daily smoking before age 17 compared with 49% of males).However, gender did not moderate the relation between dependence and haplotype status within age of onset groups and so was dropped from subsequent analyses.Based on these results for the entire sample, haplotype associations with the dichotomized FTND variable were tested by age of onset condition within each cohort separately (Table 4).There was a significant H A vs. H C contrast in the early onset group for each cohort (P's = 0.0220.001,OR's = 1.5-3.0),but this contrast was not significant in any cohort in the late onset condition.Thus, the relative risk/protective effect of haplotypes A and C was found only in early onset smokers in three separate cohorts.
Diplotype analyses also suggest that age-of-onset moderates the relation of these variants with nicotine dependence (Figure 2).Logistic regression analysis in the combined sample showed a significant interaction between the diplotype status (AA vs. BC) and age-of-onset (P = 0.02, OR = 0.43, 95% CI 0.86 to 0.21).Analyses in the early onset group found significant differences between diplotype AA and all other tested diplotypes (P value range from 0.05 to 3.0610 24 ).The contrast between diplotype AA vs. diplotype AC (P = 0.03, OR = 0.53, 95% CI 0.93 to 0.30) and diplotype AA vs. diplotype BC (P = 3.0610 24 , OR = 0.35, 95% CI 0.62 to 0.20) suggests a recessive effect of Haplotype A, and an incomplete dominance effect of Haplotype C (Figure 2A).
To ensure that the obtained results were not dependent upon the particular age of daily smoking onset or FTND dichotomy that was used, the effects for both phenotypes were investigated in more detail.First, the interaction between haplotype and age of onset as a continuous variable was tested based on the combined UT-WI-LHS cohorts and using FTND46 as the dependent variable.The H A vs. H C by daily age interaction effect was significant, p = 0.01, OR = 0.95 (95% CI = 0.91-0.99),but the H B vs. H C interaction effect was not significant, p = 0.24, OR = 0.98 (95% CI = 0.94-1.02).Next, subjects where divided into quartiles based on onset age of daily smoking and the haplotype effects within each quartile were tested separately.Table 5 shows the logistic regression analysis for the H A vs. H C effect was significant in the two younger quartiles (p = 0.0003 for 15-16 years and p = 0.01 for ,15 years), but not in the two older quartiles (p = 0.43 for 17-18 years and p = 0.63 for .18years).Figure 3 shows the relative haplotype frequencies across each age group quartile.
Prior research suggested that the FTND score of ''5'' constitutes an intermediate score that is ambiguous with regards to dependence level [42][43][44].To explore this assumption we conducted a multinomial logistic regression with all subjects in which haplotype status was related to an FTND dependent variable that was split into three levels: ''Low'' (i.e., scores 0-4,  N = 731), ''Intermediate'' (score 5, N = 540), and ''High'' (score.5, 1556).This analysis showed significant haplotype effects only when individuals with Low vs. High FTND scores were contrasted with one another.That is, comparisons involving individuals with Intermediate scores, yielded no significant main effects or interaction effects.An inspection of haplotype distributions showed that individuals with mid-range levels of dependence (FTND of ''5'') had haplotype distributions that were intermediate to those of subjects with Low and High scores.Therefore, as would be expected if haplotypes confer dependence vulnerability, when dependence levels became more extreme, so were the relative distributions of haplotypes.However, we also report secondary analyses with individuals with the full range of FTND scores, and with different score cutpoints (i.e., FTND scores 0-4 vs. 5-10 and 0-5 vs. 6-10), in order to demonstrate that the reported pattern of results does not depend upon exclusion of mid-range scores.With both divisions (FTND scores 0-4 vs. 5-10 and 0-5 vs. 6-10), the Haplotype A vs. C interaction with age of onset was significant, as was the Haplotype A vs. C contrast within early onset but not late onset smokers (Table S4).Further, diplotype analyses indicated a significant AA vs. BC interaction with age of onset for both of the additional FTND divisions and a significant AA vs. BC contrast within the early but not the late onset condition.Inspection of the relation between FTND deciles and the frequency of H A relative to H C in early onset smokers indicated nonlinearity, supporting the choice to relate genetic variants to dichotomous rather than continuous FTND scores.Finally, AA vs. BC diplotypes differed significantly for mean FTND scores in early (P = 5610 24 ), but not late, onset smokers (Figure 2C and D).
The psychometric performance of the six-item FTND to assess physical dependence on tobacco smoking is well established, particularly items 1 and 4 which relate to heaviness of smoking [36,37].The significance levels of analyses of the association  6 and positive scores on most items show a trend towards higher frequency of haplotype H A and a lower frequency of haplotype H C (Tables S5 and S6).Nominally significant results were obtained only for FTND item 4 (cigarettes per day) and FTND item 5 (smoking frequently during the first hours after waking).H A vs. H C associations with FTND items were tested within age of onset condition using chi-square for dichotomous items.For items 1 and 4, which have 4 ordered response options, Cochran's Test of Linear Trend was used to test the hypothesis that haplotype proportions change linearly across the response options.The only nominally significant results were obtained in the early onset condition for cigarettes per day (Item 4), Cochran's Test of Linear Trend (1, N = 1499) = 16.6, p,0.00005, and for early morning smoking (Item 5), x2 (1, N = 1142) = 4.01, p,0.05.In sum, the pattern of findings suggests that multiple items contribute, albeit modestly in some cases, to the measurement of nicotine dependence as it is associated with haplotypes.Since the items of the FTND are not highly correlated with one another, the additive effects of the different items have the potential to yield orthogonal variance and a more comprehensive assessment of nicotine dependence than is available via only a subset of items [2].

Discussion
These results show that CHRNA5-A3-B4 haplotypes are consistently related to severity of nicotine dependence among long-term smokers of European-American descent who began daily smoking at or before age 16 but not among those who began smoking daily after age 16.The robustness of the genotype by age of onset interaction is supported by the fact that there was a significant interaction between the two variables in logistic regression analyses and by the fact that significant associations between genetic variants and dependence were specific to early onset smokers in all three cohorts.Both human and animal research shows that early vs. late smoking or nicotine exposure is associated with more severe nicotine dependence, or greater nicotine self-administration, manifested in adulthood [25,[45][46][47] and that adolescence is a period of heightened sensitivity to nicotine reward as well as decreased sensitivity to nicotine's aversive actions [47][48][49][50].Animal research suggests possible mechanisms for this effect related to persistent changes in brain structure and function.For instance, significantly greater high affinity nicotinic receptor binding is observed in the midbrain and striatum of adolescent versus adult onset nicotine-self-administering rats [51], indicating receptor up-regulation.Also, nicotine exerts greater differential mRNA expression effects on a5, a6 and b2 nAChR transcripts and on genes that influence neuroplasticity (e.g., arc) when it is administered to adolescent, as opposed to adult rats [25,52].
The most noteworthy coding SNP contained in the H A LD structure is rs16969968, a nonsynonymous a5 coding variant (p.Asp398Asn) resulting in substitution of a negatively charged residue within the M3-M4 intracellular loop, a region thought to  p-values for items 1 and 4 are from GLM tests in which the 4 levels of the item responses were taken to be an ordinal variable.For items 2, 3, 5 and 6, p-values are from logistic regression analyses.b ''Inter'' = the interaction between haplotype and age of onset, ''All'' = all subjects, ''Early'' = early onset smokers only, and ''Late'' = late onset smokers only.For GLM analyses, the p-value was obtained in a test across H A , H B , and H C .For logistic regression analyses, the p-value was for the H A vs. H C effect.doi:10.1371/journal.pgen.1000125.t006 be involved with receptor trafficking.The local amino acid context surrounding the p.Asp398Asn variant also displays accelerated protein evolution (higher lineage-specific Ka/Ks) in primate lineages within a class of genes related to nervous system development [53].In mouse brain, a5 is a widely distributed minor subunit within heteromeric brain nAChRs, and incorporation of an a5 subunit into brain nAChRs leads to changes in receptor-level function [54].The association of H A and the p.Asp398Asn variant with a nicotine dependence phenotype in humans suggests further research, such as an engineered mouse model, to explore the functional role and developmental expression of these variants within the process leading to chronic nicotine dependence.Both rs16969968 and rs1051730, a synonymous a3 variant, are in nearly complete linkage disequilibrium and, along with the H A LD structure, occur at significantly higher frequencies in European-American populations based on HapMap allele frequencies.
In-depth haplotype analysis also revealed a protective effect of haplotype H C .Multiple non-coding variants are contained within the H C LD structure; therefore, potential functional variants may indirectly affect receptor function through developmental and/or cell-specific expression of a5, a3 or b4 levels.The risk versus protective effects of H A and H C reinforce an emerging theme in complex genetics that common and rare alleles can display a range of protective, neutral and susceptibility effects [55].The magnitude of effects reported here (haplotype frequency shifts ,10%, haplotype-specific odds ratios = 1.5) fall within the range of other common variant effects influencing complex diseases, such as diabetes [56], coronary artery disease [56][57], psoriasis [58] and inflammatory bowel disease [59] In our European-American population, we did not find evidence of significant associations (P,0.05 in early or late onset smokers) among the four CHRNA4 SNPs in common with our study (rs2273504, rs2236196, rs1044396, rs1044397; Table S3) and previous candidate gene association studies of nicotine dependence in Chinese men [5], and females of European-American and African-American descent [6].A follow-up candidate gene study [9] to a genome-wide association design [8] with a definition of no dependence (FTND = 0) in controls and a FTND.3 for dependence in cases of European descent, identified significant associations within CHRNB3 and the CHRNA5-A3-B4 cluster.Although that study [9] and the current study employed tagging SNPs for similar LD bins, the current study did not find any association between nicotine dependence and CHRNB3 SNPs.However, similar LD bins within the CHRNA5-A3-B4 cluster showed a significant risk effect in both studies, including the nonsynonymous a5 rs16969968 SNP, even with the substantial difference in FTND criteria.Using similar criteria for low and high dependence as our study, a previous study [7] also reported a suggestive association (primary P = 0.08) of the CHRNA5-A3-B4 cluster in a small sample population of 242 Israeli women.A recent follow-up candidate gene analysis using genomewide association data from three European populations data suggested a risk effect of CHRNA5-A3-B4 locus for cigarettes per day regularly smoked [32].
Subsequent to submission of this article, a GWA study using smoking quantity (cigarettes per day, CPD) as a measure of nicotine dependence observed an association of rs1051730 with CPD at genome-wide significance levels in a large Icelandic population.That study used FTND item 4 as their measure of cigarettes per day.Table S7 shows the comparison of our study and the Icelandic study using cigarettes per day as the one, in common, phenotype and rs1051730 as the one, in common, genotype.The trend in the frequency of the rs1051730 T allele is generally similar, except for the 1-10 cigarettes per day group in the Icelandic study.Clearly, their population of smokers differs from the UT-WI-LHS smokers, which is understandable since our study preferentially enrolled long-term heavy smokers who had lung disease (UT-LHS) or sought cessation treatment (WI).
Two other GWAS reports using lung cancer case-control designs, reported that rs1051730 exceeded genome-wide significance levels for association with lung cancer but having failed to measure significant associations with smoking behavior, concluded that the effect on lung cancer risk was independent of smoking behavior.Our analysis demonstrating that strong associations between CHRNA5-A3-B4 variants and nicotine dependence are seen only amongst smokers who began daily smoking relatively early in life, coupled with the detailed molecular definition of the CHRNA5-A3-B4 haplotype structures generated from resequencing, supports the hypothesis that the disease outcome effects of rs1051730, and other surrogate markers for Haplotype A, are mediated by nicotine addiction.In our opinion, the biological link between nicotinic receptor variants and smoking behavior is more plausible than a direct effect of these ion channels on lung cancer susceptibility.The association of rs1051730 with lung cancer may therefore be due to disease mortality related to long-term, persistent smoking caused by severe nicotine addiction [60,61].
In summary, we have demonstrated that major CHRNA5-A3-B4 haplotypes identify countervailing susceptibility (H A ) and protective (H C ) determinants for long-term nicotine dependence.A substantial shift in haplotype frequency (A vs. C = 10%) and diplotype frequency (AA vs. BC = 17%, AA vs. CC = 27%) is observed when age of exposure to nicotine is used to define an at-risk subpopulation.Identifying this interaction of a common genetic risk factor with age of daily smoking onset among the complexity of factors that influence nicotine addiction indicates how genetics can augment public health approaches to the problem of smoking-related illness, because the risk is amenable to intervention.Identification of genetically high-risk individuals who would benefit from proactive interventions, such as adolescent education and cessation clinics, may result in a population with a lower rate of adult nicotine addiction.

Subjects
The UT cohort was made up of respondents to community advertising for persons who had smoked more than 100 cigarettes lifetime plus a subset of the originally recruited Utah LHS cohort who had follow-up phenotyping and biosample collection in Salt Lake City, UT; these Utah participants were excluded from the LHS cohort.Participants were not drawn from a psychiatric treatment population, and no medical or behavioral treatments were offered as part of the study.UT volunteers were not excluded simply because they had a lifetime diagnosis of psychosis or Bipolar Disorder, but they were excluded if their current mental status made it impossible for them to complete the questionnaires or interviews.
WI participants were recruited by media advertisements and had to be current smokers who were motivated to quit smoking, smoked more than 9 CPD, and produced a breath sample with carbon monoxide (CO) .9ppm at baseline.Exclusion criteria included evidence of psychosis history based on the Prime-MD structured psychiatric interview [62] Forward and reverse PCR primer sequences were chosen from the publicly available genomic sequence, and PCR amplification was carried out in 25 ml reaction volumes using standard techniques (primer sequences are available from the authors upon request).Amplification primers and unincorporated nucleotides were removed using ExoSAP-IT (USB, Cleveland, Ohio).For sequencing, internal primers were used, and cycle sequencing was carried out using Applied Biosystems BigDye terminator chemistry.Cycling was done with an initial denaturation at 96uC for 30 sec.; followed by 45 cycles of 96uC for 10 sec., 50uC for 5 sec., 60uC for 4 min.Upon completion of sequencing, 20 ml of 62.5% EtOH/1M KOAc, pH 4.5 was added to each reaction and the sequence plates were centrifuged at 4000 rpm at 4uC for 45 min.The samples were resuspended in 15 ul of formamide and electrophoresed on an ABI 3700 capillary instrument.Sequence trace files were evaluated using the Phred, Phrap and Consed programs, and potential variants were identified by using the PolyPhred program [66,67].Single nucleotide polymorphisms (SNPs) were verified by manual evaluation of the individual forward and reverse sequence traces.In addition, all sequence assemblies were manually scanned for insertions, deletions and polymorphic positions undetected by PolyPhred.Tag SNPs were selected using the ldSelect algorithm [68] with an r 2 cutoff value of 0.64 on all SNPs with a minor allele frequency .5%.
Genotyping was carried out using one of two different methods: an oligonucleotide ligation, PCR assay for simultaneous genotyping of 48 single nucleotide polymorphisms (SNPs) in a multiplexed, 384-well plate method, SNPlex (Applied Biosystems), and a singlestep homogeneous SNP genotyping using a 59-nuclease assay, TaqMan (Applied Biosystems).The SNPlex assay, pooling 48 different SNPs into a single genotyping assay, was applied to DNA samples processed in 384-well plate format through an automated protocol which included: SNP-specific oligonucleotide ligation, PCR amplification, immobilization, hybridization, elution, and separation and fluorescent detection on an ABI 3730xl capillary instrument.For all SNPs genotyped by SNPlex, the call rate on the 932 individuals in the UT and WI cohort was 99.26% after 35 individuals were excluded for low genotyping rates (.10% missing genotypes).TaqMan genotyping was processed in 384-well plate format on the five SNPs in the combined UT-WI-LHS cohorts, and the call rate on 2,006 individuals was 99.42% after 29 individuals were excluded for missing genotypes.For the five SNPs genotyped by TaqMan and SNPlex methods in 236 individuals, there were three discordant calls, all in a single individual.For the 87 SNPs genotyped by SNPlex and resequencing in the subset of 192 individuals, the concordance rate was 99.7%.

Statistical Analysis
SNP genotypes were evaluated for standard summary measures including genotyping rates, allele and genotype frequencies, and Hardy-Weinberg equilibrium tests.Standard case/control allelic x 2 analyses were used to test the association of SNPs with dichotomized FTND, as well as the Cochran-Armitage trend test using the computer program, PLINK [69].Empirical significance levels of allelic tests were evaluated by phenotype-genotype permutation testing using the PLINK adaptive mode.The logistic regression analyses reported in Tables 3-4 and Figure 2 were computed using SYSTAT 10.2 (Richmond, CA: SYSTAT Software Inc.).Dummy coding was used for haplotype and sex, while ordinal coding was used for age of onset of daily smoking.Haplotype estimation and individual assignment were carried out on genotypic data using fastPHASE [70], and independently evaluated using the EM algorithm implemented in SNPHAP (http://www-gene.cimr.cam.ac.uk/clayton/software).Haplotypebased association analyses for omnibus and haplotype-specific tests were carried out using PLINK.To evaluate potential population stratification effects, the UT and WI cohorts were analyzed for population admixture using STRUCTURE [71] on 94 nonrelated loci and an assumed population of 2. No significant admixture was observed, supporting the European American self report.Also, cohorts did not differ in CHRNA5-A3-B4 haplotype frequency (x2 (6, N = 5654) = 8.2, P = 0.22, see Table 2), suggesting they were drawn from the same population.

Figure 1 .
Figure 1.Haplotype Structure and Association Results in the Nicotinic a5, a3 and b4 Receptor Subunit Genes on Chromosome 15q24.(A) Genomic region of CHRNA5-A3-B4 transcription units on chr.15 between 76,644,000 to 76,732,000 base pairs (NCBI Build 36).2log 10 (P value) plot of SNPs as a function of genomic position, with their P values (allele association values from chi square tests) observed in the early onset condition.The symbols indicate haplotype assignment (H A , H B , H C and H D ) of the individual markers; open symbols indicate P values in the UT-WI cohorts and shaded symbols indicate P values in the combined UT-WI-LHS cohorts.The five SNPs used to assign haplotype status in the UT-WI-LHS cohorts are colored by haplotype affiliation: red (H A ), (blue (H C ) or beige (H B ). (B) fastPHASE inferred haplotype structure of the region from unphased resequencing and genotypic data from 384 chromosomes (resequencing sample cohort).Haplotypes were assigned to groups using the 5 SNPs genotyped in the UT-WI-LHS cohorts (rs680244, rs569207, rs16969968, rs578776, and rs1051730).Haplotype counts for H A , H B , H C, H D and not assigned (na) are shown on the left, and the positions of SNPs genotyped in the UT-WI cohorts are indicated above the haplotypes by blue and in the UT-WI-LHS cohorts by green indicators.doi:10.1371/journal.pgen.1000125.g001

Figure 2 .
Figure 2. Low Nicotine Dependence (%) and Mean FTND Scores by Diplotype and Age of Onset of Daily Smoking in the Combined UT-WI-LHS Cohorts.(A), (B) partitioning of diplotypes between FTND,5 and FTND.5 categories in early and late onset groups.The percentage of individuals in the FTND,5 category is shown by the dichotomous 'FTND cut' value (dashed line).(C), (D) mean FTND score of diplotypes by early and late onset; error bars indicate the S.E.M. Sample size by diplotype within the early onset condition was: AA, 161; AB, 276; AC, 162; BB, 146; BC, 153; and CC, 34.Within the late onset condition, diplotype sample sizes were: AA, 203; AB, 384; AC, 186; BB, 160; BC, 190; and CC, 34.Diplotype counts are indicated by the width of each associated column within the plots.doi:10.1371/journal.pgen.1000125.g002

Figure 3 .
Figure 3. Low Nicotine Dependence (%) as a Function of Haplotypes A and C versus Age of Onset of Daily Smoking Quartiles in the UT-WI-LHS Cohorts.(A) Percentage of each haplotype in the FTND score,5 category as a function of age of onset quartiles.The percentage of individuals in the FTND,5 category for each quartile is shown by the dichotomous 'FTND cut' value.The age range for each quartile, with percentage of total subjects in parentheses, is as follows: ,15 years (19%), 15-16 years (27%), 17-18 years (26%), and .18years (29%).doi:10.1371/journal.pgen.1000125.g003 , significant alcohol abuse based on the Michigan Alcoholism Screening Test [MAST] [63] and clinically significant depression symptoms based on the CES-D[64].DNA extraction for the Wisconsin subjects was performed by the Wisconsin State Laboratory of Hygiene.Study procedures were approved by the institutional review boards at the University of Wisconsin and the University of Utah.All LHS participants had 3647274::3648079, 3648702::3649607 nts.; CHRNB2 [Chr.1] 151353228::151353745, 151354953::151356224, 151356569:: 151357823, 151361153::151362332 nts.

Table 1 .
Association of CHRNA5 -A3 -B4 SNPs with Dichotomized FTND Scores in the UT and WI Cohorts.

Table 3 .
Logistic Regression Analyses in which Low and High Nicotine Dependence (FTND = 0-4 vs. FTND = 6-10) Was the Dependent Variable and Haplotype C Was the Reference Condition.

Table 4 .
Logistic Regression Analyses by Cohort in which Low and High Nicotine Dependence (FTND = 0-4 vs. FTND = 6-10) Was the Dependent Variable and Haplotype C Was the Reference Condition.

Table 5 .
Logistic Regression Analysis for the H A vs. H C Effect by Age of Daily Smoking Onset Quartile.

Table 6 .
Associations between Haplotypes and Individual FTND Items.
5. Do you smoke more frequently during the first hours after waking than during the rest of the day?0.28 0.08 0.05 0.58 6. Do you smoke if you are so ill that you are in bed most of the day?