Development and Validation of a Risk Model for Prediction of Hazardous Alcohol Consumption in General Practice Attendees: The PredictAL Study

Background Little is known about the risk of progression to hazardous alcohol use in people currently drinking at safe limits. We aimed to develop a prediction model (predictAL) for the development of hazardous drinking in safe drinkers. Methods A prospective cohort study of adult general practice attendees in six European countries and Chile followed up over 6 months. We recruited 10,045 attendees between April 2003 to February 2005. 6193 European and 2462 Chilean attendees recorded AUDIT scores below 8 in men and 5 in women at recruitment and were used in modelling risk. 38 risk factors were measured to construct a risk model for the development of hazardous drinking using stepwise logistic regression. The model was corrected for over fitting and tested in an external population. The main outcome was hazardous drinking defined by an AUDIT score ≥8 in men and ≥5 in women. Results 69.0% of attendees were recruited, of whom 89.5% participated again after six months. The risk factors in the final predictAL model were sex, age, country, baseline AUDIT score, panic syndrome and lifetime alcohol problem. The predictAL model's average c-index across all six European countries was 0.839 (95% CI 0.805, 0.873). The Hedge's g effect size for the difference in log odds of predicted probability between safe drinkers in Europe who subsequently developed hazardous alcohol use and those who did not was 1.38 (95% CI 1.25, 1.51). External validation of the algorithm in Chilean safe drinkers resulted in a c-index of 0.781 (95% CI 0.717, 0.846) and Hedge's g of 0.68 (95% CI 0.57, 0.78). Conclusions The predictAL risk model for development of hazardous consumption in safe drinkers compares favourably with risk algorithms for disorders in other medical settings and can be a useful first step in prevention of alcohol misuse.


Introduction
Hazardous drinking, defined as alcohol consumption that places a person at risk of adverse health events, is a leading contributor to the global burden of disease [1]. Prevalence in some populations is as high as 29 per cent [2]. Hazardous drinking was defined in terms of excessive consumption [21 drinks or more per week for men (or $7 drinks per occasion at least 3 times a week), and 14 drinks or more per week for women (or $5 drinks per occasion at least 3 times a week)] [2] or in terms of a score of 8 or over on the Alcohol Use Disorders Identification Test (AUDIT) [3]. More recent validation studies of the AUDIT, however, have recommended tailored cut off points according to gender. The suggested optimal cut off of AUDIT scores is $8 in males and $5 in women [4].
Although we know a great deal about detection [5][6][7] and approaches to treatment [8,9] of hazardous or dependent drinking, we know much less about risk of progressing to hazardous use in people currently drinking at safe limits. In particular, although many risk factors are well recognised [10][11][12][13], effective prevention is hindered by lack of evidence about their combined effect. Our objectives were to develop a risk model (predictAL) for the future development of hazardous drinking in safe drinkers attending European general practices and test its predictive power in a non-European setting. We took the approach of risk models developed to predict onset of cardiovascular disease [14] and risk of major depression (predictD) [15], both of which provide a percentage risk estimate over a given time period.

Study setting and design
To develop the predictAL model we used data from a prospective cohort of general practice attenders which had been established to develop a risk model (predictD) for the development of major depression [15,16]. The research was approved in the lead centre (UK) by the South East Multi-centre Research Ethics Committee and by key ethical committees in each of the other centres. The study was conducted in six European countries: 1) 25 general practices in the Medical Research Council's General Practice Research Framework, in the United Kingdom; 2) nine large primary care centres in Andalucía, Spain; 3) 74 general practices nationwide in Slovenia; 4) 23 general practices nationwide in Estonia; 5) seven large general practice centres near Utrecht, The Netherlands; and 6) two large primary care centres in the Lisbon area of Portugal. We assessed the external validity of the risk model in patients attending 78 doctors in 10 general practice centres in Concepción and Talcahuano in the Eighth region of Chile. General practices covered urban and rural populations with considerable socio-economic variation.

Study participants
General practice attenders aged 18 to 75 were recruited in Europe between April 2003 and September 2004 and in Chile between October 2003 and February 2005. Exclusion criteria were an inability to understand one of the main languages involved, psychosis, dementia and incapacitating physical illness. Recruitment differed slightly in each country because of local service preferences. In the UK and the Netherlands, researchers spoke to patients directly while they waited to see practice staff. In remaining European countries doctors introduced the study to patients before they saw the researchers. In Chile attenders were stratified on age and gender according to figures for the populations served by each health centre and participants selected randomly within each stratum. Participants gave informed consent and undertook a research evaluation within two weeks. All assessments at baseline and both follow-up points were conducted by face-face interview at the practices or in respondents' homes.

Measurement of hazardous drinking and associated risk factors
Alcohol use in the preceding six months was assessed using the AUDIT [17], a tool for detection of alcohol use disorders in general practice [7]. It is a widely used and well validated instrument that contains 10 questions about use of, and attitudes to, alcohol consumption over the preceding six months. We defined hazardous drinking on AUDIT scores of 8 or more in men and 5 or more in women [4].
Few studies have attempted to measure key risk factors for the development of hazardous drinking in abstinent or safe drinkers. In our establishment of the predict cohort, we measured a wide range of risk factors that were known to be associated with the onset of major depression [15]. The fact that many of these medical, psychological and social factors are also known to be associated with alcohol misuse in the literature [10,11,11,18,19], also made it possible for us to model risk of hazardous drinking. Where possible, in the predict study we used standardised measures. Questions taken or adapted from published questionnaires or developed for the study were evaluated for test-retest reliability in 285 general practice attendees recruited equally across the European countries before the main study began [16]. Each instrument or question not available in the relevant languages was translated from English and back-translated by professional translators [16]. The 38 candidate risk factors are listed numerically as RF1-38. Those subjected to test-retest reliability are shown in italics; agreement was high [16].
N A DSMIV diagnosis of major depression in the preceding six months was made using the Depression Section of the Composite International Diagnostic Interview (CIDI) (RF11) [20,21]. N Stress in paid and unpaid work in the preceding six months using questions from the job content instrument [22]. Participants were categorised as feeling in control in paid work (RF13) or unpaid work (RF14); as experiencing difficulties without support in paid or unpaid work (RF15); and experiencing distress without feeling respect for their paid or unpaid work (RF16).
N Financial strain using a question used in UK government social surveys(RF17) [23].
N Besides the 10 AUDIT questions we asked whether participants had ever had problems with drinking too much alcohol or had ever received treatment for an alcohol problem (RF18).
N AUDIT score at baseline (RF19). Binge drinking at baseline was taking from responses to question three of the AUDIT. Binge drinking was defined as ''having six or more drinks on one occasion'' at least monthly (RF20).
N Self-rated physical (RF21) and mental health (RF22) were assessed by the Short Form 12 [24]. The weights used to calculate scores are from version 1.
N Whether participants had ever used recreational drugs using adapted sections of the CIDI (RF23).
N We asked whether participants currently smoked cigarettes, cigars or a pipe (RF24). It was not possible to collect smoking data in the Netherlands and Estonia (see statistical analysis below).
N Questions on the quality of sexual (RF25) and emotional relationships(RF26) with partners or spouses [25].
N Presence of serious physical, psychological or substance misuse problems, or any serious disability, in people who were in close relationship to participants (RF27).
N Difficulties in getting on with people and maintaining close relationships (RF28) [26].
N Anxiety (RF34) and panic symptoms (RF35) in the previous six months using relevant sections of the Patient Health Questionnaire (PHQ) [30].
N Major life events in the preceding six months (RF36), using the List of Threatening Life Experiences Questionnaire [31].
N Experiences of discrimination (RF37) in the preceding six months on grounds of sex, age, ethnicity, appearance, disability or sexual orientation using questions from a European study [32].
N Adequacy of social support (RF38) from family and friends [33].

Main outcome
All participants were re-evaluated after six months using the AUDIT.

Statistical analysis
Data imputation. Missing data in all variables were imputed using the method of chained equations, implemented in the Stata command ice [34]. This involves using regression models to determine plausible values for the missing data, starting with variables that had the lowest percentage of missing data and continuing until all variables are imputed. Continuous variables were imputed using multiple linear regression. Dichotomous variables were imputed using logistic regression and nominal variables such as employment status, education status, control in paid work, discrimination, problems with someone close, satisfaction with emotional relationship with spouse or partner were imputed using multinomial logistic regression. Number of life events was imputed using ordered logistic regression. This process was carried out ten times (cycles) resulting in one imputed dataset. Then the whole process was repeated to give ten imputed datasets to allow variability due to uncertainty of the exact values [35]. The final imputation model consisted of all variables listed above (with the exception of smoking status, the reasons for which are explained later) as well as the outcome included as a continuous score and then dichotomised before analysis. Each imputed dataset was analysed separately and estimates were combined using Rubin's rules [36].
Preliminary steps. Before building the multivariable model we undertook two preliminary steps. 1) Data on smoking history was collected as an additional part of the original PREDICT study. The cost incurred for this aspect of the data collection was covered by funds obtained independently by each participating centre but this was not possible in the Netherlands and Estonia. We hence first analysed data from the four European countries that were able to collect a smoking history as we believed this was very likely to be an important predictor variable. However, when current smoking (risk factor 24 above) showed no association with development of hazardous drinking we dropped this risk factor from the analysis. 2) A rule of thumb for estimating sample sizes for developing prognostic models is that there should be at least 10 events for each variable entered in the model [37]. Thus, given the event rate of hazardous drinking, we did not enter all 38 predictor variables into the model. Instead, we first conducted a series of univariable analyses to select out those variables that were not significant at the p,0.1 level. The remaining variables were then entered into the full multivariable model. These were AUDIT score at baseline; age at baseline; SF12 physical health score; SF12 mental health score; sex; professional status; educational status; marital status; employment status; living alone; lifetime alcohol problem; ever used recreational drugs; satisfaction with sex life; satisfaction with emotional relationship with spouse/partner; physical or emotional child abuse; religious/spiritual beliefs; presence of panic syndrome; binge drinking and country of residence of each participant.
Model building. We developed our multivariable predictAL model in the imputed data for safe or abstinent drinkers (male AUDIT score #8 and females AUDIT score ,5) by examining the 19 remaining predictor variables at baseline in a stepwise logistic regression with robust standard errors to adjust for general practice clustering. We used a conservative threshold for inclusion of p,0.01 in order to produce a stable model and minimise the degree of over-fitting. We retained age and sex in all regression models because of their well known associations with development of hazardous drinking [18]. We also retained country because of an a priori assumption of clustering within country. Multivariable fractional polynomial analysis was used to assess possible nonlinear effects of continuous predictors. Pair wise interactions between the variables in the model and sex were tested. The resulting predictAL score provides a predicted probability of hazardous alcohol consumption developing over six months.
Internal validation. We calculated the c-index [38] to estimate the discriminative power of the final predictAL model in each European country and all European countries combined. We adjusted for over-fitting of our model by computing a shrinkage factor based on the initial model including all 19 variables and applied it to the model coefficients [39]. We assessed the goodness of fit of the final predictAL model by grouping individuals into deciles of predicted risk and comparing the observed probability of hazardous drinking within these groups with the average risk. We calculated effect sizes using Hedge's g [40] for the difference in log odds of predicted probability between patients who were later observed to be hazardous drinkers and those who were not. Finally we report the threshold values of risk score, and the associated sensitivity, for a range of specificity that would be practical (minimising false positives) when using the instrument in a clinical setting. We stress that these values are for the fitted European model (not the external population) so we might expect them to be worse in practice.
External validation. We used the c-index, Hedge's g and a comparison of predicted versus observed probability of hazardous drinking, to evaluate the performance of the predictAL model in the Chilean data.
All analyses and data imputation were performed using Stata release 11 [41].

Results
Response rates and missing data 15, 205 people attending their general practitioners were approached of whom 10,045 people (69%) took part in the seven countries [16]. Response to recruitment was high in Portugal (76%), Spain (87%), Estonia (80%), Slovenia (80%) and Chile (97%) but lower in the UK (44%) and the Netherlands (45%). Ethical considerations prevented the collection of data on nonresponders at baseline. Across all countries the response to the six months follow-up was 89.5%. 6193 European and 2462 Chilean attenders recorded AUDIT scores below 8 in men or below 5 in women at recruitment and thus were involved in the modelling of risk ( Figure 1, table 1).

Numbers in the modelling
Once current smoking was eliminated as a significant predictor of risk in the four European countries that collected those data (see analysis section), the predictAL algorithm was developed using data for the 6193 attenders in all six European countries that had AUDIT scores below 8 in men and 5 in women at recruitment. Validation was carried out using six-month outcome data on 2462 exactly similar attenders in Chile ( figure 1, table 1). The amount of missing data in outcome and covariates is summarised in table 2. For all countries there were few outcome data missing at baseline, but this increased to 11% after six months in Europe. Taking the set of covariates as a whole, a large proportion of individuals were missing data in at least one covariate. 56% of participants in the  six European countries and 67% in Chile were missing data in at least one covariate. However, restricting the set of covariates to only those used in the final model, this proportion decreases to 1% and 0.1%.

Onset of hazardous drinking
We estimated that the incidence of hazardous drinking over six months in Europe was 4.0% (95% CI: 3.4%, 4.5%) and in Chile was 2.7% (CI 2.0%, 3.3%). The figures given here vary very slightly from table 1 as they are based on imputed data.

Development of the predictAL algorithm in Europe
Three variables (baseline AUDIT score, panic syndrome and lifetime alcohol problem) in addition to sex, age and country were retained at p,0.01 after the backwards elimination procedure (table 3). No interactions between sex and other variables in the model were significant. AUDIT score and lifetime alcohol problem was found in each of the 10 imputed data sets. The additional variables to appear were panic syndrome in six imputed datasets, marital status in four and, having ever used recreational drugs in one. Thus, the model was stable in terms of the variables selected.
The c-index for the predict Al model in all the European countries was 0.839 (95% CI 0.0805 to 0.873) ( Table 4). The effect size for the difference in log odds of predicted probability between attenders in Europe who subsequently developed hazardous alcohol use and those who did not was 1.38 (95% CI 1.25, 1.51) (table 5). The model discriminated best in the UK, the Netherlands and Spain and least well in Slovenia and Portugal.
To examine the fit of the predictAL model, we divided the European population into deciles of predicted probability of hazardous drinking. Within each decile we plotted mean risk score at recruitment against observed probability of hazardous drinking at six months (figure 2), using the model coefficients shown in table 3. The plot for Europe shows that onset of hazardous drinking in the highest decile of risk score in Europe was approximately 21% in contrast to the overall incidence of 4%.
Estimates of sensitivity and specificity of the predictAL score in predicting the development of hazardous drinking over 6 months are shown in table 6. Examples of participants screening at increasing levels of predicted probability of hazardous alcohol use on the predictAL algorithm are shown in Box S1.

External validation of the predictAL algorithm in Chile
The predictAL model was validated in Chile using data provided by the 2462 attenders who were abstinent or safe drinkers at recruitment. In Chile 2% of such people reported hazardous drinking by the 6 months follow-up point. Predicted risks at six months for Chile were obtained using shrunk coefficients. Because country is included in the model, it was necessary to recalibrate the model in Chile. In Chile the c-index for the predictAL model was 0.781 (95% CI 0.717, 0.846) and Hedge's g was 0.68 (95% CI 0.57, 0.78) (tables 4 and 5).

Sensitivity analysis
The inclusion of country as a variable in the predictAL model accounts for variation between countries in the risk assessment. However, given the relatively lower recruitment rates in the UK and the Netherlands, and their somewhat higher incidence rates of hazardous drinking at 6 months (Table 1), we conducted a sensitivity analysis to see whether exclusion of participants from the UK and the Netherlands changed our prediction model. There were minimal changes in the coefficients for most variables in the model with the exception of country which was no longer significant (data available from the authors on request).

Discussion
PredictAL is a brief risk assessment for the development of hazardous drinking over six months, which was developed in general practice in Europe and validated in attenders in Chile. We emphasise that we were not attempting to provide a superior instrument for detection of current hazardous drinking; rather we have developed an algorithm to estimate future risk of hazardous drinking. It is accurate with c-indices equal to or above those usually reported for risk prediction in medicine, such as cardiovascular events [42]. The risk factors involved (sex, age, country, baseline AUDIT score, lifetime alcohol problem and the presence of panic syndrome) are not surprising. Our study was not a search for new risk factors; rather it was an attempt to gauge how they might most parsimoniously be combined to model risk in medical settings.  The absence of what might safely be regarded as key risks, such as cigarette smoking, is also not surprising. Modelling risk in this way gives prominence to those risk factors that trump others. When the algorithm is applied in a country besides the six in Europe, or Chile, we recommend using either the overall European coefficient (20.710) or the coefficient for the country that most closely matches the six months incidence of hazardous drinking (if known) in the new setting (table 3). The coefficient for Chile was obtained by a recalibration of the predictAL model in that country.

Strengths and limitations
The main strength of our study is that we have developed the predictAL model in one continent and rigorously validated it in another. The c-index provides a standardised way of comparing the discriminative power of tests that use different measurement units in different settings [43] and shows that predictAL compares very favourably with risk instruments for other health problems. However, our study has a number of limitations. Lower recruitment rates in the UK and the Netherlands possibly occurred because the study was not so obviously introduced by the doctors. Nevertheless, response to follow-up in all countries was high and our sensitivity analysis excluding participants from these countries suggests responders were not a particular or unusual group. One strength of using data from a cohort that was established originally to develop a risk model for major depression [15], is that participants were unaware of the aim behind this risk modelling. Including the baseline AUDIT score as a covariate in the model takes account of the dependence between baseline and six month data. Although it might be argued that six months is a relatively short time over which to estimate risk, we believe that it is a pragmatic choice in general practice where longer term prediction may be less salient to patients and doctors. Using a two step process in which variables not likely to enter the model were first removed, reduced the impact of the low event rate of hazardous drinking on the power of our analysis. Finally, although a 3% incidence of hazardous drinking is low from the statistical point of view, this degree of conversion from normal to hazardous drinking over only six months presents a significant clinical risk. Until now we have had no tools whatsoever to predict normal drinkers who are at risk of future hazardous use and our efforts at prevention are also rudimentary. Hence, we believe our analysis adds valuable information to a field in need of innovation.

Application in clinical practice
Efforts to deal with the public health and social consequences of hazardous drinking must include a focus of prevention. The questions in predictAL are brief and risk scores can readily be calculated using the algorithm (appendix). Panic disorder is often established before the age of 20 and thus is an early predictor of alcohol misuse that is open to intervention [44]. Furthermore our work shows the potential for extending the AUDIT beyond its usual function of detecting current hazardous and dependent drinkers into the realm of predicting risk of hazardous drinking in so-called safe drinkers. Our results expressed by the c-indices and effect sizes demonstrate a clear difference in risk between safe drinkers who became hazardous drinkers six months later and those who did not. Thus when family doctors use the AUDIT to screen for hazardous alcohol use in their patients they might also consider adding in two extra pieces of information. The first is whether their patient has ever had problems drinking too much alcohol or has ever received treatment for an alcohol problem, and the second is a brief review of panic symptoms experienced in the previous six months (derived from Patient Health Questionnaire 30) [30]. This additional information will enable primary care clinicians to assess the future risk of hazardous drinking in men with AUDIT scores of 8 or less and in women with scores of 5 or less.
In reporting a range of thresholds for sensitivity and specificity (table 6) we would recommend maximising specificity at the cost of reduced sensitivity to minimise the potential workload for family doctors engaging with false positives. For example, if primary care physicians were to use a European threshold for risk of 10.7% (i.e. specificity of 0.9 and sensitivity of 0.594) they could be sure that the numbers of patients falsely identified as at risk of hazardous drinking (false positives) will kept to a minimum. Although this would be at the cost of missing some of those who would go on to develop hazardous drinking over six months, use of a high cut off ensures that prevention efforts are less likely to be wasted on those not at risk of becoming hazardous drinkers. However, if prevention interventions require little input by way of physician time and effort (e.g. a web-based alcohol self-help prevention package), a lower cut off of 6.1% might be considered, as the larger number of positives caught in the net could be offered the intervention without substantially increasing costs to the health service.
We acknowledge that many general practitioners have difficulty dealing with current hazardous use but this difficulty does not detract from efforts to predict hazardous use in advance. In fact, successful prediction may reduce the more challenging work that general practitioners are frequently called on to do with people already drinking unsafely. Patients identified as at risk on screening could be flagged on practice computers to alert practice staff when they attend. Recognition of those at risk may be helpful when it leads to watchful waiting or active support with advice on social and behavioural strategies they might use to reduce their risk. There is controlled trial evidence that shows providing information on coping with anxiety and the consequences of hazardous drinking may prevent alcohol misuse in young people [45]. The

Conclusions
This predictAL risk model for development of hazardous consumption in safe drinkers compares favourably with risk algorithms used in other medical settings and may be useful in prevention of alcohol disorders. We also suggest that this is an advance that takes the AUDIT beyond simply the detection of current hazardous use.

Supporting Information
Box S1 Examples of a range of predicted probabilities of hazardous drinking at baseline. AUDIT scores of 8 or more in men and 5 or more in women were defined as hazardous drinking. (DOCX)