The objective of this study was to evaluate the performance of risk scores (Framingham, Assign and QRISK2) in predicting high cardiovascular disease (CVD) risk in individuals rather than populations.
Methods and findings
This study included 1.8 million persons without CVD and prior statin prescribing using the Clinical Practice Research Datalink. This contains electronic medical records of the general population registered with a UK general practice. Individual CVD risks were estimated using competing risk regression models. Individual differences in the 10-year CVD risks as predicted by risk scores and competing risk models were estimated; the population was divided into 20 subgroups based on predicted risk. CVD outcomes occurred in 69,870 persons. In the subgroup with lowest risks, risk predictions by QRISK2 were similar to individual risks predicted using our competing risk model (99.9% of people had differences of less than 2%); in the subgroup with highest risks, risk predictions varied greatly (only 13.3% of people had differences of less than 2%). Larger deviations between QRISK2 and our individual predicted risks occurred with calendar year, different ethnicities, diabetes mellitus and number of records for medical events in the electronic health records in the year before the index date. A QRISK2 estimate of low 10-year CVD risk (<15%) was confirmed by Framingham, ASSIGN and our individual predicted risks in 89.8% while an estimate of high 10-year CVD risk (≥20%) was confirmed in only 48.6% of people. The majority of cases occurred in people who had predicted 10-year CVD risk of less than 20%.
Application of existing CVD risk scores may result in considerable misclassification of high risk status. Current practice to use a constant threshold level for intervention for all patients, together with the use of different scoring methods, may inadvertently create an arbitrary classification of high CVD risk.
Citation: van Staa T-P, Gulliford M, Ng ES-W, Goldacre B, Smeeth L (2014) Prediction of Cardiovascular Risk Using Framingham, ASSIGN and QRISK2: How Well Do They Predict Individual Rather than Population Risk? PLoS ONE 9(10): e106455. https://doi.org/10.1371/journal.pone.0106455
Editor: Adrian V. Hernandez, Universidad Peruana de Ciencias Aplicadas (UPC), Peru
Received: December 26, 2013; Accepted: August 5, 2014; Published: October 1, 2014
Copyright: © 2014 van Staa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study was supported by the Wellcome Trust as part of a programme to evaluate the feasibility of conducting pragmatic randomised trials within CPRD; this work was done as part of the assessment of eligibility for a trial comparing simvastatin and atorvastatin. LS is supported by a Senior Clinical Fellowship from the Wellcome Trust. MG was supported by the National Institute for Health Research (NIHR) Biomedical Research Centre at Guy's and St Thomas' NHS Foundation Trust and King's College London. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. CPRD is owned by the UK Department of Health and operates within the Medicines and Healthcare products Regulatory Agency (MHRA). CPRD has received funding from the MHRA, Wellcome Trust, Medical Research Council, NIHR Health Technology Assessment programme, Innovative Medicine Initiative, UK Department of Health, Technology Strategy Board, Seventh Framework Programme EU, various universities, contract research organisations and pharmaceutical companies. The department of Pharmacoepidemiology & Pharmacotherapy, Utrecht Institute for Pharmaceutical Sciences has received unrestricted funding for pharmacoepidemiological research from GlaxoSmithKline, Novo Nordisk, the private-public funded Top Institute Pharma (www.tipharma.nl, includes co-funding from universities, government, and industry), the Dutch Medicines Evaluation Board, and the Dutch Ministry of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have the following interests: Utrecht Institute for Pharmaceutical Sciences has received unrestricted funding for pharmacoepidemiological research from GlaxoSmithKline and Novo Nordisk. There are no patents, products in development or marketed products to declare. Two authors (TvS and EN) were previously employed by the Clinical Practice Research Datalink (CPRD). CPRD provides data and trial services on a commercial basis for both academic and pharmaceutical industry researchers. CPRD did not have any role in writing the report, or any input into the content of the report. This does not alter the authors' adherence to all the PLoS ONE policies on sharing data and materials.
Cardiovascular disease (CVD) is a major cause of mortality and morbidity worldwide. It causes impaired quality of life and accounts for a large share of health services utilization . Statins are widely used medications in the prevention of CVD. A recent Cochrane review reported that statins reduce the risk of mortality by 16% and CVD outcomes by 26% in people without a history of CVD . Most guidelines recommend that statins should only be used in primary prevention in people with a high absolute CVD risk , . As an example, the National Collaborating Centre for Primary Care and Royal College of General Practitioners and the National Institute of Clinical Excellence (NICE) recommended in 2007 to use statins “…as part of the management strategy for the primary prevention of CVD for adults who have a 20% or greater 10-year CVD risk of developing CVD…” .
A large number of risk assessment tools have been developed to support clinicians in determining the long-term risks of CVD . The Framingham, ASSIGN and QRISK2 risk scores are widely used to predict 10-year CVD risk for primary prevention. The Framingham risk score is based on a US cohort recruited several decades ago . The ASSIGN risk score was derived from the Scottish Heart Health Extended Cohort  and the QRISK risk score from a large primary care database in England and Wales , . These scores were based on risk factors that can easily be measured in the general population. The Framingham, ASSIGN and QRISK2 risk scores have been validated by comparing observed to predicted risks in the overall population . There is no consensus about what risk score to use for CVD risk assessment and guidelines for primary CVD prevention propose to use any risk score . These three risk scores are currently being used in the UK to determine CVD risk.
A recent review of CVD risk prediction models recommended that claims of improved performance of new models over established models should be documented in several studies carried out by independent investigators . There is little evidence about how accurately these risk scores predict high CVD risk in individuals. A risk score could perform well in the overall population if it consistently predicts low rather than high risks as those at high risks are typically only a minority . The objective of this study was to evaluate the validity and reliability of the Framingham, ASSIGN and QRISK2 scores in predicting individual CVD risk.
Material and Methods
This study used data from the General Practice Research Database in the United Kingdom which is part of the Clinical Practice Research Datalink (CPRD), previously known as the General Practice Research Database. CPRD comprises the computerised medical records maintained by general practitioners (GPs). Almost all people in the UK are registered with a general practice. GPs play a key role in the UK health care system, as they are responsible for primary health care and specialist referrals. The GPs are typically informed by hospitals of diagnoses made during outpatient consultations and hospitalisations. The data recorded in the CPRD since 1987 include demographic information, prescription details, clinical events, preventive care provided, specialist referrals, hospital admissions and their major outcomes . A recent review of validation studies found that medical data in the CPRD were generally of high quality . Fifty-five studies of the CPRD recording of diseases of the circulatory system reported a median percentage of cases confirmed of 85.3% . People in CPRD have now been linked individually and anonymously to the national registry of hospital admission (Hospital Episode Statistics [HES]) and death certificates. The linkages are performed using the patient's unique NHS number, date of birth, sex and postcode of residence. HES collect the dates of hospital admission and discharge and main diagnoses, as extracted from the medical records by coding staff in England. The death certificates list the date and causes of death. Linked data were available for 50% of the CPRD population as, at the time of the study, this only included practices in England willing to provide unique patient identifiers to the Trusted Third Party. The protocol of this study was approved by the CPRD Independent Scientific Advisory Committee.
The main study population consisted of people aged 35–74 years, using the November 2011 version of CPRD and drawn from CPRD practices that participated in the linkages. The start of follow-up was one year after start of the patient's CPRD data collection or 1 January 1998, whichever date came last. HES and death certificates data were available from 1998 onwards. The end of follow-up was the patient's end of CPRD data collection or death. The index date at which the CVD risk assessment was conducted, was a randomly selected date during this period of follow-up. This approach was different from that used in the QRISK2 analysis, which set the index date to 1-1-1998 unless the patient's data collection started later (e.g. due to patient newly registering). The use of a random index date was preferred in order to investigate changes in data recording (a newly registered patient may have different levels of e.g. missing data). The following persons were excluded: (i) those with CVD prior to the index date or with missing dates, (ii) those prescribed a statin prior to the index date or with missing dates, (iii) those temporarily registered with the practice. Follow-up was censored at the date of a first statin prescription.
The following incident CVD outcomes were included:
- CVD as recorded by the GPs (myocardial infarction, angina, coronary heart disease, stroke and transient ischemic attack).
- hospitalisation due to CVD as recorded by the hospital in HES (either primary or secondary admission diagnostic ICD10 codes): angina pectoris (I20); acute myocardial infarction (I21); complications following acute myocardial infarction (I23); other acute ischaemic heart disease (I24); chronic ischaemic heart disease (I25); cerebral infarction (I63); and stroke, not specified as haemorrhage or infarction (I64), as used for QRISK2 . Additional codes included intracerebral haemorrhage (I61) and other nontraumatic intracranial haemorrhage (I62).
- Death due to CVD as reported on a death certificate (primary or secondary cause). The ICD-10 codes were similar to those used for hospitalisations.
Death due to causes other than CVD was also measured.
Imputation for missing variables
Missing values for smoking status, systolic blood pressure, ratio of total serum cholesterol and high density lipoprotein (HDL) cholesterol and BMI were imputed (using MI and MIANALYZE imputation procedures in SAS). The imputation regression models included the risk factors as listed in supplementary Table 1, CVD occurrence, death due to causes other than CVD, duration of follow-up and interactions between CVD occurrence and death and duration of follow-up. Five imputation datasets were created and the effect estimates were based on the combination of point and variance estimates from these five datasets . The same imputed values for each patient were used across the different risk scores.
CVD risk scores
Three risk scores were analysed including Framingham, ASSIGN and QRISK2. We did not analyse the Joint British Society 2 risk score  given the similarity to the Framingham risk score. The 10-year CVD risks at the index date as predicted by Framingham and ASSIGN were estimated using the publicly available risk equations , . The risks predicted by QRISK2 were calculated using the commercial software program as provided by CLINRISK Limited on a fee-paying licence using the 2012 version [http://qrisk.org/index.php]. The CVD risks as predicted by the risk scores were based on the risk factors measured at the index date. A previous study reported that lifestyle variables as recorded in CPRD (such as obesity and smoking) were important predictors for myocardial infarction .
CVD risks based on a competing risk regression model
We also estimated for each patient the individual long-term CVD risks as modelled by a competing risk Cox proportional hazards regression model . This was done to estimate as accurately as possible the actual CVD risks for each patient in the study population, which could then be compared to the risks as predicted by the risk scores. Competing risk regression was used as standard Cox regression model has been reported to overestimate 10-year CVD risk of coronary heart disease . Accounting for the risks of competing events (such as death due to non-CVD causes) may be important in the frail and older populations as CVD occurrence may be precluded by the development of other diseases. Fractional polynomials were used to model non-linear risk relations with the continuous variables . The regression models were conducted separately by gender and three age groups.
The validation of risk scores involves the measurement of calibration and discrimination. Calibration is the comparison of observed and predicted event rates and discrimination the ability of the risk score to distinguish between people who do and do not experience the event of interest . We assessed calibration by comparing observed (using competing risk life tables) and predicted event rates in subgroups as defined by the vigintiles of predicted risk (vigintiles are the values that divides the distribution of individuals into twenty groups of equal frequency). Discrimination is the extend a risk score is able to differentiate between those who develop the outcome and those who do not. Discrimination is typically assessed by estimating the c index . Rather than estimating this c index which is a global measure and population average, we evaluated the predicted risks at the index date for those people who developed CVD during follow-up. Good discrimination would have occurred if CVD cases mostly developed in those with high predicted risks. External validation is typically recommended for models that need to be generalised to other populations . Our competing risk regression model was not intended to be generalised but only to estimate as best as possible the individual risks in our study population. We also assessed reclassification by evaluating the consistency in prediction between the different risk scores.
The main analysis consisted of a comparison of the predictions of CVD risk at the index date with the four risk scores for each individual patient. The intraclass correlation coefficients (ICCs) in individual risk prediction between the four risk scores were estimated . We report the ICCs rather than Pearson correlation coefficients because the former provides a measure of agreement between scores while the latter shows how well one score predicts the other. This distinction is important when a threshold (such as 20%) is recommended for deciding the course of clinical intervention.
Two different analyses were conducted in order to evaluate bias with the risk scores. The first analysis concerned secular trend in CVD incidence. CVD incidence has decreased over several decades . Thus, the risk scores may overestimate CVD risks in current practice. In order to estimate the potential effects of this secular trend, incidence rates were measured in each calendar year. The second bias analysis concerned multiple imputations as used in the QRISK2 estimation. This method assumes that the occurrence of missing data is random conditional on other observed patient characteristics. In UK general practice, risk factors are typically not recorded unless the patient visits the practice. People with certain conditions may also be more likely to be screened for risk factors which incur extra payments (Quality Outcome Framework). In order to evaluate the effects of imputation, Cox regression was used to compare the CVD incidence in people with imputed values (for BMI, systolic blood pressure, cholesterol and smoking status) and those with measured values. If the assumption behind multiple imputations is correct, it can be expected that the CVD rate is similar between those with recorded and imputed values (conditional on the other risk factors in the model). SAS version 9.2 was used for the analyses.
The study population included 1.8 million persons with an average follow-up of 3.3 years (Table 1). Ethnicity was not recorded for about half of the men and one-third of the women. About one-quarter of the study population had a follow-up after the random date of at least 5 years. Women were more likely to have information on smoking status, BMI and systolic blood pressure. The extent of missing data decreased sharply over calendar time. In 1998, BMI was missing in 47.3% of people, smoking status in 47.3%, systolic blood pressure in 33.4% and cholesterol/HDL ratio in 97.8%; in 2010, these figures were 32.7%, 14.3%, 22.3% and 72.2%, respectively.
CVD outcomes occurred in 69,870 persons. Major risk factors for CVD included number of cigarettes smoked per day of 21+ (relative rate [RR] = 2.77 [95% CI 2.54–3.03] in women and RR = 2.45 [95% CI 2.32–2.59] in men), unknown ethnicity (RR = 0.46 [95% CI 0.45–0.48] and RR = 0.42 [95% CI 0.42–0.43]) and 50+ records in CPRD in the year before (RR = 5.75 [95% CI 5.33–6.20] and RR = 4.31 [95% CI 4.02–4.62]).
The CVD incidence decreased over calendar time. The age- and sex-adjusted RR of CVD was 0.61 (95% CI 0.59–0.63) in 2010 compared to 1998. This RR was 0.94 (95% CI 0.89–1.00) for hospitalisations due to CVD (as recorded in HES) and 0.52 (95% CI 0.49–0.54) for GP-recorded CVD. Death due to CVD (as recorded on death certificates) also decreased over calendar time (RR of 0.58 [95% CI 0.52–0.64]).
The calibration of the competing risk model showed small differences on average with observed risks across vigintiles of risk. The largest difference between predicted and observed CVD risk occurred in the vigintile with highest risk (predicted 10-year risk of 35.9% compared to an observed risk of 34.9%). The differences between observed and predicted 10-year risks were on average less than 0.2% in 16 vigintiles with lowest risk.
Table 2 shows the distribution of 10-year CVD risks as estimated by competing risk regression. In people aged 50 years or older, 22.9% had a 10-year CVD risk of ≥20% and 51.5% of risk of ≥10%. The risks varied considerably in this age group: the 5th percentile of 10-year CVD risk was 1.4% and 95th percentile 34.6%.
The level of agreement in CVD risk prediction was best between ASSIGN and QRISK (intraclass correlation coefficient of 0.93) and lowest between Framingham and estimated risks in CPRD (0.77). The correlation was 0.91 between Framingham and ASSIGN, 0.87 between Framingham and QRISK2, 0.80 between ASSIGN and estimated risks in CPRD and 0.84 between QRISK2 and estimated risks in CPRD. As shown in Table 3, the difference in the predicted 10-year CVD risks between QRISK2 and the risks predicted based a competing risk model was on average 0.4%, while the predicted 10-year CVD risk with Framingham was on average 2.3% higher and ASSIGN 1.4% higher compared to that predicted by the competing risk model. When analysing the concordance in estimates for individual persons, only 55.6% of persons had a small difference in the risks predicted by QRISK2 and the risks based on the competing risk model.
Table 4 shows the differences between Framingham, ASSIGN and QRISK2 compared to the estimated risks in CPRD stratified by the risk factors. The mean differences between QRISK2 predicted risks and estimated risks in CPRD increased by age. QRISK2 overestimated 10-year CVD risk by 2.2% in people aged ≥65 years compared to the risks estimated in CPRD while QRISK2 predicted and CPRD estimated risks were, on average, similar in younger people. The concordance between QRISK2 predicted risks and the estimated risks in CPRD changed over calendar time; QRISK2 underestimated 10-year CVD risk by 3.2% in 1998–2001 and overestimated risk by 2.2% in 2006–2010 compared to the estimated risks in CPRD. Larger deviations between the risks predicted by QRISK2 and risks estimated in CPRD occurred with different ethnicities, diabetes mellitus, left ventricular hypertrophy and number of records for medical events in the electronic health records in the year before the index date.
The differences in individual risk prediction between the risk scores were largest among people with higher CVD risks (Figure 1). In the lowest vigintile of risk, the risk predictions by QRISK2 were similar to the individual risks estimated in CPRD (absolute difference of less than 2%) for 99.9% of people; in the highest vigintile of risk, this was only 13.3%.
X-axis: Vigintiles of predicted risk. Y-axis: Percentage of persons.
The risk scores predicted low 10-year CVD risk fairly consistently (Table 5). A QRISK2 estimate of low 10-year CVD risk (<15%) was confirmed by Framingham, ASSIGN and the CPRD estimated risks in 89.8% of people. An estimate of high 10-year CVD risk (≥20%) by QRISK2 was confirmed in 48.6% of people.
The majority of CVD cases occurred in people who had a predicted 10-year CVD risk of less than 20% (Table 6). Only 41.1% of the cases were predicted by QRISK2 to have a 10-year CVD risk of ≥20% and 27.5% of the cases a 10-year CVD risk of less <10%.
We found that all three risk scores (Framingham, ASSIGN and QRISK2) predicted the presence of low CVD risk consistently in individual persons. However, predictions of high CVD risk for individuals varied substantively between the risk scores and treatment strategies could be different depending on which risk score is being used. Most CVD cases occurred in people deemed to be at low CVD risk.
Population averages can hide substantial variability in prediction among individual persons and poor prediction of ‘high risk’ status as these estimates are often determined by the large majority of low risk individuals. As succinctly stated by Rose, the ability to estimate the average risk for a group may not be matched by any corresponding ability to predict which individuals are going to fall ill soon . The present study confirmed Rose's observations for CVD prediction, with a considerable variability between risk scores in the prediction of high CVD risk and with most CVD cases occurring in people classified to have lower CVD risk.
The QRISK2 score was developed in a similar setting as the present study and the statistical methods were also broadly similar. As expected, we found that the averages of QRISK2 estimates and our competing risk predictions were reasonably consistent. Two validation studies of QRISK2 reported that the predicted and observed risks were on average similar and they concluded that QRISK2 was accurate in identifying a high risk population , . Our analyses of averages support these studies. But we also conducted analyses of individual risk predictions and reached opposite conclusions. We found substantial deviations between the QRISK2 estimates and our competing risk predictions. This was related to the inclusion into the competing risk models of several risk factors, as pre-defined in the protocol, which were found to be strong predictors of risk (such as calendar year, number of GP visits, region and indicators of missing data including ethnicity). Our approach, although not commonly used, allows for an examination of the performance of risk scores in individual persons rather than testing averages across populations. This regression approach should provide, as long as the model is specified correctly, a close representation of the observed risks across multiple risk factors. The regression models also included risk factors not used by the published risk scores, such as number of GP visits (e.g. there was a five-fold difference in CVD risk between women with frequent and no GP attendance). If the risk scores are to be used for individual risk prediction, the evaluation of performance should go beyond population averages.
A recent meta-analysis reported that statins reduced CVD risk mortality in people with low CVD risk. It concluded that the threshold for statin treatment should be reduced to a 10-year CVD risk of 10% . A commentary of this study proposed that statins should be used by all by the age of 50 years as most people aged 50 years or older have higher risks. It stated, incorrectly, that 83% of the men older than 50 years and 56% of women older than 60 years have a 10-year CVD risk of ≥10% . It is questionable whether whole populations should be treated if individual risks vary greatly, as with CVD. Another question is how to deal with individuals who were not eligible for the trials (e.g. a 50-year old with normal LDL and C-reactive protein). There is no guarantee that the treatment effects as observed in trials can be generalised to populations different from those in the trials .
The strength of this study was the large size and representativeness of the study population, the well-documented data quality of CPRD  and the availability of linked hospital and death certificate data. There are several important limitations. Information on laboratory and physical measurements was missing for a large number of people. The extent of missing data decreased substantially over time. Reasons for this decrease include the availability of electronic communication between practices and laboratories and the incentivisation of practices in measuring and recording of data. We applied imputation techniques but found that people with imputed values had different CVD risks compared to those without missing data. This is not unexpected as healthy people are less likely to visit their practice. Another limitation of this study concerned the use of socioeconomic status in the evaluation of the ASSIGN score. This score used the Scottish IMD and their values cannot be generalised to other regions in the UK. Our approach of standardising English to Scottish IMD may have introduced bias but the direction of bias is likely to have underestimated differences as socioeconomic status is, on average, higher in England. The recording of ethnicity in CPRD also has limitations as there was a substantive discrepancy in ethnicity between CPRD and HES in the recording of ethnicity. Another limitation is that the coding by practices of CVD has changed over calendar time  which may explain part of the trend of lower CVD rates over time. However, a secular trend was also observed with hospitalisations recorded in HES and death certificates. The main analyses in this study concerned comparisons of predictions by the different risk scores which are not affected by changes in CVD recording. Another consideration in this study was the use of a random index date rather than one based on the start of data collection. This approach reduced statistical power. However, our rationale for this was the objective to emulate the performance of risk scores in actual clinical practice, with assessments being done at arbitrary dates rather than at the start of data collection. There have also been major changes in the completeness of data recording over time: an imputation model that used an index date of 1-1-1998 did not converge due to high levels of missing cholesterol levels. The number of people with a follow-up exceeding 10 years was also larger than that in the studies for ASSIGN, Framingham and QRISK1 , , .
In conclusion, the Framingham, ASSIGN and QRISK2 risk scores do not predict the presence of high CVD risk well and consistently. Current practice to use any risk score in conjunction with a constant threshold level has inadvertently created an arbitrary classification of high CVD risk. Risk prediction strategies should be based on statistical models that are transparent, derived from a similar population, with data collected recently and updated regularly.
Conceived and designed the experiments: TvS MG LS. Performed the experiments: TvS EN. Analyzed the data: TvS EN. Contributed reagents/materials/analysis tools: TvS EN. Wrote the paper: TvS MG LS BG.
- 1. Taylor F, Ward K, Moore TH, Burke M, Davey Smith G, et al. (2011) Statins for the primary prevention of cardiovascular disease. Cochrane database Syst Rev: CD004816. Available: http://www.ncbi.nlm.nih.gov/pubmed/21249663. Accessed 2013 December 26.
- 2. Cooper A, Nherera L, Calvert N, O'Flynn N, Turnbull N, et al. (2007) Clinical Guidelines and Evidence Review for Lipid Modification: cardiovascular risk assessment and the primary and secondary prevention of cardiovascular disease. Available: http://www.nice.org.uk/nicemedia/pdf/CG67fullguideline.pdf.
- 3. JBS 2: Joint British Societies' guidelines on prevention of cardiovascular disease in clinical practice (2005) Heart. 91 Suppl 5v1–52 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1876394&tool=pmcentrez&rendertype=abstract. Accessed 2013 December 16.
- 4. Matheny M, McPheeters ML, Glasser A, Mercaldo N, Weaver RB, et al. (2011) Systematic review of cardiovascular disease risk assessment tools [Internet]. Rep No 11-05155-EF-1. Available: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=21796824.
- 5. Anderson KM, Odell PM, Wilson PWF, Kannel WB, Framingham MPH (n.d.) Cardiovascular disease risk profiles P ( T > t ) = p ( log (;‘ -F > u }.
- 6. Woodward M, Brindle P, Tunstall-Pedoe H (2007) Adding social deprivation and family history to cardiovascular risk assessment: the ASSIGN score from the Scottish Heart Health Extended Cohort (SHHEC). Heart 93: 172–176 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1861393&tool=pmcentrez&rendertype=abstract. Accessed 2013 August 22.
- 7. Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, et al. (2007) Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. BMJ 335: 136 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1925200&tool=pmcentrez&rendertype=abstract. Accessed 2013 August 19.
- 8. Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, Minhas R, et al. (2008) Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. BMJ 336: 1475–1482 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2440904&tool=pmcentrez&rendertype=abstract. Accessed 2013 September 1.
- 9. Siontis GCM, Tzoulaki I, Ioannidis JPA (2012) Comparisons of established risk prediction models for cardiovascular disease: systematic review. 3318: 1–11
- 10. Davies M, Khunti K, Webb D, Mostafa S, Gholap N, et al. (2012) Updated. The handbook for vascular risk assessment, risk reduction and risk management. Leicester.
- 11. Rose G (2008) Rose's strategy of preventive medicine. New York: Oxford University Press.
- 12. Williams T, van Staa TP, Puri S, Eaton S (2012) Recent advances in the utility and use of the General Practice Research Database as an example of a UK Primary Care Data resource. Ther Adv Drug Saf 3: 89–99.
- 13. Herrett E, Thomas SL, Schoonen WM, Smeeth L, Hall AJ (2010) Validation and validity of diagnoses in the General Practice Research Database: a systematic review. Br J Clin Pharmacol 69: 4–14 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2805870&tool=pmcentrez&rendertype=abstract. Accessed 2013 September 2.
- 14. Rubin DB (1987) Multiple imputation for nonresponse in surveys. New York: John Wiley.
- 15. Delaney JAC, Daskalopoulou SS, Brophy JM, Steele RJ, Opatrny L, et al. (2007) Lifestyle variables and the risk of myocardial infarction in the general practice research database. BMC Cardiovasc Disord 7: 38 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2241637&tool=pmcentrez&rendertype=abstract. Accessed 2014 June 20.
- 16. Cheng SC, Fine JP, Wei LJ (1998) Prediction of cumulative incidence function under the proportional hazards model. Biometrics 54: 219–228 Available: http://www.ncbi.nlm.nih.gov/pubmed/9544517.
- 17. Wolbers M, Koller MT, Witteman JCM, Steyerberg EW (2009) Prognostic models with competing risks: methods and application to coronary risk prediction. Epidemiology 20: 555–561 Available: http://www.ncbi.nlm.nih.gov/pubmed/19367167. Accessed 2013 August 12.
- 18. Royston P, Sauerbrei W (2008) Multivariable model-building: A pragmatic approach to regression analysis based on fractional polynomials for modelling continuous variables. Chichester: John Wiley & Sons, Ltd.
- 19. Altman DG, Vergouwe Y, Royston P, Moons KGM (2009) Prognosis and prognostic research: validating a prognostic model. BMJ 338: b605 Available: http://www.ncbi.nlm.nih.gov/pubmed/19477892. Accessed 2013 December 26.
- 20. McGraw K, Wong S (1996) Forming inferences about some intraclass correlation coefficients. Psychol Methods 1: 30–46.
- 21. Smolina K, Wright FL, Rayner M, Goldacre MJ (2012) Determinants of the decline in mortality from acute myocardial infarction in England between 2002 and 2010: linked national database study. BMJ 344: d8059 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3266430&tool=pmcentrez&rendertype=abstract. Accessed 26 December 2013.
- 22. Collins GS, Altman DG (2012) Predicting the 10 year risk of cardiovascular disease in the United Kingdom: independent and external validation of an updated version of QRISK2. 4181: 1–12
- 23. Collins GS, Altman DG (2010) An independent and external validation of QRISK2 cardiovascular disease risk score: a prospective open cohort study. BMJ 340: c2442 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2869403&tool=pmcentrez&rendertype=abstract. Accessed 2013 December 26.
- 24. Mihaylova B, Emberson J, Blackwell L, Keech a, Simes J, et al. (2012) The effects of lowering LDL cholesterol with statin therapy in people at low risk of vascular disease: meta-analysis of individual data from 27 randomised trials. Lancet 380: 581–590 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3437972&tool=pmcentrez&rendertype=abstract. Accessed 2013 August 9.
- 25. Ebrahim S, Casas JP (2012) Statins for all by the age of 50 years? Lancet 380: 545–547 Available: http://www.ncbi.nlm.nih.gov/pubmed/22607823. Accessed 2013 September 2.
- 26. Djulbegovic B, Paul A (2011) From efficacy to effectiveness in the face of uncertainty: indication creep and prevention creep. JAMA 305: 2005–2006 Available: http://www.ncbi.nlm.nih.gov/pubmed/21586716. Accessed 2013 December 26.
- 27. Gulliford MC, Charlton J, Ashworth M, Rudd AG, Toschke AM (2009) Selection of medical diagnostic codes for analysis of electronic patient records. Application to stroke in a primary care database. PLoS One 4: e7168 Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2744876&tool=pmcentrez&rendertype=abstract. Accessed 2013 September 2.