Lipoprotein Metabolism Indicators Improve Cardiovascular Risk Prediction

Background Cardiovascular disease risk increases when lipoprotein metabolism is dysfunctional. We have developed a computational model able to derive indicators of lipoprotein production, lipolysis, and uptake processes from a single lipoprotein profile measurement. This is the first study to investigate whether lipoprotein metabolism indicators can improve cardiovascular risk prediction and therapy management. Methods and Results We calculated lipoprotein metabolism indicators for 1981 subjects (145 cases, 1836 controls) from the Framingham Heart Study offspring cohort in which NMR lipoprotein profiles were measured. We applied a statistical learning algorithm using a support vector machine to select conventional risk factors and lipoprotein metabolism indicators that contributed to predicting risk for general cardiovascular disease. Risk prediction was quantified by the change in the Area-Under-the-ROC-Curve (ΔAUC) and by risk reclassification (Net Reclassification Improvement (NRI) and Integrated Discrimination Improvement (IDI)). Two VLDL lipoprotein metabolism indicators (VLDLE and VLDLH) improved cardiovascular risk prediction. We added these indicators to a multivariate model with the best performing conventional risk markers. Our method significantly improved both CVD prediction and risk reclassification. Conclusions Two calculated VLDL metabolism indicators significantly improved cardiovascular risk prediction. These indicators may help to reduce prescription of unnecessary cholesterol-lowering medication, reducing costs and possible side-effects. For clinical application, further validation is required.


Introduction
The Framingham Risk score predicts cardiovascular risk based on six variables: age, diabetes, smoking status, treated and untreated systolic blood pressure, total cholesterol, and HDL (High Density Lipoprotein) cholesterol [1]. Newer lipoprotein measurement methods have attempted to improve risk prediction by quantifying lipoprotein subclasses by size [2][3][4][5][6][7] or density [8] range. However, the lipoprotein size information is little used in clinical practise so far, because its relation to cardiovascular risk is unclear. However, the lipoprotein size information contains implicit information about lipoprotein metabolism, which causes the size distribution. This metabolic information may be relevant for the prediction of cardiovascular disease.
We have developed a computational model to analyze measured lipoprotein subclass profiles in terms of the underlying metabolic activity [9][10][11][12]. Briefly, lipoproteins transport lipids, mainly triglycerides and cholesterol, through the bloodstream. The model includes Apolipoprotein B (ApoB)-containing lipoprotein particles ranging from large Very Low Density Lipoprotein (VLDL) through the smaller Intermediate Density Lipoprotein (IDL) and Low Density Lipoprotein (LDL) particles. Lipoprotein particles are produced by the liver, they lose fat to different tissues and become smaller in the lipolysis process, and they are finally taken up by the liver again. The main proteins responsible for the lipolysis process are Hepatic Lipase (HL) in the liver and Lipoprotein Lipase (LPL) in other tissues. The model can calculate ratios of lipoprotein production, lipolysis, and uptake processes from a single lipoprotein profile measurement; we call these ratios 'lipoprotein metabolism indicators'.
ApoB-containing lipoproteins are proatherogenic because an accumulation of especially small dense LDL particles may lead to plaque formation in veins and arteries. Growing plaques may over time lead to CVD. Small dense LDL particles can form when the liver does not clear LDL particles from the bloodstream effectively. This is a metabolic disorder of the liver, that will also have an effect on VLDL, the metabolic precursor of LDL, because when overloaded the liver will also take up less VLDL, and perhaps produce more VLDL to lose excess fat. Therefore, we hypothesized that adding metabolic information in the form of lipoprotein metabolism indicators to conventional risk factors can improve cardiovascular risk prediction. We evaluated this hypothesis for subjects from the Framingham offspring cohort.

Study Sample and Risk Factors
In this study we used measured information from subjects studied in the 4 th examination of the Framingham Heart Study Offspring cohort, as recorded in the database of Genotypes and Phenotypes (dbGaP) [13]. Subjects were included when they had no history of cardiovascular disease, gave written informed consent for general research use, had complete NMR lipoprotein profiles recorded, and had a complete record of conventional cardiovascular risk factors. Cardiovascular events were carefully recorded during the follow-up period for all subjects.

Computational Modeling
We applied the Particle Profiler computational model [9,10] to NMR lipoprotein profiles [14]. Profiles were based on the original NMR measurements, to which Liposcience's LP3 algorithm was applied. Slight modifications to the previously published Particle Profiler [12] fitting procedure can be found in Text S1 (Methods). We calculated ratios of all modeled processes (lipoprotein production, total lipoprotein lipolysis, HL lipolysis, LPL lipolysis, liver lipoprotein attachment, liver lipoprotein uptake) in each of three sets of lipoprotein size ranges (VLDL through LDL, VLDL only, IDL through LDL). The calculated ratios of modelled processes are lipoprotein metabolism indicators that serve as candidate diagnostics.

Outcomes
Subjects who experienced a general cardiovascular event, as defined by the Framingham Heart Study [14], within 10 years after the NMR measurements, were designated as 'cases', all others as 'controls'. The Framingham definition includes coronary death, myocardial infarction, coronary insufficiency, angina, ischemic stroke, hemorrhagic stroke, transient ischemic attack, peripheral artery disease, and heart failure.

Statistical Analysis
We used a statistical learning algorithm (a nonlinear L2-norm support vector machine [15,16]) to correlate predictor variables with the Cardiovascular Disease (CVD) outcome. This analysis was carried out in order to identify the most predictive 'lipoprotein metabolic indicator' diagnostics, and evaluate their performance.
We grouped the predictor variables into three datasets: 1. conventional cardiovascular risk parameters, without cholesterol (see Table 1); 2. conventional cholesterol parameters (see Table 1) and 3. lipoprotein metabolism indicators (see Text S1 (Methods)). A detailed explanation of the procedure we used for constructing the multivariate model is provided in Text S1 (Methods). In summary, in order to obtain a model similar to the Framingham Risk Score, we selected the six most predictive variables from dataset 1, the two most predictive markers from dataset 2, and further markers from dataset 3. In the first phase, using dataset 1, we included 'age' and 'gender' in the model. We then added in succession those variables that contributed most to improving predictive performance of the model, measured as the area under the Receiver Operating Characteristic (ROC) curve (or C-statistic) [17]. The area under the ROC curve is the conventional statistic used for comparing the predictive performance of diagnostics. A procedure that successively adds the best predicting variables is frequently referred to as ''forward variable selection'' (see e.g. [18]). Having selected the biomarkers from dataset 1, we proceeded in a similar manner with datasets 2 and 3, consecutively adding the most predictive variables to the model. We added markers from dataset 3 that gave a substantial improvement in ROC prediction and that were not correlated with markers already in the model (r 2 ,0.25); this procedure led to inclusion of two additional markers from dataset 3. For comparison, we also included a dataset with the selected markers from dataset 1, plus total and HDL cholesterol. We used a separate training and test-set for marker selection, but evaluated the final result using the complete dataset. All multivariate analyses were performed using Numerical Python.
The results of the multivariate analyses are various predictive models including different diagnostic markers. The CVD risk predictions of these models then need to be compared using suitable statistics. We compared the area under the ROC curve of the various models (DAUC) using the method by de Long and a binomial exact test, calculated in MedCalc, version 11.5.1.0. Also, we used Platt's algorithm to transform the predictions computed by SVM into class probabilities for computing reclassification statistics [19,20]. Reclassification was quantified using the 'Net Reclassification Improvement' (NRI) using 6% and 20% risk cutoffs for the 'medium' and 'high' risk classes and the 'Integrated Discrimination Improvement' (IDI, a risk cutoff-independent method) as suggested by Pencina [21]. The first ROC analysis is the classical comparison of diagnostic power. The reclassification comparison is more sensitive and gives more clinically relevant information, because it measures how people are redistributed over risk categories using the new diagnostics, and evaluates whether that change was correct.

Baseline Characteristics
Of the 2142 selected subjects 145 cases and 1836 controls were found to have a complete record of all relevant parameters and thus were included in the analysis. Baseline characteristics of the subjects are shown in Table 2. The mean age was 4969 years, 52% was female.

Multivariate Models
The variables included in the final multivariate models are shown in Table 3 ) and the influx of particles due to production in the liver and lipolysis of larger particles (J in,VLDL ). The second selected lipoprotein metabolism indicator ln which we call the 'VLDL Hepatic turnover indicator' or VLDL H , is the average of two ratios: that between the rate constant of hepatic VLDL lipolysis (k VLDL hl ) and the VLDL particle production flux (J prod,VLDL ), and that between the rate of VLDL attachment to the liver (k VLDL a,liver ) and the VLDL particle production flux. Further explanation of the mathematical notation of these indicators can be found in Text S1 (Methods). Tables 4 and 5 show the results of a Receiver-Operating-Characteristic (ROC) analysis for general cardiovascular disease. Table 4 displays the area under the curve, its improvement over a predictor drawn at random, and a percentage incremental improvement of the last statistic. Results of the statistical analyses comparing the curves are shown in Table 5. Our method significantly improved CVD prediction over accepted risk markers, as measured by the Area-Under-the-ROC-Curve (DAUC). The improvement of our model versus a model with classical Framingham risk markers, including total cholesterol and HDLc, was DAUC = 0.0177 with p = 0.0055. The improvement of our model versus a model including LDLc and HDLc was DAUC = 0.0150 with p = 0.0067. In comparison, the model including LDLc and HDLc did not significantly improve risk prediction over the model including total cholesterol and HDLc, with DAUC = 0.00268, and p = 0.6003. As expected, adding total and HDL cholesterol to other classical Framingham risk factors did significantly improve risk prediction, with DAUC = 0.0354 and p = 0.0003. The statistical test thus showed that adding lipoprotein metabolism indicators to a model that includes existing cardiovascular risk factors significantly improved the area under the ROC curve for this population, with respect to conventional risk markers. Table 6 shows the results of the reclassification analysis. Risk reclassification, using low, middle, and high risk classes, and also using the category independent methods was significantly improved when including LDLc, HDLc, and VLDL metabolism indicators. The improvement of the model including VLDL metabolism indicators versus the model including classical Framingham risk markers was quantified as NRI = 0.090, with p = 0.014; for the category independent method IDI = 0.051, with p,0.0001. The improvement of the model including VLDL metabolism indicators versus the model including LDLc and HDLc was quantified as NRI = 0.0828, with p = 0.013; for the category independent method IDI = 0.040, p = 0.0004. In comparison, the model including total cholesterol and HDLc versus that including LDLc and HDLc was nonsignificant, with NRI = 0.008 and IDI = 0.011. Adding total and HDL cholesterol to non-cholesterol Framingham risk factors did give a significant improvement, with NRI = 0.111 and p = 0.009; for the category independent method IDI = 0.040, p,0.0001. Lipoprotein metabolism indicators therefore add reclassification power to the NMR lipoprotein profile.

Reclassification analysis
In addition, we calculated NRI reclassification statistics for subjects classified as at 'Intermediate risk' when using Framingham risk markers (Table 7). These subjects would be eligible for drug treatment in primary prevention. The analysis shows how many subjects not experiencing events would not need treatment, and how many experiencing events would be put on more intensive treatment when using the new diagnostics. The NRI was 0.15 (p = 0.0481) when comparing these conventional markers to LDLc and HDLc, and 0.37 (p,0.0001) when comparing them to the model including lipoprotein metabolism indicators. When looking at reclassified events separately (n = 48), the two mentioned methods improved classification by 6% and 13% respectively, but both improvements were non-significant. Importantly, there was a 9% reclassification improvement of non-events (n = 422) when including LDLc and HDLc, and a 25% reclassification improvement of non-events using lipoprotein metabolism indicators, both with p,0.0001. The study therefore shows that 25% of subjects that conventional Framingham risk factors would needlessly include in the 'intermediate risk' category, were reclassified to 'low risk' using lipoprotein metabolism indicators.

Discussion
This is the first study in which 'lipoprotein metabolism indicators' have been used for cardiovascular disease risk prediction. These diagnostics are ratios of lipoprotein production, lipolysis, and uptake processes derived from a single lipoprotein profile measurement using computational modelling. We demonstrate that incorporation of two lipoprotein metabolism indicators significantly improves CVD risk prediction as measured by the area-under-the-ROC-curve. Reclassification is also significantly improved over conventional risk markers. The most important predictor, the 'VLDL Extrahepatic lipolysis indicator' or VLDL E ,is a ratio between the VLDL lipolysis rate related to lipoprotein lipase (LPL) and the influx of particles due to production in the liver and lipolysis of larger particles. As LPL mainly acts extrahepatically, this ratio gives information about the capacity of extrahepatic tissue to absorb triglycerides from VLDL particles in the fasting state. The second indicator, we call the 'VLDL Hepatic turnover indicator' or VLDL H , is the average of two ratios: that between hepatic VLDL lipolysis and VLDL production, and that between VLDL attachment to the liver and  Table 6. Reclassification analysis: Net Reclassification Improvement (NRI) and integrated discrimination improvement (IDI) when comparing the cross-validated multivariate models using all subjects in the dataset. VLDL production. This combined ratio relates to the capacity of the liver to process VLDL particles, both through lipolysis and particle attachment to the liver. Inspection of the risk model (see Text S1, Results) shows that LDLc remains the most important lipoprotein-related predictor of CVD events. HDLc is an important risk modifier, especially when no blood pressure medication is used. When using blood pressure medication, VLDL E becomes important; the lower this indicator, the slower incoming VLDL particles are lipolysed extrahepatically, the higher the risk. VLDL H is most important for determining the border between low and medium risk, especially for men and when not using blood pressure medication; the lower VLDL H , the less hepatic VLDL turnover per produced particle, the higher the risk. These interpretations show that the new risk prediction can be understood in relation to lipoprotein pathophysiology and genetic variation (in LPL and other genes pertinent to VLDL processes).
Examining the reclassification of subjects that were classified as at 'intermediate risk' by Framingham risk factors is of special clinical significance. The intermediate risk group consists of those individuals that should be treated according to international guidelines [21]. Subjects that are reclassified move to either the high risk (more intensive treatment) or low risk (no treatment) groups. Our results show that a net 25% of subjects in this group that will not get cardiovascular disease after 10 years are moved to the low risk group. The reclassification of people with events to the high risk group was not significant, probably due to the low number of cases in this group (n = 48). In other words, there is a group of people that are classified as at 'intermediate risk' using the Framingham risk factors, but of whom we know with hindsight that they do not suffer from a cardiovascular event. When performing a diagnosis using VLDL E and VLDL H , 25% of this subject group is reclassified to the low risk category, and these subjects would therefore not have to take the medication the guidelines prescribe for the intermediate risk category needlessly.
Extrapolating these results to clinical practice directly is not straightforward, most importantly because treatment decisions are most often made based on one or two parameters (such as LDLc and HDLc) and not based on a complete set of risk markers. However, because our multivariate model for the classical Framingham markers is already an improvement over the twovariable approach used in practice, a 25% improvement using our final risk model will most likely be an underestimate for a comparison with a two-variable approach used in the same population. Future studies will need to point out whether the 25% improvement can be validated in other populations, and whether a population with more CVD cases will also yield significant reclassification improvement for cases in the Intermediate risk category. Our methodology can be readily applied to any past studies in which NMR lipoprotein profiles have been measured. Possible subjects of further investigation includes determining risk in younger or older persons, differences in ethnic groups, and the benefits for secondary prevention. The Particle Profiler model can also derive lipoprotein metabolism indicators from other methods for measuring lipoprotein profiles [2][3][4][5][6][7]. Other future investigation can compare the results of modelling the data from these methods.
The current study has one technical limitation that deserves mention: the NMR spectra were recorded with an older version of the technology that is currently available. This limitation does not affect the method to derive lipoprotein metabolism indicators. Because of newer NMR methodology, the accuracy of lipoprotein metabolism indicators will increase in future studies.
In summary, in a sample of 1981 subjects from the Framingham offspring cohort, we found 2 lipoprotein metabolism indicators that together significantly improved general cardiovascular risk prediction, as quantified by the area under the ROC curve and by reclassification statistics. These indicators may help to reduce the number of people that unnecessarily take cholesterol-lowering medication, reducing costs and possible side-effects. Clinical application will require further validation of these findings.

Supporting Information
Text S1 Additional information on methods and results. (DOC)