Risk Models to Predict Chronic Kidney Disease and Its Progression: A Systematic Review

A systematic review of risk prediction models conducted by Justin Echouffo-Tcheugui and Andre Kengne examines the evidence base for prediction of chronic kidney disease risk and its progression, and suitability of such models for clinical use.


Introduction
Chronic kidney disease (CKD) is increasingly common in the US and worldwide [1,2]. Related complications, including endstage renal disease (ESRD) and cardiovascular disease (CVD), have major public health and economic implications [1][2][3]. Screening for CKD has been somewhat controversial in the absence of direct evidence from a randomized clinical trial [4]. However, early identification of individuals with CKD, especially targeting populations with a high risk for CKD and related adverse outcomes [5], followed by the implementation of evidence-based interventions can slow or prevent the progression to advanced stages of the disease, reduce the risk of CVD and other complications of decreased glomerular filtration rate (GFR), and improve survival and quality of life [6]. However, large proportions of individuals with CKD remain undiagnosed and, as a consequence, are not benefiting from those interventions. For instance, in the US, awareness of CKD in the general population remains very low [1]. During the 1999-2004 period, the proportion of US adults with stage 3 CKD who reported being aware of their status was only 11.6% in men and 5.5% in women. Even among men with stage 3 CKD and elevated albuminuria, awareness of weak or failing kidneys was only 22.8%. Among those with stage 4 CKD, the corresponding percentage was 42% for both men and women [1]. In clinical settings, awareness levels are also low. Data from the US National Kidney Foundation's Kidney Early Evaluation Program, for the 2000-2009 period, indicate that only 9% of patients with CKD are aware of their diagnosis [7].
Strategies for early identification and treatment of people with CKD are therefore needed worldwide. The use of complex and potentially expensive detection strategies may prevent those at risk from deriving the benefits of preventative interventions, especially in settings where renal replacement therapy is not readily available. Several risk factors that are independently associated with the occurrence of CKD and easily assessable in routine clinical settings have been incorporated in model equations for predicting the occurrence of CKD or progression in people already diagnosed with CKD. These models have utility even in the context of automatic reporting of the estimated GFR (eGFR). Indeed, recent data indicate that referral to a nephrologist by primary care physicians as the result of making eGFR available mostly occurs for certain subgroups in the population (women and elderly), and a high proportion of referrals are inappropriate [8].
The use of risk models is very attractive and likely cost-effective for large-scale CKD risk stratification, and would allow the identification of all the segments of the population that would benefit the most from CKD detection. To this end, it is very important that existing models are not methodologically flawed, and that they provide accurate estimates of the CKD risk in different populations.
To date, there has been no effort, to our knowledge, to provide decision makers and healthcare providers with a balanced account of the performance of existing CKD risk models. We therefore systematically reviewed studies of risk equations to predict CKD or its progression, with the objectives of summarizing evidence on their performance and exploring methodological issues surrounding their development and validation and application.

Methods
We performed literature searches to identify all risk models developed to predict the presence/occurrence of CKD, or to predict the progression of CKD in those with the disease. We also searched for all studies that applied existing CKD risk models either in the population from which the model was developed or in different populations, and, lastly, we searched for all impact studies and clinical practice guidelines that incorporated existing CKD risk models.

Model Development and Validation Studies
Data sources and search strategy. We searched the PubMed MEDLINE and Embase databases from 1 January 1980 to 20 June 2012, for English-or French-language studies of CKD risk prediction model development and/or validation. We used a combination of search terms related to CKD and prediction. The search strategies are provided in detail in Texts S2 and S3. In addition, we manually searched the reference lists of eligible studies and relevant reviews, and traced studies that had cited them through the ISI Web of Science to find additional published and unpublished data.
Study selection. Two evaluators (J. B. E. and A. P. K.) independently identified articles and sequentially screened them for inclusion ( Figure 1). Where necessary, the full text of articles and/or supplemental materials (tables and appendices) was reviewed before deciding on inclusion. Disagreements were solved by consensus between both authors.
Eligible articles had to report a risk assessment tool (equation and/or score) for predicting CKD or its progression, derived in adult human populations. Reporting of quantitative measures of the performance of tools was preferable, but not necessary for inclusion. The reported metrics of evaluation of predictive ability could be the area under the receiver operating characteristic curve (AUC) or C-statistic, reclassification percentage, net reclassification improvement (NRI), or integrated discrimination improvement index (IDI). These metrics are recognized and used for the assessment of prediction models [9,10]. We excluded studies that reported only measures of association between risk factors and CKD without information on the beta coefficients of variables included in a prediction equation, and simulation studies.
Data extraction and quality assessment. Two reviewers (J. B. E. and A. P. K.) independently conducted the data extraction and quality assessment. We did not use a particular framework for quality assessment, as there is no consensus on a quality assessment framework for risk prediction models. Consequently, we did not develop a formal protocol for the review (Text S1). From each study, we extracted data on study design, setting, population characteristics, the number of patients in the derivation and validation cohorts, the number of participants with the outcome of interest, the number of candidate variables tested as predictors, and the number and list of those variables included in the final model, as well as the type of statistical model used. For the discriminative performance of models, we extracted information on the AUC or C-statistic, which indicates the ability of a risk model to rank-order individuals' risks. To describe model calibration, we extracted data on the difference between the observed and predicted rates of CKD, as well as the p-value of the corresponding test statistic. Measures of calibration assess the ability of a risk prediction model to predict accurately the absolute level of risk that is subsequently observed.
For the assessment of reclassification, we extracted the NRI and IDI values, and the accompanying 95% CIs and p-values, when available. Reclassification analyses generally indicate the proportion of individuals who are reclassified from one risk stratum (based on estimated risk provided from a first model) to a different risk stratum (based on estimated risk from a different model, or a model that has additional variables compared with the first model). The IDI measures the extent to which the use of a new risk marker correctly revises upward the predicted risk of individuals who experienced the event of interest and correctly revises downward the predicted risk of individuals who did not experience the event.
Data synthesis. Given the wide range of metrics used for the assessment of the predictive ability of CKD risk models, and the heterogeneity in both the risk factors used for prediction and their number, as well as the study designs, we opted to conduct a narrative synthesis of the evidence instead of a metaanalysis.

Impact Studies and Implementation of Risk Models in Guidelines
Impact studies were captured by (1) scanning those publications identified through the search strategy for model development and validation, and (2) applying the search strategy for impact studies proposed by Reilly and Evans [11], which combines the model's acronym, name of the cohort, or first author with a specific search term (Text S3). We searched relevant clinical practice guidelines to investigate the implementation of CKD prediction models in countries in which such models have been developed. In the absence of validated strategies for these types of searches, we targeted guidelines (when available in English language) compiled by a selection of organizations known to be involved in issues relating to kidney diseases, including the American Society of Nephrology (http://www.asn-online.org), the US National Kidney Foundation [12], the UK National Institute for Health and Clinical Excellence [13], the International Society of Nephrology [14], the European Renal Association-European Dialysis and Transplant [15], the Canadian Society of Nephrology [16], Kidney Disease: Improving Global Outcomes [17], The Korean Society of Nephrology (http://www.ksn.or.kr/english/), the Japanese Society for Dialysis Therapy [18], The Japan Association of Chronic Kidney Disease Initiatives (J-CKDI) [19], and the Taiwan Society of Nephrology [20]. Figure 1 describes the study selection process. Of the citations identified through searches, 210 abstracts were selected for indepth evaluation, and 46 full-text publications were reviewed. After all exclusions, 26 articles, reporting on 30 CKD prediction risk scores and 17 CKD progression risk scores, met the eligibility criteria and were included in the review. Table 1 summarizes data from studies that developed CKD risk prediction models. Five of the 30 CKD risk prediction models were developed using cross-sectional data (thus, prevalent CKD) [21][22][23][24], and the remaining models were based on cohort studies.

CKD Prediction Risk Scores
Populations, outcomes, and risk factors. The majority of the 30 CKD risk models were developed from samples that mostly included white individuals, and only four studies included exclusively Asian participants [23][24][25][26]. The number of participants included in the studies ranged from 534 to 1.6 million, and their ages ranged from 18 to 90 y. The length of follow-up in the cohort studies ranged from 1 to 10 y.
The definition of CKD was fairly consistent across prediction models (eGFR,60 ml/min/1.73 m 2 ), although nine models focused on predicting diabetic nephropathy [22], and another on CKD prediction among HIV-positive individuals [26]. The included risk models used the Modification of Diet in Renal Disease (MDRD) Study equation to estimate GFR, with the exception of models from the ADVANCE study [27], which used estimates from the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation. The original MDRD equation is eGFR (ml/min/1.73 m 2 ) = 1756standardized Scr (mg/ dl) 21 Ten studies provided usable data on the numbers of candidate variables tested for inclusion in the models. This number ranged from one to 24, giving conservative estimates of the ratio of the number of observed events (outcome of interest) to the number of candidate variables ranging from six to 166. The predictors most commonly included in the final prediction models were age, sex, body mass index, diabetes status, systolic blood pressure, serum creatinine, a measure of proteinuria, and serum albumin or total protein (Table S1). Three studies used novel biomarkers or genetic or circulating factors [22,30,31]. Eighteen models were derived using logistic regressions, and three using Cox regressions. All studies reported the original model with beta coefficients, and five studies presented additional point-based scoring systems [21,27,32], or risk calculators [33,34].
Performance of risk prediction models. Table 1 shows the performance of the various CKD risk models. All the included studies reported a C-statistic ranging from 0.57 to 0.88, indicating a modest-to-good discriminatory performance. Nine risk scores were internally validated, through split-sample validation in four cases (three of these were also externally validated), and bootstrapping in five other studies. Twelve risk models had an estimate of calibration: Hosmer-Lemeshow test statistics in most cases, which generally indicated good calibration.
CKD model improvement. Four studies assessed model improvement subsequent to adding extra variables. One study reported a significant improvement after adding circulating biomarkers (aldosterone and homocysteine) to traditional CKD risk factors [30]; the difference in AUC was 0.012 (p = 0.00233), NRI 6.9% (p = 0.0004), and IDI 0.013 (p = 0.004). The second study reported an AUC difference of 0.001 (p = 0.2) for adding genotypic information (16 single nucleotide polymorphisms) to known risk factors [31]. The third study reported no statistically significant improvement from adding uric acid, postprandial glucose, hemoglobin A 1c , and proteinuria $ 100 mg/dl to traditional risk factors, with nonsignificant differences in AUC (20.003), NRI (20.0889), and IDI (0.0141) [25]. The last study found that a model for predicting major renal events using eGFR and albumin/creatinine ratio (ACR) (AUC: 0.818) was superior to models with either of the predictors alone (AUC: 0.779 for eGFR, and 0.752 for ACR); all three models were inferior to an expanded model with five additional variables (AUC: 0.847) (all p,0.05 for AUC comparison) [27]. In the same study, the eGFR+ACR (AUC: 0.629) and ACR alone (AUC: 0.627) models had similar performance for predicting new-onset albuminuria; both were superior to the eGFR alone model (AUC: 0.543) (both p,0.05), while all three were inferior to an extended model (AUC: 0.647) with six extra variables (all p,0.05 for AUC comparison) [27].
Validation of CKD risk prediction models. Table 2 shows the results of the external validation of CKD risk models. Only eight of the models were externally validated. Of these, only four models were validated more than once: twice for three models [25,[34][35][36] and three times for one model [21,37,38]. The AUC in validation studies (0.57 to 0.88) was generally lower than that in the derivation sample; the change from the original C-statistic from when the model was first derived ranged from 20.2 to +0.06 (Table 2), being negative or null except in two cases of validation of one score where it was positive [25], thus indicating a generally lower discrimination in validation populations. In the validation populations, the calibration was also poorer, though it was not assessed in most of validation studies. Table 3 shows the models for the prediction of progression to later stages among people with already established CKD. We found 17 CKD progression risk scores, developed from Cox regression models using data from clinical settings, mainly in white populations. Two of the CKD progression risk scores were developed from a cohort of people with type 2 diabetes and nephropathy [39,40], and three other scores used cohorts of people exclusively with IgA nephropathy [41][42][43]. The risk factors included in CKD progression risk models varied. The number of candidate variables tested for inclusion in the models ranged from ten to 24, corresponding to a ratio of number of observed events (outcome of interest) to number of candidate variables of four to 16. For one risk model, the performance in the derivation sample was not reported [39], although the performance of the score was later assessed in a validation study conducted in a different population. When evaluated, the Cstatistic of these models ranged from 0.56 to 0.94, and calibration (reported for two models only) was good. In addition to reporting beta coefficients for regression models, four studies also provided a point-based scoring system [42][43][44] or a risk calculator [45].

Risk Scores for Predicting Progression of CKD to ESRD
As shown in Table 4, five of the CKD progression risk models were externally validated (C-statistic: 0.83 to 0.91); the change in C-statistic from the original value when the model was first developed ranged from 20.1 to +0.03. This change was negative in all but one case, thus indicating a generally poorer discrimination.
Two studies investigated the improvement of three different CKD progression models [33,40], after adding biomarkers to traditional risk factors (serum bicarbonate and phosphate in one case [33], and Troponin T plus brain natriuretic peptide in the two other cases) [40]. The change in C-statistic or AUC varied from 0.01 to 0.02, and NRI from 16.9% to 26.7%.

Impact Studies and Incorporation of CKD Prediction Models in Clinical Practice Guidelines
We found no evidence in guidelines of recommendations for using CKD risk prediction models to estimate the risk in patients either in clinical or community settings. We also did not find any studies assessing the impact of adopting CKD (occurrence and progression) risk scores in clinical practice on the process of care and outcomes of patients.

Discussion
This systematic review shows that a sizeable number of renal risk prediction models have been developed, with, however, variation in their quality. Reasons for this may be specific to nephrology, where risk prediction is still in its infancy and the methodology for predictive research may be underappreciated. Despite the heterogeneity of CKD, with several specific forms, this review demonstrates the feasibility of defining individual renal risk using a combination of commonly assessed variables. Indeed, there was remarkable similarity between the variables that entered the prediction models (Tables S1 and S2), each developed in a distinct group of participants, sometimes with specific forms of CKD. The discriminative performance of existing models was generally acceptable-to-good on the derivation sample. However, when corrected for overfitting (internal validation) or tested in a new population (external validation), this discriminative performance was modest-to-acceptable. For CKD risk prediction, the SCORED model appears to be the most reliable, as it is the most externally validated model, with a reasonable discrimination [21]. Regarding CKD progression, no risk model has been extensively validated in different populations. Potential Public Health and Clinical Applications of CKD Risk Models Risk prediction models have potential applications in the prevention and management of CKD. Risk communication to patients may motivate them for lifestyle modification and adherence to prescribed therapies. Using models for predicting progression of CKD, clinicians may be able to tailor diseasemodifying therapies as well as frequency of monitoring to individual risk. Indeed, therapies for controlling several variables included in CKD progression models (e.g., diabetes and hypertension) have been shown to delay CKD progression. Furthermore, using CKD progression models to identify patients who are most likely to need renal replacement therapy would allow patient education on available therapeutic options. CKD risk scores may be useful in the assessment of novel technologies or biomarkers for risk prediction, or for patient recruitment in prevention trials. They can also serve in mass screening and public education initiatives. For all these applications, estimates of CKD risk from prediction models must be accurate and validated.

Development of Existing CKD Risk Prediction Models
The performance of prediction models is largely determined by the appropriateness of the methodological approaches used to develop them. Virtually none of the existing CKD models was developed using data specifically collected for risk modeling purposes. This may raise concerns about the quality of the predictors and outcomes tested/included in the models, as well as the completeness of measurements. Lessons learned from CVD prediction suggest that the source of data for model development matters less, provided that the ensuing model can reliably predict the outcome of interest in different populations [46]. Indeed, in practice, assembling data only for the purpose of modeling can be challenging, and researchers tend to rely on available data collected for other reasons [9]. At least four of the models were likely statistically underpowered, based on having a ratio of the number of outcomes to the number of candidate predictors of ,8 [24,26,40,41,47]. The performance of such models tends to drop substantially when the model is applied to different populations [24]. Other mistakes that affect model performance were present across studies, including dichotomization of continuous variables prior to modeling, linearity assumptions without formal testing, and exclusion of participants with missing values on predictor/ outcome variables.

Internal Validation of Existing CKD Prediction Models
One model was published without indicators of performance during the derivation process [39]. Most models provided measures of performance, which were based on the direct application of the model to the derivation sample (apparent performance). This approach is optimistic (self-fulfilling prophecy). Some models provided performance measures from internal splitsample or bootstrap validation, which may provide the new user with an idea about what to expect when applying the model to different populations. When reported, discrimination was always good for CKD progression models, and acceptable-to-good for prevalent/incident CKD models, indicating that these models were able to differentiate participants with CKD from those without in the derivation sample. Calibration, a key property of model performance, was less commonly assessed during the derivation process. Whether calibration performance of a model in one population can inform its behavior in another population is still debated. However, there is a growing agreement that, because calibration is largely affected by the background risk, which varies across populations, models need to be updated through recalibration procedures to provide accurate estimates of the risk in new populations. There have been attempts to update some of the existing CKD models, but the procedures used (addition of extra variables) have focused on improvement in discriminatory performance [25,27,30,31], and only one study reported change in the calibration properties [27].

External Validation of Existing CKD Risk Prediction Models
The demonstration of the performance of a model in new populations is an important step before recommending its widespread use. A limited number of existing CKD prediction models have been tested on different populations [21,22,25,[32][33][34][35]45]. Validation studies have mainly been conducted by the same group of investigators who developed the models. This is methodologically inferior and quantitatively insufficient to provide good indicators of models' behavior in various populations. Hence, more validation studies of existing models are needed, ideally by different investigators, to guarantee their generalizability to a larger number of people. Instead of developing new models for their own setting, investigators in the field of CKD may consider integrating aspects of the validation of existing models into future studies. In addition to providing indicators of the performance of existing models in various settings, such an approach limits unnecessary development of new models.

Implementation of Existing CKD Prediction Models
CKD models have largely been published in the form of mathematical equations, with point-scoring systems [21,32,[42][43][44] or calculators [33,34,45] for a few. The mathematical format may not be suitable for application in various settings, particularly by busy clinicians who may be less familiar with manipulating complex formulas. Translation efforts are therefore needed to convert accurate and validated CKD prediction equations into simple tools that can improve their uptake in various settings [33]. Some context-specific efforts may also be required to derive appropriate cutoffs for defining high-risk status when models are integrated in guidelines for screening. It is, however, important to confirm whether the implementation of CKD risk prediction models affects the behavior of healthcare providers and improves outcomes of care. At present, no implementation study of CKD risk prediction models has been conducted.
Published studies have relied on GFR estimated from the MDRD equation to define CKD [28]. The MDRD equation provides less accurate estimates of GFR in different ethnic groups, compared with estimates derived from the more recent CKD-EPI equation [29], resulting in ''over-diagnosis'' of CKD using the MDRD equation. There have been suggestions that this overdiagnosis may have little effect on estimates of the association between risk factors and CKD outcomes [24,32] and, accordingly, on discriminatory performance when models developed to predict the outcome of CKD based on the MDRD equation are applied to the outcome of CKD based on the CKD-EPI formula. However, the difference in prevalence/incidence of CKD based on the two formulas will invite recalibration of MDRD equation-based models to improve their applicability with the increasing international adoption of CKD-EPI estimates of GFR for CKD diagnosis.
Participants in the reviewed studies were overwhelmingly white. A homogenous population does not allow researchers to probe into the whole scope of the variability in CKD risk. This is even more important for CKD than for other diseases, as some ethnic groups are particularly prone to CKD (e.g., African-Americans), and the use of risk stratification tools in these groups may be more warranted. Future studies should therefore incorporate more participants of different ethnic backgrounds.

Strengths and Limitations of the Review
The strengths of this review include the exclusion of studies that reported only effect estimates for independent association of risk factors with CKD. These measures alone provide no information on model calibration and global discriminative performance. The case for predictive testing depends not merely on the magnitude of the risk ratio, but also on the extent to which the test results are useful for improving prediction of disease when various risk factors are accounted for. This systematic review may also help policy makers decide whether to incorporate risk tools in guidelines for screening, routine evaluation, and management of CKD. Such an inclusion may be premature at this point in time, particularly in the absence of extensive external validation studies and impact analyses. We did not explicitly rank or categorize the quality of existing CKD risk models, mindful that there is no agreed-on scientific system for rating risk prediction model quality. Some will argue that minimizing risk for potential bias is of critical importance, while others might support the view that a risk score should be judged on its ability to perform accurately across diverse settings. Finally, our ability to assess publication bias was limited.

Conclusion
This review suggests that risk models for predicting CKD or its progression have a modest-to-acceptable discriminatory performance, but would need to be better calibrated and externally validated-and the impact of their use on outcomes assessedbefore these are incorporated in guidelines. Their potential application for screening or management to identify CKD in a heterogeneous population will also depend on the context. In the US, for example, the adoption of the Kidney Disease Outcomes Quality Initiative guidelines has led to systematic reporting of eGFR by laboratories whenever serum creatinine is requested. Consequently, a certain degree of de facto opportunistic CKD screening is happening. In such a context, risk scores for predicting CKD progression or outcomes would be particularly useful for defining prognosis in identified people. However, an important fraction of the population at high risk of CKD without access to care could still be identified in the community using CKD risk prediction tools.

Supporting Information
Table S1 Factors included in models of risk prediction for chronic kidney disease. (DOC)

Editors' Summary
Background. Chronic kidney disease (CKD)-the gradual loss of kidney function-is increasingly common worldwide. In the US, for example, about 26 million adults have CKD, and millions more are at risk of developing the condition. Throughout life, small structures called nephrons inside the kidneys filter waste products and excess water from the blood to make urine. If the nephrons stop working because of injury or disease, the rate of blood filtration decreases, and dangerous amounts of waste products such as creatinine build up in the blood. Symptoms of CKD, which rarely occur until the disease is very advanced, include tiredness, swollen feet and ankles, puffiness around the eyes, and frequent urination, especially at night. There is no cure for CKD, but progression of the disease can be slowed by controlling high blood pressure and diabetes, both of which cause CKD, and by adopting a healthy lifestyle. The same interventions also reduce the chances of CKD developing in the first place.
Why Was This Study Done? CKD is associated with an increased risk of end-stage renal disease, which is treated with dialysis or by kidney transplantation (renal replacement therapies), and of cardiovascular disease. These life-threatening complications are potentially preventable through early identification and treatment of CKD, but most people present with advanced disease. Early identification would be particularly useful in developing countries, where renal replacement therapies are not readily available and resources for treating cardiovascular problems are limited. One way to identify people at risk of a disease is to use a ''risk model.'' Risk models are constructed by testing the ability of different combinations of risk factors that are associated with a specific disease to identify those individuals in a ''derivation sample'' who have the disease. The model is then validated on an independent group of people. In this systematic review (a study that uses predefined criteria to identify all the research on a given topic), the researchers critically assess the ability of existing CKD risk models to predict the occurrence of CKD and its progression, and evaluate their suitability for clinical use.
What Did the Researchers Do and Find? The researchers identified 26 publications reporting on 30 risk models for CKD occurrence and 17 risk models for CKD progression that met their predefined criteria. The risk factors most commonly included in these models were age, sex, body mass index, diabetes status, systolic blood pressure, serum creatinine, protein in the urine, and serum albumin or total protein. Nearly all the models had acceptable-to-good discriminatory performance (a measure of how well a model separates people who have a disease from people who do not have the disease) in the derivation sample. Not all the models had been calibrated (assessed for whether the average predicted risk within a group matched the proportion that actually developed the disease), but in those that had been assessed calibration was good. Only eight CKD occurrence and five CKD progression risk models had been externally validated; discrimination in the validation samples was modest-toacceptable. Finally, very few studies had assessed whether adding extra variables to CKD risk models (for example, genetic markers) improved prediction, and none had assessed the impact of adopting CKD risk models on the clinical care and outcomes of patients.
What Do These Findings Mean? These findings suggest that the development and clinical application of CKD risk models is still in its infancy. Specifically, these findings indicate that the existing models need to be better calibrated and need to be externally validated in different populations (most of the models were tested only in predominantly white populations) before they are incorporated into guidelines. The impact of their use on clinical outcomes also needs to be assessed before their widespread use is recommended. Such research is worthwhile, however, because of the potential public health and clinical applications of well-designed risk models for CKD. Such models could be used to identify segments of the population that would benefit most from screening for CKD, for example. Moreover, risk communication to patients could motivate them to adopt a healthy lifestyle and to adhere to prescribed medications, and the use of models for predicting CKD progression could help clinicians tailor disease-modifying therapies to individual patient needs.