Bilirubin levels and kidney function decline: An analysis of clinical trial and real world data

Objective To evaluate if previously found associations between low serum bilirubin concentration and kidney function decline is independent of hemoglobin and other key confounders. Research design and methods Clinical trial data from the SAVOR-TIMI 53 trial as well as the UK primary care electronic healthcare records, Clinical Practice Research Datalink (CPRD), were used to construct three cohorts of patients at risk of chronic kidney disease (CKD). The randomized clinical trial (RCT) cohort from the subset of SAVOR-TIMI 53 trial consisted of 10,555 type-2 diabetic patients with increased risk of cardiovascular disease. The two observational data cohorts from CPRD consisted of 71,104 newly diagnosed type-2 diabetes (CPRD-DM2) and 82,065 newly diagnosed hypertensive (CPRD-HT) patients without diabetes. Cohorts were stratified according to baseline circulating total bilirubin levels to determine association on the primary end point of a 30% reduction from baseline in estimated glomerular filtration rate (eGFR) and the secondary end point of albuminuria. Results The confounder adjusted hazard ratios of the subpopulation with lower than median bilirubin levels compared to above median bilirubin levels for the primary end point were 1.18 (1.02–1.37), 1.12 (1.05–1.19) and 1.09 (1.01–1.17), for the secondary end point were 1.26 (1.06–1.52), 1.11 (1.01–1.21) and 1.18 (1.01–1.39) for SAVOR-TIMI 53, CPRD-DM2, CPRD-HT, respectively. Conclusion Our findings are consistent across all cohorts and endpoints: lower serum bilirubin levels are associated with a greater kidney function decline independent of hemoglobin and other key confounders. This suggests that increased monitoring of kidney health in patients with lower bilirubin levels may be considered, especially for diabetic patients.


Introduction
Emerging treatment options for chronic kidney disease (CKD) require clinicians to make decisions on which treatments and when to start them. This is particularly critical given that the stage of disease at diagnosis and the kidney function decline can vary significantly between patients. Current approaches to identifying CKD patients early, especially those most likely to progress quicker are limited.
Bilirubin is a degradation product of hemoglobin that is excreted following conjugation by glucuronosyltransferase 1A1 (UGT1A1) into the bile duct and is used as a marker of liver function or hemolytic diseases. Recently, bilirubin has gained attention as an endocrine molecule rather than just a waste product of heme metabolism; besides well-known antioxidative effects, bilirubin has been shown to have antithrombotic, lipid lowering, immunomodulating, and insulin sensitizing properties [1,2]. Observational clinical studies have shown that subjects with mildly elevated circulating bilirubin levels have a reduced risk for cardiovascular disease, type 2 diabetes, and respiratory disease [3][4][5].
Similarly, other observational studies have shown associations between naturally elevated circulating bilirubin levels and a slower CKD progression in patients with and without diabetes [1,. This association has however been mainly demonstrated in Asian populations; hence, it is uncertain if the results can be expanded to non-Asian populations. To evaluate a potential association between circulating bilirubin levels on CKD progression and outcomes it is important to account for known factors impacting bilirubin levels as well as CKD. For example, bilirubin and hemoglobin levels are highly associated; low bilirubin may therefore be a consequence of anemia commonly seen in CKD. However, prior to this investigation, only Mashitani et al. accounted for hemoglobin when evaluating the association between circulating bilirubin levels and CKD, showing that the associations on CKD outcomes were statistically insignificant. Consequently, the question on the association of bilirubin versus hemoglobin levels on CKD outcomes and their applicability to non-Asian populations remains unanswered.
To evaluate the association of circulating bilirubin levels with development of CKD in subjects with varying risks (established diabetes with additional cardiovascular risk, newly diagnosed diabetes, and newly diagnosed hypertension), we compiled the largest data set to date, consisting of 163 724 racially diverse patients using a mix of real-world data from primary care electronic health records in the United Kingdom (UK) and data from a well-controlled clinical trial (SAVOR-TIMI 53) [29].

Data source
We conducted a retrospective analysis of a cohort from the study of Saxagliptin Assessment of Vascular Outcomes Recorded in Patients with Diabetes Mellitus (SAVOR)-Thrombolysis in data for secondary research with use of a variety of analytical tools such as SAS, R, and other standards through a remote desktop solution. This environment allows AstraZeneca to fulfil legal and ethical obligations (under GDPR and other global privacy requirements) as a good data steward, protecting the data from download and potential misuse. The disclosure is based on consent and appropriate due diligence by the sponsor. Use of the data is restricted to the named users approved for the request and is made available to the requestor for one year from the date access was granted. CPRD dataset The data that support the findings of this study are available from CPRD but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Access to data from CPRD is subject to a full licence agreement containing detailed terms and conditions of (see https://cprd.com/ for access details).

PLOS ONE
Myocardial Infarction (TIMI) 53 [29] and two cohorts selected from the Clinical Practice Research Datalink (CPRD). The SAVOR-TIMI 53 trial was a multicenter phase 4 clinical trial to evaluate the efficacy of saxagliptin on major adverse cardiovascular events (MACE) in patients with type 2 diabetes. The SAVOR-TIMI 53 trial included patients with established cardiovascular disease or multiple MACE risk factors. A subset of the dataset including both placebo and active arms was approved for this secondary analyses (excluding patients who did not sign the informed consent form for secondary use of data and the patients from countries where secondary usage of a dataset is not permitted). The secondary usage of this dataset was approved by AstraZeneca (study code: INT-20190919-235).
The CPRD database contains coded and anonymized electronic health records from primary care practices. CPRD offers a quality-assured source of longitudinal and representative real-time UK population health data for epidemiological and pharmaco-epidemiological research. The CPRD is representative of the general UK population in terms of age, sex and ethnicity [30][31][32] CPRD includes a UK-wide network of over 1,300 primary care practices and includes over 35 million patient records of which 11 million are currently registered active patients with at least 20 years of follow-up for 25% of the patients. (from https://www.cprd. com/ accessed on April 4th, 2019) CPRD requires the approval of the study protocol including statistical analyses plan prior to the release of the data hence all our analyses on CPRD is based on the prespecified approved analyses plan. Ethical approval and study protocol approval were granted by the CPRD scientific committee and the National Information Governance Board of Ethics and Confidentiality Committee (Approved by Independent Scientific Advisory Committee for MHRA database research ID number 19_144, protocol attached as a S1 File).

Patients
We used the dataset of randomized patients in the SAVOR-TIMI 53 trial with consent for secondary data analyses (n = 15,816). The baseline was defined as the values recorded from the visit immediately prior to randomization. In total 5,261 (33%) subjects were removed from the analysis dataset; the most common reason being anemia (n = 3,027, 19%) followed by liver abnormality (n = 770, 5%), lack of bilirubin measurement at baseline (n = 340, 2%) or other abnormalities at the baseline (See Fig 1 for the complete patient disposition diagram with detailed exclusion criteria.). Anemia is defined as hemoglobin less than 120g/L for female and less than 130g/L for men. Liver abnormality is defined as outside of the normal range of AST and ALT (AST: 8-48U/L, ALT: 7-55U/L). In this paper we refer this subset of dataset of the SAVOR-TIMI 53 trial as "SAVOR cohort".
The CPRD dataset was used to constructed two cohorts at risk for decreasing kidney function: a cohort of patients with newly diagnosed type 2 diabetes (CPRD-DM2) and a cohort of non-diabetic patients with newly diagnosed primary hypertension (CPRD-HT) without diabetes. In CPRD-DM2, all patients with diagnoses dates of type 2 diabetes between January 2005 and January 2016 and with a clinical record at least one year prior and three years after the diagnoses date were included. The type 2 diabetes diagnosis was defined by the presence of a clinical record of Read code [33] subchapter C10, or two prescriptions of oral anti-diabetic medication or one glycated hemoglobin (HbA1c) lab value greater than 6.5%. The baseline values for the continuous covariates are defined as the average of all the measurements available during the acute disease period (within two years following diagnoses). The baseline value used for urine protein is defined as the greatest value observed using dipstick during the acute disease period. Out of 246,034 patients with type 2 diabetes diagnosis, 71,104 (29%) met the additional inclusion criteria for the CPRD-DM cohort. Lack of bilirubin measurements at baseline (n = 38,819, 16%), liver-related comorbidity (n = 35,527, 14%) and anemia (n = 32,789, 13%) were the most common reasons for not meeting the inclusion criteria. See Fig 1 for the complete patient disposition diagram and attached study protocol for the technical definitions of these exclusion criteria.
In CPRD-HT, all patients with a first clinical record of primary hypertension defined by Read code subchapter G2 between January 2005 and January 2016 were included. Of the available 427,297 patients with a date of first diagnoses of primary hypertension, 82,065 (19%) met the remaining inclusion criteria. The main reasons for not meeting the inclusion criteria was lack of sufficient bilirubin measurements at baseline (n = 149,534, 35%), prior diagnosis of diabetes to hypertension diagnosis (n = 48,567, 11%), insufficient follow-up (n = 40,944, 10%) and anemia (n = 32,769, 8%). See Fig 1 for the complete patient disposition diagram.

Measurers
2.3.1. Exposure. Patients were divided into either high or low circulating total bilirubin groups using the population median levels at baseline. For the SAVOR cohort, the baseline measurements were defined as the lab measurements and clinical observations immediately before randomization. For CPRD the baseline serum bilirubin levels were determined as the average of all available measurements during the acute disease period (2 years after the rolling start of the study date, cf. S1 Fig). At least two bilirubin measurements were required to obtain the average. We refer to the group with greater than or equal to median circulating bilirubin levels (9μmol/L for SAVOR and 10μmol/L for CPRD-DM2 and CPRD-HT) as the 'high bilirubin group' and the group with less than median circulating total bilirubin levels as the 'low bilirubin group'.

Outcome (UACR and eGFR outcome definitions).
The primary endpoint was the first observation of an estimated glomerular filtration rate (eGFR) (calculated using the Chronic Kidney Disease Epidemiology Collaboration equation (CKD-EPI) [34]) decrease of greater than 30% from baseline. For the CPRD cohorts we required a sustained 30% reduction, defined as more than two consecutive 30% decreases from baseline. For the SAVOR cohort, we count the single occurrence of a 30% decrease as an end point (due to the lower frequency of observations for each patient). Subjects without baseline eGFR (CKD-EPI) were excluded.
The secondary end point was the first observation of urine albumin creatinine ratio (UACR) greater than 30mg/mmol. Due to the frequency of available UACR measurements, we used a single occurrence of such event as an end point. Patients without UACR records at baseline were included, whilst patients with a baseline UACR greater than 30mg/mmol were exclude.

Statistical analysis
Baseline variables were summarized for each cohort and high/low bilirubin group.
The standardized difference for each covariate with respect to the high/low bilirubin groups was calculated to identify potential imbalanced covariates between each group [35]. A directed acyclic graph (DAG) analysis of covariates was conducted to identify confounders and mediators among available baseline covariates [36,37]. In DAG analyses, we first created nodes for all the covariates included in the analyses, then second joined the nodes with directed edge by the known or suspected causal effect (regardless of the effect size). Once the diagram is generated, each covariate is assessed either if it is a mediator or a confounder by applying the definition (if the covariate has causal effects to both bilirubin and endpoint then it is a confounder, if the bilirubin has causal effect to the covariate and the covariate has the causal effect to endpoint then it is a mediator).
A multivariable logistic regression model with confounder (and not mediator) as covariates was used to calculation the propensity scores. Then the propensity scores of high bilirubin are used to calculate stabilized inverse probability weights (IPW). More specifically for the individual in the high bilirubin group, the IPW is calculated by the probability of the patient in the high bilirubin group divided by the propensity score (the probability of patient in the high bilirubin group given the covariate calculated based on the logistic regression analysis). The IPW for the individual in the low bilirubin group can be calculated similarly.
The covariate balance between the low and high bilirubin groups after IPW was diagnosed by recalculating the standardized differences with IPW.
Multivariable Cox's proportional hazard model with confounder (and not mediator) as covariate with stabilized inverse probability weighting scheme [35] was used. We included confounders identified by the DAG analysis in the final multivariable Cox model. Missing covariates were handled using the missing indicator method [38] so that the propensity score can be calculated.
To account for between-subject correlations we applied a case sampling bootstrap method to compute the mean and variance of the HR and computed the confidence interval. All the analyses are done for each cohort separately.

Sensitivity analyses
The analysis of the primary and secondary end points was conducted with and without use of IPW and adjustment for baseline risk factors. Sensitivity of the result to the definition of high/ low bilirubin groups was evaluated by dividing subjects into 5 or 10 bilirubin quantiles and computed the HR considering the highest serum bilirubin level group as the reference. Prespecified subgroup analyses were performed to identify the interaction between the subgroup and the hazard ratios. The followings are predefined subgroups: younger or older than 75 years old, male or female, smoker or no-smoker, CKD stages, albuminuria stages, ACE inhibitor use, statin use, diuretic use, pottasium sparing diuretics use and race.

Patient characteristics
The baseline characteristics of each cohort and high/low bilirubin group are described in Table 1. The SAVOR cohort had the highest HbA1c level whilst CPRD-HT had the lowest. Mean eGFR was comparable across all three cohorts; however, UACR was significantly higher in the SAVOR cohort. Both CPRD cohorts had a higher proportion of females than the SAVOR cohort. The CPRD cohorts had a slightly greater proportion of smokers than the SAVOR cohort.
Distributions of Baseline serum bilirubin levels were similarly distributed across all cohorts (Fig 2). Factors identified as confounders through directed acyclic graph analyses were adjusted for, as the focus of this investigation was on the total association of bilirubin level on the risk of CKD (Fig 3). The identified confounders were hemoglobin, sex, age, body mass index (BMI), alanine transaminase (ALT), aspartate transaminase (AST), smoking and race, and they are used for the calculation of IPW. HbA1c, blood pressure and baseline eGFR were identified as mediators. Fig 4 shows the standardized difference for investigated factors between subjects in high and the low bilirubin groups. Absolute standardized difference above 10% is an indication of a meaningful difference between groups [35]. Hemoglobin, male gender, height, AST, ALT, age, and Asian race had larger than 10% standardized difference in all cohorts. Insulin therapy, HbA1c, BMI, smoking, and black race had less than -10% of the standardized difference in all cohorts. After inverse probability weighting, all covariates except ALT and serum uric acid in the CPRD cohorts are within plus or minus 10% of standardized differences. However,~80%  of baseline ALT and~95% of serum uric acid measurements are missing in CPRD resulting in this imbalance. Fasting prior to baseline measurement is coded as a binary variable where if yes then coded as 1 and otherwise (no or unknown) 0.

Main outcome eGFR and UACR
The event rate for the primary end point of an eGFR decrease >30% was different in the three cohorts. As expected, subjects with established diabetes and additional CV risk factors in the SAVOR cohorts had an event rate of 42.8 per 1000 patient years (kPY), almost 5 times the event rate observed in the CPRD-DM2 cohort (9.2 per kPY) and 8 times the event rate observed in CPRD-HT cohorts (5.1 per kPY). These event rates differed even more between the cohorts for the secondary end point of albuminuria (SAVOR had 6-and 26-times higher event rate compared to CPRD DM and CPRD-HT cohorts respectively).
The results of the primary analyses are summarized in Fig 5a. The multivariable adjusted inverse probability-weighted hazard ratios (95% confidence interval) for the primary end point for SAVOR, CPRD-DM2, CPRD-HT were 1. 18

Sensitivity analysis
The hazard ratios estimated from the multivariable cox proportional hazard models were almost identical between the analyses with or without the inverse probability weighting (S2 Fig). Without any confounder adjustment, all HRs were greater than one, and for all primary end points, they were statistically significant (Fig 5b). We consider absolute standardized difference over 10% to be significantly different baseline characteristic between the high bilirubin group and low bilirubin group. Green dots and squares are the standardized difference after inverse probability weighting. Filled squares indicates these covariates (Sex, ALT, AST, Race, Age, Body Mass Index, Smoking) are used to calculate the propensity score for the inverse probability weighting. After the inverse probability weighting the difference of baseline characteristics between high bilirubin and low bilirubin subpopulations become not significant except HOMA2 Beta Cell, HOMA2 Insulin Sensitivity (for SAVOR), Serum Uric Acid and ALT (for CPRD cohorts). We note that the proportion of missing baseline records for HOMA2 Beta Cell, HOMA2 Insulin Sensitivity (for SAVOR), Serum Uric Acid and ALT (for CPRD-DM2) are 56%, 56%, 93%, 85%, respectively. Protein Dipstick Test results are converted in to the numerical values in the order of null, trace, +, ++ as 0,1,2,3 respectively. Sex is coded 0 as female and 1 as male. isAsian, isPacific Islander, isWhite, is Native, is Other, is Mix, is Black are the binary coding of the racial category of the patient, for example if the reported race of the patient is an Asian then 1 as isAsian variable and 0 for isPacific Islander, isWhite, isNative, isOther, isMix, and isBlack. Smoking is coded as a binary variable where 1 indicates the current smoker and 0 otherwise. Time baseline measurements are taken was recorded in 24 hr scale with minutes converted into fraction of an hour. Other diabetic medication is the medication included in the British National Formulary (BNF) code chapter 'Other Antidiabetic Drugs'. https://doi.org/10.1371/journal.pone.0269970.g004 In Fig 7, relationship between the hazard ratio and serum bilirubin concentration is plotted. The HRs were calculated relative to the highest serum bilirubin level group when the cohorts were divided into 5 or 10 quantiles. As can be seen in Fig 7, regardless of how the cohort is divided by the bilirubin level, a consistent relationship between lower bilirubin concentration and higher risk for the primary and the secondary end point.
The pre-specified subgroup analyses identified only age-group in the CPRD cohorts as statistically significant (S3a-S3f Fig). HR ratio for the primary and the secondary end points were lower in subjects below the age of 75 years in the CPRD cohorts and similar to those in the SAVOR cohort. HR ratios (95% confidence interval) for the primary end point in subjects below 75 years in SAVOR, CPRD-DM2, CPRD-HT were 1.

Conclusion
To the best of our knowledge, the presented study is the most extensive retrospective study to date on the association between serum bilirubin and CKD. By using three distinct cohorts with different CKD risk, we have shown that low bilirubin level is an independent risk factor for kidney function decline in both diabetic and hypertensive patients.
This study was conducted using both registry and RCT data. The RCT data from the SAVOR cohort contributes a robust well-controlled cohort with standardized data collection Shown is the primary endpoint of the study -estimated glomerular filtration rate (estimated by the CKD EPI formula) of more than 30% from the baseline-and the secondary endpoint of the study -first observation of the albuminuria (defined by the urine albumin creatinine ratio greater or equal to 30mg/mmol)-according to the study cohorts. High bilirubin denotes the patient subpopulation that has above or equal cohort median of the baseline serum bilirubin concentration (9μmol/L for SAVOR, 10μnil/L for CPRD-DM2 and HT), Low bilirubin denotes the patient subpopulation that has below cohort median of the baseline serum bilirubin concentration, HR hazard ratios in panel a, inverse probability weighted and adjusted for confounding covariates (age, sex, race if available, baseline body mass index, baseline hemoglobin, baseline alanine transaminase, baseline aspartate transaminase, smoking), HR hazard ratios in panel b, no inverse probability weighted and no adjustment, CI confidence interval calculated by fitting a normal distribution to the 1000 bootstrap sample of the HR.
https://doi.org/10.1371/journal.pone.0269970.g005 and quality control. For example, fasting and time of day are known factors [39,40] that influence bilirubin levels that previous investigations were unable to adjust for or standardize. However, in SAVOR, 97.4% of the baseline measurements were collected after fasting, and 95.3% of the samples were collected in the morning thus standardizing for the diurnal factors influencing bilirubin levels. In contrast, real-world data (RWD) offers a long-term view over a broad population in less controlled manner.
Similar to other studies we found that lower bilirubin levels were associated with an increased risk for kidney function decline and by extension, CKD. Importantly, this finding was consistent in both a well-controlled RCT setting as well as in RWD. Most of the prior analyses evaluating the levels of bilirubin on CKD risk were conducted in Asian populations with only one study by Riphagen et al. [41] that included a considerable number of Caucasians. However, the study by Riphagen et al. [41] focused on the risk for progression in already established CKD in contrast to our study where we demonstrate the association of lower bilirubin levels for the wide range of CKD risk populations. Our study, therefore provides an essential extension of the previously reported findings.
The study of Mashitani et al 20 was based on 957 subjects with microalbuminuria and demonstrated a reduced risk for CKD progression among subjects with elevated bilirubin levels Weighted cumulative incidence plots of the primary and secondary end points. The primary endpoint of the study was estimated glomerular filtration rate (estimated by the CKD EPI formula) of more than 30% from the baseline (Panels a, b, c). The secondary endpoint of the study was the first observation of the albumiunria (defined by the urine albumin creatinine ratio greater or equal to 30mg/mmol) (Panels d, e, f). The study cohorts are SAVOR (Panels a, d), CPRD-DM2 (Panels b, e), CPRD-(Panels c, f). High bilirubin plotted in red are the patient subpopulation that has above or equal cohort median of the baseline serum bilirubin concentration (9μmol/L for SAVOR, 10μnil/L for CPRD-DM2 and HT), Low bilirubin plotted in blue are the patient subpopulation that has below cohort median of the baseline serum bilirubin concentration, The cumulative incidences were estimated with the use of the Kaplan-Meier method where the events are weighted by the inverse probability weighting scheme with the propensity score calculated based on the confounding covariates (age, sex, race if available, baseline body mass index, baseline hemoglobin, baseline alanine transaminase, baseline aspartate transaminase, smoking).
https://doi.org/10.1371/journal.pone.0269970.g006 [25]. However, after adjusting for baseline hemoglobin levels, results were no longer statistical significant. Since bilirubin is a catabolic product of heme breakdown, it might be inferred that lower bilirubin levels reflect lower hemoglobin levels associated with CKD progression. Hence, from their study it is not possible to conclude if serum bilirubin level is independently associated with CKD risk, or if it was merely a confounding factor of hemoglobin. Mashitani and colleagues' study was replicated in various other East Asian populations; none of these studies provided the answer to this question. Our study, therefore, is designed to provide evidence to show whether the association between the CKD risk and serum bilirubin is independent from the hemoglobin level.
The size of the estimated relative risk was largest in the SAVOR cohort that included patients at highest risk for CKD (18% higher risk) and smallest in CPRD-HT cohort with the lowest risk (8.7% higher risk). However, the difference between the estimated HR ratios was profoundly smaller in the prespecified subgroup analysis of subjects less than 75 years of age, (18% versus 14%). The concentration versus HR analysis of bilirubin (Fig 7) shows that compared to the highest quantile subjects, the patients in lowest quantile have~30% higher risk for both the primary and the secondary end points.
Based on these results, we conclude that an association between low serum bilirubin levels and an elevated risk of CKD can be extended beyond Asian populations. All these results are statistically significant while being adjusted for the key confounders such as smoking and hemoglobin levels. We therefore conclude that low bilirubin levels are an independent CKD risk factor and not a consequence of lower hemoglobin levels associated with CKD.
Since our study is an observational study there are several limitations. Firstly, the results presented in this study were adjusted for known and measured confounders. However, there is Relationship between the hazard ratio and serum bilirubin concentration. SAVOR cohort was subdivided into five quantiles by the baseline serum bilirubin level and the hazard ratios were calculated with the highest quantile as the reference. CPRD cohorts were subdivided into ten quantiles by the baseline serum bilirubin level and the hazard ratios were calculated with the highest quantile as the reference. Multivariable Cox proportional hazard model was used without inverse probability weighting to calculate the Hazard Ratios. The 95% confidence intervals for the outcomes were calculated based on robust variances.
https://doi.org/10.1371/journal.pone.0269970.g007 a possibility that the associations presented here could be due to other unavailable factors such as participation in regular physical exercise. Secondly, CPRD is a real-word data set [31] with a significant number of missing variables (cf. S1 Table). We did not assume the missing variables to be missing at random and have therefore applied the missing indicator method in our analysis. We also removed all subjects without baseline bilirubin measurements which may have caused bias in the population. In addition, we have assumed that a no test for UACR at baseline as an indication of physicians overall judgement of normal kidney function. Hence for the CPRD cohorts, if the UACR measurements are not available at baseline we assumed the event (UACR >30mg/mmol) has yet to have occurred. In the CPRD cohorts, a significant number of patients did not have UACR measurements at baseline (CPRD-DM2, about 43%, CPRD-HT about 91%). On the other hand, for both CPRD-DM2 and CPRD-HT only about 2% of the patients with UACR measurements at baseline had UACR >30mg/mmol hence we believe the impact of above assumption is negligible. The limitations related to RWD mentioned here are to some extent mitigated by conducting the same analyses using RCT data where more precise measurements were performed on all subjects. Thirdly, in all of our datasets, genetic information that may explain individual variability of serum bilirubin level was not available; however if the genetic information was available, using the UGT 1A1 polymorphism data and conducting Mendelian randomization instead of dividing and comparing patients by baseline serum bilirubin levels would have been preferred. Despite these limitations, the results are consistent across all the cohorts with a wide range of CKD risks. Furthermore, the results were robust against the sensitivity analyses; hence, we believe this finding is generalizable to a broad patient population in the real world.
Physicians treating diabetes and hypertensive patients should consider that bilirubin measurements may be useful for early identification of patients at risk for kidney function decline and development of CKD. For example, patients with bilirubin <7 μmol/L may benefit from more frequent kidney function monitoring in order to detect kidney function decline earlier.
Similarly, bilirubin may have utility in clinical trial patient enrichment in order to increase chances of detecting a treatment effect [42]. Finally, as recently indicated by Vitek et al., [2] and Hull and Agarwal [43], induction of a mild hyperbilirubinemia may present a therapeutic opportunity for several indications including CKD. Our studies have further supported that the elevation of bilirubin may be a therapeutic mechanism when seeking new treatments for CKD.
In conclusion, based on our retrospective analysis of the largest longitudinal cohort to date using RWD and RCT data, we suggest that the causality of low serum bilirubin levels to poor CKD outcomes should be investigated in future interventional studies.
Supporting information S1 File. (DOCX) S1   Fig. Primary and secondary endpoints, according to study cohorts, hazard ratio calculated without inverse probability weighting. Shown is the primary endpoint of the study -estimated glomerular filtration rate (estimated by the CKD EPI formula) of more than 30% from the baseline-and the secondary endpoint of the study -first observation of the albuminuria (defined by the urine albumin creatinine ratio greater or equal to 30mg/mmol)according to the study cohorts: SAVOR, a subset of The Saxagliptin Assessment of Vascular Outcomes Recorded in Patients with Diabetes Mellitus-Thrombolysis in Myocardial Infarction 53 trial, CPRD-DM2 (Type 2 diabetic cohort constructed from Clinical Practice Research Datalink), CPRD-HT (Hypertensive cohort constructed from Clinical Practice Research Datalink). High bilirubin denotes the patient subpopulation that has above or equal cohort median of the baseline serum bilirubin concentration (9μmol/L for SAVOR, 10μnil/L for CPRD-DM2 and HT), Low bilirubin denotes the patient subpopulation that has below cohort median of the baseline serum bilirubin concentration, HR hazard ratio adjusted for confounding covariates (age, sex, race if available, baseline body mass index, baseline hemoglobin, baseline alanine transaminase, baseline aspartate transaminase, smoking), CI confidence interval calculated by fitting a normal distribution to the 1000 bootstrap sample of the HR.