Using time-varying models to estimate post-transplant survival in pediatric liver transplant recipients

Purpose To distinguish clinical factors that have time-varying (as opposed to constant) impact upon patient and graft survival among pediatric liver transplant recipients. Methods Using national data from 2002 through 2013, we examined potential clinical and demographic covariates using Gray’s piecewise constant time-varying coefficients (TVC) models. For both patient and graft survival, we estimated univariable and multivariable Gray’s TVC, retaining significant covariates based on backward selection. We then estimated the same specification using traditional Cox proportional hazards (PH) models and compared our findings. Results For patient survival, covariates included recipient diagnosis, age, race/ethnicity, ventilator support, encephalopathy, creatinine levels, use of living donor, and donor age. Only the effects of recipient diagnosis and donor age were constant; effects of other covariates varied over time. We retained identical covariates in the graft survival model but found several differences in their impact. Conclusion The flexibility afforded by Gray’s TVC estimation methods identify several covariates that do not satisfy constant proportionality assumptions of the Cox PH model. Incorporating better survival estimates is critical for improving risk prediction tools used by the transplant community to inform organ allocation decisions.


Methods
Using national data from 2002 through 2013, we examined potential clinical and demographic covariates using Gray's piecewise constant time-varying coefficients (TVC) models. For both patient and graft survival, we estimated univariable and multivariable Gray's TVC, retaining significant covariates based on backward selection. We then estimated the same specification using traditional Cox proportional hazards (PH) models and compared our findings.

Results
For patient survival, covariates included recipient diagnosis, age, race/ethnicity, ventilator support, encephalopathy, creatinine levels, use of living donor, and donor age. Only the effects of recipient diagnosis and donor age were constant; effects of other covariates varied over time. We retained identical covariates in the graft survival model but found several differences in their impact.

Introduction
Liver allocation and transplantation policies in the US must balance the problems of donor organ scarcity with mandates for fair and effective allocation. In response, oversight of the allocation process has evolved to include development and maintenance of national registry data for patients who are listed and/or transplanted [1]; development of biologically-based models of disease progression for end-stage organ failure [2]; and use of simulation models that incorporate registry data with clinical data regarding disease progression in the absence of transplantation [3]. Simulation models for adults have been critical in providing timely, evidencebased data to the transplant community. They inform policy development and evaluate the potential impact of changes before policy adoption or implementation, by comparing survival and quality of life (with and without transplantation) under different recommendations and practices [3].
The Pediatric Acute Liver Failure (PALF) study funded by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) is a multicenter study focused on children diagnosed with PALF. Diagnoses associated with PALF differ significantly from those seen in adults; in particular, specific diagnoses are not established in up to 50% of pediatric cases [4]. Reliable models to predict outcomes with the native liver in PALF are not yet available. Previous attempts, including Kings College Criteria [5] and the Liver Injury Unit score [6], do not reliably predict death [7,8], but recent modeling efforts to predict outcomes in PALF have shown promise. A unique immune and inflammatory cytokine network associated with death has been identified, while an entirely different network occurred in subjects who survived with their native liver [9]. Additionally, serial assessments of clinical parameters (international normalized ratio (INR), total bilirubin, and clinical encephalopathy) differentiated subject outcomes using a growth mixture model [10]. These data inform natural history of PALF and could be used in developing predictive models for liver transplant decisions.
As collaborators with the PALF study, we are expanding prior work with a simulation model of liver transplant candidates originally calibrated for adults only [2]. Extending the model to include children improves outcome prediction for all recipients and better represents the competition for resources that are not confined within diagnosis, clinical condition, or age. The complexities of optimizing liver transplant decisions in PALF cannot be underestimated. Clinical components must be integrated, such as predicting clinical deterioration associated with death or permanent organ damage; assessing the impact of allocation decisions for both acute and chronic conditions; and estimating quality outcomes following transplantation.
Integral to developing a liver transplant decision model for PALF, the focus of this report is to demonstrate optimal methods for predicting post-transplant outcomes in pediatric transplant recipients, including children with PALF, by comparing conventional estimation methods with those that are more flexible but also more challenging to employ.
Separate but analogous analyses were conducted for two primary outcomes-post-transplant patient survival and graft survival-and survival curves were generated to illustrate effects over time for transplant recipients with a representative set of covariates. The University of Pittsburgh Institutional Review Board approved this study.

Methods
Since February 2002, risk of death without transplantation for liver transplant candidates has been estimated using severity of illness calculations, specifically Pediatric End-Stage Liver Disease (PELD) scores for children less than 12 years old [11] and Model for End-Stage Liver Disease (MELD) for adults and children over 12 years old [12]. MELD/PELD calculations are widely accepted for predicting pre-transplant deaths, but there is no consensus for using MELD/PELD calculations to predict death after transplantation. Researchers are divided, with some studies reporting associations between pre-transplant MELD scores and post-transplant survival [13,14] and others finding no significance [15,16]. In one systematic review, most of the included studies found moderate associations, but overall findings indicated low levels of evidence for using MELD to estimate post-transplant survival [17].
Instead, post-transplant outcomes rely on survival models, including our earlier work [18,19]. The Cox proportional hazards (PH) model is most commonly used [20], not only in the context of organ transplantation but for myriad health conditions and time-to-event analyses. The Cox PH model is convenient because of widespread availability in statistical packages, but it imposes several strict assumptions, namely that the effect of a variable is both proportional and constant over all future time. When these assumptions are not satisfied, Cox PH models are inappropriate. For example, in adults listed for liver transplantation, use of life support at time of transplantation has an immediate effect on the likelihood of survival but later becomes irrelevant, while prior cytomegalovirus (CMV) infection has minimal impact initially but becomes increasingly important over time [2]; proportionality assumptions are also violated in children diagnosed with hepatocellular carcinoma [18]. In these circumstances, flexible models allowing for covariate effects to change over time would improve the accuracy of survival predictions and also provide insights for better outcomes through pre-transplant management.
Gray's piecewise-constant time-varying coefficients (TVC) model [21] is one such alternative that first tests the proportional hazards assumption and then, if not satisfied, allows but does not force the effect of a covariate (e.g., ventilator support) to vary over time. In essence, Gray's TVC model treats the covariate effect as a series of Cox PH models-as the name suggests, it is "piecewise constant" within a given time interval that may be adjusted up or down in subsequent intervals. Gray's TVC model captures important changes in covariate effects and is an extension of Cox, but estimation is more complicated and not easily available in software.

Data sources
We obtained a standard de-identified data file of transplant candidates from United Network for Organ Sharing (UNOS), based on Organ Procurement and Transplantation Network (OPTN) data. The main file contains patient diagnoses (Table 1) and demographics for everyone placed on the transplant waiting list; it measures clinical characteristics at time of listing and (if applicable) time of transplantation. The file also maintains updated information about survival and liver function until candidates die, are removed from the waiting list, or receive a transplant. Changes in physiology are recorded using laboratory values affecting the candidate's MELD score (albumin, bilirubin, INR, and serum creatinine) or PELD score (albumin, bilirubin, INR, age, and growth failure), plus other comorbidities and laboratory values related to severity (ascites, encephalopathy, serum sodium).

Outcome variables and model covariates
We considered two outcome variables, post-transplant patient survival (number of days between primary liver transplant and death) and graft survival (number of days between primary liver transplant and death or re-transplantation).
We identified potential covariates based on prior work [18,19] and expertise of the PALF oversight committee. MELD/PELD score was excluded from consideration because, as noted above, it was developed for estimating survival without transplantation. For both outcomes, we considered the same initial covariates related to the recipient, the donated organ, and the transplant procedure (Tables 2 and 3).
Recipient characteristics. Attributes include demographics, namely age, gender, race/ethnicity, and primary source of payment. They also include six major disease categories, aggregated in consultation with our oversight committee, that are relevant in children: biliary atresia, metabolic disorder, cancer, and autoimmune disease, plus acute liver failure and a final category for miscellaneous other chronic diagnoses (Table 1). This final group combines heterogeneous diagnoses appearing in small numbers, with "unknown" (n = 279) and primary sclerosing cholangitis (n = 66) being the most prevalent.
Donor characteristics. Donor-specific covariates include demographics, blood type, whether the donor was deceased or living, and CMV and EBV status.
Transplant information. Information describing the transplant itself included year of transplant (a proxy for secular changes), transplant center region (defined by OPTN/UNOS), a binary indicator for use of partial/split organs, cold ischemia time, and transport distance of the donor organ. Except for geographic region, information about transplant centers was not available.

Statistical analysis
We summarized cohort characteristics using descriptive statistics, first for the entire sample and then separately for patients who remained alive (without re-transplant) for the study duration, died (without re-transplant), or were re-transplanted during the study period. We followed the same analytical procedure for both patient and graft survival, initially testing proportionality assumptions for each covariate and estimating univariable models using Gray's TVC model. Importantly, the Cox PH model is a special case of Gray's; when covariates satisfy proportionality assumptions, estimates for the two models are identical. Using univariable results, we retained covariates with a P-value < 0.15. This subset was combined into a multivariable Gray TVC model, using a backward stepwise procedure for variable selection to gain more power.
We encountered two types of missing data: dichotomous variables (usually indicating presence or absence of a symptom/condition) that are often overlooked when absent; and values where data were not captured. In the first case, we treated missing data as "absence" of a condition and tested this in sensitivity analyses; in the second case, data were treated as truly missing and observations were dropped from the estimation.
After finalizing our specification using Gray's TVC, we estimated the corresponding Cox PH model. The findings are summarized graphically, and we superimposed both hazard plots onto a common coordinate axis.

Definition of patient sample
The OPTN data file contained 5,718 children less than 18 years old who were listed for liver transplantation between February 2002 (when MELD/PELD was implemented) and June 2010. We excluded 1,183 children who were removed from the waiting list for myriad reasons, including death (n = 448), too sick (n = 172), too healthy (n = 445), refused transplant (n = 14), missing transplantation data (n = 3), and other/unknown reasons (n = 101). We also removed children still awaiting transplantation (n = 498). Finally, we excluded 574 children receiving multiple simultaneous organ transplants (e.g., combined kidney and liver) and 288 recipients with liver cancer, who were evaluated separately [18]. The current analysis includes 3,175 pediatric liver transplant recipients initially transplanted by June 2010, with post-transplant follow-up through June 2013 (Fig 1).

Description of patient sample
Transplant recipients are described in Table 2 (demographics and clinical characteristics) and Table 3 (donor-and allocation-related information). Unadjusted post-transplant survival, stratified by disease, is provided in Fig 2; in general, survival appears similar, except for recipients with acute liver failure and miscellaneous other diagnoses (logrank test, p < 0.001). An analogous figure for graft survival appears in supporting information (S1 Fig). As Table 2 illustrates, most recipients remained alive throughout the study period, so this group closely resembles the overall sample. Re-transplanted patients were similar to the overall sample, although they were more likely to be white and have private insurance. Autoimmune disease and metabolic disorders were also more common, and re-transplanted recipients appeared sicker at time of (primary) transplant, with higher rates of ascites and encephalopathy.
For those who died after transplantation, there are similarities regarding gender and blood type, but they stand out from the overall sample in several respects. A higher proportion of decedents were older (at least 12 years old) at transplant, black, and publicly insured. They were more likely to have acute liver failure or miscellaneous other diagnoses (Table 1); they appeared sicker at transplant, with worse bilirubin and higher rates of ascites, CMV, encephalopathy, and ventilator support. A higher proportion was Status 1 with worse MELD/PELD scores and shorter waiting times, and proportionately fewer were active exceptions at transplant.
Factors related to the donor or procedure itself did not vary greatly, although recipients who remained alive enlisted living donors more often than those who died or were re-transplanted, and re-transplanted recipients had higher rates of partial/split donor organs for their primary transplant (Table 3). Decedents received more organs procured regionally, which reflects allocation rules prioritizing the sickest candidates (Status 1) first, followed by candidates listed in the donor's region with highest MELD/PELD scores [23].

Patient survival
Model estimation. In estimating univariable Gray's TVC models, serum sodium was excluded because of missing values (n = 883). Of the remaining covariates, 13 were significantly associated with overall survival: recipient age, race/ethnicity, primary source of payment, liver disease, total bilirubin, creatinine, encephalopathy, ventilator support, donor age, use of a living donor, use of partial/split organs, type of organ procurement (local, regional, or national), and cold ischemia time.
Covariates were considered jointly for multivariable estimation using backward stepwise selection. The final Gray TVC model (Table 4, left-hand side) included five of the original covariates (recipient age, liver disease, creatinine, donor age, and use of living donors), plus three other covariates that were recoded and retained: race/ethnicity, which was recoded as dichotomous (black race vs. nonblack race) so that the model would converge; and a    (Table 4, right-hand side), results for both approaches were plotted and compared (Fig 3).
Comparison of Gray TVC and Cox PH models. Covariate effects for both models are summarized graphically (Fig 3). For each covariate, a horizontal line at 0 (black dashed line) represents no relationship between the covariate and patient survival. Estimated hazards from the Cox PH model are constant (solid red line). Estimated log hazards and 95% confidence bands of Gray's TVC model (solid black line, plus grey shading) are superimposed to illustrate covariate effects when hazards are not held constant. In each graph, the effect of the covariate is graphed over time based on the Cox proportional hazards model (red line) and Gray's time-varying model (black line, with 95% confidence intervals in gray). The model estimates can be compared to a log hazard of 0 (no effect; black dashed line). Of note, when Cox's assumptions are maintained, the Gray and Cox estimates will be the same and the red and black lines may be indistinguishable in the graph. https://doi.org/10.1371/journal.pone.0198132.g003 Post-transplant survival in pediatric liver transplantation Liver disease. Biliary atresia was the most common diagnosis (47%) and served as the referent category. In comparison, survival of recipients with acute liver failure, autoimmune diseases, and metabolic disorders experience was similar to that of biliary atresia (the dashed line at 0 falls in the shaded bands). Only recipients with miscellaneous other (e.g., cystic fibrosis, Budd-Chiari, neonatal hepatitis) have significantly poorer survival outcomes after transplant.
Recipient age. Recipient age at time of transplant has no effect initially; after 3 years, children who were older at time of transplant have a higher risk of death than those transplanted at younger ages. "Higher risk" is a relative statement: between any two age groups, older recipients have higher risk of death 3 years after transplantation. Although this statement appears self-evident, it bears mention because the cohort consists exclusively of children.
Creatinine at transplant. In contrast to the Cox PH model, the estimated hazards from the Gray TVC model show that elevated creatinine levels increase the recipient's risk of posttransplant death during the first 90 days. This risk declines afterwards and later becomes nonsignificant.
Encephalopathy/Ve ntilator support. If encephalopathy is present and/or use of ventilation is required at transplant, patients experience a higher risk of death, particularly in the first 6-12 months.
Race (black vs. nonblack). The Cox PH model indicates poorer post-transplant survival for black recipients at all time points, whereas the time-varying estimates show no differences initially but increased risk for black race over the long term.
Use of living donors. The Cox PH model shows immediate benefit to use of living donors. In the Gray TVC model, the initial survival benefit is not statistically significant; however, the advantages become apparent after 6 months and remain significant.
Donor age. Both models show that older donor organs increase risk of death and the risk is constant over time.
Example: Patient survival for prototypical recipient. Our findings reveal additional insights from time-varying methods, but clinical significance is harder to interpret. This section describes a pediatric transplant recipient with a typical combination of covariates (prototype) and generates survival curves to highlight covariate effects over time.
We defined a prototype using median values for continuous covariates and modal outcomes for categorical variables. Only covariates for the final model (Table 4) were considered. Hence, the recipient is 4.8 years old, non-black, with biliary atresia. At transplant, the recipient has creatinine levels of 0.48 mg/dL, has no encephalopathy, uses no ventilator support, and receives a graft from a 15.5-year old deceased donor.
We generated new survival curves and varied each covariate in turn, holding everything else at baseline values (Figs 4, 5 and 6). (Throughout the graphs, the prototype's survival curve is the same and appears first in the legend.) For example, in the first graph (recipient age), the prototype is 4.8 years old. Among children who are older (e.g., 9 years old) at time of transplant, the survival curve lies slightly above the prototype's, indicating better survival. Curves for younger children fall slightly below, but in general survival does not vary much by age. Moving through other graphs, we see that black recipients have worse survival than nonblack recipients. Recipients with autoimmune diseases are similar to those with biliary atresia, but other diagnoses show slightly worse survival. Although creatinine appeared significant in model estimates, survival curves at various levels of creatinine show no differences. On the other hand, the recoded variable for encephalopathy and ventilator support shows marked differences: recipients with neither condition have the best survival, whereas survival for those on ventilator support (with or without encephalopathy) is substantially worse. Living donation yields uniformly better survival, and donor organs from younger donors also show survival benefits.

Graft survival
Tables and figures for graft survival are provided in supporting information.
Model estimation. When these steps were repeated to evaluate graft survival, the same covariates were significant in univariable models, except for primary source of payment, use of Post-transplant survival in pediatric liver transplantation partial/split organs, and cold ischemia time. We also retained the original race/ethnicity categories because the model converged without recoding. Covariates were combined for multivariable estimation, and non-significant covariates were excluded. The final graft survival model included the same covariates retained in the patient survival model (S2 Fig, S1 Table).
Comparison of Gray TVC and Cox PH models. Graft survival for recipients with metabolic disease and acute liver failure was similar to biliary atresia, but autoimmune disorders and the miscellaneous other category both showed increasing risk of graft failure. The increased risk is observed late (after 4 years post-transplant) with autoimmune disorders but occurs early (after 6 months post-transplant) in miscellaneous other. The effect of creatinine is also different: higher creatinine levels at transplant increase the risk of graft failure, and the Post-transplant survival in pediatric liver transplantation risk does not dissipate over time. Recipient age and race have the same effects on graft survival as they did with patient survival. Use of living donors is uniformly better and use of older donor organs increases risk for graft failure. As with patient survival, ventilator support dramatically increases risk of early graft failure, and risks associated with encephalopathy are immediate and persist over time.
Example: Graft survival for a prototypical recipient. The series of graft survival curves for a prototypical recipient are provided in S3, S4 and S5 Figs.
The figures show that older children experience better graft survival than younger patients; black recipients have worse graft survival than other recipients. There are slight differences based on liver disease and no variation across creatinine levels. The greatest risks for graft failure occur with encephalopathy or ventilator support, or with the use of deceased donors. Risk also increases with the age of the donor.

Discussion
The transplant community grapples with demand for services that far exceeds donor organ availability, and more than 2,200 individuals die on the waiting list annually [24]. The simple fact that not everyone who needs a transplant can receive one drives researchers and providers alike to evaluate processes routinely and ensure effective and fair use of scarce resources [25].
Defining goals and the best use of resources is part of the conversation. In renal transplantation, researchers have advocated for shifting organ allocation priorities from the sickest to those who will benefit the most [26,27]. Similarly in liver transplantation, policies have evolved and now integrate local preference with medical urgency when allocating donor livers. Known as Share 35 and Share 15, the policies give priority to candidates with the greatest need and those who will benefit the most, respectively, but initial preference is extended to local and regional candidates before organs are offered nationally [28].
When considering these or other potential policy changes, accurate and robust predictions of survival (with and without transplant) are needed. Decision models and risk prediction equations are increasingly important for anticipating the impact of potential changes a priori. The current paper demonstrates advantages of time-varying approaches for estimating risk of death/graft failure in children after transplant. Overall survival was related to recipient characteristics (age, race, liver disease), severity of illness (creatinine, encephalopathy, ventilator support), and donor attributes (age, living vs. deceased donor). Beyond their importance in constructing post-transplant outcome models, these findings identify opportunities to improve pre-transplant management and highlight the challenges associated with outcome disparities related to age and race. Pre-transplant management that includes judicious use of blood products and colloid infusions and avoidance of nephrotoxic medications in settings of acute or acute-on-chronic liver failure may impact the frequency of ventilator support and renal insufficiency [29]. Racial disparities, also identified in pediatric intestinal failure and transplantation, are likely a result of myriad factors including access to care, insufficient insurance, geographic disparities, and cultural differences that require a broader collaborative to address [30].
The Gray's TVC model provided insights that were, in some cases, surprising. Though we might expect ventilator support and creatinine levels to have waning effects on survival after the initial peri-transplant period due to survivor bias (similarities between patients who survive short-term exacerbations and those who never had problems at all), it is surprising that the effect of encephalopathy, another measure of illness severity, persists for much longer. Cross-sectional neurocognitive outcomes among children with acute liver failure identified lasting impairment of motor skills, attention, and executive function associated with clinical encephalopathy [31]. The impact of these deficits upon compliance with medications, and office and laboratory visits that may contribute to poor post-transplant outcomes requires further study. Similarly, long-term survival advantages of living donors (versus deceased donors) and non-black recipients (versus black recipients) were significant and suggest key areas for future study.
The same covariates matter in graft survival. Using older donor organs appears to increase the risk of graft failure later, but many other putative donor characteristics were not significant.
Once adjusting for these covariates, structural characteristics of the transplantation network mattered less than historically believed. Specifically, differences in organ procurement (local vs. regional sources) and cold ischemia time did not affect graft survival significantly, somewhat surprising given previous findings. Year of transplant was not significant either, suggesting that overall survival in children has been relatively stable. Where disease etiology matters (autoimmune, other chronic), the effect is not apparent in the short term but becomes important over longer periods of time. The combined effect of these covariates was illustrated through the patient prototype rather than looking at each covariate in isolation.
We encountered several limitations in our analysis, largely related to the covariates that were considered and the retrospective nature of the analysis. First, we had to exclude several covariates that had a high proportion of missingness, especially serum sodium which was not collected in OPTN data until November 2004. Data collection has expanded and these variables should be considered in the future. Second, using a stepwise procedure with backwards selection excluded several surprising variables from the final model, particularly cold ischemia time, transplant center location, and laboratory values measured at time of transplant. However, backwards selection also guards against highly correlated covariates, which may explain why variables were excluded. We therefore tested these exclusions in sensitivity analyses before finalizing the model specification. Third, our estimates relied upon retrospective information and should be validated prospectively. We also do not know yet how using more flexible prediction equations will affect the overall simulation or whether the predicted outcomes for pediatric recipients will be substantially different. Survival estimates are only one part of a larger model that needs to be considered in combination with simulation, cytokine networks and growth mixture models described previously [9], as a way to inform potential changes to allocation policies. That said, knowing that proportionality assumptions are not satisfied in this context, there is merit to using more appropriate survival models for future simulations.
The strengths of the paper include use of comprehensive, national data for pediatric liver transplant recipients, suggesting that the findings here are generalizable. One group of recipients was systematically excluded (those with a primary diagnosis of cancer) because unadjusted survival differed substantially from other diagnoses and was examined separately [18]. Using two alternative estimation methods is also a strength, demonstrating Gray's TVC flexibility as well as its consistency with Cox when proportionality assumptions are not violated. Though the transplant community may not be surprised by the covariates found to be significant in predicting post-transplant outcomes, our results show changes in their impact over time. In this regard, there were surprising findings, such as the effect of encephalopathy, which had an effect not only during and immediately following transplantation as expected, but also remained important throughout the follow-up period. Without application of time-varying approaches, we cannot discern these and other differential effects on survival.

Conclusion
For both patient survival and graft survival, the effect of several variables varied over time, which is consistent with our understanding of pathophysiology. However, the Cox PH model must "average" these changing effects over time, which misrepresents the impact of covariates on survival and potentially results in faulty decision making.
Our analysis demonstrated that standard Cox PH models are not appropriate statistical methods for predicting post-transplant survival. Effects of several important patient characteristics are not constant over time, and time-varying effects cannot be captured by the Cox PH model. Whatever transplant outcomes policymakers hope to optimize in the face of limited resources, good risk prediction tools are important for demonstrating the policies that will achieve those goals.