Joint Modelling of Survival and Emergency Medical Care Usage in Spanish Insureds Aged 65+

Background We study the longevity and medical resource usage of a large sample of insureds aged 65 years or older drawn from a large health insurance dataset. Yearly counts of each subject's emergency room and ambulance service use and hospital admissions are made. Occurrence of mortality is also monitored. The study aims to capture the simultaneous dependence between their demand for healthcare and survival. Methods We demonstrate the benefits of taking a joint approach to modelling longitudinal and survival processes by using a large dataset from a Spanish medical mutual company. This contains historical insurance information for 39,137 policyholders aged 65+ (39.5% men and 60.5% women) across the eight-year window of the study. The joint model proposed incorporates information on longitudinal demand for care in a weighted cumulative effect that places greater emphasis on more recent than on past service demand. Results A strong significant and positive relationship between the exponentially weighted demand for emergency, ambulance and hospital services is found with risk of death (alpha = 1.462, p < 0.001). Alternative weighting specifications are tested, but in all cases they show that a joint approach indicates a close connection between health care demand and time-to-death. Additionally, the model allows us to predict individual survival curves dynamically as new information on demand for services becomes known. Conclusions The joint model fitted demonstrates the utility of analysing demand for medical services and survival simultaneously. Likewise, it allows the personalized prediction of survival in advanced age subjects.


Introduction
Rising rates of longevity are closely tied to the growth in demand for medical attention and long-term care services, the increasing usage of which extends longevity even further. Joint models of longitudinal and survival data enable us to estimate the simultaneous association between survival and the demand for care, and serve as a tool for predicting risk of death based on personalized medical records. Thus, as an individual's risk of death increases, the probability of their being hospitalized increases and having been admitted to hospital, their probability of death is also greater than that of a non-hospitalized individual.
The aim of this paper is to examine the use of joint modelling techniques for predicting survival. Our motivating dataset includes 39,137 Spanish insureds aged 65 and over, on which we conducted a longitudinal and time-to-death monitoring study from 1 January 2006 to 1 February 2014. Whereas classical survival prediction is based solely on patients' past medical records [1,2], joint modelling has the advantage that intensive care usage and survival are modelled simultaneously [3][4][5][6], by assuming a zero-mean latent Gaussian process that correlate the two settings [7][8][9]. Moreover, recent and current records, as opposed to information about medical usage that occurred many years ago, have a greater incidence on an individual's current health status, suggesting that information should be weighted with an appropriate decreasing function. Here, we investigate the role of the information contained in medical records and identify a fading effect, whereby more recent records have a greater influence than older medical records on the risk of death.
Medical emergency claim counts serve as a measure of annual medical care usage and, as such, of a patient's health status. Changes in these biomarker counts allow us to make dynamic subject-specific predictions, that is, personalized survival curves can be systematically updated as new measurements are collected. Thus, the joint modelling scheme proposed here is a powerful, dynamic tool, especially applicable to subjects afflicted by chronic processes [10][11][12]. In summary, joint models allow us to: (i) estimate both longitudinal and survival parameters without the bias introduced by a separate analysis; (ii) confirm that age and gender affect survival changes, even when controlling for hospitalization and emergency care unit visits; (iii) test the significance of the degree of association between care usage and survival; (iv) estimate subject-specific survival probabilities; and (v) update personalized survival estimations as additional longitudinal data are collected.

Health insurance dataset
Our study examines a large dataset provided by a Spanish medical mutual company, from which we collected a random sample of policyholders aged 65 and above. All patient records/ information was anonymized and de-identified prior to analysis. The UB Riskcenter's ethics committee approved this retrospective study.
The data contain historical information on claims reported between 1 January 2006 and 1 February 2014 by 39,137 policyholders (aged 65+), 39.5% of whom were men and 60.5% women. The time of origin for each policyholder's record is taken as their date of entry into the study (t = 0), and from that point on all subjects continued to be monitored until the end of the study period. The dataset provides reports on all the claims made by each policyholder in eight consecutive one-year windows. As our primary interest lies in non-routine care, usage counts only include hospitalizations, emergency service visits and the use of ambulance services. The data can be consulted at www.ub.edu/riskcenter/R/jm.
The mean age of policyholders entering the study is 73.2 yr for men and 74.2 yr for women. It is well documented that women have a longer average life expectancy and worse health status [13][14]. Data presented in Table 1 show the frequencies and row percentages for age-at-entry groups and by gender. Observed deaths are also presented. In the 65-to-74 age-at-entry group the number of deaths is higher for women than for men (261 vs. 207), but as the number of insured women (13,121) is much higher than the number of insured men (9,370), the death rate is lower for the former (2.0% vs. 2.2%).
Overall, the date of death was recorded for 3,129 individuals during the follow-up period, so that 92.0% of policyholders survived or were no longer in the sample at the end of the study. Although policyholders aged over 85 represented just 9.1% of the dataset, they accounted for 36.3% of deaths. As for the actual length of the follow-up, almost half of all policyholders were included throughout the observation window. As such, some of the right-censored profiles result from the administrative closure date, with some policyholders having left the insurance company. The censoring mechanism, other than that attributable to the end of the study, can therefore be assumed to be unrelated to the death-event, as it refers to insurance cancellations attributable to reasons of an external nature (primarily dissatisfaction with the service, a change of company or an unwillingness to pay).
We consider the demand for health claims in the year preceding the observation point as our longitudinal outcome, that is, the number of times a subject requests hospital, emergency room or ambulance services. This was introduced in the model using a logarithmic scale to allow for a log-linear specification [15]. In the survival approach, the baseline covariates are age0 (age at entry) and sex (man:0; woman:1). The former plays an essential role by providing a suitable survival characterization for late entries, i.e. policyholders who enter the study after the age 65, whereas the sex covariate allows us to distinguish different patterns of behaviour by gender.

Longitudinal approach
The observed longitudinal response included in the model summarizes a policyholder's past demand for hospital and emergency services and corresponds to an accumulation of past claims. Such observed data are transformed as the logarithm of one plus the number of services demanded during the year prior to the observation point. This is denoted by y ij = {y i (t ij ), j = 1,. . .,n i }, where i is the individual and n i is the number of measurements made of a particular person. The specific occasions when the policyholder is measured are denoted by t ij . A standard log-linear mixed model is used to explain the mathematical expectation of service usage, which we denote by m i (t). A Gaussian approximation is assumed in the response [16]. Table 1. Description of policyholders, deaths and death rate distributions by age-at-entry and gender. Number of subjects (frequencies and row percentages), deaths occurring prior to study completion (frequencies and row percentages) and percentage death rates are stratified by age-at-entry and gender.

Gender
Age at entry (years) In order to give greater weight to more recent claims than to older claims, F ðÁÞ is defined as the cumulative weighted area under the claim's historical path. Since we assume that the historical effects fade over time, we define where wðÁÞ is the weighting function [17,18]. Alternative weightings for capturing the fading effect of service demand on survival were tested. For the random effects structure, we firstly considered a random intercept model (a single random effect). Additionally, a random slope effect via a bivariate random effects structure has been also tested.

Survival approach
Given our particular interest in discerning gender survival differences among elderly policyholders, it is highly informative to know the distribution of the true event times for the i-th policyholder, T i Ã , defined in our study as a non-negative random variable that includes the time lag from the point a policyholder enters the study until their death-event. This approach involves the probabilistic distribution of the timing of death, considering the proportion of living subjects beyond time point t via the corresponding survival function,S(t) = Pr (T Ã > t). A corresponding potential right-censored time C i is assumed since some policyholders may still be alive when their coverage terminates, so that in practice we can only observe event times, T i ¼ min fT i Ã ; C i g. Under the above assumptions, the Cox proportional hazards model [19] can be defined as: where h 0 (t) is the baseline risk function, w i is the vector of baseline survival covariates and γ the vector that contains the corresponding regression parameters. Complete information of this nature can only be known at fixed time points. Although h 0 (Á) remains traditionally left unspecified, this condition was relaxed so that subject-specific predictions could be made more easily. Thus, the logarithm of the baseline function was approached using a cubic B-spline basis function [20].

Joint model with recency-weighted cumulative effects
The joint model is specified as both a longitudinal and a survival process simultaneously, where α is the parameter that captures the effect of the policyholder's background health status on her survival. Therefore, the joint model can be defined as where M i ðtÞ ¼ f m i ðsÞ; 0 s tg denotes the complete policyholder-specific history of the expected longitudinal response until time t and F ðm i ðtÞÞ are the recency-weighted cumulative effects of the longitudinal process. In practice, this captures the influence of past claims on the survival model. Moreover, for the intercept and slope random effects, b i = (b i0 , b i1 ) T , a bivariate normal distribution with zero mean and unknown unstructured variance-covariance matrix D is assumed, where The Akaike Information Criterion (AIC) was used as a global goodness-of-fit measure and its value served as the basis for ranking the fitted models. The lowest AIC value corresponded to exponentially weighted records, which points to the marked impact of medical services used in the last year of life (accounting for the overall weight) on the death hazard rate. All analyses were performed with R statistical software (www.rproject.org) version 3.2.2 [21], using the JM package [22]. Moreover, to compare the prediction performance we also used JMbayes package [23] to implement the Bayesian approach.

Individualized survival predictions
One of the key features of joint modelling techniques is that personalized predictions for the time-to-event outcome can be obtained from a consideration of all subject-specific longitudinal measurements. Joint modelling means that longitudinal outcomes preceding present time t are directly related to current mortality probability as well as to the probability of surviving at a later time u conditional on being alive at t. As such, an accurate knowledge of a subject's past health problems is essential for providing personalized survival predictions.
Let us consider a generic policyholder aged over 65 not included in the original dataset D n ¼ T i ; d i ; y i ðtÞ; i ¼ 1; . . . ; n f g ; but within the target population. This individual has a historical set of emergency medical care records Y k ðtÞ ¼ fyðsÞ; 0 s tg as well as a particular vector of survival baseline covariates w k . Thus, the prognosis task relies on estimating the survival probability according to the joint generalized linear model and subject-specific information. Let us assume that the subject is alive at time t, then our primary objective is to estimate the probability of her remaining alive at any future time u > t, By adopting a Bayesian strategy, the above expression can be estimated from the corresponding posterior as where the first term of the above integral is obtained through conditional independence between the longitudinal and survival responses, and the posterior distribution pðθ j D n Þ is derived from the asymptotic approximation θ j D n $ N ðθ; VðθÞÞ via Markov chain Monte Carlo (MCMC) sampling, whereθ is the maximum likelihood estimate for the true parameter vector θ. Finally, a Monte Carlo estimate of π k (u|t)is obtained by combining the previous assumptions.

Results
We first present the estimation obtained when applying the joint model and compare these results with those obtained under a separate estimation of the longitudinal and survival submodels that ignores the dependence between these components for each subject. The maximum likelihood estimates of the main parameters from both strategies are presented in Table 2.
The random effects in the longitudinal mixed model provide robust estimations for both separate and joint approach, with very little difference in the parameter estimates or in their corresponding standard errors. The estimates of the population-averaged parametersb 0 and b 1 are both positive and describe an overall upward trend. In the case of the survival submodel, in contrast, a difference in the estimation is detected in the case of the joint model specification. A marked improvement is also recorded in the significance of the baseline survival parameters, as indicated by the smaller p-values, and the readjustment of the estimated coefficients. However, the main result is the presence of a significant association between longitudinal demand and the event outcome, since parameter α is strongly significant and positive, α = 1.462 (p-value < 0.001). Thus, we infer an increasing relationship between the frequency of use of non-routine medical services and the risk of death. For the random effects structure, we firstly considered a random-intercept model (a single random effect). However, the posterior inclusion of a random slope effect indicated that the rate of change in the claiming evolution was significantly different form policyholder to policyholder, as reflected by the likelihood ratio test (p-value < 0.001).
As for the structure of the joint model, we assume that an integrated measure of the claims history data is the most appropriate way to incorporate the claims data "biomarker" in the joint modelling framework. Although this appears reasonable, we would also expect the rate of rise in the actual claims history, e.g. the number of new claims made in a given period of time, to also provide information. Policyholders presenting a more rapid rise in their claims process are likely to be in poorer health, otherwise they would not be seeking medical attention. Thus, this measure appears to account for the "fading" behaviour captured by recency-weighted information on claims. We have explored the sensitivity of these inferences to the derivative (slope) of the claims history process, further improvement is obtained from extending the standard joint model parametrization (that is, by considering the current biomarker value) to the rate parametrization (p-value = 0.094). In the case of policyholders with the same age at study entry, the estimated hazard ratio of death for females relative to males is exp(-0.722) = 0.486, i.e. (1-0.486)Á Table 2. Comparison of the estimates obtained from the separate analyses and from the joint model. Parameter estimates for the models applied to the health insurance dataset (n°of cases = 39,173). The longitudinal process captures the log-transformed demand for hospitalization, emergency room care and ambulance services in the year preceding observation and the event process (defined by an exponential weighting function) models survival. On the other hand, in the joint model when considering policyholders with the same level of claims and same age at study entry, the estimated hazard ratio of death for females relative to males is exp(-4.648) = 0.010, which is now significantly lower (p-value < 0.001). The four panels in Fig 1 depict an example of the annual updating at each measurement point in both the subject's biomarker evolution and her estimated survival curve. Taking the age of 88 as our threshold reference, the plots show a declining survival probability due to an increasing cumulative area under the logarithm of hospital/emergency health service demand. Thus, a more marked expansion of the shaded area reduces the probability of staying alive until the age of 88. The data structure assumes the absence of observed deaths after the given closing date, which impedes making predictions beyond that date. In spite of the short exposure considered here, the monitored period is long enough for us to infer once again a significant association between care usage and time-to-death. Note that the inclusion of new data will not only reduce the standard error of the estimates, but also extend the time horizon of the survival curves. To illustrate the difference between the joint and separate survival models, it can be seen in Table 3 that the woman's survival probability at age 88 is 0.790 by joint modelling techniques, but that the estimated value is 0.755 in the separate survival model and would remain the same regardless of information on current and past emergency claims. The performance of the joint model also differs from that of the separate survival model; thus, if we compare the prediction error of a joint model (at time 7, the estimated prediction error with absolute loss function is 0.064 when using information available up to time 5) with that of a separate survival model (at time 7, the corresponding estimated prediction error with absolute loss function is 0.070), we can confirm that the joint model performs better in terms of prediction because the error is smaller.
The size of our sample is large, although small in comparison to the health claims data typical of other contexts (e.g. the US, Canada or the NHS in the UK); therefore, computational time may be an issue when implementing joint models. Here, for instance, the model estimation takes about 21 minutes on a standard PC. The Bayesian approach requires considerably more time, which is why we address Bayesian illustrations with a smaller random sample (specifically, a 10% random fraction). For this smaller sample, the estimation time for the frequentist approach is 79.1 seconds, while it is 10.2 hours for the Bayesian approach. Thus, the Bayesian approach requires much more computational time on a standard PC than is required by the frequentist approach.
We report the model estimates from the Bayesian and the frequentist approaches in Table 4, for this smaller sample of 3,915 policyholders. In particular, we performed three MCMC chains with 35,000 iterations, of which 5,000 were used for the burn-in period. The results provided by the two approaches do not vary substantially. By switching to a Bayesian approach, besides estimating the model, we also calculate numerical measures evaluated by MCMC methods (see the documentation of the JMbayes package [23]). The prediction error for the smaller sample under the Bayesian approach is 0.065 by considering an absolute loss function, which is exactly the same as that obtained under the frequentist approach for the same sample size.

Discussion
An increasing number of statistical studies report the individualized monitoring of timedependent covariates prior to the occurrence of a particular event. Here, we have shown that the joint analysis of the individualized and of the true longitudinal evolution over the lifetime is not only indispensable for detecting the strength of association between two responses, but additionally helps improve parameter estimates considerably by circumventing the potential risk of bias caused by the incomplete follow-up of certain subjects. The key feature of the joint approach lies in the inclusion of longitudinal information within the standard survival model by means of a latent Gaussian process, which is specifically materialized in their shared random effects purpose. The number of events in our claims dataset is in fact low and most of the patients are actually still alive by the end of the observation period. This is the most likely explanation for the Table 4. Comparison of the estimates obtained from the separate analyses and from the joint model with the frequentist and the Bayesian approaches. Parameter estimates are based on the small health insurance dataset (n°of cases = 3,915). The longitudinal process captures the demand for hospitalization, emergency room care and ambulance services in the year preceding observation and the event process (defined by an exponential weighting function) models survival. numerical identity of the coefficients of the longitudinal process, when modelled both separately and jointly. The association between gender and the survival model is shown in the joint model and not in the separate model, and this is probably due to the variable use of heath care resources, as men usually report fewer claims than women. The mutual health insurance sector, which stores large portfolios of individual policyholders monitored over long periods of time, provides a benchmark example of the potential applications of joint modelling techniques that should broaden our understanding of the mechanisms underpinning health progression trends. Indeed, insights into the underlying trends at this individualized level should enable professionals to provide more accurate service demand predictions and to adjust the risk of mortality to better capture the baseline factors of age and gender. To date, joint modelling techniques have usually been used with quite small datasets (in the order of a few hundred individuals) and it is not easy to find applications to larger sets or discussions of the computational obstacles that need to be overcome.
This study has reported the challenging task of implementing a single joint model with a large data sample (policy holders aged 65 and over) and it has demonstrated a statistically significant dependence between a subject's past medical care usage (their use of ambulance and emergency services and admissions to hospital) and their current hazard of death. While we have no information about the number of days the subjects spent in hospital or in the emergency room, nor about the condition that required their seeking healthcare, we are able to provide a personalized survival prediction. Information on routine medical visits is specifically not used here.

Conclusion
The results reported here, as illustrated by a real case, demonstrate that likelihood inferences based on a joint procedure are efficient and less biased than those obtained from the classic separate approach. Thus, they present obvious benefits over standard survival techniques. The positive sign of the association parameter shows that relatively high cumulative demand for emergency room, hospitalization and ambulance services is simultaneously related to a deterioration in the subject's health status and, consequently, to lower probabilities of survival. Specifically, an increase in accumulated claims can be quantified as an increase in mortality risk, for an initial age and gender thus leading to personalized survival curve prediction.