Predicting Relapsing-remitting Dynamics in Multiple Sclerosis Using Discrete Distribution Models: a Population Approach

Background: Relapsing-remitting dynamics are a hallmark of autoimmune diseases such as Multiple Sclerosis (MS). A clinical relapse in MS reflects an acute focal inflammatory event in the central nervous system that affects signal conduction by damaging myelinated axons. Those events are evident in T1-weighted post-contrast magnetic resonance imaging (MRI) as contrast enhancing lesions (CEL). CEL dynamics are considered unpredictable and are characterized by high intra-and inter-patient variability. Here, a population approach (nonlinear mixed-effects models) was applied to analyse of CEL progression, aiming to propose a model that adequately captures CEL dynamics.


Introduction
Multiple sclerosis (MS) is a prototypic autoimmune disease that affects the central (CNS) with a relapsing-remitting (RR) disease progression [1]. Clinical relapses in MS, acute symptoms that appear in episodic periods, are considered to be the reflection of focal inflammatory events in the white matter that disrupts neural conduction by damaging axons [2]. Clinical relapses are used to categorize different forms of the disease, i.e. RR versus progressive MS, as a marker to define the disease's disease progression and to measure the success of new therapies [2].
Magnetic Resonance Image (MRI) is a useful tool for understanding and following the disease progression in patients with MS [3][4][5]. The focal inflammatory events of the CNS that accompany a clinical MS relapse are evident on MRI recordings as contrast enhancing lesions (CELs) on T1-weighted images [6]. This kind of MRIs shows CELs four to ten times more frequently compared with clinically defined relapses [7]. That is, clinical relapses may not occur even if a CEL is observed. Therefore, CELs are more informative biomarker for disease progression than the Expanded Disability Status Score (EDSS). The natural history of a CEL is highly variable both within and between patients ( Figure 1). In MS, CELs and associated clinical relapses generally last for a month with spontaneous partial or full recovery afterwards. The CEL distribution over time has not been associated with any specific pattern or cause to date [2,8]. However, in one third of cases, relapses are preceded by either a stressful events and/or infections [9,10].
The number of CELs measured every month is a discrete response variable that can take only non-negative integer values ( Figure 1). Modelling such count data has been applied to different processes including anticonvulsant responses [11,12], incontinence [13], neonatal apnea [14] and epileptic seizures [15,16]. Commonly the Poisson distribution (PS) model is used to describe the data. The mean counts in an arbitrary time interval for the PS model can be denoted as l which can be influenced by several factors as drug effect, covariates (sex, weight, age…), disease progression, etc. The PS model has two restrictions: the mean (l) is equal to the variance of the data and the numbers of events occurring in non-overlapping intervals of time are assumed independent. This is a significant challenge as many counting outcomes show (i) bigger or smaller variability than that predicted by the Poisson model, a phenomenon called over-dispersion or under-dispersion respectively and (ii) lack of independence in the counts observed in previous intervals. Therefore, discrete distribution models other that the Poisson should be explored to evaluate this heterogeneity along with the evaluation of Markovian elements to adjust for correlation in counts between intervals. Identification of models that better characterize the distribution of CELs is relevant for two reasons. First, it provides a predictive framework for the relapsing-remitting dynamic observed in patients with MS. Second, it can serve to inform the design of longitudinal studies of this disease. While several count distributions have been proposed to model this type of data, the negative binomial (NB) distribution has been consistently found to provide one of the better fits to the data [17][18][19][20]. Although the NB has already demonstrated a very good fitting with this kind of count data, it might be the case that the election of a smaller interval period for the MRI acquisitions had produced a different analysis outcome. The best scanning interval and analysis for this outcome is not clear since monthly scans generally provided more prediction power but they are more expensive [21]. In this study we analysed the distribution of CELs developed by nine MS patients whom underwent monthly MRI for 48 months. Here we used the unique high resolution of this dataset for the exploration of other distributions adding other factors that might effect changes in the disease dynamic. Concretely, Markovian effects on the model parameters have been explored and quantified. In addition, how corticosteroids affect the lesions counts was also included during the model exploration and development. The short interval between MRI acquisitions (one month) shows an adequate time resolution to capture the relapsing-remitting dynamics of this disease.

Results
Several models were evaluated, resulting in sixteen key structural models based on seven different probability distributions. Based on the number of model parameters, the objective function (table 1) and the precision of the parameters the best  fitting distribution was the negative binomial (equation 10) overcoming the other explored distributions like (Poisson model, Poisson model with mixture distribution, Zero-Inflated and Generalized Poisson models and the Zero-Inflated Negative Binomial model). When the variance and the mean of number of CELs were calculated from the raw data by subject, the variances were greater in magnitude than the means for all but 1 patient ( Figure 2). A statistically significant difference in the objective function value (OFV) was observed when a first order Markov parameter h PDV , which modified the mean counts (l) parameter based on the counts observed from the previous MRI, was incorporated (NB MAK2 model, equation 4). When a second order Markov parameter h PPDV was included (NB nested MAK2 model, equation 5) relating the current mean count to the observed count two months prior, the statistical difference also significant, but the magnitude of the second order effect was less (h PDV .h PPDV ). The same decreasing pattern, h PDV .h PPDV . h PPPDV , was observed although the fit improvement was not significant when a third order Markov parameter was also included (NB nested nested MAK2 model, equation 6). Therefore, the best model was the negative binomial model (equation 10) with first and second order Markov parameters (equation 5): NB nested MAK2 model. Table 1 shows the parameter differences among models as well as the corresponding objective function values. Selected model (NB nested MAK2) estimates are listed in table 2 with the corresponding relative standard error (RSE %). The fixed effects parameters were estimated with adequate precision, however the RSE associated with the random effects was high as we expected (N = 9 subjects). All parameters and random effects variables for the sixteen models are listed in Table 1 with the objective function values.
The goodness of fit of the models was evaluated using simulation-based methods (see methods). Figure 3 shows the results for visual numerical predictive checks (VNPC) of several dynamic descriptors: (i) probability of having of 0, 1, 2, 3, 4, 5, 6, 7 and .8 CELs ( Figure 3A In order to evaluate better the predictive capacity of the NB nested MAK2, the prediction intervals for the dynamic descriptors previously defined were calculated ( Figure 4). The model captures the observed percentiles for a majority of the descriptors reasonably well. The CEL count distributions for the raw and simulated data for the selected model were also compared for model evaluation ( Figure 5). Figure 6 shows the prediction interval for variance versus mean of number of CELs with the patient data. The model is able to capture the relationship between the mean number of CELs and the variance of those counts. Based on these visual model evaluations it was concluded that the NB nested MAK2 model describes the observed data and better than other explored distributions.
The model was also validated with data from a second cohort. Model simulations for the maximum, median and mean of the number of CELs during 6 months were compared with fourteen patients with RRMS whom underwent monthly MRIs during a 6month pre-therapy phase ( Figure S1). The model captured reasonably well the median and mean although slightly underpredicted ( Figure 7A). The maximum number of active lesion was clearly under-predicted ( Figure 7A). The predicted interval for variance versus mean of number of CELs for a 6 month time window was slightly under-predicted for smaller means ( Figure 7B). The disease activity in the group of patients used for the model validation was higher than the group used for the model building; for example, the number of CELs per patient per month in average was 4.08 and 3.26 for model validation and model building respectively.
During the study, six patients were treated with immunomodulatory or immunosuppressive drugs (intravenous methylprednisolone at 1 g/day for 3-5 days or oral prednisone) for alleviation of clinical relapses. The months in which these patients received steroids are indicated in figure 1. A parameter for the effect of steroids was evaluated on: l 0 , OVDP, h PDV and h PPDV . Although the use of immunosuppressive drugs occurred infrequently, a steroid administration effect on the CEL events was quantifiable. A statistically significant improvement in OFV was observed when the effect was included on h PDV instead in l 0 . This relationship was further evaluated utilizing a randomization test approach (see methods); figure 8 shows 99.3% of the randomized schemes resulted in a higher OFV, highlighting the impact of the steroid effect (p-value = 0.007). Table 3 shows the estimates with the corresponding relative standard error (RSE %) for the selected model with the steroid covariate effect. Comparing values, the parameter values for l 0 , h PDV and h PPDV changed slightly to: 21.80%; 3.80%; and 2.75% respectively. The OVPD parameter dropped 14.86%, indicating that part of the over dispersion observed in the data might be due to the immunomodulatory treatment. The parameter h PDV , is diminished 66.44% when the patient was treated with immunosuppressive drugs for that month (h PPDV_S ). This result suggests that the use of steroids contributes to the inflammatory resolution of persistent CELs but does not affect the new CELs generated that month. Other implications of this result are indicated in the discussion section. The steroid covariate inclusion into the model, that is a within subject effect, did not explain the ISV associated to l 0 and h PDV . All fixed effects parameters were estimated with adequate precision except h PPDV . The RSE associated with the random effects were similar to the selected model with no steroid effect. The steroid effect was also evaluated for the following month that the patient received the treatment; however no significant effect was identified.

Discussion
The disease progression of CEL dynamics is highly variable both within and between patients. In this study we identified a discrete distribution model from a pool of candidate models that best described the distribution of CELs in these patients. Although several models were able to describe the data reasonably well, the negative binomial resulted in the best fit. The identified overdispersion indicates the presence of greater intra-patient variability (variance) in the number of CELs during a period of time (48 month, 48 points) than what is expected based on the mean.
All of the models had significant improvements in fit when the information about what happened in the previous months was incorporated (i.e., Markovian elements). Nevertheless the importance of previous observations diminishes with increasing time from the observation. This may be attributable to the fact that the CEL counts noted every month were the total number of CELs, and thus, older lesions observed in previous months might persist in the current one. Working under this hypothesis, the results suggest that such persistent CELs may last up to 2 months. In other words, this indicates that although the clinical relapse (symptoms that appear in episodic acute periods) usually last less than a month, the active inflammatory event might persist for a longer time.
It is well known that the focal inflammatory events in the CNS that accompany a clinical multiple sclerosis relapse show complex dynamics. There is the potential for a great deal of insight to be generated if mechanistic elements, e.g. balance between effector and regulatory T cell, were incorporated to these kinds of probabilistic models [22]. The idea would be to identify latent variables that explain variations in the mean counts (l). Unfortunately, with data available, we were not able to develop a more mechanistic model.
The selected model (called NB nested MAK2) describes the observed data well and better than other explored distributions ( Figure 3, Figure 4). Although all the parameters were estimated with adequate precision, the RSE associated with the random effects were high. Therefore, the estimated values for the ISV of the parameters l 0 and, h PDV are not well determined. However, as shown in figure 4, the prediction intervals simulated from the model for all descriptors of CEL response captured the observed response percentiles well.
The model was externally validated with data from a second cohort. Model predictions were adequate for the median and mean numbers of events, but the maximum number of events was under-predicted. The predicted interval for variance versus mean of number of CELs was also slightly under-predicted for smaller means. These under-predictions are probably due to the disease being more active in the data set used for the validation than in the group of patients used for the model building (the average number of CELs per patient per month was 4.08 in the validation set and 3.26 in the model building set). In theory, the level of overdispersion of the 6 months simulated data should be identical to that use to simulate 48 months of data. However, when comparing both simulated data sets (6 and 48 months), we realized that this is not the case. The lower overdispersion found in 6 months simulated data is an effect of the number of months (or number of observations) within the subject. As a matter of fact, the same phenomenon also occurs if we select any 6 consecutive months of the 48 months in the observed data. Therefore, higher overdispersion is expected for both simulated and observed data when larger time windows are used for the calculation of the mean and corresponding variance of the number of active lesions.
During the course of the study, six of the nine patients were treated with corticosteroids for clinical relapses. We explored a steroid effect on all model parameters: l 0 , OVDP, h PDV and h PPDV . A significant improvement in fit was found when the effect was included on h PDV instead of l 0 . This result suggests that the use of steroids contributes to the inflammatory resolution of persistent CELs (older CELs) but that it is not affecting to the newer CELs generated in the month after administration. Specifically, the model suggests that the use of steroids would help to resolve approximately the 66% of the persistent CELs. This result was further evaluated utilizing a randomization test strategy. One thousand randomized ensembles of the dose event architecture (see methods) were simulated to test this. Figure 8 shows the histogram of the OF values calculated (22LL) from the simulations. 99.3% of the randomized schemes resulted in a higher OF, highlighting the statistical relevance of steroid effect (p,0.007). The steroid drug effect was evaluated not only for the month in which the patient received steroids, but also for the following month. Although a decrease for l 0 when patients received steroids in the previous month was identified, this result was not significant. These results reflect the utility of this modelling approach for drug effect evaluation, providing a quantitative framework that can support the informed design of future longitudinal studies and other clinical trials.
The best probabilistic model, fitting the distribution of CELs developed by nine MS patients undergoing monthly MRI evaluations over 48 months, was developed. The information that can be extracted from this kind of count data depends on the resolution with which data are collected as well as the coincidence of the measurement paradigm with the CEL cycle. Other approach/methodologies have been applied with relative similar purposes [17][18][19][20][21][23][24]. Although in general the number of patients analysed in those studies was larger; the recording timings for the MRIs was not of sufficient resolution for capturing the CELs dynamic. The short interval between MRI acquisitions (one month) provides an appealing time resolution to capture the relapsing-remitting complex dynamics of the CELs. In this data analysis, an additional step was taken by applying the nonlinear mixed effect modelling approach. This provides a quantitative analysis of the data allowing the incorporation and quantification of both fixed and random effects. This approach takes into account the information from all patients simultaneously, defining both the population tendency for each parameter and the interpatient variability in that parameter. This methodology has been widely applied and evaluated in research fields such as pharmacometrics. It is an approach that is especially well suited for the analysis of repeated measurements. The selected model was comprehensively evaluated (OFV comparisons, goodness of fit plots, visual/numerical predictive checks, parameter precision…) and externally validated using data from a second cohort.

Patients and MRI Scans
The study was performed at the National Institutes of Health in Bethesda, MD, USA. The Intramural Research Board of the National Institute of Neurological Disorders and Stroke approved the study. Informed written consent was obtained from each patient. The MRIs were performed on a 1.5-T magnet (General Electric Medical Systems, Milwaukee, Wisconsin) using a standard head coil as previously described. At each monthly MRI, the total number of contrast-enhancing lesions (CELs) on T1-weighted post-contrast scans was identified by experienced radiologists (Figure 1). Clinical and imaging details about this cohort are described in detail elsewhere [25]. Nine patients with MS were sequentially enrolled. Patients were enrolled in the study if they had never been treated with immunomodulatory or immunosuppressive drugs, except for intravenous methylprednisolone at 1 g/ day for 3-5 days, or oral prednisone taper for a clinical relapse. In addition, patients were required to be able to complete monthly MRI scans and to have been steroid-free for at least 1 month at study entry. After a complete neurological examination, including rating disability using the Expanded Disability Status Score (EDSS) and initial MRI scan at baseline, patients were subsequently examined and imaged monthly. The number of total CELs in each month was calculated as the sum of all the CELs that were enhancing at that month for the last time. Thus, each CEL considered for the analysis was counted only once.

Data Analysis
Data were analysed employing the population approach using the Laplacian integral approximation method implemented in the software NONMEM version VII (Icon Development Solutions, Hanover, Maryland).

Models for Count Data
ii. Poisson model with Markov elements (PMAK). The Markovian element provides for the dependence of events across iii. The Poisson model utilizing a mixture distribution for individual observations (PMIX) (equation 7) incorporates an additional parameter, the mixture probability (MP) for an observation to belong to one of the two mixture distribution (characterized by two different ls) within an individual [26]: iv. Zero-Inflated Poisson model (ZIP) is a mixture model adapted for data presenting an excess of zero values and therefore needing an adaptation of the distribution that is otherwise extremely skewed. ZIP is a special case of PMIX where l 1 is equal to 0. It is composed of two equations depending on whether the random variable is a zero or a greater value and they include the probability P 0 of the observation being zero (equation 8 vi. Negative Binomial model (NB) is used when there is over dispersion due to latent heterogeneity (equation 10). The NB model is a mixture of the Poisson distribution when the mean follows a Gamma distribution [28] and is a function of l and a parameter which accounts for the degree of over dispersion called here OVDP. The mean is still l, but the variance becomes l6(1-OVDP6l). As OVDP approaches zero the NB model approaches the PS model. OVPD is restricted to be positive. First (equation 4), second (equation 5) and third (equation 6) order Markov elements were explored for this distribution models affecting to the l parameter, called NB PMAK2, NB nested PMAK2 and NB nested nested PMAK2 respectively.
vii. Zero-Inflated Negative Binomial model (ZINB). Equation 11 corresponds to the zero-inflated negative binomial. The parameters P 0 and OVDP have the same meaning as in the ZIP model. The ZINB model reduces to the ZIP, NB or PS model when P 0 , OVDP or both approaches zero, respectively Once the basic structure of the model was identified, the effect of the steroid treatments on the model parameters was evaluated. This was done for every model parameter, i.e., l 0 , OVDP, h PDV and h PPDV . The drug effect was evaluated not only for the month in which the patient received steroids, but also for the following month to evaluate longer-term or delayed effects.

Model Development and Selection Criteria
The minimum value of the objective function (OFV) provided by NONMEM, which corresponds approximately to 226log(likelihood) [22LL], served as a criteria for model comparison during the model development process. A decrease in 22LL of 6.63 points for one additional parameter, was regarded as a significant model improvement corresponding to p-value of 0.01 for nested models. The Akaike information criteria (AIC), calculated as AIC = 22LL+26NP, where NP is the number of parameter in the model, was used for selection among non-nested models [29]. The choice of the final model was based also on the OFV value, the precision of parameter estimates, and the results from model predictive performance where raw data and data obtained from model simulations were compared.

Model Evaluation
Models were evaluated in more detail as follows: (i) Visual Numerical Predictive Checks (VNPC). One thousand individuals were simulated using the selected models and their model parameter estimates. For the observed data and each simulated dataset the following descriptors were calculated: probability of having of 0, 1, 2, 3, 4, 5, 6, 7 and .8 CELs, maximum and mean elapsed time without lesions during the four years and number of cumulative CELs per year. For every descriptor, the increasing percentiles from 5 th to 95 th were calculated. The results for the raw and simulated data with the different models were plotted where X axis represent the different percentiles ( Figure 3). (ii) Predicted interval of VNPC. One thousand studies with 9 individuals per study were simulated using the selected model. The same dynamic descriptors that were described for the VNPC were used here. For every descriptor, the increasing percentiles from 10 th to 90 th were Figure 8. Analysis of the significance of the steroid effect by randomizing the dose events. One thousand new data files were generated by randomizing the doses event architecture while preserving the total number of dose events and the patient observations. The histogram shows the distribution of the OF values obtained using the selected model with the steroid effect when drug administrations were randomly generated. The OF value of the selected model with no steroid effect is marked in green. The OF value of the selected model with the covariate steroid effect, using the real dose moments is highlighted in red. doi:10.1371/journal.pone.0073361.g008 (11)  calculated. The 95% prediction interval by model was plotted with the data (Figure 4). (iii) Probability CEL distribution; observed data was compared to the probability distribution of simulated data generated by the selected model ( Figure 5). (iv) Predicted interval for variance versus mean of number of CELs; one thousand simulated individuals with the selected model were generated. The individual mean CELs counts and the individual variance for every patient were computed from the raw data. The same computations were then made for each simulated individual and year; for a total of 1000. The results were divided into 20 intervals for the mean of CELs, with each interval containing 50 simulated subjects. For each interval, variances were binned, and the median and 5 th -95 th percentiles were calculated. Finally, the overall median and percentiles were represented graphically together with those corresponding to the raw data ( Figure 6).

Model Validation
Fourteen patients with relapsing-remitting multiple sclerosis underwent monthly MRIs during a 6-month pre-therapy phase. None of these patients were treated with any immunosuppressive therapy before the first scan. On each MRI, the total number of CELs was noted ( Figure S1). This open-label study was performed at the National Institutes of Health, Bethesda, Maryland, with approval from the institutional review board [30]. Three descriptors were used for model validation ( Figure 7A): (i) maximum number of CELs, (ii) median of the CEL counts and (iii) mean of the number of CELs collected during the 6 months pre-therapy phase. Moreover, the predicted interval for variance versus mean number of CELs for a 6 time window by the selected model was compared with the same values from the data set ( Figure 7B).

Randomization of the Steroid Dose Administrations
A randomization test to calibrate the false positive rate for the evaluation of the steroid effect was conducted. Specifically, one thousand new data files were generated by randomizing the dose event architecture while preserving the total number of dose events and the patient observations.