Structural and practical identifiability of contrast transport models for DCE-MRI

Contrast transport models are widely used to quantify blood flow and transport in dynamic contrast-enhanced magnetic resonance imaging. These models analyze the time course of the contrast agent concentration, providing diagnostic and prognostic value for many biological systems. Thus, ensuring accuracy and repeatability of the model parameter estimation is a fundamental concern. In this work, we analyze the structural and practical identifiability of a class of nested compartment models pervasively used in analysis of MRI data. We combine artificial and real data to study the role of noise in model parameter estimation. We observe that although all the models are structurally identifiable, practical identifiability strongly depends on the data characteristics. We analyze the impact of increasing data noise on parameter identifiability and show how the latter can be recovered with increased data quality. To complete the analysis, we show that the results do not depend on specific tissue characteristics or the type of enhancement patterns of contrast agent signal.


Introduction
Contrast agent transport models have been developed and used for decades in the analysis of concentration time course data derived from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI), and they are essential elements of many computational clinical applications [1][2][3][4][5][6][7][8][9].DCE-MRI transport models are based on linear ordinary differential equations with compartments describing the concentration of the contrast agent (CA) in tissue and vasculature.These models assume the compartments are well-mixed, i.e., contrast agent distributes evenly throughout the compartment instantaneously, such that contrast concentration is only a function of time, and not space, within an individual voxel.Several models have been formulated [10] based on different assumptions and simplifications to describe the blood-tissue exchanges of the administered contrast agent between compartments.The kinetic parameters derived from these models have been applied to a wide range of biological settings, including cancer, and have provided compelling diagnostic and prognostic value [11] as well as detecting early response to treatment [12].
Here we consider the structural and practical identifiability of four nested contrast transport compartmental models frequently used to analyze DCE-MRI data.Identifiability is a fundamental property to create models able to capture the dynamics shown in the data with welldetermined parameters [13,14].The issue of model identifiability revolves around the question of whether it is possible to use data to accurately and uniquely estimate parameters in the model.Generally, the larger the number of compartments in a model, the higher the accuracy, but at the cost of higher analysis complexity.Moreover, the more parameters are included, the higher the probability of having a non-unique combination of model parameters that can accurately characterize the dynamics, underpinning the concept of structural identifiability.Therefore, there is a need to investigate how the kinetic parameters involved in the class of nested models used in DCE-MRI analysis can be determined, and what affects their identifiability, even in simple cases.Model identifiability becomes especially important for biological systems due to the limited availability and quality of the data [15][16][17].Moreover, the amount and the quality of the data have a strong impact on the identifiability of some parameters and, thus, on model outcomes and predictions.
Structural identifiability determines if the model structure is well-defined, however it is not sufficient to ensure the robustness of model outcomes when dealing with experimental biological data to inform the model.In fact, it does not guarantee successful parameter estimation when, for instance, there is a limited amount or a quality of the experimental data, with large measurement noise.This is where the study of practical identifiability takes importance and a careful analysis of both types of identifiability becomes vital.In this work, we analyze the entire family of nested contrast transport models from both structural and practical viewpoints.To the best of our knowledge, few works in the literature have analyzed both structural and practical identifiability aspects of transport models used for DCE-MRI data, mainly focusing on the practical identifiability and only for a subset of these models [18][19][20][21][22][23].
This work is organized as follows.In Section 2, we describe the theoretical aspects of structural and practical identifiability, DCE-MRI data, and the class of considered models.Next, in Section 3, we provide structural and practical identifiability analysis of the most complex member of the family of models, the Leaky Tofts-Kety model, applied to glioblastoma brain cancer data.We comment on the outcomes and implications in Section 4. Finally, in S1-S3 Text, we provide results concerning the nested family of sub-models as well as additional breast cancer data.

Ethics statement
The study of publicly available data and retrospective analysis of City of Hope patient data were approved with a waiver of consent by the City of Hope Institutional Review Board, Protocol 15286.

Theory of nested DCE-MRI transport models
Compartmental models of contrast transport for DCE-MRI are constructed assuming that the total tissue CA concentration C t (t) can be modeled as the sum of the CA concentration in the plasma space (PS) (C p (t)) and in the extracellular extravascular space (EES) (C e (t)), i.e.
where v e and v p are the fractional EES and plasma volumes, respectively.The explicit description of the CA concentration in each of these compartments depends on the assumptions made to estimate both C e (t) and C p (t) and, thus, on the chosen model, as illustrated in Fig 1A .The CA concentration for the plasma compartment, C p (t), is assumed to be not affected by local transport and it is given by the empirical vascular input function (VIF): The dynamics of the VIF are characterized by a sharp uptake, followed by a peak value, and then a washout dynamic (Fig 1B).Several techniques have been proposed to measure the VIF [10,24].Here we consider two different estimation methods.First, we consider a populationbased analytical expression of the VIF, commonly referred to as the Parker VIF [25], and second we estimate the VIF directly from the MRI signal [26].The Parker VIF uses a mixture of 2 Gaussians plus an exponential modulated by a sigmoid function: Here A n , T n , and σ n are the scaling constants, centers, and widths of the n th Gaussian, α and β are the amplitude and decay constant of the exponential, and l and τ are the width and center of the sigmoid, respectively.For the second method, VIF estimation is performed by drawing a region of interest (ROI) on a major vessel in the tissue and directly quantifying the contrast agent concentration, assuming the CA bolus arrives simultaneously to the entire tissue of interest.Population-based VIFs are widely used in DCE-MRI due to their simplicity and the fact that they do not require additional measurements, though they may ignore inter-subject variability.In contrast, the accuracy of individual-based VIFs derived from estimations of the CA concentration in large vessels depends on MRI characteristics, which may contribute to identifiability issues, as will become apparent through our analysis.
For the evolution of the CA concentration in the EES compartment, and, thus, to derive the expression for C t (t), we analyze a class of nested compartmental models consisting of the Patlak model (PM), Tofts-Kety (TK) and its extended version (eTK), and the Leaky Tofts-Kety (LTK) model.The Patlak model, firstly introduced in [27], represents the simplest tissue uptake model and it assumes that CA diffuses from the PS to the EES at a rate governed by the forward transfer constant K trans .It neglects the reflux from the EES back into the plasma space due to the assumption on low permeability and short measuring time.Thus, the concentration into the EES compartment varies according to v e dC e dt ¼ K trans C p ð4Þ and, assuming that the initial concentration in the EES is zero (C e (0) = 0), from Eq (1), the tissue CA concentration is given by To overcome the low permeability hypothesis of the Patlak model the Tofts-Kety model was introduced in [28,29].The TK model accounts for bidirectional transfer of the contrast agent from the plasma to the EES compartment at rates governed by the forward transfer constant K trans and the reverse constant , respectively.However, it ignores the intravascular (plasma) compartment contribution (i.e., v p � 0).The dynamics of the CA concentration in the EES compartment are described by The transfer constant K trans now has a different physiologic interpretation, depending on the balance between capillary permeability and blood flow: in high-permeability situations, K trans is equal to the blood plasma flow per unit volume of tissue; in low permeability scenarios, K trans is equal to the permeability surface area product between blood plasma and the EES, per unit volume of tissue [29].Under the assumption that v p � 0 and setting the initial concentration in the EES to zero (C e (0) = 0), we can derive from Eq (1) the expression of the tissue CA concentration as The hypothesis of tissue being weakly vascularized (v p � 0) may be invalid in some situations, especially in the case of highly vascular tumors.Thus, a modification of the TK model was proposed in [30].Known as the extended Tofts-Kety model, it includes the vascular contribution v p C p (t) to the overall CA tissue concentration, maintaining the same dynamics shown in Eq (6) for the EES concentration C e (t): The TK and eTK models have experienced some problems to obtain reliable estimation for the involved parameters when the data depict a persistent uptake curve (Type I) [31].The PM model, with its assumption on unidirectional flux from the plasma to the EES compartment, is able to overcome this issue, but assuming a negligible reflux from the EES back into the plasma space is not realistic, especially in the high permeability regime [32].The Leaky Tofts-Kety (LTK) model was introduced to address this issue [31].The LTK model, also known as Leaky Tracer Kinetic model (LTKM), adds an additional compartment, the leakage compartment, to the already described plasma and permeable compartments.The permeable space and the leakage space are considered subcompartments of the EES region.The former assumes a bidirectional exchange between plasma and EES, while the latter considers a unidirectional flow from which the contrast, at a concentration C L (t), does not flow back into the vasculature.Thus, for the LTK model, the overall CA tissue concentration is given by The contrast agent in the permeable compartment C e (t) evolves following Eq (6), while the rate of contrast change in the leakage space can be read as Here, λ is the volume transfer constant from the plasma compartment to the leakage compartment.Assuming that the initial concentration in the EES and leakage compartments are zero (C L (0) = C e (0) = 0), then the CA tissue evolution is given by In summary, the main differences between these models are the directional flows for the CA exchange between the plasma space, the extravascular extracellular space, and eventually the leakage space.The Patlak model is a unidirectional model which assumes CA transfer from EES to PS is negligible due to low permeability and short measurement time [27].The Tofts-Kety model assumes that the CA diffuses from, and returns to the PS, and the vascular (plasma) compartment contribution is negligible [28].This assumption of tissue being weakly vascularized is overcome with the extended Tofts model, which includes the vascular contribution to the tissue concentration [30].Finally, the Leaky Tofts-Kety model extends the eTK, adding an additional leakage compartment with unidirectional flow for disease applications such as gliomas, where slow accumulation of CA from neighboring voxels is common [31].In all models where the vascular component is considered, it is assumed that the concentration of the contrast agent within the vasculature is not affected by local transport, (i.e. it is an empirical forcing function), and that the difference between the empirical vascular input function and the local tissue concentration is the primary driver of transport between tissue and vasculature.

Data
Two DCE-MRI scans were used in this study.A brain scan was obtained from a patient with pathology-confirmed diagnosis of glioblastoma who underwent MRI at City of Hope National Medical Center.The T1-weighted brain DCE-MRI scan was acquired as follows: TR = 5.1ms, TE = 2.1ms, variable flip angle, with field of view 240mm x 240mm, matrix size 128x128 and image size 256 x 256 with 12 slices with slice thickness 6mm, and a temporal resolution of 6.03 seconds with 32 dynamic phases and 3 baseline images prior to contrast administration.A breast DCE-MRI scan was obtained from the Quantitative Imaging Network (QIN) BREAST-02 study, UPN-01.The image acquisition details for the breast MRI can be found in the documentation for the BREAST-02 study [33,34].In both cases, a variable flip angle (VFA) scan was acquired for direct contrast agent quantification [35].The VFA scan was used to calculate the baseline T1 relaxation, T 10 , of the tissue, which is used for calculating the local contrast agent concentration [36].

Identifiability analysis
Identifiability analysis is a fundamental mathematical tool used to assess a model's capability to describe data with unique determined parameters.It focuses on whether it is possible to identify a unique vector of parameter values for a given model structure, or whether multiple parameter values will fit the data equally well.It is important to distinguish between structural identifiability, which concerns how the model structure itself affects the possible unique identification of the involved parameters, and practical identifiability, which is based on the analysis of the model's ability to estimate parameters from the data with adequate precision.
Structural non-identifiability implies practical non-identifiability: in fact, if a model is not structurally identifiable, then the quality of the data collected does not matter; it will not be possible to uniquely estimate parameters in practice.The converse, however, is not true.Structural identifiability implies practical identifiability only when there is availability of infinitely resolved data, with zero measurement noise.However, even if the model structure theoretically allows parameters to be estimated, one still needs to have the appropriate data to achieve practically identifiable parameters.
where xðtÞ 2 R n represents the state variables, yðtÞ 2 R m the measurable output (e.g., the data), h(�) the function that maps the states x to the observations y, uðtÞ 2 R r the external input function (in our case the vascular input function), and θ 2 R q the set of constant parameters.The dynamical system is structurally identifiable if each element θ i of the vector θ is structurally identifiable.This means that each of these elements can be uniquely determined from a given input u(t) and a measurable output y(t), i.e., yðxðtÞ; uðtÞ; θÞ ¼ yðxðtÞ; uðtÞ; θ 0 Þ ) y i ¼ y 0 i 8i ¼ 1; ::; q Alternatively, one element θ i of the parameter vector θ is structurally non-identifiable if varying its value does not necessarily alter the model trajectory y, as these changes can be compensated for by varying other parameters.In particular, a model is defined to be structurally identifiable if all of its parameters are structurally identifiable [37].A large variety of methods exist to assess structural identifiability of a system, from the Laplace transform [38], Taylor series expansion [39], seminumerical approaches [40,41], differential algebra [42][43][44], and numerical algebraic geometry [45], among others (for reviews of some of these approaches, see Refs.[16,46,47]).Due to the simple structure of the nested models considered in this analysis and described in Section 2.2, we apply a differential algebra approach.This approach is convenient for linear compartmental models because of the reduced number of state variables (i.e., C t (t)) and the direct relationship between state variables and observations (which actually coincides for this class of nested models).Further, this approach directly allows for testing identifiability, obtaining simple forms of possible identifiable combinations of the parameters, and identifying reparameterizations of the model when it is structurally non-identifiable.

Theory of practical identifiability.
As already briefly explained, a parameter that is structurally identifiable may still be practically non-identifiable.This issue arises frequently if the quantity and/or quality of the experimental data is insufficient.Assessing the practical identifiability of a model (or a parameter) is fundamental in obtaining reliable parameter estimations and, thus, for ensuring good model prediction capabilities.While the notion of practical identifiability is not uniquely defined in the literature, we consider a definition based on the concept of profile likelihoods and confidence intervals [48].Using this framework, we evaluate the agreement between the experimental data and the observable predicted by our set of nested models choosing as Maximum Likelihood Estimator (MLE) the objective (or cost) function defined as: Here, N is the number of time points t i available from the data, y D i are the CA data values at time t i , y M i ðxðt i Þ; uðt i Þ; θÞ is the CA value at time t i predicted by the model with the estimation θ of the parameter vector, and σ is the standard deviation of the noise of the data.In particular, this standard deviation is estimated as the inverse of the Signal-To-Noise (SNR) ratio of the data [21].The state variable x(t) represents the CA concentration C t (t) and u(t) is the vascular input function.The parameter vector θ used to evaluate the practical identifiability of the model is given by θ ¼ arg min½MLEðθÞ� : To determine the confidence interval of the estimated set of parameter θ, we consider the profile likelihood (PL) method.To build the profile likelihood plot, the idea is to choose one parameter θ i , vary its value around the maximum likelihood estimate ŷi , and re-optimize the remaining parameters.Thus, the profile likelihood is given by Then, the confidence intervals (CI) of estimated parameters are defined as where Δ α represents the α-th quantile of the chi squared (χ 2 ) distribution with k = 1 degrees of freedom for a point-wise confidence interval [37,49].Thus, given the optimal value ŷi of one parameter, defining the confidence interval to a confidence level α implies that the true value ŷi of the parameter is located within this interval with probability α.When this confidence interval is finite, then a parameter θ i is practically identifiable.Otherwise, if the confidence region is infinitely extended, although the numerical algorithm finds a unique minimum θ, then the parameter is practically non-identifiable.Precisely, we analyze the profiles of the MLEð θÞ versus each parameter θ i .The choice of a least squares function for the MLE is equivalent to working with an additive Gaussian noise measurement model with constant variance.We point out the reader that the assumption of additive Gaussian noise may not be the more appropriate under certain conditions, especially for data close to zero.We highlight that other choices of measurement error model may be considered in order to relax this assumption of additive Gaussian noise (see [50] for a recent review on the topic).In any case, the method of the profile likelihood analysis here implemented holds when different measurement models are used.
A structurally non-identifiable parameter is characterized by a flat profile likelihood, whereas the profile likelihood of a practically non-identifiable parameter may have a minimum, but does not have a well-defined confidence level α for increasing and/or decreasing values of θ i .Instead, for an identifiable parameter the profile likelihood exceeds such threshold in both directions of θ i .Fig 2 shows an illustrative example of likelihood profiles.Practical identifiability as determined by profile likelihoods methods are standard methods for determining the unique parametrization of a real-world model-data pairing.We refer the interested reader to [37] for a detailed primer on profile likelihoods and practical identifiability.Profile likelihood-based confidence intervals are often used in biological applications to ensure that there is sufficient data and data quality to assume the validity of the model [14,[51][52][53][54].With respect to other methods for determining confidence intervals, this particular method allows for asymmetric intervals that are invariant under re-parametrizations of the model.The analysis performed here refers to univariate likelihood profiles, where we are considering only one target parameter.For the reader convenience, it is worthy to mention that other options are available when pairs (or more) target parameters are considered.In particular, in this cases bivariate or higher-order profiles can be generated and studied [53].
In addition to the profile likelihood, we also analyze the compensating profiles of the model parameters.These are obtained by perturbing one parameter θ i around its maximum likelihood estimate ŷi and plotting the variation of the re-optimized parameters versus the perturbed θ i value.This analysis demonstrates how a non-identifiable parameter (whether structural or practical), may be compensated for by variations of other parameters.The small dimensionality of our systems allows us to use the described method for the analysis proposed in Section 3.2.We analyze the practical identifiability for the family of nested models described in Section 2.2 using a self-implemented code in Matlab based on the Particle Swarm algorithm for global optimization [55,56].The codes used for the analysis proposed in these notes are available in the corresponding GitHub repository: https://github.com/mconte93/Practical_Identifiability.git.

Results
Here we analyze the results concerning structural and practical identifiability of the most complex of the nested models, the LTK model described in Section 2.2.

Structural identifiability of the LTK model
Following the formalism of the differential algebra approach described in Section 2.4.1, model (10) can be rewritten as where y is the observable concentration of the contrast agent in the tissue C t (t), u(t) is the external input given by the concentration of contrast agent in the plasma compartment (i.e., C p (t) in ( 2)), x 1 and x 2 are the state variables for the CA concentration into the EES and leakage compartment, respectively.Differentiating the last equation in system (11) and plugging inside the first two equations of (11), we obtain the differential equation for y(t) of the form where v is defined as v ¼ R t 0 uðtÞdt and the coefficients a i are given by We observe that a 2 , a 3 , a 4 < 0, while a 1 > 0. Solving system (13) for the four parameters of the LTK model provides the following expressions: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi For v e to be well-defined we need a 2 � a 4 a 1 À a 1 , which always holds as the four parameters of the LTK model are positive.From this analysis, we can conclude that LTK model is structurally identifiable.The structural identifiability of the nested PM, TK, and eTK models has also been analyzed and details are provided in S1 Text.Moreover, for the reader's convenience, in S2 Text we provide an example of a structurally non-identifiable model obtained by modifying the LTK model structure and changing the connections among the compartments.

Practical identifiability of the LTK model
Practical identifiability is performed using the method described in Section 2. ; Type II-plateau curve-is characterized by an initial peak followed by a relatively constant enhancement; Type III-wash-out curve-refers to a sharp uptake followed by an enhancement decrease over time (Fig 1C, left plot).We consider two different tissues to study the role of the tissue characteristics on model identifiability.In both tissues, we replicate the analysis for the three different CA time-enhancement curves.We report results on the Glioblastoma multiforme (GBM) brain tumor data in the main text.
Results for the breast cancer data are presented in S3.1 Text.For practical identifiability analysis we distinguish between the following three cases of study (referring to the rows of Fig 3).
(AA).The first case is referred to as Artificial-Artificial.Here, we consider an artificial VIF given analytically by Eq (3) and three artificial data sets.These are generated by running forward simulations of the LTK model, given by Eq (10), with the parameter values listed in Table 1, and replicate the three CA time-enhancement curves.In particular, these artificial data sets represents noise-free experimental data.Thus, in this case, there is no noise in the VIF or in the data which could affect the parameter identifiability.For the VIF we set A 1 = 48.54mM × s, A 2 = 18.64 mM × s, σ 1 = 3.378 s, σ 2 = 7.92 s, T 1 = 10.2276 s, T 2 = 21.9 s, α = 1.05 mM, β = 0.0028 s −1 , l = 0.6346 s −1 , and τ = 28.98 s. (RA).The second case is referred to as Real-Artificial.Here, we consider a real estimation of the VIF directly from DCE-MRI data by drawing a ROI around a major vessel, and three artificial data sets.These are obtained by running forward simulations of the LTK model, given by Eq (10), with the parameter values listed in Table 2 and the real VIF.This real VIF includes noise, as compared to the analytic expression for the artificial VIF as in the (AA) case.
(RR).The last case is referred to as Real-Real.Here, we consider both a real estimation of the VIF and a real evolution of the contrast agent in the tissue.We analyze data from a GBM brain tumor and a breast cancer MRI dataset.
The three CA time-enhancement curves used in (AA), (RA), and (RR) cases are shown in Fig 3 (continuous black lines) together with the corresponding model fits obtained with the LTK model (10) (dashed red lines).The particle swarm (PS) method is used to estimate parameter values for fitting the three curve types and in the three cases of study (AA), (RA), and (RR).Values of fitted parameters are collected in Table A in S3 Text.
Recalling the methodology described in Section 2.4.2, we analyze profile likelihood and confidence levels for each of the parameters involved in the LTK model, in the three cases of study (AA), (RA), and (RR), and for the three types of CA time-enhancement curves.Here, we report the results for the Type I enhancement curve, as the LTK model was introduced with the aim of solving the issues of PM, TK, and eTK models concerning parameter estimation for persistent uptake CA profile.Results concerning Type II and Type III time-enhancement curves are collected in Figs A and B in S3 Text.Moreover, we show the results for K trans and λ parameters.K trans is the only parameter appearing in all nested models analyzed in this study, thereby allowing for a direct comparison of the results obtainable with the different models.λ is a novel parameter from the LTK model, thus it is important to analyze its identifiability.
Fig 4 shows the results concerning practical identifiability of the parameter K trans for the LTK model and Type I time-enhancement curve.The columns of Fig 4 refer to the three cases of study (AA), (RA), and (RR).From Section 3.1, we know that K trans is a structurally identifiable parameter, i.e., it is possible to uniquely determine its value from the given model structure.When we consider a completely artificial data set, obtained with the parameter values collected in Table 1 and the population-based analytical expression of the VIF given in (3), the complete absence of noise in the data and in the VIF allows us to accurately recover K trans .In the (AA) case, the profile likelihood shows a well-defined parabola, with a unique minimum in the optimal value K trans .Zooming around this minimum, we observe a finite confidence region around K trans for the 95% confidence level.Analyzing the (RA) case (second column), we notice that K trans represents a local minimum for the profile likelihood and it is not possible to identify a lower bound for the confidence regions for both 95%, 80%, and 68%, i.e., it is not possible to define a finite confidence interval.Thus, K trans is practically non-identifiable in this case.Moreover, from the compensating profiles in the region where the profile likelihood is almost flat (left side with respect to K trans ), we notice v e and λ compensating for varying K trans .In contrast, the optimal value estimated for v p does not significantly changes with respect to K trans variation.This is in line with the results in (13), where we can notice how v p does not have any relation with K trans .In the (RR) case, K trans is not practically identifiable, evident from the flat likelihood profile, which makes it not possible to identify a lower or upper bound of a confidence region with respect to the optimal value K trans .Similar observations can be made from the results of From the structural identifiability analysis of Section 3.1, we know that λ is a structurally identifiable parameter, thus, it is possible to uniquely identify its value from the model structure.For the completely artificial data set used in (AA), we are able to accurately identify λ.The profile likelihood in the first subplot of Fig 5 shows a well-defined parabola with a unique minimum in the optimal value l, Top row: profile likelihood and confidence levels at 68%, 80%, and 95% for the parameter K trans in the (AA), (RA), and (RR) case for the Type I enhancement curve.Inset in the first subplot shows a zoom of the region around the best-fitted value K trans (red marker).Bottom row: compensating profiles of the parameters v e , v p , and λ with respect to variation of K trans around its best-fitted value.Variation of ±50% around the optimal value of K trans are considered.Two colors are for two different y-axis: black curves refer the left y-axis and red curves to the right y-axis.Different line styles are used to distinguish curves referring to the same y-axis.For each curve, the name of the corresponding parameter is indicated above the line in the same color.https://doi.org/10.1371/journal.pcbi.1012106.g004around which a finite confidence region for the 95% confidence level can be determined (as shown in the zoomed inset).Analyzing the RA and RR cases (second and third columns), we notice that the profile likelihood is characterized by flat regions for which it is not possible to identify a lower or upper bound for 95%, 80%, and 68% confidence levels.Thus, λ is not practically identifiable in these more realistic cases.Looking at the corresponding compensating profiles, we observe that v p estimation is not influenced by λ variation, while v e and K trans show variability in response to λ.This result is reasonable in relation to the results in (14): the expression for v p is independent from the other parameters, whereas K trans , v e , and λ expressions depends on common coefficients (i.e., a 1 , a 2 , and a 4 ).
Observation 1.For reader convenience, we point out that it is reasonable to obtain different optimal estimations for K trans in the different analyzed cases (AA, RA, RR), as moving from AA to RA and, then, RR case, we are changing the VIF and the DCE data on which the inverse problem is performed.
To specifically analyze the influence of the noise in both the data and the individual-based estimation of the VIF function, derived directly from the DCE-MRI data, we perform two different tests, whose results are shown in Figs 6 and 7. We first analyze the impact of increasing the noise in the VIF function, mimicking the process that leads from the AA to the RA case.In fact, the individual-based estimations of the VIF are characterized by a predefined level of noise that is not present in the analytical VIF estimations.To this aim, we define three different amplitudes for the noise-only signal, corresponding to the 5%, 10%, and 15%, respectively, of the noise-free VIF at every element.These amplitudes allow to control the amount of noise.Then, we define the noise-only signal in the form a standard normal distribution.Finally, we add the noise-only signal adjusted with the corresponding amplitude to your original noisefree VIF.This creates the noisy VIF used in the study.More details can be found in the available code.Fig 6 shows the results of this analysis for the parameter K trans and Type I timeenhancement curve.Analogous results can be obtained for the other parameters and enhancement profiles.The first column, the no noise case, replicates the same results shown in the first column of Fig 4, i.e., the practical identifiability of K trans with a confidence level of 95%.Increasing the noise in the VIF, up to 5%, we observe that K trans is still practically identifiable, but with a lower confidence level (80%).Considering higher levels of noise of the VIF (10% or 15%), the practical identifiability of the parameter is no longer guaranteed, as the confidence region bounds become no longer finite for the considered confidence levels.This supports the conclusion that increasing noise in the VIF decreases the confidence level of parameter identifiability, justifying the differences in the results between (AA) and (RA) cases in Figs 4 and 5.
To analyze the influence of the noise in CA data on the parameter identifiability and, thus, mimicking the process that leads from the RR to the RA case, we perform a second test based on data smoothing.Starting from the CA time evolution data from the GBM dataset for a Type I time-enhancement curve, we use a moving average method with different values for the smoothing factor γ. Precisely, the method is based on the moving average technique that slides a window along the data, computes the mean of the points inside of each window, and replaces each data point with the average of the neighboring data points defined within the span.The window size is fixed and heuristically determined by the algorithm depending on the input data, while γ 2 [0, 1] adjusts the level of smoothing by scaling the window size.Values of γ near 0 produce smaller moving window sizes, resulting in less smoothing, while values near 1 produce larger moving window sizes, resulting in more smoothing.15, shows the practical identifiability of K trans can be recovered for the confidence level of 68%.We would not expect to obtain a higher confidence level for K trans .These results justify the conclusion that high levels of noise in the data decrease the confidence level of parameter identifiability, in line with the previous shown results for RA and RR case in Figs 4 and 5.

Discussion
In this work, we have carried out a formal and in-depth study of the structural and practical identifiability of a well-known class of transport models widely used for the analysis of DCE-MRI data, with the aim of showing the impact of specific data characteristics on intrinsic features of these models.We focused on the LTK model, which accounts for the different aspects that are separately included in other members of the nested family of compartmental models under investigation.However, analysis for the entire nested family has been also carried out and the corresponding results are shown in the S1 Text.
In Sections 3.1 and S1.1 Text, we analyzed structural model identifiability, showing how in all PM, TK, eTK, and LTK models it is possible to define a unique parametrization that identifies the parameters through the given inputs, i.e., CA concentration C t (t) and vascular input function VIF.As a matter of completeness, in S2 Text, we also described an additional transport model of the same class-namely the mLTK model-which is not structurally identifiable, but for which a structurally identifiable reparametrization can be defined.Then, in Section 3.2 and S1.2 Text, we used the method based on profile likelihood and confidence intervals to study practical identifiability of the nested family of models in three different scenarios, accounting for different levels of noise and construction of the data, and for different CA timeenhancement profiles.In fact, studying the practical identifiability of a model with respect to the data at hand is crucial for ensuring well-determined model predictions, as these are often used in systems biology research and quantitative image analysis.
The obtained results show how the entire class of transport models for DCE-MRI analysis, from PM to LTK, is structurally identifiable, i.e., they have an intrinsic mathematical structure that allows for a unique identification of the parameters and, thus, reproducible and accurate outcomes.However, practically this would happen only in the ideal case of dealing with a very large amount of data with zero noise.Our findings for both the LTK model (discussed in Section 3.2) and the nested PM, TK, and eTK models (discussed in S1.2 Text) reveal how practical parameter and, thus, model identifiability is affected by the quality of the data: the more noisy the data and/or the individual-based VIF function, the lower the confidence levels of identifiability of the different parameters.This is evident from the change in the shape of the profile likelihood transforming from a parabola-like shape in the (AA) case to a flat shape in the (RR) case, shown in These results support the observation that model identifiability is not attributed to a specific contrast agent dynamic, but rather to the characteristics of the data from which the enhancement curves are obtained.Moreover, the further analysis concerning the impact of noise and data smoothness on model identifiability, whose results are summarized in Figs 6 and 7, show how modifying the data directly affects on the profile likelihood shape (from a parabola to a flat profile and vice versa).In fact, results highlights how reducing the VIF noise or obtaining smoother data profiles (e.g. with higher temporal resolution) allows for recovery of parameter identifiability and, thus, increasing the reliability of the derived parameter values.Concerning limitations, all our results have been obtained using a global optimization procedure, which is based on a fixed discrete sampling of the parameter space and does not work perfectly with respect to real-world noisy data.Thus, possible extensions of this work might focus on comparing the performances of different optimizes, either in terms of parameter estimations, by varying the sampling frequency in parameter space, or algorithm efficiency.
The analysis proposed here is of significant importance considering the wide use of DCE-MRI data in research [10,57] and, thus, the need for ensuring reliability and reproducibility of transport model results [58].DCE-MRI has been shown to be associated with tumor angiogenesis and may be used to assess glioma grading [59][60][61][62][63][64], predict genetic mutation status of brain tumors [65][66][67], distinguish pseudoprogression from true progression in glioblastomas [68,69], and predict response to antiangiogenic treatment [70].The parameter K trans , in fact, as a marker for tumor microvascular permeability from DCE analysis, could help to predict treatment response in glioblastoma [71].Moreover, DCE-MRI parameters in a breast cancer study have been reported to be associated with microvessel density which is a marker for angiogenesis [72].However, variability in DCE acquisition (e.g. in scan duration) and imaging analysis with various transport models may produce diverging results, which makes repeatability a challenging issue and hinders its wider adaptation for both clinical practice and research [58,73].Prospective longitudinal clinical trials, preferably in a multicenter setting with standardized imaging techniques and quantification of DCE parameters, are needed to validate the utility of transport parameters as imaging biomarkers for treatment response and survival prediction.
In this light, understanding model and data issues and limitations allows for a more conscious use of the results obtainable with them.Our results provide a mathematical explanation for the lack of repeatability of DCE-MRI quantification between clinical sites, as it has been previously reported [74].As we have shown that practical identifiability improves with increasing SNR, we emphasize the need for image acquisition standards to increase the quality of imaging data and therefore reliability of parameter estimations such as the standards proposed by the Quantitative Imaging Biomarkers Alliance (QIBA) [75].As DCE-MRI acquisition protocols and quantification methods continue to rapidly develop [76], it is important to ensure rigor in model fitting and analysis.

Fig 1 .
Fig 1. Models and data.(A) Schematic illustrations of the four nested transport models from simple (Patlak) to complex (Leaky Tofts-Kety):the contrast agent concentration C t (t) is evaluated using the functions C p (t), the CA concentration in the plasma compartment, which is assumed to be given by the arterial input function, and C e (t), for the CA concentration in the EES space.The rate of forward and backward volume transfer and the fractional EES and plasma volumes are the quantities K trans , v e , v p , and λ.For each model, the involved parameters are listed in brackets.(B) Different enhancement patterns of CA signal: Type I-persistent curve-is a progressive increasing intensity signal; Type II -plateau curve-is characterized by an initial peak followed by a relatively constant enhancement; Type III-wash-out curve-refers to a sharp uptake followed by an enhancement decrease over time.(C) Examples of Type I (right plot) and Type III (left plot) enhancement curves obtained from breast and brain tumors, respectively.https://doi.org/10.1371/journal.pcbi.1012106.g001

4 . 2
and results are compared with the analytical results obtained in Section 3.1 regarding parameter structural identifiability.Precisely, we study the effect of varying the noise characteristics of the DCE-MRI data used for estimating the VIF and CA concentration on the accuracy of the identification of the model parameters.Each analysis is performed on the three characteristic enhancement patterns observed in CA evolution[10] (Fig 3).As shown in Fig 1B, CA timeenhancement curves are classified into three types: Type I-persistent curve-a progressive signal intensity increase (Fig 1C, right plot)

Fig 3 .
Fig 3. Best fitting of CA evolution signal obtained with the LTK model (10).The three types of CA time-enhancement curves (columns) for the three cases of study (AA), (RA), and (RR) (rows).The last row (RR) is CA time course data from the GBM DCE-MRI.Similar results for the breast cancer dataset are provided in Fig C in S3.1 Text.https://doi.org/10.1371/journal.pcbi.1012106.g003

Fig 5 ,
where practical identifiability of the parameter λ for the LTK model and Type I time-enhancement curve is analyzed.As for Fig 4, columns of Fig 5 refer to the three cases (AA), (RA), and (RR).

Fig 4 .
Fig 4. K trans practical identifiability for LTK model and Type I enhancement curve.Top row: profile likelihood and confidence levels at 68%, 80%, and 95% for the parameter K trans in the (AA), (RA), and (RR) case for the Type I enhancement curve.Inset in the first subplot shows a zoom of the region around the best-fitted value K trans (red marker).Bottom row: compensating profiles of the parameters v e , v p , and λ with respect to variation of K trans around its best-fitted value.Variation of ±50% around the optimal value of K trans are considered.Two colors are for two different y-axis: black curves refer the left y-axis and red curves to the right y-axis.Different line styles are used to distinguish curves referring to the same y-axis.For each curve, the name of the corresponding parameter is indicated above the line in the same color.

Fig 5 .Fig 6 .Fig 7 .
Fig 5. Leakage (λ) practical identifiability for LTK model and Type I enhancement curve.Top row: profile likelihood and confidence levels at 68%, 80%, and 95% for the parameter Kλ in the (AA), (RA), and (RR) case for the Type I enhancement curve.Inset in the first subplot shows a zoom of the region around the best-fitted value l (red marker).Bottom row: compensating profiles of the parameters K trans , v e , and v p with respect to variation of λ around its best-fitted value.Variation of ±50% around the optimal value of λ are considered.Two colors are for two different y-axis: black curves refer the left y-axis and red curves to the right y-axis.Different line styles are used to distinguish curves referring to the same y-axis.For each curve, the name of the corresponding parameter is indicated above the line in the same color.https://doi.org/10.1371/journal.pcbi.1012106.g005 Fig 7 summarizes the results of this analysis.The first column, referring to the original GBM data, replicates the same results shown in the last column of Fig 4, i.e., K trans is not practically identifiable, with an almost flat profile likelihood.However, moving from left to right in Fig 7, the higher the value of the smoothing factor γ, i.e., the smoother the data, the less flat the corresponding profile likelihood.In particular, the last column of Fig 7, referring to γ = 0.
Figs 4 and 5, and A (in S1.2 Text) for Type I contrast enhancement curve.The same changes in the profile likelihood shapes are observed for Type II (Fig A in S3 Text) and Type III (Fig B in S3 Text) contrast enhancement curves.