Model-based analysis of response and resistance factors of cetuximab treatment in gastric cancer cell lines

Targeted cancer therapies are powerful alternatives to chemotherapies or can be used complementary to these. Yet, the response to targeted treatments depends on a variety of factors, including mutations and expression levels, and therefore their outcome is difficult to predict. Here, we develop a mechanistic model of gastric cancer to study response and resistance factors for cetuximab treatment. The model captures the EGFR, ERK and AKT signaling pathways in two gastric cancer cell lines with different mutation patterns. We train the model using a comprehensive selection of time and dose response measurements, and provide an assessment of parameter and prediction uncertainties. We demonstrate that the proposed model facilitates the identification of causal differences between the cell lines. Furthermore, our study shows that the model provides predictions for the responses to different perturbations, such as knockdown and knockout experiments. Among other results, the model predicted the effect of MET mutations on cetuximab sensitivity. These predictive capabilities render the model a basis for the assessment of gastric cancer signaling and possibly for the development and discovery of predictive biomarkers.


Introduction
Gastric cancer is the fifth most common cancer and third leading cause of death from cancer worldwide [1]. Treatment options include surgery, chemo-and radiation therapy. However, the overall survival rate remains unsatisfactory due to molecular and clinical heterogeneity [2], therefore new treatment options are urgently required. Novel drugs targeting members of a family of receptor tyrosine kinases including the epidermal growth factor receptor (EGFR) have shown mixed success in clinical trials [3,4]. Among others, the EGFR antibody cetuximab did not improve patient survival in a phase III clinical trial [4].
Cetuximab is an antibody which binds EGFR [5]. EGFR is overepressed in many cancer types and activated by a variate of ligands [6], including besides EGF also the transforming growth factor-alpha (TGFA), heparin-binding EGF-like growth factor (HBEGF), betacellulin (BTC), amphiregulin (AREG) and epiregulin (EREG) and epigen (EPGN). Although all these ligands bind to EGFR in the abscence of cetuximab, they do not produce identical biological responses. These varying responses might be due to different ligand affinity, different ability to induce EGFR recycling or different ability to produce EGFR:ERBB2 heterodimers or EGFR: EGFR homodimers [7]. The binding of cetuximab to EGFR inhibits the interaction with its ligands, thereby reducing survival and proliferation signals. In addition, cetuximab induces antibody-dependent cellular cytotoxicity (ADCC) by provoking immune cells to attack cancer cells [8].
A potential reason for the failure of cetuximab is the molecular heterogeneity of gastric cancer [2]. Due to this heterogeneity, only a small subgroup of patients might benefit from the targeted therapy. However, a suitable biomarker for patient stratification is currently not available. Here, we aim to render a basis for understanding response and resistance mechanisms for cetuximab treatment in gastric cancer, to unravel likely causal differences between cetuximab responders and non-responders [9], and to possibly identify biomarkers for guiding targeted therapy by using a cell culture model.
Conceptually, biomarkers for responsive patient subgroups can be identified using statistical approaches characterizing responder and non-responder subgroups within different molecular high-throughput methods. Unfortunately, the necessary large-scale studies on the response of gastric cancer patients to cetuximab are missing. In addition, many proposed biomarkers from purely associative approaches have failed in clinical use [10]. Even large-scale cancer cell line projects, such as the Cancer Cell Line Encyclopedia (CCLE) [11] and the Genomics of Drug Sensitivity in Cancer (GDSC) [12] project, do not provide data for cetuximab response. Consequently, the limited amount of cell line and patient data prohibits the use of established statistical methods for biomarker development for cetuximab responsive patient subgroups.
In recent years, several studies showed that mechanistic dynamical models provide an alternative route to biomarker development [13]. [14] predicted the survival of neuroblastoma patients using a mechanistic model of the c-Jun N-terminal kinase (JNK) pathway. [6] predicted ligand dependence of solid tumors using a mechanistic multi-pathway model. [15] showed that large-scale mechanistic models facilitate the integration of large-scale data and enable the derivation and mechanistic interpretation of biomarkers. All these studies employ ordinary differential equation (ODE) models to describe the biochemical reaction networks involved in intracellular signal processing. The models integrate (i) prior knowledge on the pathway topology derived over the last decades and available in databases, such as KEGG [16], Reactome [17] and BioModels [18], with (ii) heterogeneous experimental data for the process of interest. The exploitation of prior knowledge constrains the search space and improves in many cases the predictive power. Furthermore, the chosen modeling approach facilitates, unlike most statistical approaches, the mechanistic interpretation of the findings.
In this study, we employed mechanistic mathematical modeling based on ODEs to understand response and resistance mechanisms for cetuximab treatment in gastric cancer cell lines. Building on previously published mechanistic ODE models [6,15,[19][20][21] and published logical models [22], we developed multiple candidate models for the EGFR, the Protein Kinase B (AKT) and the Extracellular-signal Regulated Kinase (ERK) signaling in cetuximab responder and non-responder cell lines. The most appropriate model based on the Akaike Information Criterion (AIC) [23] was selected, calibrated and validated using previous published and newly collected data. To analyze the dependence of the cellular response on gene expression levels and (somatic) mutations, the resulting model was interrogated using simulation studies and in silico knockdown and knockout experiments. This suggests several intervention points and potential model-based biomarkers.

Mathematical model of intracellular signaling in cetuximab responder and non-responder cell lines
We utilized the gastric cancer cell lines MKN1 and Hs746T as a model system to study response and resistance factors of cetuximab treatment. Based on previous results obtained by proliferation and motility analysis, the MKN1 cell line is characterized as a cetuximab responder, while Hs746T is characterized as a non-responder cell line [9,24]. Different factors might contribute to this difference, including a 2.5 fold higher EGFR expression in MKN1 cells compared to Hs746T [24].
Furthermore, MKN1 cells expresses the PI3K mutant PI3K p.E545K and in Hs746T cells MET is mutated to MET p.L982_D1028del and amplified [25]. PI3K p.E545K possesses an increased catalytic activity compared to wild-type PI3K [26] resulting in enhanced downstream signaling, while MET p.L982_D1028de is assumed to be constitutively active, i.e. independent of its ligand.
Cetuximab targets the EGFR signaling pathway which regulates growth, survival, proliferation, differentiation [27] and motility [9,28]. Upon ligand binding, EGFR homodimerizes and auto-phosphorylates promoting its catalytic activity. Phosphorylated EGFR (pEGFR) is internalized and degraded or recycled. Membrane-bound and internalized pEGFR catalyzes the activation of the G-protein RAS and the Phosphoinositid-3-Kinase (PI3K) by phosphorylation. While RAS activates the mitogen-activated (MAPK) signaling pathway resulting in the phosphorylation of ERK, PI3K activity results in the phosphorylation of AKT. Phosphorylated ERK (pERK) and phosphorylated AKT (pAKT) are important regulators of DNA transcription.
Cetuximab binds EGFR and blocks the binding of EGF or other EGFR ligands, such as amphiregulin (AREG) and epiregulin (EREG) [5]. This reduces-in the absence of resistance factors-the activity of EGFR and its downstream targets. Known resistance factors to EGFRtargeted therapies in other cancer types include mutations and overexpression of the receptor tyrosine kinases AXL [29] or MET [30].
A visualization of the EGFR signalling pathway accounting for common adaptor proteins, cetuximab as well as the aforementioned mutants of PI3K and MET is provided in Fig 1A. Based on this pathway information and the available literature, in particular the work of [19] and [21], we constructed a simplified pathway model. To limit the model complexity, adaptor proteins and multi-site phosphorylation of EGFR and ERK are disregarded, and the intermediate steps in the MAPK signaling pathway are lumped into a single reaction. The simplified pathway description is visualized in Fig 1B. The model description is provided in the S1 Text, Section 3.
We calibrated the pathway model using a comprehensive dataset obtained using quantitative immunoblotting. The dataset contains time and dose responses for EGFR, pEGFR, pERK and pAKT. In an iterative process, published data [9,24] were complemented by newly collected data to improve reliability of the parameter estimates and model predictions. In total, we considered 303 data points for the MKN1 cell line ( Fig 1C) and 312 data points for the Hs746T cell line (Fig 1D).
We determined the maximum likelihood estimates of the model parameters, e.g. rate constants, using multi-start local optimization [31,32] (see Materials and methods). The estimation was performed individually for MKN1 and Hs746T cells. For MKN1 cells, the optimizer converged reliably. This is reflected in the clear plateau in the waterfall plot (Fig 2A). For the optimal parameters we observed a good agreement of MKN1 data and model fit (Fig 2B-2D)  (Pearson correlation ρ = 0.95). For Hs746T cells, the results of the multi-start optimization do not show clear plateaus, suggesting that the objective function possesses a large number of local optima or flat regions ( Fig 3A). But even so, the best fit found provides an accurate description of the data (Fig 3B-3D) (Pearson correlation ρ = 0.91) and the best 100 fits achieve similar correlations (S7 Fig). Furthermore, the best 10 fits are statistically not distinguishable from the best fit [33]. A potential source of the convergence problems for Hs746T cells is the low signalto-noise ratio, which is caused by the limited response to EGF and cetuximab treatment (see, e.g., Fig 3C and 3D) since this is a non-responder cell line. Yet, despite the low signal-to-noise ratio several changes are statistical significant (see analysis by [9] for a part of the published data). As the model fits provided a good description of most data points, the parameter estimation confirmed that the proposed pathway model is a good candidate for further analysis.

Model-based cell line comparison predicts causal differences beyond mutations and expression patterns
The available experimental data (Figs 2 and 3, S3 and S4 Figs, and S1 Text, Section 2) show a pronounced difference in the response of the cell lines to EGF and cetuximab treatment. Potential sources of this behavior are differences in mutation patterns, protein expression levels/abundances and reaction fluxes (due to an influence of latent components between the cell lines). While we did not perform a comprehensive quantification of the protein abundances, selected measurements and transcriptome data indicate substantial differences between the cell lines.
To identify important differences in the reaction fluxes, we compared the parameter estimates obtained for the individual cell lines. These parameter estimates should reflect changes on the biochemical level between the cell lines. In the comparison, we only consider parameters associated with single model reactions summarizing multi-step signaling processes.
Examples include the indirect activation of ERK by RAS, which is described by one reaction in the considered pathway model. As the expression levels of the proteins involved in the intermediate reactions-in this case RAF and MEK ( Fig 1A)-can vary, the reaction rate constant linked to each multi-step process can easily differ between cell lines. In contrast, for direct reactions, such as the binding of EGF to EGFR, the reaction rate constant should be identical for both cell lines. For details on the classification for the reactions we refer to the S1 Text, Section 3.
For the comparison we considered the best 100 parameter vectors obtained by multi-start local optimization for MKN1 and Hs746T cell lines. Statistical testing for cell line specificity of parameters was performed using one-way-ANOVA suggesting significant (p < 0.05) differences for 12 out of 20 estimated kinetic rates ( Fig 4A). The mapping of the findings on the pathway visualization revealed potential differences in (i) RAS-MAPK signaling, (ii) PI3K-AKT signaling, and (iii) EGFR turnover including internalization, degradation and recycling (Fig 4B, S1 Table and S1 Text, Section 4). In addition, mutated MET (MMET) and mutated PI3K can cause differences between cell lines as they are each only present in one of the cell lines.
The comparison of the optimal parameter values found for the different cell lines provides a conservative evaluation. Indeed, not all differences between cell lines might be required to fit the data (see [34]). To assess the relevance of differences in modules (i)-(iii), we fitted the dataset for the MKN1 and Hs746T cell lines simultaneously. Therefore, a collection of mathematical models was developed accounting for cell line specific mutations and protein abundances, as well as all possible combinations of differences of parameters belonging to modules (i), (ii) and (iii). All remaining kinetic rates were assumed to be identical for the cell lines. In total, eight different models were constructed accounting for all possible combinations. For each candidate model, a multi-start local optimization run was performed using the complete dataset. The combination of molecular data for responder and non-responder cell lines provides addition constraints for the parameters and increases the number of data points per estimated parameter.
Model selection using AIC, indicated that the model including cell line specific differences only in the EGFR turnover dynamics provides the best qualitative description of the experimental data (Fig 4C). This agrees with previous findings reporting the presence of FBXW7 p. R465C mutation in MKN1 cells, but not in Hs746T cells. FBXW7 ubitiquinates EGFR, leading to changes in the turnover/degradation of EGFR between the cell lines [35]. The resulting model provides an accurate description of the experimental data for the responder (Fig 4D) and the non-responder (Fig 4E) cell lines. We evaluated if the fitting can be improved by including additional processes such as the negative feedback regulation of RAS activity by phosphorylated ERK [36]. However, the AIC value did not improve (see S8 Fig and S1 Text, Section 7). The subsequent analyzes are based on this model.

Integrated modeling of responder and non-responder cell lines yields reliable predictions
The selected pathway model describing the experimental data for the cell lines MKN1 and Hs746T possesses 230 parameters in total. 57 parameters are associated with the reaction kinetics and protein abundances (for wild-type and mutant proteins), and 173 are parameters related to the observations (i.e. scaling constants). To assess the identifiability of the kinetic parameters as well as differences between cell lines and media, we computed the profile likelihoods. We did not calculate the profiles for the observation parameters as they differ between single experiments and are not relevant for the model dynamical response.
Profile likelihoods provide a maximum projection of the likelihood on the individual parameters (Materials and methods). The analysis of the profile likelihoods revealed that 23 out of 57 parameters are practically identifiable, meaning that the 90% confidence intervals are finite (Fig 5A). Most of the identifiable parameters take part in the EGFR dynamics module, involving processes such as internalization, degradation, ligand binding and dimerization. The remaining parameters are practically non-identifiable, meaning that lower and/or upper bounds could not be found for the defined confidence level. In particular, parameters related to MET signaling are practically non-identifiable. This is not unexpected as MMET is not directly observed.
We complemented the profile likelihood calculation with an evaluation of the Fisher Information Matrix (FIM) at the maximum likelihood estimate. The eigenvalue spectrum of the FIM for the kinetic parameters and initial conditions spans many order of magnitude (Fig 5B) implying sloppiness [37].
Interestingly, there are 50 eigenvalues which differ substantially from numerical zero, meaning that the number of constraint directions in parameter space-given by the eigenvectors-is larger than the number of identifiable parameters. This can happen if individual parameters are non-identifiable but functions of several parameters (e.g. sums, differences or ratios) are identifiable. As the detailed analysis of this using profile likelihoods is computationally demanding, the analysis of the FIM provides a first glimpse.
While the estimates of many parameters appear reliable, the insights which can be obtained by studying the parameter values is limited. As many parameters in the model represent multistep processes, there is no clear biological counterpart. This is different for the state variables and outputs of the model, allowing for model-based predictions. To assess the predictive power of the model despite the sloppy eigenvalue spectrum and the non-identifiable parameters, we simulated the system in additional experimental conditions, i.e. in conditions not used for the fitting. Firstly, we evaluated whether the model can despite the non-identifiabilities in the MET signaling dynamics predict published experimental data for the response of Hs746T cells to selected MET inhibitors. Therefore, the inhibitor was implemented in the model, and simulation was performed using published parameters (for details on the model implementation refer to S1 Text, Section 3). We found that the model qualitatively predicts the reduction of pMMET and pAKT levels observed by [38] (Fig 5C). Secondly, we predicted the state of MKN1 cells beyond the first 240 min for which experimental data are available. We found that the model provides reasonable predictions for long-time response of EGFR, pEGFR and pERK (Fig 5D). The quality of these predictions profits from the model topology as well as the estimated parameter values (see S9 Fig and S1 Text, Section 8). Accordingly, our analysis showed that the model structure and parameter estimates are reasonable and that the model is able to predict the considered validation experiments.

Mathematical model reliably predicts response and resistance factors
The developed mathematical model provides a screening tool for the rapid assessment of potential response and resistance factors. Here, we demonstrate this by studying the validity of the predictions for several established factors (Fig 6A).
The evident differences between the responder cell line, MKN1, and the non-responder cell line, Hs746T, are the mutation patterns. The experimental data for MKN1 cells showed that PIK3CA p.E545K is not a resistance factor. However, our model predicts the association of PIK3CA overexpression (due to p.E545K mutation) with an insensitivity of pAKT to cetuximab treatment (S2 Fig). This is difficult to test experimentally, but consistent with the finding that PIK3CA mutation is a source of acquired cetuximab resistance in metastatic colorectal cancer patients [39]. The model predictions suggest the same for gastric cancer, although "PIK3CA mutations were not significantly associated with any clinical, epidemiologic, or pathologic characteristic" in gastric cancer patients obtaining non-targeted therapy [40].
The MET exon 14 juxtamembrane splice site mutation found in Hs746T cells inhibits the degradation of MET receptor, prolonging its oncogenic activity [41]. MET activation is an established resistance factor for cetuximab treatment in gastric cancer [24]. Indeed, our model predicts that a knockdown of mutant MET reduces Hs746T cell proliferation and survival signaling via ERK and AKT (Fig 6B). To validate the prediction, MET was silenced and quantitative immunoblotting was performed. We found that the model provides accurate quantitative predictions for EGFR, pEGFR, pERK and pAKT (Fig 6C and S9 Fig) (Pearson correlation ρ = 0.872) although the down-regulation of pERK is slightly underestimated. For detailed information about the modeling of MET silencing refer to S1 Text, Section 3.
Beyond mutations, amplifications and expression changes have been reported as response and resistance factors. In particular the abundance of EGFR has been reported to be associated with cetuximab response in gastric cancer [24,42]. Our model predicts that reducing EGFR expression levels in MKN1-a cell line overexpressing EGFR-decreases the level of pEGFR, pERK and pAKT ( Fig 6D). Interestingly, the effect on downstream signaling was predicted to be relatively small. We tested this prediction by silencing EGFR expression and found a good agreement with experimental data (Fig 6E and S9 Fig) (Pearson correlation ρ = 0.915). The result implies that the dependence of ERK and AKT activity on EGFR activity is limited. For detailed information about the modeling of EGFR silencing refer to S1 Text, Section 3.
Beyond the expression of EGFR, the abundance of the EGFR ligand AREG has been shown to correlate positively with cetuximab response [24]. As the biochemical processes underlying AREG-and EGF-induced activation of EGFR signaling are similar [43], we employed the same model for AREG and EGF treatment. Following the literature [44], we assumed that AREG has an EGFR affinity about 50 times lower than EGF. We neglected that AREG stimulation leads to EGFR recycling while EGF promotes EGFR degradation [45]. The resulting model predicted in the absence of MET activation, i.e. in MKN1 cells, that for higher AREG levels cetuximab achieves a higher reduction in EGFR and ERK phosphorylation (Fig 6F). As a consequence of PI3K mutation in MKN1 cells, AKT activation is insensitive to changes in the receptor signal. This model prediction is in line with the published experimental data in [24].

Conclusion
Mechanistic ODE models are widely used for the integration of experimental data and the analysis of causal relations. Furthermore, recent studies demonstrate their potential for the

PLOS COMPUTATIONAL BIOLOGY
Model-based analysis of cetuximab treatment in gastric cancer cell lines identification of biomarkers [6,14,15]. To render this approach available for gastric cancer, we developed a mechanistic model of signaling in gastric cancer. The model describes the dynamics of the EGFR, ERK and AKT signaling pathways in response to EGF and cetuximab treatment, accounting for mutation patterns and protein expression levels. The proposed model provides a more detailed description than the available logical model [22] by capturing individual biochemical reactions and encoding the wild-type and mutated genes. To the best of our knowledge, the proposed model is the first mechanistic model tailored to gastric cancer signaling.
To assess the predictive power of the model, we performed a broad spectrum of analyzes. First, we used the model to study causal differences between the cetuximab responder cell line, MKN1, and non-responder cell line, Hs746T. The analysis suggested the presence of a MET mutation as well as differences in receptor internalization and recycling as key factors for the response to cetuximab treatment. Second, we validated the predicted response to a MET inhibitor (KRC-00715) as well as to long-term EGF and EGF in combination with cetuximab treatment. Third, we employed the model to predict response and resistance factors. The predicted role of the EGFR abundance as well as MMET were experimentally confirmed by knockdown experiments. The array of successful validations suggests that the model has the potential to be used as a kickoff tool for guiding biomarker discovery.
While all tests were positive, the parameters and predictions of the proposed models are subject to uncertainties. Less than 50% of the parameters were practically identifiable, implying that addition of experimental data is required. In particular, measurements of MMET activity would be beneficial, as well as the absolute quantification of expression and phosphorylation levels. Furthermore, a detailed experimental analysis of the impact of gene expression on cetuximab response using isogenic cell lines would provide valuable information about signal processing. Complementary, the model provides only a crude description of the core pathways and extensions might become necessary. Specifically, the adaptor proteins, intermediate kinases and the effect of multi-site phosphorylation should be included. Furthermore, additional members of the HER family and additional receptor tyrosine kinases, such as AXL, could be included. A template for the refinement could be provided by the Atlas of Cancer signaling Networks [46] or the large-scale mechanistic model we recently introduced [15]. However, all model extensions should be counterbalance with acquisition of additional experimental data for gastric cancer cell lines.
For model-based patient stratification, the model also needs to be extended to account for antibody-dependent cellular cytotoxicity (ADCC). ADCC is-besides the binding of cetuximab to EGFR and the inhibition of receptor dimerization and intracellular signal transduction-the second mode of active of cetuximab. ADCC provokes immune cell-mediated killing of tumor cells [8]. Since the ADCC effect cannot be observed in vitro, patient material would be required to study this effect.
In conclusion, the proposed model provides a potentially valuable tool for gastric cancer research in the context of cetuximab treatment. It aggregates the information of a comprehensive list of publications and describes a large set of data points. In the future, it might be used to integrate additional data, e.g. from different cell lines and drugs.

Cell lines and cell culture conditions
For the establishment of the model, the human gastric cancer cell lines Hs746T and MKN1 were used and cultured as reported by [9].

Western blot analyzes
Western blot analyzes to set up the model were performed as reported earlier [9,24,35]. The cells were treated for indicated times with 0.05, 0.1, 1, 10 or 50 μg/ml cetuximab and/or 5, 30 or 100 ng/ml EGF.

Knockdown experiments
To generate a transient knockdown of EGFR in the cell line MKN1 and a knockdown of MET in Hs746T cells, siRNA was used. Cells were cultured overnight in rich medium, before the medium was exchanged for antibiotic-free medium. A specific FlexiTube GeneSolution (Qiagen, GS1956 for EGFR, GS4233 for MET) was used, containing four gene specific, preselected siRNAs. The AllStars Negative Control siRNA and AllStars Negative Control siRNA AF488 were used as negative controls. Cells were transfected with Lipofectamine 2000 according to the manufacturer's instruction. After a transfection time of 4 h (EGFR knockdown in MKN1 cells) or 24 h (MET knockdown in the cell line Hs746T), the medium was replaced with rich medium. The transfection was checked on protein level by western blot analyzes. Therefore, samples were collected 24 hours after the transfection. For analysis of EGFR, MAPK and AKT activation after EGFR knockdown, cells were treated for 5 minutes with 5 ng/ml EGF or the combination of 1 μg/ml cetuximab plus 5 ng/ml EGF.

Mechanistic modeling
The dynamics of the biochemical reaction networks were modelled using systems of ODEs, in which x denotes the concentration vector of the biochemical species, S denotes the stoichiometric matrix, v(x, θ) denotes the reaction flux vector and θ denotes the parameter vector. The parameters are, e.g., rate constants of synthesis or degradation reactions. As the cells start from an unperturbed state, the initial condition x 0 (θ) is the unperturbed steady state of the ODE.
To infer causal differences between responder and non-responder cell lines, we allowed for differences between the parameters of the cell lines. Following the work of [34], we modelled the difference by the additional parameters ξ i , The parameters ξ i denotes the log-fold change for the i-th parameter. If ξ i is zero, the parameters of the two cell lines are identical. We fixed ξ i to zero for all parameters which are associate with direct biochemical interactions, e.g. binding rates, which should be conserved between cell lines. Only for expression levels and indirect interactions, i.e. simplification of multi-step processes, we allowed for ξ i 6 ¼ 0 and therefore estimated along with θ ! (θ, ξ). Similar to the modeling of differences between cell lines, we modelled differences between cell culture media (rich and starvation medium). Here, only differences in the expression levels are allowed. Specific information about parameters are provided in the S1 Text, Section 5, and S2 Table. The concentration of biochemical species were linked to the observables y(t) by observation functions h j , in which j indexes the observable and k indexes the time point. We assumed independent and additive normally distributed measurement noise ∊ j � N ð0; s 2 j Þ. To adapt the scale of different replicates and standard deviation of the experimental data, we employed mixture modeling. The standard deviations were allowed to differ between observables but were assumed to be independent of time and treatment condition for each individual experimental setup (e.g. a set of Western blots). We fixed σ 2 to the estimated standard deviations and therefore reduced the complexity of the optimization problem [54]. Details are provided in the S1 Text, Section 1.

Parameter estimation
We performed maximum likelihood estimation by minimizing the negative log-likelihood function, in which x(t j ) denotes the solution of the ODE model for the parameter vector θ. Numerical optimization was conducted using multi-start local optimization. This approach has been shown to outperform most global optimization methods and achieves a performance comparable with sophisticated hybrid optimization methods [31,48]. For each fitting problem, initial parameters were generate using Latin hypercube sampling. The local optimization was performed by trust-region based algorithms implemented in the MATLAB function lsqnonlin. For the optimization, parameters were log 10 -transformed to improve numerical properties, e.g. improvement of convexity [49] and computational efficiency [48,50]. For detailed information about the optimization options refer to S1 Text, Section 6.
For the fitting of the individual cell lines, we generated 1000 starting points for local optimization. For the model selection, 100 starting points were used instead. The model selection was based on the AIC values for each model alternative. Following the work of [51], a difference of 10 was considered to be substantial.
For the assessment of differences between cell lines, we employed an iterative process instead of the regularization approach proposed by [34]. We also tested the latter, but the optimization did not converge and the penalization did not achieve a clear model reduction.

Uncertainty analysis
The uncertainty of the model parameters was assessed by analyzing the eigenvalue spectrum of the FIM at the maximum likelihood estimate. The parameter estimation results were considered sloppy if minimum and maximum eigenvalue differed by more than six orders of magnitude [37].
The local analysis provided by the FIM was complemented by computing the profile likelihoods [52]. From the profile likelihoods, we computed the confidence intervals for each individual parameter disregarding scaling constants. Parameters with unbounded confidence intervals for a significance level of 0.1 were considered as practically non-identifiable.

Software
The model formulation, parameter estimation and uncertainty analysis was performed in the MATLAB toolbox Data2Dynamics (https://github.com/Data2Dynamics/d2d, commit 9b1c3556) [32]. The parameter confidence intervals and the visualization of parameter uncertainties was carried out using the MATLAB toolbox PESTO (https://github.com/ICB-DCM/ PESTO, commit 8278a1a) [53]. For image quantification, we used ImageJ software [47]. A: Scatter plot of the likelihood ratios for the initial parameter guesses, θ initial , and their corresponding final optimized values, θ optimal . The likelihood ratio was calculated with respect to the best/maximum likelihood estimate found during the multi-start local optimization. The 20 best parameter sets are shown paired with their initial guesses. B-D: Boxplots of the agreement between the validation data and model simulation for the initial parameter guesses, θ initial , and their corresponding final optimized values, θ optimal . The agreement is shown in terms of Spearman's correlation coefficients. The 20 best parameter sets are shown paired with their initial guesses. The validation datasets shown are (B) MET inhibition, (C) long-term kinetic response in MKN1 cells, and (D) silencing of EGFR and MET expression. (TIF)  Table. Reaction rate candidates for cell-line specificity. The following kinetic rates were identified as significantly different between the cell lines by the ANOVA statistical test. All kinetic rates belonging to the same module were simultaneously tested (not individually) resulting in 8 different model candidates. (PDF) S2 Table. Literature values used as prior information in the parameter estimation. Prior mean values were converted to 10-logarithmic scale and used for the parameterization of the model. Note that the same kinetic rate can appear in different reactions, e.g., receptor endocytosis in R10, R22 and R34. (PDF)