Keratin Dynamics: Modeling the Interplay between Turnover and Transport

Keratin are among the most abundant proteins in epithelial cells. Functions of the keratin network in cells are shaped by their dynamical organization. Using a collection of experimentally-driven mathematical models, different hypotheses for the turnover and transport of the keratin material in epithelial cells are tested. The interplay between turnover and transport and their effects on the keratin organization in cells are hence investigated by combining mathematical modeling and experimental data. Amongst the collection of mathematical models considered, a best model strongly supported by experimental data is identified. Fundamental to this approach is the fact that optimal parameter values associated with the best fit for each model are established. The best candidate among the best fits is characterized by the disassembly of the assembled keratin material in the perinuclear region and an active transport of the assembled keratin. Our study shows that an active transport of the assembled keratin is required to explain the experimentally observed keratin organization.


Introduction
The epithelial cytoskeleton is characterized by abundant keratin intermediate filaments (Fig 1). The cytoplasmic keratin filament network is responsible for the mechanical stress resistance of epithelial cells and contributes significantly to epithelial stiffness [1,2]. The importance of keratins for epithelial tissue stability is reflected by a large group of genetic skin blistering diseases that are caused by point mutations in keratin-encoding genes [3,4]. The mechanical functions not only rely on static resilience but also necessitate a high degree of plasticity, for example in migrating cells during wound healing [5]. The current view is that keratins act as general stress absorbers protecting epithelial cells not only against mechanical insults but also against irradiation or osmotic and microbial challenges. Thus, keratins are involved in heat shock response, apoptosis and organelle homeostasis [6]. Furthermore, functions affecting processes such as proliferation, differentiation and inflammation are also dependent on keratins (see recent reviews in [4,7]).
All of these functions are tightly coupled to the keratin network dynamics (see S1 Video). Examination of cultured epithelial cells producing fluorescent keratins has provided evidence for the different mechanisms that are involved in the continuous renewal and reshaping of the keratin system [9]. On the basis of these observations, a biological model of the keratin cycle has been proposed by Leube and Windoffer et al. in [10,11]. This biological model takes into account the assembly/disassembly and transport of keratins. In this study, it was proposed that assembly and disassembly occur in topologically defined regions with assembly taking place predominantly in the cell periphery while disassembly takes place primarily in the perinuclear region. The biological model further postulates active transport of insoluble assembly stages of keratins toward the nucleus and rapid diffusion of soluble subunits throughout the cytoplasm. To the best of our knowledge, how these different processes are coupled and regulated is not yet known.
In a previous work we developed methods to examine and quantify the keratin transport and turnover in epithelial cells [12]; the spatial distribution of the assembled keratin material in epithelial cells was available at 24 hours and 48 hours after seeding. The current effort is to use the available qualitative and quantitative observations to derive, from first principles, experimentally-driven mathematical models that could yield hypothetical predictions testable in laboratories. Our approach is unique, it translates experimental observations and data into a series of alternative plausible mathematical models or scenarios to further advance our understanding of the critical parameters in keratin cycling. Fundamental to this approach is the fact that optimal parameter values for each scenario are established and out of this set, a single scenario is identified that best fits experimental observations and data.
Hence, a collection of mathematical models resulting from different assumptions of the keratin transport, assembly and disassembly is designed to investigate the effects of the interplay between turnover and transport on the keratin organization. The collection is built as a welldesigned scientific experiment by considering control and knockout of processes. To highlight and confirm the importance or existence of a given process, scenarios in which the process is absent are also considered and tested. Model responses are then compared to experimental observations and data published in [12] to identify optimal parameter values that yield the best fit of each of the models to the experimental data. Finally, we identify, using an information-theoretic approach, the best scenario or model given the data and candidate models under study.
By employing this reductionist phenomenological approach and systematic evaluation of the different scenarios we not only confirm the proposed transport features of the keratin cycle and the restricted disassembly in the perinuclear region but also find that the assembly throughout the cytoplasm fit best to the experimental data. Furthermore, our particular approach allows us to demonstrate that the inward motion experimentally observed is not an emergent behavior but is an inherent property of the keratin material organization and it is due to an active transport thereby confirming recent experimental observations [10,11].

Experimental data
In Moch et al. [12], the spatial distribution of the assembled keratin material in epithelial cells is measured for 15 minutes at 24 hours and 48 hours after seeding. The shape of each epithelial cell is normalized to fit a circle of fixed radius [13]. The average normalized spatial distribution is calculated over 50 cells at 24 hours (resp. 84 cells at 48 hours). The average speed and direction of the motion of the assembled keratin material are measured and determined at every location within the normalized cell. Finally, at every spatial location, the net assembly/ disassembly is calculated. Hence, regions with preferential assembly and disassembly are identified. We will refer to regions of assembly as Sources and regions of disassembly as Sinks. More details on the experimental data can be found in [12,14].
In the present work, cells are represented as one-dimensional cross-section domains. A diameter of the normalized cell is used as the spatial domain which is centered at the center of the cell and is of length 2L with L = 22.5μm. Moch et al. recorded the fluorescence intensity of fluorescent protein-labeled keratins in cells. Assuming a proportionality between fluorescences and concentrations, and knowing from [15] the mean concentration for keratin in keratinocytes, we convert fluorescence intensities to concentrations (μM) as follows: Concentration = Fluorescence × (Mean Concentration / Mean Fluorescence). In Fig 2, the average spatial distribution, the speed of the assembled keratin, regions of assembly (Sources) and regions of disassembly (Sinks) on the one-dimensional cross-section domain are displayed.

Mathematical models
To study its organization in cells, the keratin material is categorized into a soluble pool composed of the soluble keratin, and an insoluble pool representing the assembled keratin material. The state variables used to represent the soluble and insoluble pools are: • S(x, t) denotes the concentration of the soluble keratin material at position x at time t, • I(x, t) denotes the concentration of the assembled keratin material at position x at time t.
From here onwards, the one-dimensional spatial domain representing the cell is defined by where x = 0 is the center of the cell and x = ±L are the boundary positions of the plasma membrane ( Table 1). The general framework of the model, derived from first principles based on experimental observations, takes into account the turnover and transport for both soluble and insoluble pools and can be stated verbally as: ( Rate of change of S ¼ Transport of S À Assembly of S in I þ Disassembly of I in S; Rate of change of I ¼ Transport of I þ Assembly of S in I À Disassembly of I in SÁ Experimental data adapted from [12]. 2.1: The spatial distribution of the assembled keratin material. The mean concentration of the keratin material used in the conversion of fluorescence intensities to concentrations is estimated to be equal to 520μM in [15]. Circles represent raw experimental data; curves are fits of experimental data. Details on P(x) and f final (x) can be found in Appendix 4. 2.2: The speed of the assembled keratin material at 24 hours. 2.3: Regions of assembly and disassembly denoted Sources and Sinks respectively. In all figures, the cell is represented by a one-dimensional cross-section domain centered at the center of the cell; plasma membrane positions are at ±L = ±22.5μm, the nuclear envelope is located at ±7.5μm and the center of the cell is located at zero.
Hence, the generalized model has the following expression; stated mathematically as: ( @S @t ¼ T S ðSÞ À AðSÞ þ DðIÞ; where T S (Á) (resp. T I (Á)) is the functional term describing the transport of the soluble pool (resp. insoluble pool). For example, T S ðSÞ ¼ À @ @x J S ðx; tÞ (resp. T I ðIÞ ¼ À @ @x J I ðx; tÞ) where J S (x, t) (resp. J I (x, t)) describes the flux of the soluble pool (resp. insoluble pool) at position x from the left (x = −L) to the right (x = L) at time t. The function A(S) (resp. D(I)) is the assembly term (resp. disassembly term). To investigate the interplay between turnover and transport on the organization of the keratin material several assumptions are proposed for each of the functionals T S , T I , A and D.
Modes of transport. Molecules of the soluble pool are assumed to be subjected to the Brownian motion; the soluble pool is diffusible with a diffusion coefficient D S . The functional term of transport for the soluble pool is given by Passive transport for both pools is assumed in all models. Diffusion is assumed for the insoluble pool to describe the wiggling motion of the keratin filaments in cells. The diffusion coefficient of the insoluble pool is set to be much smaller than that of the soluble pool: 0 < D I 10 −3 D S . It is assumed that only the insoluble pool can be driven by an active transport. Experimental evidence show that the assembled intermediate filament proteins in the form of filament  [ 12] f 0 (x) Initial distribution of assembled keratin in cells at 24 hours Eq (17) (μM) [ 12] D S Diffusion coefficient of soluble pool 0.88±0.08 (μm 2 /s) [ 9] D I Diffusion coefficient of insoluble pool 10 −3 D S (μm 2 /s) u Speed of insoluble pool 0.002 − 0.008 (μm/s) [ 12,[16][17][18] v(x) Almost constant speed of active transport of insoluble pool Eq (12) (μm/s) [ 12,18] u(x) Variable speed of active transport of insoluble pool Eq (13) (μm/s) [ 12] k precursors, squiggles or filaments move along microtubules and actin filaments by interacting via molecular motors [11,[18][19][20][21]. An active transport (also called the inward drift) of the insoluble pool from the plasma membrane to the center of the cell is hypothesized based on reports of the motion of the assembled keratin material mostly towards the nucleus in epithelial cells [11,17,18]. The speed of the active transport is set to be almost constant v(x) everywhere or variable u(x) everywhere (Fig 3). Based on experimental observations, both speeds are assumed to decay around the nucleus towards the center of the cell. The variable speed u(x) is estimated using the profile of the average speeds measured in [12] (Fig 2.2). The magnitude of the almost constant speed v(x) is set to be the average value of the variable speed over the cell. Details on the derivation of the estimates of v(x) and u(x) are given in Appendix 1. Hence, the functional term for the transport of the insoluble pool can take three different forms: where v(x) is given in (12) and u(x) in (13); the two functions representing the speeds of the active transport are graphed in Fig 3. The function sgn(x) defined by describes the inward direction of the active transport at any location of the spatial domain O centered at zero. Combining the modes of transport for the soluble and insoluble pools, three types of transport are considered for the keratin material in the epithelial cell.
Expressions of assembly / disassembly reactions. In the present work, the turnover is composed of two reactions: the assembly / aggregation / polymerization of units of the soluble pool to grow the insoluble pool and the disassembly / solubilization / depolymerization of the insoluble pool into units of soluble pool. It is assumed that the assembly process is a function of the soluble pool only, whereas the disassembly process is a function of the insoluble pool only. The simplest case is to consider a linear model that assumes linear exchanges between the two pools. The linear model can be stated as follows: where k ass (Á) (resp. k dis (Á)) is the rate of assembly of the soluble pool (resp. disassembly of the insoluble pool). Both rates can either be constant, k ass and k dis , or space-dependent k ass (x) and k dis (x). In the second case, the turnover is assumed to depend on the enzymatic activities [22]. For instance, the solubilization of the insoluble pool into soluble proteins (disassembly) is triggered by a kinase activity and the assembly of insoluble pool is induced by the dephosphorylation of soluble proteins by a phosphatase [23]. The turnover term is assumed to be of the Michaelis-Menten form stated as:

AE
AðSÞ À DðIÞ ≔ AE k ass ðÁÞS where k ass (Á) (resp. k dis (Á)) is the maximum rate of assembly of the soluble pool (resp. disassembly of insoluble pool) and k S (resp. k I ) is the concentration at which the assembly (resp. disassembly) rate is half of k ass (Á) (resp. k dis (Á)). When Michealis-Menten dynamics are used, the model is called nonlinear. Both rates can either be constant or space-dependent describing the intracellular localization of the post-translational modification enzymes.
Profiles of assembly and disassembly rates. As previously mentioned, the assembly and disassembly rates, k ass (Á) and k dis (Á), used in the linear and nonlinear models can be constant or space-dependent functions. The profile (shape) of the space-dependent function k ass (x) is derived from the spatial profile of regions of assembly (Sources) measured in [12] (Fig 2.3). Details of the derivation of the Sources k ass (x) are given in Appendix 2.
Two types of shapes for k dis (x) are assumed to represent two types of localization of the disassembly in cells. First, similarly to the assembly rate, the profile of the disassembly rate is deduced from the experimental data published in [12] (Fig 2.3); the spatial profile of the disassembly regions, Sinks, is used to build the shape of the first space-dependent disassembly rate. This disassembly rate is called of type Sinks. Second, it is assumed that disassembly is localized around the nucleus; a mollified step-function is designed to describe this assumption. This second space-dependent disassembly rate is called of type Mollify. Details on the derivation of the two k dis (x) rates are given in Appendix 3. Profiles of the active transport speeds u(x) and v(x) for the insoluble pool both of which are derived from the experimental measurements in [12]. The function u(x) is derived from the profile of speeds and v(x) is approximated as the average value of these speeds. Details on the derivation of u(x) and v (x) are given in Appendix 1.
For the sake of illustration, the two profiles of k ass (Á) and the three profiles of k dis (Á) are shown in Fig 4. Accounting for the three modes of transport and considering the turnover described by either linear and nonlinear models with 6 possible combinations of the profiles for the assembly and disassembly rates, a collection of 36 scenarios (models) is defined; each scenario follows the general form stated by system (1). Details of the 36 scenarios are given in Fig 5. All scenarios are considered with the same initial conditions given by where f 0 (x), defined in (17), is chosen to be a mollified version of the profile of the assembled keratin at time t 0 = 24 hours averaged over all the normalized cells (Fig 6). Details of the derivation of f 0 (x) are given in Appendix 4. Initial conditions describe the observed fact that the soluble pool (resp. insoluble pool) represents 5% (resp. 95%) of the total keratin material [24]. All scenarios are considered with boundary conditions describing the impermeability of the plasma membrane for the keratin material where J S is the flux of the soluble pool and J I is the flux of the insoluble pool as defined below system (1). The model parameters used in all the scenarios are listed in Table 1.

Comparison between mathematical models and experimental data
Parameter estimation. Let p denote the set of all the model parameters for each scenario. To estimate the optimal set of parameter values p for each scenario, the solution of each Possible profiles for the assembly and disassembly rates. 4.1: k ass = 10 −3 s −1 when the linear model is used (resp. μM.s −1 when the nonlinear model is used) and k ass (x) is computed using (14). 4.2: k dis = 10 −3 s −1 when the linear model is used (resp. μM.s −1 when the nonlinear model is used) and k dis (x) are computed using (15) for Sinks and (16) for Mollify, respectively. Details on the derivation of k ass (x) and k dis (x) rates are given in Appendix 2 and Appendix 3, respectively. Parameter values in (14), (15) and (16) are chosen in such a way that their profiles give the same total amount of assembly / disassembly over the spatial domain Ω.  scenario is compared to experimental data using the following objective function (an estimation of the distance between the experimental data and the model response): where t final is equal to 48 hours. The experimental data at t final is represented by f final (x) and the model solution at t final of the considered scenario evaluated with the parameter values p is I(x, t final , p). To obtain the model solution for each scenario, the corresponding system is numerically integrated from t 0 = 24 hours to 50 hours using the MATLAB solver for partial differential equations, pdepe [25]. Solutions are computed at N x locations of the spatial domain O with N x = 200. To carry out computations, the raw data in which the concentration of the assembled keratin material is known at 623 spatial locations is approximated by f final (x) defined in Appendix 4 by (18); f final (x) is the fit of the average profile of the assembled keratin measured after 48 hours on the normalized cells (see the black profile in Fig 2.1). The estimation of the parameter values for the 36 scenarios, i.e. the determination of parameter valuesp that provide the best fit to the experimental data, is done by minimizing the objective function, C(p), such that Minimization of the objective function is done by a parallelizable genetic algorithm described in [26].
In this study, only constant parameter values of the turnover reactions are optimized. When the linear model is considered, for each scenario, two parameters k ass and k dis are estimated. When the nonlinear model is considered, four parameters, k ass , k S , k dis and k I , are estimated for each scenario. In all scenarios, the diffusion coefficients for the soluble and insoluble pools as well as the active transport speeds v(x) and u(x) are considered as fixed/known parameters or functions and are set to values measured in previous studies [9,11,12] (Table 1). The diffusion coefficients are taken as D S = 0.88μm 2 /s and D I = 9.5 × 10 −4 D S , and the "constant speed" u in v(x) as defined in (12) is set to be u = 0.0025μm/s [9,18].
Model selection. In order to select the best model out of the 36 scenarios, we use an information-theoretic approach. Specifically, we employ the Akaike information criterion (AIC) [27,28] to select from all the scenarios, the best model that best captures experimental observations given the collection of models considered in this study. Since we use the Least Squares principle to estimate the values of the constant free parameters (2 free parameters for linear models and 4 for nonlinear models), the Akaike criterion for Scenario i, AIC i , takes the following form: is the estimate of the variance with C i ðpÞ the residual (9) estimated for Scenario i and N x the size of the sample (N x = 200). K is the number of estimated parameters; i.e. the number of free parameters in Scenario i plus one for the estimate of the variance (K = 3 for linear model and K = 5 for nonlinear models). The scenario with the lowest AIC value is the best model. The AIC selects a model with the least number of parameters that best fits experimental observations. To rank and compare scenarios the Akaike weights w i are calculated and these are known as the weights of evidence in favor of Scenario i being the actual best model given the experimental data and the collection of scenarios considered. The Akaike weights are expressed as follows: where R represents the number of scenarios considered (R = 36) and Δ i being the difference in AIC with respect to the AIC of the preferred scenario min i AIC i = AIC min . It must be noted that the Akaike weights w i sum to 1 and are interpreted as the probability that Scenario i is the best model given the experimental data and the collection of scenarios considered. The models, ranked from the largest to the smallest Akaike weights, whose Akaike weights sum to 0.95 form the confidence set of the models that captures, more faithfully, experimental data. The ratio of the Akaike weights w i /w j (also known as the evidence ratio) is used to compare model pairs. Furthermore, the relative importance of a process can be estimated by summing the Akaike weights of all scenarios involving the process of interest. We will denote by w þ Ã the sum of the Akaike weights of the scenarios including the process of type Ã . This sum can be interpreted as the probability that the process of type Ã is the best type of the process given the experimental data and the collection of models considered.

Best scenario
We employ a two-step process in order to find the best scenario. First, the best fit for each scenario is found by minimizing the objective function (9). Secondly, the Akaike information criterion (10) is used to select the best of the best fits out of all the scenarios.
The best fit for each of the 36 scenarios is presented in Fig 7 in the following order: 1. Row-wise: • Scenarios in the first three rows (1 to 3) have linear terms for turnover.
• Scenarios in the last three rows (4 to 6) have Michaelis-Menten type turnover terms. • Rows 1 and 4 are scenarios with constant disassembly rate.
• Rows 2 and 5 are scenarios with localized disassembly rate of type Sinks.
• Rows 3 and 6 have scenarios with localized disassembly rate of type Mollify.
2. Column-wise: • The first two columns (1 and 2) include scenarios with diffusion alone without drift.
• Columns 3 and 4 display scenarios with drift with an almost constant speed.
• The last 2 columns (5 and 6) are scenarios with drift with variable speed.
• Even columns (2, 4 and 6) are scenarios with localized assembly rate of type Sources.
According to the AIC i values, the best model, the best of the best fits, is identified as being Scenario 21, AIC 21 = min i AIC i (3 rd column of Table 2). Since Scenario 21's Akaike weight is over 0.95 (5 th column of Table 2), it is the only model out of the set considered that satisfies the confidence criterion; Scenario 21 matches very well experimental observations and data. The second best model is Scenario 31. The evidence ratio of scenarios 21 and 31 w 21 /w 31 is equal to 36.67 × 10 6 ; i.e. Scenario 21 is about 36 millions times more likely than Scenario 31 to be the best model given the experimental data and the collection of models considered. We consider this to be strong evidence in support of Scenario 21. Scenario 21 is characterized by an inward drift with an almost constant speed for the insoluble pool, turnover terms of Michaelis-Menten type, a constant assembly rate and a disassembly rate of type Mollify. The profiles of assembly and disassembly rates are displayed in Fig 8. For this scenario, the estimated optimal parameter values are k ass = 9.3819μM/s, k S = 570.73μM, k dis = 0.9998μM/s leading to k max = 1.976μM/s in (16) and k I = 976.07μM. Based on the Michaelis-Menten constants for the assembly and disassembly processes, since k S < k I , an enzyme that would be involved in the solubilization of the assembled keratin material requires a higher substrate concentration to achieve a given reaction speed than an enzyme that would be involved in the assembly of soluble proteins. Snapshots of the soluble and insoluble pool profiles taken every 30 minutes from t 0 to t final are displayed in Fig 8. After only 2 hours, the solution stabilizes to its final profile. Scenario 21 preserves the repartition of the keratin material between the soluble and insoluble pools over time, about 95% of the keratin material is assembled to form the insoluble pool. Finally, the characteristic time scales of the passive transport (τ Diffusion ), active transport (τ Drift ) and turnover (τ Reaction ) for Scenario 21 are estimated by using an adimensionalization of system (1) corresponding to Scenario 21. Details on calculations are given in Appendix 5. The time scales of the processes included in Scenario 21 are ordered as follows: Using Peclet's number (Pe = τ Diffusion /τ Drift ) 1), it is found that the transport of the assembled keratin material by drift is faster than by diffusion, the dominant mode of transport is the active transport. Using Damköhler number (Da = τ Drift /τ Reaction ) 1), it is found that the active transport time scale is greater than the reactive time scale; overall, Scenario 21 is controlled by active transport. The complete ranking based on the Akaike weights ω i of all the scenarios is given in the 6 th column of Table 2. It is worthwhile remarking that differences in AIC i , Δ i , could have been used for ranking purposes; in this case, we would have obtained the same ranking. The rationale of using ω i is that it allows us to quantify how preferably each candidate model is via the normalization to 1.

Importance of the type of process
Using the sums of the Akaike weights, w þ Ã , we investigate several questions to evaluate the relative importance of the different types considered for a process (Tables 3 and 4). The set of the The numerical value 0 in the 5 th column denotes a value lower than 10 −12 . In the 6 th column, "Rank" denotes the ranking of scenarios, i.e. the descending order of Akaike weights w i . The use of weights allows the "quantitative" comparison of the adequacy of scenarios because of the normalization to 1.  Table 3. Importance of the type of process.
Type of Transport Type of Turnover Term Type of Assembly Type of Disassembly Combinations of assembly/disassembly rate profiles-Constant assembly and disassembly rates vs constant assembly rate and disassembly rate of type Sinks vs constant assembly rate and disassembly rate of type Mollify vs assembly rate of type Sources and constant disassembly rate vs assembly rate of type Sources and disassembly rate of type Sinks vs assembly rate of type Sources and disassembly rate of type Mollify. Top: The numerical value 1 denotes that the combination of processes of interest is in Scenario i. Bottom: w þ Ã denotes the sum of Akaike weights of scenarios including the process combination as indicated at the top. 36 scenarios is partitioned into categories with respect to the types of processes considered. For instance, considering the transport process, the collection of the 36 scenarios is partitioned into 3 categories (Fig 5): scenarios with no drift (Scenarios from 1 to 12), scenarios with an almost constant speed drift (Scenarios from 13 to 24) and scenarios with a variable speed drift (Scenarios from 25 to 36). The sum of the Akaike weights of each category is then calculated, compared and ordered to characterize which type is more likely to be present. In what follows the sign > denotes the relative importance, measured in probabilistic terms, of the type of process.
• What is the most likely type of transport for the keratin material in cells given the experimental data and the model collection considered? Is it that there is no drift of the insoluble pool (diffusion only for both pools), drift with an almost constant speed for the insoluble pool or drift with a variable speed for the insoluble pool? Each category includes 12 scenarios (see 3 rd to 5 th columns in Fig 5). From the Akaike weights, it is obvious that scenarios including diffusion only (with no drift) are not supported by experimental data since w þ NoDrift ¼ 0 ( Table 3). Diffusion alone is not enough to explain the experimental data. Given the data and the model collection, the type of transport can be ordered as follows (Table 3): Drift with almost constant speed > Drift with variable speed ) No drift: • What is the most likely type of turnover between the soluble and insoluble pools given the experimental data and the model collection considered? Is linear exchange or Michaelis-Menten type (underlying an enzyme activity) more likely? Each category includes 18 scenarios (see 6 th to 7 th columns in Fig 5). From Table 3, enzymatic activities are more likely to be present. Hence Nonlinear exchange ) Linear exchange: • What is the most likely rate profile of the assembly process of the keratin material in cells given the experimental data and the model collection considered? Is a constant assembly rate or assembly mainly localized at the cell membrane periphery more likely? Each category includes 18 scenarios (see 8 th to 9 th columns in Fig 5). From Table 3, the non-localized rate of assembly is preferred to the rate of the localized assembly; w þ CstAss > w þ Source . Hence Non-localized assembly ) Assembly localized at cell membrane periphery: • What is the most likely rate profile of the disassembly process of the keratin material in cells given the experimental data and the model collection considered? Is a constant disassembly, disassembly of type Sinks or disassembly localized around the nucleus more likely? Each category is composed of 12 scenarios (see 10 th to 12 th columns in Fig 5). From Table 3, none of the scenarios that include the disassembly rate of type Sinks is supported by experimental data.
Disassembly localized around the nucleus > Non-localized disassembly ) Sinks-type profile: • What is the most likely combination of the assembly and disassembly rate profiles given the experimental data and the model collection considered? Is a combination of constant assembly and disassembly rates, a constant assembly rate and disassembly rate of type Sinks, a constant assembly rate and disassembly rate of type Mollify, an assembly rate of type Sources and constant disassembly rate, an assembly rate of type Sources and disassembly rate of type Sinks or an assembly rate of type Sources and disassembly rate of type Mollify more likely? Now, interactions between the assembly and disassembly processes are investigated. Six categories are defined. In each category, there are 6 scenarios as listed in the top of Table 4.
Overall, as in Scenario 21, the combination of non-localized assembly and disassembly localized around the nucleus is preferred (Table 4).
Non À localized assembly and disassembly localized around the nucleus > Non-localized assembly and disassembly ) Any other combination:

Discussion
The primary aim of the present work is to investigate, through mathematical modeling driven by experimental observations, mechanisms contributing to the organization of the keratin material in epithelial cells and subsequently to test the biological model for the keratin dynamics proposed by Leube and Windoffer et al. in 2011 [10, 11]. From first principles, we formulate a collection of mathematical models capturing various combinations of biological processes describing the spatio-temporal dynamics of the keratin material in epithelial cells. By using techniques in parameter estimation, we find optimal reaction kinetic parameter values for each model that best captures experimental observations. We go one step further and employ an information-theoretic approach for model selection to determine the best model among all the best fit models that captures the key processes of the experimental data.
As previously highlighted, the model framework and the description of processes considered are driven by experimental data and observations. Experimental data used in this study provide the spatial distribution of the assembled keratin concentration at 24 and 48 hours. As only macroscopic information on the keratin organization is available, the models only describe the keratin material as soluble or assembled and consider exchanges between these soluble and insoluble pools combined with transport events. For instance, as experimental data identify the existence of regions with preferential assembly and disassembly (Fig 2.3), the assembly and/or disassembly processes are assumed to be localized in some scenarios. When observing the perpetual inward motion of the keratin network (see S1 Video), the "natural assumption" for the transport of the assembled keratin would be the existence of an inward active transport. However, we decide to consider scenarios with only passive transport (no active transport). Why? First, it is well known that combining appropriate reaction terms and diffusion (only) can lead to the emergence of complex behavior such as traveling waves and/or pattern formation (see some examples in [29]). Secondly, to reinforce the predictive power of our study. In our approach, we do not only design mathematical models but also calibrate models (parameter estimation) and then evaluate (model selection) how each model performs. To highlight and confirm the importance or existence of a given process, scenarios in which the process is absent must also be considered and evaluated. Our strategy of considering "unrealistic scenarios" and "realistic scenarios" in the collection of models to be evaluated mimics the biological experimental protocols such as knockout and controls. For instance, it has been shown that the keratin assembly / disassembly depend on post-translational modifications of keratins due to enzymatic activities [23]. In our study, a total of 36 scenarios that divide into 18 scenarios with a turnover of type linear (no enzymatic activities) and 18 scenarios with a turnover of type nonlinear (enzymatic activities) are investigated. There is a one-to-one correspondence between the 18 scenarios of the 2 groups. When model performances (Akaike weights) are compared, scenarios having a nonlinear turnover perform better than the corresponding ones with a linear turnover. Hence, the enzymatic activity is detected by the model selection as existing but also preferable. This outcome allows us to judge that our approach combining the mathematical modeling and model selection performs correctly, it gives us the confidence and trust in our conclusions.
Following this methodology, it turns out that Scenario 21 was the best of the best fits. Scenario 21 is in good agreement with the biological model proposed in [10,11]. Scenario 21 and the biological model share the following common key features: diffusion of the soluble pool, inward active transport of the insoluble pool and disassembly of the insoluble pool localized around the nucleus. Moreover, as experimentally observed, Scenario 21 preserves the repartition of the keratin material over time; the keratin material is mainly insoluble in epithelial cells. Both the proposed biological model and Scenario 21 hypothesize an inward active transport for the assembled keratin material. Examining the Akaike weights of the models with no active transport (Scenarios 1 to 12 in Table 2, 1 st and 2 nd columns in Fig 7 and 1 st to 3 rd columns in Table 3), we go one step further and show that the active transport is a requirement to explain the experimental data and that the experimentally observed inward motion of the assembled keratin is not an emergent phenomenon but it is due to an active transport. Furthermore, comparing the characteristic time scales of processes, we establish that the keratin dynamics are mainly controlled by the active transport in Scenario 21. Interestingly, Scenario 21 concurs with the proposed biological model about the existence of a disassembly localized in the perinuclear region. An experimental protocol must now be developed to further work out the details of this conclusion. It is worth pointing out that some characteristics of the proposed biological model were not tested due to the form of the mathematical models. For instance, the scenarios considered here describe the turnover in terms of the assembly of soluble proteins and disassembly of aggregated proteins. In the biological model, the nucleation of filaments, i.e. the initiation of filaments, is explicitly described and localized at the cell periphery. However, the nucleation process is not explicitly described in any of the present models. It is worth mentioning that at the beginning of this work a nucleation term was included in the models; the effect of this term did not change significantly the results; hence, to reduce the complexity of models and the number of parameters the nucleation term was subsequently dropped. Investigation, by mathematical modeling, of the nucleation process is currently being carried out in a separate and ongoing study.
When only scenarios with a variable drift are considered (Scenarios 25 to 36, 5 th and 6 th columns in Fig 7), the preferred combination of assembly and disassembly rate profiles is a constant rate of assembly and disassembly as in Scenario 31, which is the second best model. The best model, Scenario 21, and the second best model, Scenario 31, share the non-compartmentalization of the assembly process. However, in Scenario 21 the active transport has an almost constant speed whereas in Scenario 31 the speed is variable. For the disassembly, this process is non-localized in Scenario 31 whereas it is localized at the perinuclear region in Scenario 21. The spatial variability of the speed u(x) in Scenario 31, in particular, the almost-zero speed at the nucleus locations (see for x 2 [−7.5, 7.5] in Fig 3) "compensates" for the non-compartmentalization of the disassembly process. After comparing the features of Scenarios 21 and 31, we inspect their profiles. The profile obtained with Scenario 21 fits very well the experimental data on non-perinuclear locations (see for x = 2 [−7.5, 7.5] in Fig 7.33). On the other hand, the profile resulting from Scenario 31 fits very well experimental data on perinuclear locations (see for x 2 [−7.5, 7.5] in Fig 7.23). As an improvement of Scenario 21, we would expect that a wider decay in speeds around the nucleus modeled, for instance, with a smaller value of a in (12) would result in a better fit of the perinuclear region.
Scenarios 29 and 35 include the variable speed measured from experimental data, a localized assembly rate having the spatial profile deduced from Sources and a localized disassembly rate whose shape is derived from the profile of the Sinks. All the characteristics extracted from the experimental data are included in Scenario 29 (linear model) and Scenario 35 (nonlinear model). If one wanted the best model that takes into account all the experimental features, then Scenario 29 or 35 would be the best scenario. Scenario 29 is ranked 27 th and Scenario 35 is ranked 6 th (Table 2). Similarly to the general trend, for the same set of assumptions, the use of the Michaelis-Menten type turnover term generally provides a better representation of experimental data than the use of the linear turnover term. The evidence ratio of Scenarios 35 and 29 ω 35 /ω 29 is ridiculously large; Scenario 35 is much more adequate to represent the experimental data than Scenario 29. However, Scenario 35 is still only ranked 6 th . The failure/mismatch of Scenario 35 might be explained by the redundancy of the information existing in the variable speed and the net assembly/disassembly region profiles extracted from the experimental data. Furthermore, according to the Akaike weights, disassembly rates of type Sinks (the type deduced from experimental data) are less likely to occur given the experimental data and the collection of models considered. A similar conclusion is obtained for the assembly rate of type Sources. This could be a consequence of our too conservative interpretation of the regions of preferential assembly or disassembly. In our assumptions, in zones of preferential assembly, the disassembly rate is set to be about null and vice versa (Fig 2.3).
In summary, to model the keratin dynamics in epithelial cells we characterized the keratin material into two pools, the soluble and the insoluble pool; and, the events considered are the turnover and transport for both pools. The modeling assumptions used, for instance, the diffusion of the soluble pool, are based on biological observations and experimental data. The collection of the models considered in this study is designed to answer a set of questions such as "what is the mode of transport of the keratin material in epithelial cells?". After optimizing parameter values, model selection and evaluation methods applicable to non-nested models are used to discriminate between the candidate models, identify the best model and quantify how models under consideration are adequate to explain the experimental data. Note that the ranking of the models (scenarios) and the relative importance of the different types of processes (Tables 2-4) are only valid in the context of the experimental data and the set of the candidate models considered here. For instance, considering other hypotheses such as the non-negligence of the anterograde motion of the assembled keratin material or the stabilization (protection against disassembly) of the keratin filaments involved in the nuclear cage would have led to a different collection of scenarios in which our best scenario could have failed to be the best one. Furthermore, as some modeling assumptions are directly derived from experimental data, missing information in experimental data might have been prejudicial for the correct approximation / modeling of processes; for instance, missing information about the speed of assembled keratin in the cell periphery result in small values for the variable speed close to the plasma membrane (Fig 2.2). Keeping in mind the limitations of our approach, important conclusions have been reached such as • an active transport of the assembled keratin material is required, thereby confirming recent experimental observations [10,11], • enzymatic activities regulating the assembly / disassembly are more likely to occur, • the assembly process is more likely to be non-compartmentalized in cells, • last but not least, a unique best model strongly supported by experimental data is identified, this scenario is in good agreement with the biological model previously proposed in [10,11]. Interestingly, the best scenario supports the perinuclear localization of the disassembly process hypothesized in the biological model.
Mathematical models of the keratin intermediate filament organization in cells were previously proposed [30][31][32][33][34]; however, none of these models included the effects of transport of the assembled material on its organization. More importantly, only the behavior of models were validated qualitatively, no comparisons to experimental data were carried out. It is worth noting that other studies of the intermediate filaments dynamics combining mathematical modeling and experimental data approaches were carried out but on neurofilaments in neurons (see for example [35,36]).

Conclusion
Given the experimental data published in [12], through modeling and simulations, we investigate the effects of the interplay between turnover and transport on the keratin spatio-temporal organization in epithelial cells. Out of all the scenarios investigated, a scenario strongly supported by experimental data is found that best captures most of the hallmarks of the experimental observations. This scenario predicts the diffusion of soluble keratin, an inward active transport of the assembled keratin and the disassembly localized around the nucleus triggered by enzymatic activities as well as the assembly process that is non-compartmentalized over the cell. The value of our models is reflected in their predictive nature; first, the approach predicts the localized disassembly at the perinuclear region and second, that the experimentally observed inward motion is not an emergent behavior but that it is an inherent property of the organization of the keratin material in epithelial cells and it is due to an active transport.

Appendix 1
Based on experimental observations, the speed of the assembled keratin material is assumed to decay to almost zero around the nucleus. The nuclear envelope positions are at x = ±7.5μm. In order to describe the decay of the speed around the nucleus such that it is almost constant, the following function is used: with a = 0.05 and u being in the range given in Table 1. For numerical simulations, u is set to 0.0025μm/s which represents the average value of speeds measured in [12]. For the variable speed case, a symmetrical function over the spatial domain O is sought to describe the speed u(x). Averaging over the symmetrical spatial locations values of the average speed measured in [12] (Fig 2.2) and curve fitting with a sum of Gaussian functions of these values, an estimate of u(x) is obtained and expressed as follows: with the following coefficients: a 1 = 0.003372, b 1 = 17.39, c 1 = 7.577, a 2 = 0.003378, b 2 = −17.41 and c 2 = 7.546. The estimate u(x) is almost zero at the nucleus-locations and is symmetrical around the center of the cell (Fig 9). For numerical simulations, when needed, the following function with a = 1 is used as a "smooth analogue" of sgn(x).
Recall the parameter values obtained from the parameter estimation: k ass = 9.3819μM/s, k S = 570.73μM, k I = 976.07μM/s and k max = 1.976μM/s in (16) that correspond to a constant disassembly rate of level k dis = 0.9998μM/s. Hence, from the definition of k max , k dis (Az) can be approximated by k dis = 0.9998. The values of the fixed parameters are D S = 0.88μm 2 /s, D I = D S with = 9.5 × 10 −4 and u = 0.0025μm/s. The following estimates are then obtained: • Adimensional parameters: k dis (Az)/k ass = k dis /k ass = 0.1 and k I /k S = 1.7; • Diffusion time scale: τ Diffusion = ' 2 /D. To determine what mode of transport (passive vs active) and what process (transport vs reaction) dominate the dynamics of the keratin material, the Péclet and Damköhler numbers are used: • Pe = (u')/(D S ) % 135 ) 1. Diffusion is small compared to active transport for the insoluble pool, the transport of the assembled keratin material by drift is faster than by diffusion, active transport is the dominant mode of transport; • Da = ('k ass )/(uk S ) % 296 ) 1. Active transport time scale is greater than the reaction time scale; the overall process in controlled by active transport.
Supporting Information S1 Video. Dynamics of the keratin network in a cell. Time-lapse fluorescence microscopy of hepatocellular carcinoma-derived PLC clone PK18-5 stably expressing fluorescent fusion protein HK18-YFP [8] depicting the dynamic properties of the keratin filaments over a time period of 15 hours. Bar 10 μm. (AVI)