The role of the encapsulated cargo in microcompartment assembly

Bacterial microcompartments are large, roughly icosahedral shells that assemble around enzymes and reactants involved in certain metabolic pathways in bacteria. Motivated by microcompartment assembly, we use coarse-grained computational and theoretical modeling to study the factors that control the size and morphology of a protein shell assembling around hundreds to thousands of molecules. We perform dynamical simulations of shell assembly in the presence and absence of cargo over a range of interaction strengths, subunit and cargo stoichiometries, and the shell spontaneous curvature. Depending on these parameters, we find that the presence of a cargo can either increase or decrease the size of a shell relative to its intrinsic spontaneous curvature, as seen in recent experiments. These features are controlled by a balance of kinetic and thermodynamic effects, and the shell size is assembly pathway dependent. We discuss implications of these results for synthetic biology efforts to target new enzymes to microcompartment interiors.

and a 2 the area per shell subunit (measured at the inner surface of the shell), and ρ c the cargo density (which we assume is approximately its liquid density). Shells assemble from a solution of free pentamers, hexamers, and cargo molecules with concentrations ρ p , ρ h , and ρ c .
The total free energy density is given by where the index α runs over free pentamers (p), hexamers (h), and cargo molecules (c), v 0 = a 3 is a standard state volume, ρ(n) is the concentration of shells with n subunits, G(n) is the free energy in such a shell arising from shell-shell and shell-cargo interactions, and n min is the minimum shell size allowed by geometry (e.g. 12 pentamers). We then minimize f tot with respect to {ρ n }, subject to the constraint that the total concentrations of pentamer, hexamer, and cargo molecules ρ T p , ρ T h , and ρ T c are fixed: The minimization results in the law of mass action for concentrations of shells [B3], [42]: where Ω(n) is the excess free energy which includes the mixing entropy penalty associated with removing subunits and cargo particles from solution, with chemical potentials µ α = k B T ln (v 0 ρ α ) for α = {p, h, c}. We define the interaction free energy G(n) as: with ∆G p = n p (g ph + g pc ) (provided pentamers are present) and g ph and g hh as the shell shell-shell binding free energy per pentamer or hexamer (we assume the shells are large enough that there are no direct pentamer-pentamer interactions), g pc and g hc the shell-cargo interaction free energy strengths, µ liq c the chemical potential of the cargo subunits within the packaged globule, and τ the surface tension of the cargo globule. If there are no pentamers present, then ∆G p accounts for the 12 pentameric vacancies.
The term E elastic gives the elastic energy of the shell arising from bending and stretching deformations, including the contributions of the 12 disclinations required by topology. In the case of a fluid membrane, the bending energy is given by the ratio of its curvature radius R to its spontaneous curvature R 0 by the Helfrich energy, with κ and κ G the mean and Gaussian curvature moduli. The deformation energy for an elastic icosahedral shell without spontaneous curvature was derived by LMN [B2] and then approximately extended to include spontaneous curvature by NBG [B1]. The behavior depends on the dimensionless Föppl-von Kármán number (FvK), γ = Y R 2 /κ s with Y the 2D Youngs modulus, which gives the relative importance of bending and stretching. Stretching energy dominates over bending when γ > γ B ≈ 130, driving buckling of the shell [B1, B2]. Below the buckling threshold, the elastic energy is given by where γ 0 = Y R 2 0 /κ s is the FvK for a shell at its minimum energy size (R = R 0 ), and the first term gives the energy arising from the elastic interactions between the 12 disclinations for an icosahedral structure, with B ≈ π/3 a numerical constant [B2, B5]. The elastic energy from the defect interactions grows quadratically with shell size, until γ B when it becomes favorable to screen the interaction by buckling. Above this threshold, the elastic energy in the absence of spontaneous curvature is given by [B2] We omit the (lengthy) expression for the case of non-zero spontaneous curvature above buckling [B1], since in the present paper we focus on the sub-buckling case for simulations with spontaneous curvature. We will consider buckling of shells with spontaneous curvature in a future work. Mean shell size. The mean shell size can be obtained from Eqs. (S2.2) and (S2.3) as a function of the chemical potentials µ p , µ h , µ c using n = nρ(n)/ ρ(n). (S2.8) Alternatively, since the total concentrations of each species ρ T p , ρ T h , and ρ T c are the usual experimental control variables, it is convenient to numerically solve for the three unknown chemical potentials at fixed total concentrations. Shell size distribution for subunits with no spontaneous curvature, in the limit of excess of hexamers. In this section we calculate the shell size distribution corresponding to the simulation results on hexamer subunits without spontaneous curvature. Based on the simulation results, we restrict the calculation to spherical cargo globules and icosahedral shells, so the complete shell contains n h hexamers and 12 pentameric vacancies. We discuss the limits of this restriction below.
In the absence of cargo, and below the buckling threshold, the excess free energy in Eq. S2.3 is given by with G 0 = 8πκ + ∆G p and ∆µ h = ∆µ h + 6κB/n B with n B = 4πγ B κ s /Y a 2 the threshold buckling size. Eq. (S2.9) has the same form as the free energy of a system of fluid vesicles [B6] (for simplicity we are neglecting the renormalization of κ with shell size). However, allowing for equilibrium between assembled shells and free subunits gives the form of a cylindrical micelle [B3], with an exponential shell size distribution Thus shells are polydisperse, with the standard deviation of shell sizes equal to the mean. Significant assembly requires a total subunit concentration exceeding the 'critical concentration' [B7], [94] ρ * ≈ e β(g hh +6κB/nB) . (S2.11) As pointed out in NBG [B1], above the buckling threshold the free energy is unstable due to the presence of the log term in Eq. (S2.7), and the distribution is thus highly polydisperse.
In the presence of cargo, the excess free energy is given by with the chemical potential difference now including shell-cargo interactions, ∆µ h = g hh + g hc − µ h + τ a 2 + 6κB/n B , and the cargo chemical potential difference ∆µ c = µ liq c − µ c . Under conditions of limiting cargo, the system will equilibrate at concentrations of free shell subunits and cargo such that ∆µ h < 0 and ∆µ c > 0, with µ c = k B T log(ρ c v 0 ) and ρ c = ρ T c − ∞ n h =nminρ c n 3/2 h ρ n h accounting for the 'finite-pool' of cargo particles [B8]. The finite pool effect gives rise to a minimum in Eq. (S2.12) , and correspondingly a maximum in the shell size distribution ( Fig. S9). Note that in the thermodynamic limit, the condition ∆µ h < 0 should make the system unstable to other structures with a larger surface-to-area ratio, such as spherocylinders. Indeed, Cameron et al.
[39] observed elongated structures when pentamer proteins were knocked out and RuBisCO was overexpressed. We do not allow for these in the present calculation because we do not observe them in our simulations, either because the system size is not large enough or because the initial coalescence of cargo into a spherical droplet makes these geometries kinetically inaccessible.

Determination of parameter values
Comparing predictions of the equilibrium theory against BD simulation results requires a mapping between the interaction parameters of the theory (g hh , g hc , g ph , g pc , µ liq c , τ , and ρ c ) and simulations (ε HH , ε PH , ε SC , ε CC ). For this purpose, we use the mappings estimated in [42]. Note that these mappings are approximate, and we have not updated them for changes in ε angle (and in the case of flat subunits, the preferred subunit-subunit angle) between the two studies. Moreover, the estimates for subunit-subunit binding affinities (g hh (ε HH ) and g ph (ε PH )) are calculated for subunit dimerization reactions, and thus do not fully account for differences in the translational and rotational entropy of subunits within a complete shell compared to an a dimer. Thus, we can only qualitatively compare the equilibrium theory against the simulation results. However, the fitting parameters independently estimated for subunit-subunit interactions from our measurements of the shell bending modulus described next and in Fig. 6 agree reasonably well with the calculations from Ref. [42].
Estimating the shell bending modulus, κ s . We tune the bending modulus in our computational model by varying the parameter ε angle . However the angular dependence of the subunit-subunit interaction arises from a combination of nonlinear repulsive and attractive potentials, and has sufficient complexity that we could not directly calculate the bending modulus. We therefore obtained rough empirical estimates of the relationship κ s (ε angle ) by measuring the change in the average energy of assembled shells as a function of ε angle and/or shell size. Note that the dependence of κ s on ε angle ) differs for the two versions of the model (with and without spontaneous curvature).
For shells with T =3 spontaneous curvature, we extracted a complete shell containing 98 hexamers and 12 pentamers, along with cargo, which had assembled in a simulation with parameters ε HH =1.8, ε SC =9.0 and ε angle =1. We then performed a set of BD simulations, each at a different value of ε angle but with other parameters fixed, using the complete shell configuration as initial conditions. In each simulation we performed 10 5 time steps to allow relaxation under the new value of ε angle , followed by an additional 5 × 10 4 time steps during which we measured the total energy of the shell, U shell (ε angle ), including all shell-shell attractive and repulsive interactions (but not shell-cargo interactions since these were nearly independent of ε angle ). We then performed two regression analyses to fit the measured dependence of U shell on ε angle according to U shell (ε angle ) =U 0 + U bend (ε angle ) U bend (ε angle ) =C 1 ε p angle + C 2 (S2.13) with p = 1 (linear regression) or p = 1/2 and C 1 and C 2 fit parameters. The constant U 0 estimates the shell energy at ε angle = 0 and thus can be interpreted as the contribution from the attractive interactions and pentamers in their unperturbed configurations, U 0 = n p (g ph ) + (n − n p )(g hh ). The remainder of Eq. (S2.13) captures the variation of shell energy with ε angle , and thus can be interpreted as the bending energy arising from deviations from the shell spontaneous curvature. Fig. S10 shows the fits of Eq. (S2.13) to the simulation data.
We then estimate the bending modulus from U bend according to U bend (ε angle ) = 8πκ s 1 − n n 0 2 (S2.14) with n = 110 subunits in the simulated shell, and n 0 = 32 the number of subunits in a shell with radius equal to its spontaneous curvature R 0 . For ε angle =0.5 nonlinear and linear fits result in κ s =14.5 and κ s =6.2 respectively. Discriminating between these fits (or any other value of p) is challenging because they primarily differ near ε angle = 0 where we are unable to obtain simulation results. Moreover, the calculated κ s depends on the value obtained for U 0 . Thus, we set κ s = 10 ± 5k B T as an approximate average between the two fits. Our simulations of flat subunits explore a wider range of ε angle and shell sizes than those of simulations with spontaneous curvature. Consequently, we observed more significant nonlinear effects, and a higher-order dependence of elastic energy on shell size than accounted for in Eq. (S2.6). Note that these nonlinearities are not consistent with the expected renormalization of bending modulus with shell size [B6], but rather arise from the very large deviations from the preferred curvature R 0 = ∞ for the small shells considered. Therefore, for each shell size considered in Fig. 6, we measured the interaction energy U shell as a function of ε angle following the procedure described above, and then estimated an effective value of κ s from Eqs. (S2.13) and (S2.6) with n/n 0 = 0.
Our estimated bending modulus values are comparable to mechanical properties of carboxysomes measured by AFM. Using AFM nanoindentation experiments on β-carboxysomes, Faulkner et al.
[89] estimated a 3D Young's modulus of E = 0.6 MPa from a linear fit or E = 80 MPa from a Hertzian fit to the nanoindention profiles. These estimates lie below the range of Young's modulus values measured for viruses by nanoindention, E ∈ [100MPa, 2GPa] [B9-B12], thus suggesting that the carboxysome bending modulus lies below the range of corresponding bending modulus values for viruses, κ ∈ [30, 600]k B T .
A lower bound on the bending modulus can be estimated from the linear fit according to thin shell elasticity as [B13] κ s = Eh 3 12 (1 − ν) 2 (S2 .15) with h the thickness of the carboxysome shell and ν the Poisson's ratio. Using h ≈ 4.5 nm estimated from the carboxysome structure [89] and the typical Poisson's ratio for proteins ν = 0.3 [B13] results in κ s = 1.9k B T . This is a crude estimate since the nanoindention profile is better fit by the nonlinear Hertzian model and the effective thickness h typically corresponds to the minimum thickness of the shell rather than its mean thickness; for instance the effective thickness of virus shells has been estimated at h ≈ 2 [B13]. However, from a direct comparison of the estimated Young's modulus values for carboxysomes and viruses, it is reasonable to estimate that the carboxysome bending modulus falls in the range κ s ∈ [1, 25]k B T .