RNA length has a non-trivial effect in the stability of biomolecular condensates formed by RNA-binding proteins

Biomolecular condensates formed via liquid–liquid phase separation (LLPS) play a crucial role in the spatiotemporal organization of the cell material. Nucleic acids can act as critical modulators in the stability of these protein condensates. To unveil the role of RNA length in regulating the stability of RNA binding protein (RBP) condensates, we present a multiscale computational strategy that exploits the advantages of a sequence-dependent coarse-grained representation of proteins and a minimal coarse-grained model wherein proteins are described as patchy colloids. We find that for a constant nucleotide/protein ratio, the protein fused in sarcoma (FUS), which can phase separate on its own—i.e., via homotypic interactions—only exhibits a mild dependency on the RNA strand length. In contrast, the 25-repeat proline-arginine peptide (PR25), which does not undergo LLPS on its own at physiological conditions but instead exhibits complex coacervation with RNA—i.e., via heterotypic interactions—shows a strong dependence on the length of the RNA strands. Our minimal patchy particle simulations suggest that the strikingly different effect of RNA length on homotypic LLPS versus RBP–RNA complex coacervation is general. Phase separation is RNA-length dependent whenever the relative contribution of heterotypic interactions sustaining LLPS is comparable or higher than those stemming from protein homotypic interactions. Taken together, our results contribute to illuminate the intricate physicochemical mechanisms that influence the stability of RBP condensates through RNA inclusion.

where q is the charge (being -0.75e for the different nucleotides: A, C, G, U; and +0.75e for amino acids such as R and K, +0.375e for H, and -0.75e for D and E residues), ϵ r = 80 is the relative dielectric constant of water, ϵ 0 is the electric constant, κ −1 = 795 pm is the Debye screening length, and r ij is the distance separating particles i and j. For these interactions, a Coulomb cut-off of 3.5 nm is employed. The non-bonded interactions between protein/RNA beads are modelled via the Wang-Frenkel potential [3] where representing σ the molecular diameter of each residue/nucleotide and ϵ the interaction strength between distinct amino acids and nucleotides (i and j). µ ij and R ij are constant model parameters set to µ ij = 1 and R ij = 3σ ij for every interaction, while σ ij and ϵ ij are specified for each pair of interaction in Ref. [1]. Finally, bond energy is computed with an harmonic bond potential of the following form: where b is the total number of bonds, r b is the bond distance, k=8.03 Jmol −1 pm −2 is the spring constant and r 0 is the bond reference position, set to 381 pm and 500 pm for protein and RNA bonds respectively. For further details on this force field and the full list of the model parameters please see Ref. [1].

FUS Sequence and PDB of the structured domains
The Uniprot code of the sequence is K7DPS7. The following Protein Data Bank (PDB) codes were used to build the globular structured domains of FUS (residues from 285-371 (PDB code: 2LCW) and from 422-453 (PDB code: 6G99)).

Minimal protein/RNA model
For the minimal coarse-grained simulations, we employ a patchy particle model [4][5][6][7] in which proteins are described by a pseudo hard-sphere (PHS) potential [8] that accounts for their excluded volume: where λ a =49 and λ r =50 are the exponents of the attractive and repulsive terms respectively, and ε R accounts for the energy shift of the pseudo hard-sphere. On top of this, we add a continuous square-well (CSW) potential for modeling the different protein binding sites, therefore mimicking protein multivalency: where ϵ CSW is the depth of the potential energy well, r w the radius of the attractive well, and α controls the steepness of the well. We choose α = 0.005σ and r w = 0.12σ so that each binding site can only interact with another single one. RNA-protein interactions are modeled with a standard Lennard-Jones (LJ) potential [9]: where ϵ LJ measures the depth of potential and σ the excluded volume between proteins and RNA. The Lennard-Jones potential is employed between proteins and RNA beads, while RNA-RNA interactions 2/10 are modelled via the PHS potential [8]. In this way, we model RNA as a self-repulsive polymer of bonded hard spheres, and protein-RNA interactions via sites that are not the protein-protein binding sites, hence, if one protein is bonded to RNA, it can still bind to other proteins. The mass of each patch is a 5% of the central PHS particle mass, which is set to 3.32x10 −26 kg, despite being this choice irrelevant for equilibrium simulations. This 5% ratio fixes the moment of inertia of the patchy particles (our minimal proteins). The molecular diameter of the proteins, both scaffold and cognate proteins, as well as the RNA beads is σ = 0.3405 nm, and the value of ε R /k B is 119.81K. With this model, we express magnitudes in reduced units: reduced temperature is defined as T * = k B T /ϵ CSW , reduced density as ρ * = (N/V )σ 3 , reduced pressure as p * = pσ 3 /(k B T ), and reduced time as σ 2 m/(k B T ). In order to keep the PHS interaction as similar as possible to a pure HS interaction, we fix k B T /ε R at a value of 1.5 as suggested in Ref. [8] (fixing T = 179.71K). We then control the effective strength of the binding protein attraction by varying ϵ CSW such that the reduced temperature, T * = k B T /ϵ CSW , is of the order of O(0.1). The cut-off distance for the interactions in this model are 1.17σ for the PHS and CSW potentials and 5σ for the LJ interactions. The ϵ LJ /k B for LJ interactions is set to 152.5K. This model has been proven to qualitatively reproduce the effect of protein valency in LLPS [5], the enhancement of RNA-mediated LLPS with RNA-binding proteins [10] or the size conservation of condensates through interfacial free energy reduction [7].

Simulation details
Our Direct Coexistence simulations are performed in the NVT ensemble (i.e. constant number of particles (N), volume (V) and temperature (T)), for which we use a Nosé-Hoover thermostat [11,12] with a relaxation time of 5 ps for the Mpipi model simulations and 0.074 in reduced units for the patchy particle simulations. Since all our potentials are continuous and differentiable, we perform all our simulations using the LAMMPS Molecular Dynamics package [13]. Periodic boundary conditions are used in the three directions of space. The timestep chosen for the Verlet integration of the equations of motion is 10 fs for the Mpipi model and 3.7x10 −4 in reduced time units for the patchy particle model. In Table A, we summarise all the simulation details regarding the employed systems for both resolution models.

Computing phase diagrams via Direct Coexistence
To calculate the coexisting densities of the phase diagrams, we employ the Direct Coexistence method [14][15][16]. Within this scheme, the two coexisting phases are simulated by preparing periodically extended slabs of the two phases, the condensed and the diluted one, in the same simulation box. We use an implicit solvent model; accordingly, the diluted phase (protein-poor liquid phase) and the condensed phase (protein-rich liquid phase) are effectively a vapour and a liquid phase, respectively. Once our DC simulations have reached equilibrium, we compute the density profile along the long axis of the box, and thus, we extract the density of the two coexisting phases (as shown in the Supporting Material of Ref. [10]). From the plateau of the condensed phase and the diluted one, we measure the density (avoiding the interfaces between both phases). To estimate the critical point of the phase diagrams, we use the universal scaling law of coexistence densities near a critical point [17], and the law of rectilinear diameters [18]: and where ρ l and ρ v refer to the coexisting densities of the condensed and diluted phases respectively, while ρ c is the critical density, T c is the critical temperature, and d and s 2 are fitting parameters.  Table A. Summary of the simulation details of the employed systems: Total number of proteins (N P ), total number of RNA nucleotides (or RNA beads in the minimal model; N N ), total number of RNA chains (N RN A,chain ), length of the RNA chains (L RN A ), net charge of the system, box dimensions (in x/Å, y/Å, z/Å), and estimated critical temperature (T c in K for the high-resolution Mpipi model and in reduced units for the minimal CG model). Fig A. Snapshot of a pure PR 25 Direct coexistence simulation at T/T c,F U S =0.5. As it can be seen, in absence of RNA, PR 25 cannot undergo LLPS (even at low temperatures). The same colour code employed in Fig. 1 of the main text has been employed here.

Fig B.
Snapshot of a pure cognate system Direct Coexistence simulation at T/T c,Scaf f old =0.6. As it can be seen, in absence of RNA, the cognate protein cannot undergo LLPS (even at low temperatures). The same colour code employed in Fig. 1 of the main text has been employed here.

Results with the HPS & KH models for FUS and PR 25
Additionally to the results presented in the main text of this article, we perform simulations of FUS and PR 25 in presence of RNA with different lengths at a constant RNA/protein concentration employing the hydrophobicity scale (HPS) model for the FUS-polyU systems, and the Kim-Hummer (KH) model for those of PR 25 with polyU. Both models are detailed in Ref. [19]. In these models, the solvent is also implicit as in the Mpipi force field [1]. The simulation details are the same provided for the Mpipi simulations in Section 1D. Here, polyU RNA strands are mimicked as chains of glutamic acid (and so its mass) as shown in Ref. [20]. In Fig D (a) we show how the phase behaviour of FUS and polyU RNA deeply resembles the one observed in the main document using the Mpipi model (Fig 3 (a) of the main text): the critical temperature of FUS-RNA mixtures marginally depends on the length of the added polyU strands. However, for PR 25 -polyU systems (Fig D (b)), there is a remarkable difference of a factor of 2 (almost 200K) between the shortest (20-nucleotide long RNA chains) and the longest ones (800nucleotide long RNA chains) as observed in the main text for the Mpipi model (Fig 3 (b) of the main text).
Moreover, the same intermolecular contact analysis of Fig 4 of the main text was also performed for these simulations, providing similar results to those computed from the Mpipi model calculations.
In Fig E(a), it can be seen how FUS-FUS contacts are mostly responsible of holding the protein condensates (displaying about 4 times more contacts per nm 3 than the RNA-FUS contacts), while in the PR 25 -RNA condensates, heterotypic PR 25 -RNA interactions are predominant respect to those of PR 25 -PR 25 (Fig E (b)). Lastly, the length dependent behaviour of the critical temperature for PR 25 condensates is held as in the case of the Mpipi model (main document , Fig 4(d)), while for FUSpoly-U systems the critical temperature remains roughly constant independently of the RNA chain length.

Computing the number of intermolecular contacts
In Figs 4 and 6 of the main document, we provide the number of intermolecular contacts per unit of volume. To compute such magnitude, we consider two amino acids to be in contact under a distance of 7.7Å, which is the average of all the possible σ ij (Eq. 2) multiplied by a factor 1.2 (typical value in σ at which the attractive LJ/Wang-Frenkel interactions become mild). The RNA-protein interactions are considered as contacts when the distance between a nucleotide and an amino acid is lower than 8.95Å, which is the average σ ij between uridine and any other amino acid, multiplied also by a factor of 1.2. The number of total contacts is averaged throughout entire converged simulations of about 2 µs. Then, the contacts density is obtained by dividing the total number of contacts within the condensed phase by the condensate volume. The density of LLPS-stabilizing intermolecular contacts within the condensates as a function of temperature for FUS+polyU(400-nt) and PR 25 +polyU(400-nt) is provided in Fig. F (a) and (b) respectively. Moreover, the electrostatic vs. non-electrostatic contribution to the attractive potential interactions sustaining the phase-separated condensates as a function of RNA length for FUS+polyU condensates (filled circles) and PR 25 +polyU condensates (empty squares) is given in Fig