A structural dynamics model for how CPEB3 binding to SUMO2 can regulate translational control in dendritic spines

A prion-like RNA-binding protein, CPEB3, can regulate local translation in dendritic spines. CPEB3 monomers repress translation, whereas CPEB3 aggregates activate translation of its target mRNAs. However, the CPEB3 aggregates, as long-lasting prions, may raise the problem of unregulated translational activation. Here, we propose a computational model of the complex structure between CPEB3 RNA-binding domain (CPEB3-RBD) and small ubiquitin-like modifier protein 2 (SUMO2). Free energy calculations suggest that the allosteric effect of CPEB3-RBD/SUMO2 interaction can amplify the RNA-binding affinity of CPEB3. Combining with previous experimental observations on the SUMOylation mode of CPEB3, this model suggests an equilibrium shift of mRNA from binding to deSUMOylated CPEB3 aggregates to binding to SUMOylated CPEB3 monomers in basal synapses. This work shows how a burst of local translation in synapses can be silenced following a stimulation pulse, and explores the CPEB3/SUMO2 interplay underlying the structural change of synapses and the formation of long-term memories.


Introduction to the AWSEM-3SPN2 force field
In this study, all simulations were carried out using the AWSEM(-3SPN2) force field. The AWSEM force field is a predictive coarse-grained model for proteins, whose Hamiltonian V AW SEM is shown below, where V backbone is a peptide-like backbone term; V contact is a tertiary residual interaction term; V burial is a many body burial term; V DH is the Debye Huckel electrostatic term; V HB is a beta-sheet favored, hydrogen bond term; V F M is the associative fragment memory term, a bioinformatical term guiding the local-in-sequence interactions. In all simulations, we used the PDB structure 2N1W as the only memory for SUMO2 protein. We used the PDB structure 2MKJ to model the structure of free state CPEB3-RRMs and used the PDB structure 2M13 to model the partial structure of CPEB3-ZnF via Modeller. These two structures are then used as the only memories for RRMs and partial ZnF. The fragment memories for the rest of the ZnF were obtained from the known protein data bank using a sequence alignment algorithm.
For simulations containing the RNA fragment, we used the 3SPN2 force field which is a coarse grained model originally developed for simulating DNA, as the model for the RNA by regarding Uracil (U) as the base particle thymine (T) in 3SPN2: where V backbone is a bonded term connecting 3-beads nucleotides backbone; V stacking is a stacking term between consecutive nucleotides; V BP is a base pairing term between complementary nucleobases; V CS is a cross stacking term between neighboring base pairs; V excl is an exclusion term; V DH is the Debye Huckel electrostatic term. The interaction between the protein and the nucleotides is modelled by an exclusion term and the Debye Huckel electrostatic term in AWSEM-3SPN2 force field. A more detailed introduction to the AWSEM-3SPN2 force field can be found in previous papers. (1 -4 ) A debye length of 10 A is used for all Debye Huckel electrostatic terms. This value corresponds to a typical physiological solution condition when temperature T=300K and ionic strength equals 0.1 M, using a dielectric constant for water = 80. (5 ) Original AWSEM-3SPN2 only assigns charges to the phosphate particle (-0.6) and four types of residues (Arg:+1, Lys:+1, S2 Asp:-1, and Glu:-1). Here additional charges were applied to the two zinc coordination sites include Zn1 (Cys654, Cys657, Cys681, Cys684) and Zn2 (Cys671, Cys676, His689, His697) to model the two zinc ions (+0.5 for each coordination residue) . The first nucleotide of the nucleic acid chains in 3SPN2 model lacks the phosphate particle, which is a proper approximation for very long nucleic acid chains. However, our simulations used a short RNA chain with a length of five nucleotides. Therefore, we assigned an additional charge of -0.6 to the sugar particle of the first nucleotide of our RNA chains. For the electrostatic interactions between protein and RNA, we modified the effective charge of each nucleotide into -1.0. Other parameters are set to the default values in AWSEM-3SPN2 (openmm version) and can be found at https://github.com/npschafer/openawsem and https://github.com/cabb99/open3spn2.

Binding energy for SIM/SUMO2 complex
The binding energy E binding for the SIM/SUMO2 complex is calculated using the equation:

Additional potentials
To model the two zinc ions coordinating with ZnF domain, we have assigned positive charges to coordination residues. However, this modification of charges can not take into account all the structural effects of coordinated zinc ions. To confine the structure of ZnF, we applied an AMHGo S3 potential (V AM HGo ) to all simulations containing the ZnF domain: where k AM HGo = 0.3kcal/mol and other parameters are the same as corresponding parameters in Q value definition. Here structure A represents the ZnF structure during simulations and structure B is the Modeller structure of ZnF built from the PDB structure 2M13. We only considered the residue pairs with a distance closer than 8 A in the template structure.
To guide the formation of the inter-molecular beta sheet between RRM1-SIM and SUMO2-β2, we applied a spring potential (V rbias ) between these two groups of residues for a biased simulation: where k rbias = 0.1kcal/mol/A 2 , r 0 = 15A and r is the distance between the mass centers of RRM1-SIM (P484-W489) and SUMO2-β2 (G27-R36).
To run the umbrella sampling simulations for studying the closure motion of RRMs, we used a biasing potential for the reaction coordinate Q f : where k qbias = 10000kcal/mol, Q 0 is the minimal for a simulation window and Q f is the interdomain Q value for current structure corresponding to the NMR structure of free RRMs.
To simulate the specific binding between CPEB3-RRMs and its target mRNA, we applied a sequence-specific contact term V ssc between the protein and the RNA for all simulations containing RNA: where k ssc = 0.8kcal/mol, δ = 1.5A, r ij , r B ij are the distance between residues i and nucleotide j in current structure and the NMR structure of RNA-bound RRMs. γ ij is the weight for each residuenucleotide pair and equals ln(C ij ), where C ij is the atomic contact number between residues i and nucleotide j in the NMR structure of RNA-bound RRMs. Here we used a distance threshold of 4.5 A to define atomic contacts and only considered residue-nucleotide pairs having no less than S4 affinity of RRMs calculated from our free energy profile (∼ 8 kacl/mol) is comparable to the experimental value (6.4 ∼ 6.9 kcal/mol).
To calculate the free energy profiles of RNA dissociation for protein/RNA complex systems, we ran a two-dimentional umbrella sampling for each system. The major reaction coordinate is the value of V ssc , corresponding to the biasing potential: where k sscbias = 1mol/kcal and V 0 is the minimal for a simulation window. For simulation windows with V 0 values close to 0, an additional biasing potential V Rbias was applied to enhance sampling along the order parameter R, the distance between RNA and RNA-binding pocket in RRMs: where k Rbias = 0.1kcal/mol/A 2 , R 0 the minimal for a simulation window and R is the distance between the mass centers of RNA chain and RNA-binding pocket (F462, G465, P490, Y504, F506, K541, Q544, F570, K609, K645).

Simulation methods
All simulations were run in the openMM platform with the Langevin integrator at a constant temperature 300 K. We used 2 fs as the simulation time step and 1 ps as the damping time. For all umbrella sampling simulations, we output one sampling structure every 5000 steps.

Equilibrum simulation of free RRMs
We ran 20 independent equilibrium simulations of free RRMs, starting from the Modeller structure built from PDB structure 2MKJ. Each simulation is run for 6×10 6 steps and outputs one sampling structure every 10000 steps. All sampled structures in 20 trajectories were then used for the carte-

Structural prediction to the SUMO2/RBD complex
Using the NMR structure of SUMO1/CBP-ZnF complex (PDB ID: 2N1A) as a template, we can dock the SUMO2 protein to the ZnF of the 60 predicted structures of full length RBD and obtain 60 initial structures for the SUMO2/RBD complex structure prediction. For each initial structure, a biased AWSEM simulations with an additional potential V rbias was run for 6 × 10 6 steps. After turning the V rbias term off, the last frame of the biased simulation was relaxed for 2 × 10 7 steps.
The final structures of the relaxed simulations were considered as candidates for further evaluation and selection.

Free energy profiles for RNA dissociation
For the RRMs/RNA system, we used the Modeller structure built from the NMR structure of RNAbound RRMs (PDB ID: 2MKI) as the initial structure. The RNA sequence used in all simulations is also the same as the sequence in PDB structure 2MKI: 5'-CUUUA-3'. For the RBD/RNA system, we attached the ZnF domain to the C-terminal of RRMs in the RRMs/RNA Modeller structure to obtain the initial structure. For the SUMO2/RBD/RNA system, we first docked the RNA chain to the RNA-binding pocket in the predicted SUMO2/RBD complex. Then we ran a biased equilibrium simulation for 8 × 10 6 steps with the V sccbias term on. In this simulation, we set k sscbias = 5mol/kcal and V 0 = −35kcal/mol so that the sequence specific protein/RNA contacts S6 were correctly formed in the final structure, which was then used as the initial structure for later umbrella sampling simulations.
For all these RNA-bound systems, we applied a 2D umbrella sampling method to archieve sufficient sampling using both V ssc and R as two biasing reaction coordinates. The biasing potentials V sscbias and V Rbias are mentioned above. We first turned off the V Rbias term and set 33 simulation windows with V 0 evenly distributing from -35 kcal/mol to -3 kcal/mol. We also turned on the V Rbias term and set 9 simulation windows with R 0 evenly distributing from 1 nm to 5 nm when For the other combinations, we simply removed unneeded components from that structure to build initial structures. The structural similarity to the NMR structure of free RRMs, Q f was used as a biasing reaction coordinate. The biasing potential V qbias is mentioned above. For each system, we set 71 simulation windows with Q 0 evenly distributing from 0.1 to 0.8. Each simulation lasts 8 × 10 6 steps. The first 1 × 10 6 steps are treated as equilibration and are then discarded in later analysis. Another order parameter, PC0 is used for 2D-WHAM analysis. ( Based on Western blot experiments on CPEB3 (7 ) and Orb2 (8 )