In Silico Insights into Protein-Protein Interactions and Folding Dynamics of the Saposin-Like Domain of Solanum tuberosum Aspartic Protease

The plant-specific insert is an approximately 100-residue domain found exclusively within the C-terminal lobe of some plant aspartic proteases. Structurally, this domain is a member of the saposin-like protein family, and is involved in plant pathogen defense as well as vacuolar targeting of the parent protease molecule. Similar to other members of the saposin-like protein family, most notably saposins A and C, the recently resolved crystal structure of potato (Solanum tuberosum) plant-specific insert has been shown to exist in a substrate-bound open conformation in which the plant-specific insert oligomerizes to form homodimers. In addition to the open structure, a closed conformation also exists having the classic saposin fold of the saposin-like protein family as observed in the crystal structure of barley (Hordeum vulgare L.) plant-specific insert. In the present study, the mechanisms of tertiary and quaternary conformation changes of potato plant-specific insert were investigated in silico as a function of pH. Umbrella sampling and determination of the free energy change of dissociation of the plant-specific insert homodimer revealed that increasing the pH of the system to near physiological levels reduced the free energy barrier to dissociation. Furthermore, principal component analysis was used to characterize conformational changes at both acidic and neutral pH. The results indicated that the plant-specific insert may adopt a tertiary structure similar to the characteristic saposin fold and suggest a potential new structural motif among saposin-like proteins. To our knowledge, this acidified PSI structure presents the first example of an alternative saposin-fold motif for any member of the large and diverse SAPLIP family.


Introduction
Pepsin-like aspartic proteases (APs) constitute a family of endopeptidases found in all kingdoms of life [1]. APs of plant origin are generally homologous to members of the A1 family of APs (http://merops.sanger.ac.uk) [2] sharing similar primary and bilobal tertiary structures wherein two lobes are separated by a large active site cleft containing two catalytic aspartic acid residues, and low pH optima. However, plant APs are unique in that they frequently contain an extra 100-residue domain inserted in the C-terminal lobe, distinguishing them from their microbial and animal counterparts [3]. This extra domain, termed the plantspecific insert (PSI) or plant-specific sequence (PSS) [4][5][6][7], belongs to the saposin-like protein (SAPLIP) family and contains the Saposin B (Sap B) protein domain architecture [8]. Physiologically, SAPLIPs exhibit varied functionalities manifested primarily in their abilities to target, bind and/or perturb membranes, sometimes involving the ability to permeabilize and/or induce vesicle fusion [8][9][10][11]. Examples of SAPLIP function include sphingolipid degradation and antigen presentation [12], haemolytic activity (Na-SLP-1 and Ac-SLP-1) [13], antimicrobial and cytolytic activity (NK-lysin and granulysin) [14,15] and fusion of large unilamellar anionic vesicles in vitro (Sap C and recombinant PSI expressed without the parent AP) [9][10][11]. Recombinantly expressed free-form potato PSI has been shown to display potentially useful functionalities in vitro including antimicrobial activity against both plant and human pathogens [16], as well as anticancer activity against leukaemia cells without having lymphocyte toxicity [17].
Although SAPLIPs share low sequence identity and exhibit a multitude of functions in a variety of organisms, there have only been two discrete SAPLIP conformations observed to date. With the exception of granulysin [18], all known SAPLIPs have a characteristic pattern of 6 cysteines that form 3 disulfide bridges. The predominant tertiary structure is a substrate-free closed form first elucidated by NMR structure determination of porcine NKlysin, and subsequently observed for all known SAPLIPs (see References [13,14,[18][19][20][21][22]). This closed form, the classic saposin fold, is distinguished by a compact globular structure consisting of a 4 or 5 a-helix distorted bundle packed into an oblate spheroid.
By contrast, a second SAPLIP structural variant exists having an extended open conformation, first observed for Sap C bound to SDS micelles, that resembles two side-by-side boomerangs [19].
Unlike the compact structure seen in closed SAPLIPs, lipid-bound Sap C opens in a jackknife-like fashion thereby exposing the normally buried hydrophobic core to accommodate lipid interactions [23]. This V-shaped configuration has also been shown in Sap A bound to various amphiphiles [24]. The V-shaped SAPLIP configuration is also observed in Sap C homodimers in the absence of bound lipids [22]. As with Sap C bound to SDS micelles and Sap A lipoprotein discs, ligand-free Sap C jackknifes open at hinge points at the helix-helix junctions between the first two and last two helices. These hinge points allow Sap C monomers to open up and adopt an extended V-shape configuration forming domain-swapped homodimers. Hydrophobic regions are thus sequestered from their aqueous environment as the two interfaces come together yielding a hydrophobic core within the dimer. This jackknife opening mechanism serves to demonstrate the conformational flexibility of some SAPLIPs afforded by the helix-helix junctions. The ability to open and close allows for both membrane interactions and oligomerization [20,22,24].
Only two PSI structures (PDB IDs 1QDM and 3RFI), both resolved by X-ray crystallography, have been elucidated thus far: the inactive precursor structure of barley (Hordeum vulgare L.) phytepsin (HvAP) [25] and recombinant PSI of Solanum tuberosum (potato) AP [11]. Barley PSI was shown to have the archetypical compact saposin fold wherein the N-and C-termini remained attached to the C-terminal lobe of its parent phytepsin. By contrast, free form potato PSI adopts the less commonly observed open conformation as a homodimer analogous to that of ligand-free Sap C. Like their SAPLIP homologues, PSIs have been shown to induce vesicle leakage and fusion as well as having roles in plant vacuolar targeting [3,11,26,27].
To gain insight into the structural determinants of the PSI's pHdependence for activity, the present study sought to elucidate and compare the protein dynamics and structural characteristics of the PSI in active and inactive pH conditions. Furthermore, the folding dynamics of the PSI open extended SAPLIP structure were investigated to clarify how the PSI tertiary structure relates to the typical closed SAPLIP fold.

Results
The PSI dimer forms a stable complex regardless of pH Equilibrium molecular dynamics simulations of the dimer complex in acidic (pH 3.0) and neutral (pH 7.4) conditions revealed that the PSI dimer is stable regardless of pH, evidenced by low and relatively constant root-mean-square deviation (RMSD) values of the dimer trajectories when fitted to the initial coordinates of the crystal structure ( Figure 1). As measured at the centre of mass (COM) of the individual monomers within the dimer complex, each monomer maintained steady contact at the dimer interface throughout the time course of each simulation ( Figure 2). Further examination of the trajectories showed little fluctuation in the residues comprising helical regions. The C a rootmean-square fluctuation (RMSF) for helices remained consistent through 100 ns and deviated little from the crystal structure (PDB ID 3RFI) providing further evidence of dimer stability ( Figure 3).

Influence of pH on PSI dimer dissociation
Analogous to AFM pulling [28] and optical tweezers experiments [29], steered molecular dynamics (SMD) [30] simulations can be used to direct behaviour within a reduced number of degrees of freedom towards a particular state or phenomenon of interest. The efficiency of this technique can be exploited to study phenomena not normally accessible by conventional timescales due to the computational expense of traditional MD simulations. SMD simulations typically use pulling velocities that are orders of magnitude higher than those used in AFM pulling or optical tweezers experiments, resulting in comparatively higher pulling forces. SMD is useful in exploring underlying processes involved in the dissociation of a dimer complex as evidenced in previous studies on unbinding pathways of proteins and substrates [31][32][33][34][35].
As a function of the distance between two molecules, the 1D potential of mean force (PMF) along a desired reaction coordinate (j) can be calculated [36,37]. In particular, it is the ability of the PMF, or free energy, to quantitatively describe DG dissociation of a protein-ligand or dimer complex of interest. Although a number of ways of determining PMF exist [38,39], the umbrella sampling method was chosen for its efficient sampling along the reaction coordinate [40]. Using the umbrella sampling method in the context of dimer dissociation, an umbrella biasing potential was applied to restrain one monomer at increasing distances from the second reference monomer as measured between the respective centres of mass. As opposed to conventional MD simulations, the use of the restraining potential allowed for increased sampling of conformational space at defined positions along the reaction coordinate, resulting in a series of biased histograms. The weighted histogram analysis method (WHAM) was then used to combine the individual distributions and extract the unbiased PMF in a manner similar to [41].
To assess the potential influence of pH on the dissociation of the PSI dimer and gain insight into the unbinding mechanisms of the dimer complex, SMD simulations were performed in combination with umbrella sampling and WHAM. Equilibrium MD simulation structures were used as starting configurations for SMD, and pulling simulations were performed for acidic (active; pH 3.0) and neutral (inactive; pH 7.4) conditions in which one monomer (chain B) was pulled away from an immobile reference peptide (chain A) along the z-axis such that the distance between their centres of mass increased as the two peptides were pulled apart from one another. Although the PSI is optimally active at pH 4.5, pH 3.0 was used here since the pH 4.5 dimer is not sufficiently soluble to conduct ongoing monomer-dimer equilibrium experiments whose preliminary data indicate that PSI exists as a dimer under acidic conditions (unpublished data). The resultant trajectories were then used to generate the windows for umbrella sampling, and WHAM was used to extract PMF associated with dimer dissociation ( Figure 4). The PMF profiles indicated that a significant amount of free energy was required to instigate dissociation with DG dissociation values of 108.8 kJ mol 21 at acidic pH and 95.7 kJ mol 21 at neutral pH. The high free energy barrier to dissociation was suggestive of strong intermolecular protein-protein interactions. This was expected and reasonable as the probability of water contacting the hydrophobic undersides of the respective PSI monomers is minimized by maintaining strong contacts at the dimer interface [42], thereby sequestering hydrophobic residues from solvent and thus stabilizing dimer quaternary structure.
While the PMF profiles describing both the acidic and neutral pH dimer dissociations were similar, it should be noted that the DG dissociation for the PSI at pH 3.0 was almost 13% larger than at neutral pH. Although the two monomers maintained similar contact distances throughout the time course of the simulation, the difference in DG dissociation ( Figure 2) may be explained by differences in ability to preserve contact at the hydrophobic interfaces. Compared to the pH 7.4 simulation in which all residues were in their standard state, histidine as well as all glutamic and aspartic acid residues were protonated in the acid simulation resulting in charge neutralization and mitigation of electrostatic repulsion among the expansive number of negatively charged residues. This would result in the stabilization of the dimer as movement of the two monomers away from each other would be restricted by the dominant hydrophobic interactions at the dimer interface and the higher free energy requirement for dissociation.
The closed saposin-fold conformation is the dominant structure adopted by monomeric PSI Principal component analysis (PCA) is a robust tool for identifying and separating the large-scale, and usually slowest, collective motions of atoms to reveal the largest contributors to atomic fluctuation of protein structures from the fast random internal motions [43,44]. To examine tertiary structure dynamics of monomeric PSI, and assess potential influence of pH on protein folding, unrestrained MD simulations were performed on the extended PSI monomer in solution at both active and inactive pH values. PCA was then applied to the unrestrained MD simulations and conformational changes were examined. Monomer conformational stability was evaluated by calculating backbone RMSD after least-square fitting by superposing MD trajectories onto the PSI crystal structure. Simulations at both active and inactive pH produced similar trends in the evolution of RMSD, remaining stable with fluctuations in RMSD by approximately 0.2 nm -0.8 nm until approximately 230 ns (pH 4.5) and 198 ns (pH 7.4), suggesting that the PSI deviated little from the crystal structure. At these times, a transition in tertiary structure occurred in which the RMSD brusquely increased 1.2 nm -1.4 nm after which the RMSD remained stable upon adopting a new conformation ( Figure 5). The extended conformation closed in on itself and adopted the closed saposin fold characteristic of other SAPLIPs [13,14,[18][19][20][21][22][23][24][25] irrespective of pH. Hence, the simulations essentially described a spontaneous tertiary structure transition from the open to closed state. As one might expect, the1D mode described for the first PC in either simulation corresponded to the closing motion of the PSI, accounting for approximately 78.8% and 74.2% of the overall motions for simulations at active and inactive pH, respectively ( Figure 6). This closing motion corresponded to helices a1/a4 collapsing onto helices a2/a3, hinging at the flexible helix-helix junctions formed between a1/a2 and a3/a4 (Figure 7). At pH 7.4, the second PC was characterized by a slight twisting motion of the terminal helices (a1 and a4) relative to helices a2/a3 and was responsible for the characteristic distortion of the a-helix bundle typical for the saposin fold ( Figure 8), a phenomenon that was not observed in the active pH simulation. Subsequent PCs showed diminished contributions of conformational changes to overall motions of the PSI.
Two-dimensional projections of the active and inactive trajectories onto their respective first and second PCs showed that the PSI explores a wide range of conformational space (Figure 9). The conformer plot for the inactive simulation ( Figure 9B) revealed that the PSI transits through three distinct conformational states corresponding to three distinct minima, after which it becomes trapped in a third and final state. The first conformational state corresponds to the extended open crystal structure. After sampling the essential subspace near the starting conformation, the monomer then transitions to a second, discreet state as the protein begins to jackknife. This intermediate conformation corresponds to a quasi-folded tertiary structure in which helices a1 and a4 begin to collapse onto helices a2 and a3, thus forming the beginnings of the characteristic 4-helix bundle observed for all known SAPLIPs [8]. The third and final cluster is the most densely populated and closely packed cluster corresponding to a distorted helix bundle tertiary structure like that of the characteristic saposin-fold. Similarly, the active pH simulation showed a transition from the initial open extended structure to a quasifolded, compact 4-helix bundle tertiary structure ( Figure 9A). However, unique to the active pH simulation were several microstates sampled along the second PC before finally becoming trapped in the densely populated final cluster.

Radius of gyration measurements for investigating PSI compaction
To monitor compaction of the PSI monomer as indicated from the principal component analyses, the radius of gyration (R g ) was determined for hydrophobic residues located in helical regions ( Figure 10). For both the active and inactive pH simulations, initial R g corresponded to fluctuations in the PSI open conformation. At approximately 200 ns, a sharp decrease in R g occurred corresponding to folding events related to the hydrophobic collapse of the concave face in which the stem formed between the N-and Ctermini folded over onto helices a2/a3. This process corresponded to the large, abrupt changes in RMSD as well as the first PC observed at this time. Post-collapse, the lowered R g values and the scarcity in R g deviation throughout the remainder of the simulations were consistent with the adoption of a stable tertiary structure.
The adoption of the saposin fold-like tertiary structure for monomeric PSI is made possible by the hinges formed at the flexible helix-helix junctions between a1/a2 and a3/a4. Similar to the orthorhombic saposin crystal structure (PDB ID 2QYP) [22], extended PSI had obtuse opening angles of 109u and 110u for the active and inactive pH simulations, respectively, as measured at the Ca atoms of Pro66, Glu85 and Lys101 in which the hinge between helices a3 and a4 open about Glu85 ( Figure 7A). Upon closing, the opening angles closed to approximately 23u and 33u for the pH 4.5 and 7.4 simulations, respectively, in agreement with the 34u opening angle measured at the Ca atoms of Pro67, Asn86 and Arg102 of the resolved portion of the closed HvAP PSI crystal structure [25]. Some a-helical secondary structure was lost at the helix-helix junctions during folding, transitioning to random coil to accommodate the movement of side chains towards the hydrophobic concave face of the PSI. The adoption of a saposin-like fold is further reinforced by the low RMSD between the resolved 4helical bundle of the HvAP PSI and closed StAP PSI structures generated through MD (1.060 Å and 0.6477 Å for pH 4.5 and pH 7.4, respectively). All data are available upon request. The calculated PMFs describing the dissociation of the PSI dimer at pH 3.0 and pH 7.4 gave DG dissociation values of 108.8 kJ mol 21 and 95.7 kJ mol 21 , respectively. As expected, the large free energy requirement can be attributed to the need to sequester the hydrophobic concave face of the PSI from solution, thereby minimizing entropy associated with the exposure of hydrophobic residues. Though both dimers form stable conformations, it should be noted that greater binding between the monomers is achieved at acidic pH, and the major reason for this is likely charge neutralization of carboxylic acid groups on glutamic and aspartic acid residues. It would be expected that electrostatic repulsion between monomers would thus be lowered allowing hydrophobic interactions at the dimer interface to dominate. An analogous phenomenon is seen with membrane-bound Sap C where neutralizing its negatively charged electrostatic surface removes membrane-protein charge-charge repulsion [19] thereby mitigating the unfavourable introduction of charges into the bilayer apolar hydrophobic environment. Furthermore, the calculated RMSD, RMSF and minimum distance maintained between the two monomers (see Figures 1-3) were consistent with the stability gained by folding the two monomers into a compact globular quaternary structure, limiting potential changes in tertiary structure.
Maintaining hydrophobic contact between the residues lining the concave face of the PSI is the driving force for preserving the quaternary structure and stability of the overall dimer, which may be related to the relative stability of the dimer at acidic pH relative to that at neutral pH. As the dimer experiences a lower free energy barrier to dissociation at neutral pH, the dimer quaternary The larger free energy requirement for dissociation at acidic pH may be indicative of a physiological necessity. It has been established a priori that the PSI is active against bilayers at acidic pH [11,25,27,45]. We hypothesise that the dimer formation at acidic pH may represent a particular functional quaternary structure. Baoukina and Tieleman [46,47] previously concluded that covalently linked antiparallel lung surfactant protein B (SP-B) dimers, analogous to StAP PSI dimer, mediate faster kinetics of monolayer folding. SP-B dimers promoted bilayer folding and eventual formation of hemifusion-like stalk connections similar to those observed in vesicle fusion [46,47]. It is theorised in the present study that PSI dimers may also function in a similar manner. Such a pH-dependence for a quaternary structurefunction relationship is further supported by the observed StAP PSI capacity to induce bilayer fusion of large unilamellar vesicles (LUV) at acidic pH, causing both membrane disruption and fusion [11], and is supported by previous research examining the roles of the PSI in vesicle disruption and membrane targeting [26,27,45]. The idea that the dimer serves a functional role in bilayer disruption and fusion is also consistent with the ''clip-on'' model for Sap C-mediated vesicle fusion, proposed by Wang et al. [10] and further appended by Rossmann et al. [22]. This model hypothesises that Sap C dimers can bind to two vesicles, interacting with the membrane in a similar fashion to Sap C monomers through domain swapping, and thereby bring adjacent bilayers close enough to mediate fusion. Considering the similarities in structure and dimer stability pH-dependence, it stands to reason that our proposed model suggests a possible commonality between the Sap C ''clip-on'' model and the PSI mode of membrane interaction.

Conformational flexibility and adoption of the saposin-fold
Insight into conformational changes can be gained by projecting the MD trajectory onto the subspace spanned by the two largest (typically the first and second) principal components [43,48]. In doing so, it is possible to characterize the transitions from the open, extended conformation of the PSI monomer to the closed saposin fold-like structure seen in the unrestrained simulations. As well, any possible intermediate structures that may be adopted during the opening-to-closing transition can be observed, providing a map of the overall structural variability of the PSI. The resultant conformer plots thus provide the means to interpret the conformational changes sampled by the unrestrained MD simulations and express the relationships between these conformers. Unrestrained MD simulations of the extended PSI monomer suggested that the PSI adopts a closed saposin-like conformation independent of pH.
Principal component analysis performed on the MD trajectories revealed that the first PC corresponds to the closing of the PSI, accounting for 78.8% and 74.2% of the overall motion of the protein for the active and inactive pH simulations, respectively. Analogous to the PSI dimer, it is postulated that the closing motion observed in monomeric PSI arises from the need to reduce the entropy gained from exposure of these hydrophobic residues to water. Two-dimensional projections of the first two PCs revealed that the PSI transitions from an extended state to one or more intermediates before finally closing in on itself. Although the 2D projections sampled similar conformational space, the conformer plots of the active and inactive pH simulations differed in that the active pH simulation sampled several microstates before settling into a minimum and adopting a saposin-like closed motif. This differed from the inactive pH simulation in which the PSI sampled only three distinct states corresponding to energy minima for the initial structure, a molten globular structure, and finally a closed saposin-like tertiary structure. It is at this state that the concave face of the PSI has formed a hydrophobic core at its centre. These differences may be attributed to the differing electrostatic makeup of the two systems; negative charges on Glu and Asp are at least partially neutralized at active pH resulting in an overall positively charged protein, whereas both negative and positive residues exist  The conformational changes adopted by the PSI as it transits from extended to closed conformation were attributed to the high degree of conformational flexibility at the hinge-bending regions of the helix-helix junctions, similar to what is observed in other SAPLIPs. The latter is a common characteristic for saposin members of the SAPLIP family which have been shown to have the capacity to exist both in substrate-free closed and in extended lipid-or peptide-bound conformations [22][23][24]. For the PSI, this flexibility is made possible in part by the local dynamics of side chains. Hydrophobic residues located in the helical regions orient themselves such that their side chains are involved in the formation of the tight dimer interface (as observed in the unrestrained dimer simulations), induced by the presence of inter-protein hydrophobic interactions. The helix orientations in this packing motif are mimicked by monomeric PSI as it closes in a domain-swapped fashion in that helices a3 and a4 twist about their helical axes thereby maximizing intra-protein hydrophobic contacts. This folding process is marked by the hydrophobic regions of the four helices collapsing on themselves thereby minimizing contact with the polar environment and concomitantly maximizing aqueous contact with the polar outer surfaces.  The dynamics of PSI closure suggest two possible structures for the PSI, and that pH influences these conformational differences. To date, the only SAPLIP pH-structure report has been for Sap C in which a reduction in pH from 6.8 to 5.4 did not result in observable conformational changes [23]. It should be noted, however, that Sap C acidification occurred with monomeric protein already folded to a local minimum having adopted the characteristic saposin fold. This is in contrast to the present study in which the open PSI structure was allowed to explore a large degree of conformational space as it closed to a local energy minimum. The similar structure for the neutral StAP PSI ensemble in the present study, and that for HvAP PSI [25], as well as the low RMSDs, may indicate that the classical saposin-fold is pH-dependant for at least some SAPLIP cases. The neutral pH StAP PSI simulation and HvAP crystallography [25] used similar experimental parameters (i.e., 100 mM NaCl and neutral pH) suggesting that the acidic pH saposin-like fold observed in the present study likely presents a derivative of the classic saposin fold and it would be expected to be adopted by other SAPLIPs having similar structures and pH-function dependencies.
The present study undertook a comprehensive analysis of the PSI to identify conformational changes due to differences in pH and to assess the potential impact that these changes may have on protein function. Free energy changes for PSI dimer dissociation at acidic and neutral pH were predicted by steered MD simulations in combination with umbrella sampling. These identified key differences in binding affinities indicating that the PSI has a preference for maintaining the dimer quaternary structure at acidic (active) pH due to the higher free energy requirement for dissociation. In conclusion, we postulate that the preference for dimerization may be indicative of a functional structure that plays a role in membrane binding and vesicle fusion. PCA of unrestrained MD simulations of the PSI monomer after separation from the dimer complex was then used to assess conformational Figure 10. Radius of gyration (R g ) of the PSI over the time course of the simulations. R g of the PSI at pH 4.5 (blue line) and pH 7.4 (orange line) as a function of time. In either case, the PSI was free to move in the extended state. Upon adoption of a saposin-like fold, the collapse of the hydrophobic concave face of the PSI onto itself limits movement, thereby restricting water access to the hydrophobic core. doi:10.1371/journal.pone.0104315.g010 Folding Dynamics of a Plant Aspartic Protease Saposin-Like Domain changes adopted by the monomers. Although monomeric PSI folded to a closed conformation regardless of pH, the final closed structures differed in that the pH 7.4 PSI adopted a tertiary structure consistent with the characteristic saposin-fold whereas a distinct saposin-like fold was observed at pH 4.5. This acidified PSI structure presents the first example of an alternative saposinfold motif for any member of the large and diverse SAPLIP family.

Initial models
In the present study, the high resolution (1.9 Å ) X-ray crystal structure of extended potato (Solanum tuberosum) PSI (PDB ID 3RFI) [11] was used as the template structure for SMD simulations. Chain A was used for the unrestrained MD simulations of the PSI monomer. The linker region (residues 40-63) connecting helices a1/a2 to a3/a4 was not resolved in the original crystal structure. As such, MODELLER 9v8 was used to build the missing linker region ab initio and modeled as random coil [49,50]. Hydrogen atoms were added for all titratable residues in accordance to their calculated protonation states as determined using the H++ web server [51][52][53] using an internal protein dielectric constant and solvent dielectric constant of 10 and 80, respectively, with sodium chloride added at 140 mM or 100 mM for the SMD dimer dissociation (pH 3.0 and 7.4) and unrestrained MD monomer (pH 4.5 and 7.4) simulations, respectively.

Unrestrained molecular dynamics system setup
All simulations and analyses were carried out using the GROMACS software suite, Version 4.5.5 [54][55][56] employing the Amber99sbnmr1-ILDN force field [57][58][59]. For each simulation, periodic boundary conditions were applied in all dimensions. The PSI was centred in a cubic box such that the protein was positioned at least 1.2 nm from the box edge and hydrated using the TIP3P explicit water model [60] to solvate the system. Sodium and chloride counterions were added at 140 mM or 100 mM concentrations for the SMD and unrestrained MD simulations, respectively, to produce electroneutral systems. Short-range electrostatic interactions were cut off at 8 Å whilst long-range electrostatic interactions were calculated using the particle-mesh Ewald (PME) summation method [61] with fourth order B-spine interpolation and a maximum grid spacing of 1.2 Å . A twin-range Folding Dynamics of a Plant Aspartic Protease Saposin-Like Domain van der Waals cut-off was employed (0.8/1.0 nm) and an integration time step of 2 fs was used with neighbour searching performed every 5 steps with all bond lengths being constrained using the linear constraint solver (LINCS) algorithm [62].
Each simulation was prepared in 3 phases before production runs were performed. In the first phase, the protein was energy minimized using the steepest decent algorithm with position restraints placed on all heavy atoms (k PR = 1000 kJ mol 21 nm 22 ) until the maximum force converged to #500 kJ mol 21 nm 21 . In the next phase, the system was equilibrated for 1 ns with position restraints placed on all heavy atoms in the canonical ensemble using the Berendsen weak coupling method [63] with temperature maintained at 303.15 K (t T = 0.1 ps). This equilibration was followed by another 1 ns position-restrained simulation in the isobaric-isochoric ensemble. Again, the Berendsen weak coupling method was used to maintain temperature at 303.15 K and pressure isotropically coupled at 1 bar (t P = 1.0 ps). The isothermal compressibility of the system was set to 4.5610 25 bar 21 . For the production unrestrained MD simulations, position restraints were removed. The velocity rescale (v-rescale) algorithm [64] was used to maintain the temperature of the system at 303.15 K (t T = 0.1 ps) and the pressure was again maintained at 1 bar using the Parrinello-Rahman [65,66] barostat (t P = 2.0 ps) in the isobaric-isochoric ensemble with long-range dispersion correction applied for both the energy and pressure terms. Production simulations were conducted for 500 ns.

Steered molecular dynamics
Equilibrated starting structures of the PSI dimer for the pH 3.0 and pH 7.4 SMD simulations were generated following the same procedure used for the unrestrained MD simulations with production MD conducted for 100 ns. The resultant structures were then used as the starting configuration for the corresponding SMD pulling simulations. The PSI dimer was placed in a rectangular box large enough to accommodate separation of the dimer along the z axis whilst satisfying the minimum image convention. The dimer was then subjected to energy minimization and equilibration in both the canonical and isobaric-isochoric ensemble again as described above. For the SMD pulling simulations, position restraints were removed from chain B of the PSI dimer while heavy atoms of chain A were harmonically restrained (k PR = 1000 kJ mol 21 nm 22 ) in a similar fashion to that used in the equilibration phases. Chain A was used as an immobile reference for chain B pulling. The 1D reaction coordinate was chosen to be the distance along the z axis between the COMs of the two PSI monomers. Chain B was pulled away from chain A along the z axis for 1 ns with a constant velocity of 10 nm ns 21 using an elastic spring (k = 1000 kJ mol 21 nm 22 ) positioned at the COM of the peptide. Trajectories at slower pulling rates (5 ns nm 21 and 1 ns nm 21 ) were also tested to assess the influence of pulling forces on the structure as force is applied. These slower pulling rates resulted in similar force-time curves and similar overall trajectories (Figure 11) [67]. As such, the faster pulling rate was used for experimental SMD simulations to minimize usage of computational resources.

Umbrella sampling and determination of PMF
The pH 3.0 and pH 7.4 trajectories from SMD pulling simulations were used to generate sampling windows along the reaction coordinate. Windows were spaced between 0.5-2.0 Å for the first 2.5 nm followed by approximately 2.0 Å spacing until the overall distance between the COMs between chains A and B was approximately 6.0 nm. This resulted in 43 and 45 sampling windows being selected for the pH 3.0 and pH 7.4 simulations, respectively. MD was conducted for each window for 15 ns with a harmonic restraint (k PR = 1000 kJ mol 21 nm 22 ) applied to chain B to fix the peptide along the reaction coordinate and then the PMF was constructed. The unbiased PMF was calculated using WHAM and the DG dissociation was evaluated as the difference in energy between the plateau and energy minimum along the PMF curve [68].

Principal component analysis
PCA was used to identify principal modes of motion sampled during the unrestrained MD simulations of the extended PSI monomer at pH 4.5 and pH 7.4. A covariance matrix of the backbone atoms in the monomer was constructed using the PSI trajectories. The matrices were then diagonalized yielding eigenvectors and their corresponding eigenvalues revealing both the directions and amplitudes of motion, respectively. Projection of the two largest eigenvectors onto 2D space was then used to quantitatively compare the ability of each ensemble to sample varying regions of conformational space.