Computational Design of a pH Stable Enzyme: Understanding Molecular Mechanism of Penicillin Acylase's Adaptation to Alkaline Conditions

Protein stability provides advantageous development of novel properties and can be crucial in affording tolerance to mutations that introduce functionally preferential phenotypes. Consequently, understanding the determining factors for protein stability is important for the study of structure-function relationship and design of novel protein functions. Thermal stability has been extensively studied in connection with practical application of biocatalysts. However, little work has been done to explore the mechanism of pH-dependent inactivation. In this study, bioinformatic analysis of the Ntn-hydrolase superfamily was performed to identify functionally important subfamily-specific positions in protein structures. Furthermore, the involvement of these positions in pH-induced inactivation was studied. The conformational mobility of penicillin acylase in Escherichia coli was analyzed through molecular modeling in neutral and alkaline conditions. Two functionally important subfamily-specific residues, Gluβ482 and Aspβ484, were found. Ionization of these residues at alkaline pH promoted the collapse of a buried network of stabilizing interactions that consequently disrupted the functional protein conformation. The subfamily-specific position Aspβ484 was selected as a hotspot for mutation to engineer enzyme variant tolerant to alkaline medium. The corresponding Dβ484N mutant was produced and showed 9-fold increase in stability at alkaline conditions. Bioinformatic analysis of subfamily-specific positions can be further explored to study mechanisms of protein inactivation and to design more stable variants for the engineering of homologous Ntn-hydrolases with improved catalytic properties.


Introduction
To perform their functions, most proteins form compact native structures that are stabilized by complex networks of covalent bonds, non-covalent hydrophobic, electrostatic, van der Waals interactions and hydrogen bonds [1][2][3]. Structural stability, therefore, is largely necessary for the maintenance of functional conformations under adverse environmental conditions (temperature, pressure, pH, presence of solvents, and salts, etc.). Stability is a fundamental property that not only affects the structure and function of macromolecules but also determines biological fitness. Retention of the native fold is, in general, a prerequisite for the evolution of new functions [4,5]. Proteins that better tolerate functionally beneficial but destabilizing mutations have a higher chance to survive the selection pressure [6]. Consequently, extra stability provides a strong advantage in evolution and promotes evolvability [7,8]. Thus, exploring the mechanisms of protein stability appears to be important not only for studying enzyme evolution and understanding structure-function relationship, but also for the engineering of novel enzymes.
Last century has witnessed a rapid emergence of biocatalysis and its use in the adaptation of enzymes to new industrial applications for which they have not been evolved [9]. The unique catalytic functions of enzymes are determined by their complex three-dimensional structures. In particular, the organization of the active site residues must satisfy certain structural constraints, which can result in poor stability [10,11]. Consequently, stability emerged as a major limitation to the use of enzymes in nonnatural environments [12]. Owing primarily to industrial needs, enzymes were extensively studied to understand the tradeoff between activity and stability and to engineer more tolerant variants [13][14][15][16][17]. These studies focused mainly on the thermal stability of the biocatalysts as many industrial processes involve reactions at elevated temperatures for improved productivity and for exclusion of microbial contamination.
Penicillin acylases (EC 3.5.1.11) are a group of enzymes mainly known for their ability to preserve the labile b-lactam ring of penicillins and cephalosporins while catalyzing the selective hydrolysis and/or synthesis of their relatively stable amide bond of their side chains [18,19]. These enzymes are members of the Nterminal nucleophile (Ntn) hydrolase superfamily, which is characterized by a common catalytic N-terminal nucleophile [20]. They are also capable of effective and enantioselective acylation of amino compounds in aqueous medium and can be used for preparation of individual enantiomers of primary amines, amino alcohols, non-conventional amino acids and aminonitriles [21][22][23][24][25][26][27][28].
Taking into account their biotechnological potential, penicillin acylases have been immobilized to create robust biocatalysts with improved stability over a broad range of reaction conditions including ones in organic solvents. The major results were achieved using different immobilization techniques [29] incorporation of the enzyme into soluble-insoluble polyelectrolyte complexes [30] or lipid biocomposite [31], chemical cross-linking of enzyme crystals (CLEC) [32] or enzyme aggregates (CLEA) [33,34], covalent binding to epoxy-activated acrylic carriers [35], epoxy-Sepabeads [36,37], and adsorption on Celite rods [38]. Stabilization of penicillin acylase by addition of co-solvents (salts, sugars and polyols) has also been reported [39] outlining the important role of hydrophobic interactions in maintaining the protein structure. Immobilization techniques in general provided good operational stability of the biocatalysts but led to a loss of the catalytic activity of the native enzyme. While meeting the industrial requirements, these results tell little about the fundamental protein stabilization mechanisms and structure-function relationship of the penicillin acylase family. A rare example of engineering the native protein stability implemented random mutagenesis to study the influence of surface residues on the alkaline stability of E. coli penicillin acylase [40] and resulted in a two-fold increase of the enzyme's half-life period. In a different study a structure-driven computational approach was applied to design penicillin acylase mutants with up to three-fold increase of the thermal stability compared to the wild type [41].
The inactivation kinetics of penicillin acylase from E. coli has been studied experimentally in a wide pH range [42]. Very high stability in neutral solutions and rapid inactivation in acidic and alkaline environment was shown. Analysis of kinetic data suggested that the stable and functional protein conformation was maintained by several ionizable residues whose positions in the protein structure were not determined.
While little research has been performed to investigate the basis of pH-dependent denaturation, several stabilization strategies have been proposed for different enzymes. Random mutagenesis combined with screening and selection for the desired phenotype to mimic the Darwinian process (directed evolution) was implemented to produce protein variants with improved pHstability [43][44][45][46][47]. It was further suggested to apply the mutagenesis only to the charged residues located on the protein surface that seemed most likely to lead to the improvement [40,48]. However, the total number of possible mutations in a protein structure is astronomical and, therefore, stochastic approaches are highly resource-demanding and time-consuming. They require very large mutant libraries as well as efficient screening and selection techniques while still being able to scan only a small fraction of the sequence space. Furthermore, such stochastic approaches are hampered by a high number of deleterious mutations and low number of functionally beneficial phenotypes.
Rapid development of computational technologies as well as availability of quickly increasing genomic/structural databases have diminished the need for unguided evolutionary stochastic approaches and screening of large random libraries. Instead, remarkable progress was achieved by employing comparative structural analysis on M-proteases and alkaline cellulases K with different pH-optimum [49,50]. These studies concluded that alkaline adaptation is accompanied by a decreased number of negatively charged amino acids and lysine residues and an increased number of arginine and neutral hydrophilic residues (histidine, asparagine and glutamine). Electrostatic interactions of the charged residues were suggested as a key factor of adaptation to extreme pH. Consequently, substitution of basic by acidic residues was used to improve the charge balance and stability at low pH, and vice versa [51][52][53][54]. It was also established that asparagines and glutamines deamidate in alkaline medium leading to destabilization of the protein structure [55] and mutation of these residues was suggested to improve stability at extreme alkaline conditions [56,57]. Finally, alteration of pKa values of the key residues -either manually selected [58] or known to be catalytically important [59] -was implemented to design pHstability profiles. While all these empirical studies suggest a more rational way of engineering pH-stability, they do not provide a clear strategy for selection of the relevant residues to be mutated.
In an alternative approach, the free energies of unfolding of a wild-type protein and its mutants were compared under userdefined environmental conditions to perform wide-scale in silico redesign of protein stability. Such methods, however, perform with moderate accuracy and more reliable tools yet have to be developed [60,61].
The studies discussed above have shown that the pH-dependent protein stability can be modulated by introducing amino acid residues with ionizable side-chains. It is, however, still debated if charged residues on the surface or in the protein core are more important for stability. On the one hand, strong charge-charge interactions on the protein surface stabilize the folded state at neutral pH [62]. On the other hand, a number of mutagenic analyses have shown that electrostatic interactions on the protein surface contribute very little to protein stability suggesting that buried core residues are more important to maintain protein structure [63,64]. In any case deprotonation/protonation of the ionizable residues at high or low pH is an important factor for protein stability. Consequently, to design pH-stability in a rational way one should evaluate the role of the interactions of ionizable residues, especially those whose side-chains are buried.
In this work bioinformatic analysis was applied to identify the subfamily-specific positions responsible for the function and stability of Ntn-hydrolases. Molecular modeling was used to study the pH-induced unfolding of penicillin acylase from Escherichia coli in aqueous medium at alkaline conditions. A buried interaction network has been identified, the collapse of which at alkaline pH leads to destabilization of the native protein conformation. Furthermore, the bioinformatic analysis of homologous Ntnhydrolases with different pH stability was used to select the hotspot for mutagenesis that can lead to improved protein stability. The computationally predicted mutant was tested experimentally and showed more than 9-fold stabilization under alkaline conditions.

Molecular modeling of the pH-induced destabilization
Penicillin acylase (EC 3.5.1.11) from Escherichia coli (EcPA) is a heterodimer that contains two monomer chains referred to as a and b, consisting of 209 and 557 amino acid residues, respectively. The two chains are closely interlaced and form a pyramidal structure with the active site located at the bottom of a deep coneshaped cavity [65]. According to the CATH classification [66], the functional conformation of EcPA contains 5 structural domainstwo mainly alpha-helical domains of the a-chain (A1: a3-149 and A2: a150-179) and three domains of the b-chain. The latter include the catalytically active abba-core structure B1, which, in itself, consists of three sub-domains discretely placed in the sequence (B1-1: b1-72; B1-2: b146-290; and B1-3: b452-536), mainly beta-sheet B2 (b73-145) and mainly alpha-helical B3 (b291-451). The B1 domain is the central domain in the EcPA structure and represents the characteristic four-layer ''sandwich'' fold of all Ntn-hydrolases, which is composed of two antiparallel bsheets covered by a layer of a-helixes on each side [67]. This domain, which forms either covalent or non-covalent bonds with all other domains, constitutes the binding site cavity that contains catalytically important residues, and is highly important for the biological function of the enzyme.
A recently developed bioinformatic analysis method [68] was applied to the superfamily of Ntn-hydrolases to identify subfamilyspecific positions (SSPs) that are conserved within each functional subfamily, but are different between the subfamilies (see Methods, section 5). Such SSPs are supposed to be responsible for functional discrimination among homologs and thus can be used as hotspots for rational design of functional properties of the enzyme [69]. It has been recently shown that pH-induced denaturation of the penicillin acylase leads to irreversible unfolding of the protein dimer to disordered polypeptide chains [70]. The role of subfamily-specific positions in the pH-induced inactivation of EcPA in neutral and alkaline media was further studied using molecular modeling simulations [71][72][73][74][75].
The root mean square deviation (RMSD) of the protein backbone atoms from the initial structure was used as a measure of structural mobility (see Methods, section 1). At neutral pH no global conformational changes were observed in the EcPA. Domains A2, B2 and B3 showed the largest deviation from the initial structure, but did not have a tendency for further destabilization. Domains A1 and B1 remained more consistent during the simulations (Fig. 1). The root mean square fluctuation (RMSF) of C a atoms was used to characterize the structural flexibility and was in agreement with the RMSD data (Fig. S1). All domains with higher RMSF (A2, B2 and B3) also showed larger deviation from the initial conformation, as evidenced by higher RMSD, suggesting that structural flexibility contributes to conformational mobility of the EcPA. Residues within the domain B1 had the smallest average RMSF.
Further comparative analysis of the conformational mobility of EcPA revealed significant structural destabilization at alkaline pH. Under these conditions the domain B1 underwent the largest shift from its native structure, as indicated by the ascending RMSD trend ( Table 1). RMSF of C a atoms at alkaline pH showed only slight increment compared to the neutral conditions. Nevertheless, we can note that residues of the domain B1, especially the subdomain B1-3, showed the largest increase of fluctuation that can be explained by a collapse of crucial interactions (Table S1).
The domain B1 involved seven charged residues, which can be in different ionization states at neutral and alkaline environment. These residues were further analyzed to identify the cause of the observed structural changes. Five of these residues that were found to be exposed and the buried residue Aspb73 located in the Ca 2+binding site did not show significant pH-sensitive fluctuations during the MD simulations. In contrast, the core amino acid Glub482 -one of the most significant subfamily-specific positions ( Table 2) -appeared to be crucial in maintaining the active enzyme conformation.
Glub482 is located in a loop close to the strand b11 of the B1 domain (secondary structure numbering is given as in [67]) in close proximity to Pheb57 and Aspb484, which leads to an unusually high pKa value (see Methods, section 4). Molecular modeling has shown that at neutral pH the Glub482 is protonated, its side-chain oxygen is 2.5 Å away from the side-chain oxygen of Aspb484 (strand b11), and the two carboxyl groups form a hydrogen bond. The existence of carboxyl-carboxylate pairs and their role in the maintenance of the protein structure has been previously discussed [76,77]. At neutral pH one of the side-chains is protonated and the corresponding proton is shared between both carboxyl groups, which results in the formation of a hydrogen bond. In this case, at pH 7.5 the hydrogen bond between Glub482 and Aspb484 is preserved in 90-98% of the unconstrained runs of the molecular dynamic trajectories and participates in a network of stabilizing interactions ( Fig. 2A). Glub482 forms a second hydrogen bond with the side-chain of Asnb47 (strand b4), which in turn interacts with the side-chains of Thrb32 (b3) and Aspb501 (b12). At the same time Aspb484 forms a hydrogen bond with a buried water molecule Wat2436. Buried solvent molecules usually form functionally and structurally important hydrogen bonds [78]. In the penicillin acylase structure, Wat2436 is involved in three interactions: it acts as a hydrogen bond acceptor for Aspb484, a hydrogen bond donor to the side-chain of Asnb20 (b2) and to the main-chain of Trpb65 (b6). The side-chain of Trpb65, in turn, is involved in the formation of a hydrogen bond with the main-chain Ileb55 (a loop between b4 and b5).
At alkaline pH, both carboxylates -Glub482 and Aspb484become negatively charged, which causes a repulsion of the sidechains followed by a collapse of the observed hydrogen bond network. While the time-evolution analysis indicates that secondary structure remains intact in general ( Fig. S2, S3, S4, S5), the strands b4-b6 shift towards the underlying helices a1 and a2. Disordering of the side-chains leads to a less compact structure that is easily penetrated by solvent molecules -they fill newly formed cavities and expand them; the globule, therefore, starts to ''swell'' (Fig. 2B). Further development can lead to the formation of an intermediate between folded and unfolded protein states -a so-called ''molten globule'' [79].
The identified stabilizing interactions connect two antiparallel b-sheets of the abba-core domain B1 -b1-b2-b11-b12 and b6-b5-b4-b3. The hydrogen bond between Glub482 and Aspb484 at neutral pH is the cornerstone of this buried network whose collapse at alkaline pH launches detrimental changes in the structure and finally leads to the enzyme's inactivation. It can be noted that strands b1-6 belong to the B1-1 subdomain, while b11 and b12 belong to the B1-3 subdomain. This fact explains the destabilizing effect of the alkaline pH on the domain B1 that was observed during the molecular dynamics simulations (Fig.1).
The abba-core domain is common for the entire Ntn-hydrolase superfamily [67]. The key residues involved -Asnb20, Glub482 and Aspb484 -are among the most significant subfamily-specific positions in the Ntn-hydrolases (Table 2) and thus the closest evolutionary relatives of penicillin acylases were further studied to reveal whether variation of amino acid residues at selected subfamily-specific positions is related to different stability of these enzymes in alkaline medium.

Structural analysis of Ntn-hydrolases
EcPA was compared to other Ntn-hydrolases with known threedimensional structures -penicillin acylases from Providencia rettgeri (PrPA) and Alcaligenes faecalis (AfPA), glutarylamidases from Pseudomonas sp. (PsGA) and Brevundimonas diminuta (BdGA), and Nacyl homoserine lactone acylase PvdQ from Pseudomonas aeruginosa (PaAHLA). The subfamily-specific positions Asnb20, Glub482, and Aspb484 in EcPA and in the related enzymes were shown to stabilize the interactions between two antiparallel b-sheets of the abba-core domain. Such interactions typically involve a dense hydrogen bond network with buried solvent molecules that can have different localization in the structure. These results suggest that the outlined hydrogen bond network that involves subfamilyspecific positions is highly important for the stability of all Ntnhydrolases.
In PsGA the subfamily-specific positions of Asnb20, Glub482 and Aspb484 of EcPA are occupied by Glnb20, Trpb457, and Alab459, respectively (Fig. 3A). While some positions are substituted by residues with different physicochemical properties, the subfamily-specific positions nevertheless remain involved in a similar network of stabilizing interactions between the b1-b2-b11-b12 and b6-b5-b4-b3 sheets. A buried solvent molecule Wat2032 participates in multiple interactions -it can be a hydrogen bond donor to the main-chain of Asnb2 (strand b1) and Asnb68 (b6), and a hydrogen bond acceptor from the main-chain of Asnb21 (b2) and the side-chain of Thrb67 (b6). It should be noted that Wat2032 in PsGA has a different structural localization compared to Wat2436 in EcPA and a hydrogen bond between the layers is created by the side-chain of Glnb20 (b2) with the main-chain of Ileb66 (b6) and the side-chain of Trpb457 (b11).
In PaAHLA, the subfamily-specific positions Asnb20, Glub482, and Aspb484 of EcPA are occupied by Alab20, Tyrb481, and Glnb483, respectively (Fig. 3B). A buried Wat2053 is located in the structure similar to Wat2032 of glutarylamidases and forms three hydrogen bonds with Asnb2 (b1), Thrb67 (b6), and Hisb68 (b6). Another structural water molecule, Wat2305, was found to be positioned in the same manner as Wat2436 in EcPA. It is a hydrogen bond donor to the main-chain atoms of Leub18 (b2) and Glnb483 (b11) and can serve as a hydrogen bond acceptor from the side-chain of Glnb483, which, in turn, forms two more bonds with Glyb59 (b5) and Tyrb481 (b11).
The outlined subfamily-specific positions are conserved in PrPA and EcPA, whereas in the AfPA structure, there is a substitution of Aspb484 by Asnb480 (Fig. 3C). The side-chain amide group of Asnb480 in AfPA plays a similar role to the protonated carboxylic group of Glub482 in EcPA at neutral pH. However, in contrast to EcPA the hydrogen bond formed by the amide-carboxylate pair in AfPA can be preserved at both neutral and alkaline pH. The Glub482 and Asnb480 interaction, therefore, is a crucial element of the buried stabilizing interaction network in AfPA that makes this enzyme much more stable in alkaline medium than EcPA [80]. Next, we tested the ability of the corresponding mutation Db484N-EcPA to enhance the stability of EcPA in alkaline medium.

3.
In silico modeling of catalytic activity and stability of the wild type enzyme and the Db484N-EcPA variant The B1 domain is highly important for the catalytic function of penicillin acylases. In particular, strand b1 hosts the active site Serb1 while strand b6 has the oxyanion hole residue Alab69. The proposed mechanism of penicillin acylase catalysis consists of a nucleophilic attack of Serb1 on the substrate's amide bond and the corresponding tetrahedral intermediate is stabilized by interactions with the main-chain peptide bond of Alab69 and the side-chain amide of Asnb241 in the oxyanion hole [20]. As the proposed substitution Db484N in the B1 domain could jeopardize the catalytic activity, the organization of the active sites of the wild type and mutant enzymes was studied in silico. Enzyme-substrate complexes formed by the wild type EcPA and its Db484N variant were modeled and used as starting points of MD simulations. To evaluate the catalytic abilities of the enzyme variants the knowledge-based structural criteria were applied to characterize the productive substrate binding mode and the near-to-attack conformation of the substrate in the active site. Three criteria were used -distance between the c-oxygen of the catalytic Serb1 and the substrate carbonyl carbon and distances from the substrate carbonyl oxygen to the main-chain amide of Alab69 and to the side-chain amide of Asnb241 -all limited to at most 3.5 Å (see Methods, section 1). A productive substrate binding mode was observed in both enzyme-substrate complexes formed by the wild type EcPA and its Db484N mutant at both pH 7.5 and 10.0 (Table 3). This observation is in agreement with experimental data (see Results, section 4) and confirms that Db484N mutation does not influence substrate binding at the active site.
The stability of the Db484N-EcPA variant at different conditions was studied using molecular modeling. The formation of a hydrogen bond between Asnb484 and Glub482 prevailed in 94-99.5% of the unconstrained MD runs at both pH 7.5 and 10.0. No significant differences in the conformational mobility of the wild type enzyme and the Db484N variant were observed at pH 7.5. However, at pH 10.0, a significant stabilizing effect of the Db484N mutation was monitored in the abba-core B1 domain (Table 4). These MD studies have shown that in the Db484N-EcPA mutant, a buried hydrogen bond network is preserved in alkaline medium and the two antiparallel b-sheets stay connected within the respective domain preventing destabilization of the protein globule in general.

Experimental characterization of the Db484N variant
The Db484N-EcPA variant was purified and studied experimentally. The inactivation rate constants were determined at neutral and alkaline conditions. At pH 7.5 the wild type enzyme and the variant showed comparable inactivation rates, however, at pH 10.0 the Db484N mutant was much more stable compared to the wild type EcPA ( Table 5). The single mutation led to a 9-fold stabilization at alkaline conditions. The catalytic activity of both enzyme variants was also investigated in neutral and alkaline medium. The initial rates of the enzymatic reaction and the kinetic parameters of the wild type enzyme and the Db484N-EcPA mutant were comparable in both cases (pH 7.5 and 10.0). The experimental results were in a good agreement with the computational predictions and showed that the mutation Db484N has led to a significant stabilization of the EcPA in alkaline environment without compromising its catalytic function.

Discussion
Protein stability has been widely studied in the context of industrial needs. It is becoming increasingly evident that the protein ability to tolerate adverse environmental conditions can have important evolutionary implications for the development of new functions. Therefore, understanding the molecular mechanisms by which mutations affect protein stability is an emerging task in protein engineering. New protein functions evolve by the introduction of new phenotypic features by a small number of mutations [81]. As it has been suggested, mutations providing new functions can be destabilizing [7]; such functionally beneficial mutations have to be counterbalanced by additional stabilizing mutations [82,83]. Bioinformatic analysis of subfamily-specific positions can be used to study stabilization mechanisms in different protein families and develop more effective strategies to implement novel functions. In this work, bioinformatic analysis and molecular modeling have been used for the first time to identify amino acid residues that are crucial for the pH-stability of penicillin acylase from E. coli. A hydrogen bond between the subfamily-specific positions Glub482 and Aspb484 at neutral pH was identified to serve as a basis of the buried stabilizing interaction network connecting two antiparallel b-sheets of the Ntn-hydrolase abba-core fold. In alkaline medium, both carboxylates -Glub482 and Aspb484become negatively charged, which leads to the repulsion of their side-chains and the collapse of the observed hydrogen bond network. Our molecular modeling results show that a hydrogen bond between the mutated residue Db484N and Glub482 is formed both at neutral and alkaline pH and maintains the observed stabilizing interaction network at alkaline conditions. The Db484N-EcPA mutant was produced, studied experimentally and showed an increase of 9-fold in stability at pH 10.0.
Penicillin acylases are key biocatalysts in the industrial production of semisynthetic penicillins and cephalosporins. They are also capable of catalyzing many other potentially valuable reactions such as protection/deprotection of functional groups in peptide synthesis [84] and preparation of enantiomerically pure amines [21][22][23] that are important chiral building blocks for pharmaceutical and agrochemical industries [85]. As primary amines are strong basic compounds, their effective biocatalytic acylation is limited by the low stability and catalytic activity of most enzymes in alkaline solutions. Application of the more stable EcPA variant Db484N in alkaline medium and its use as catalyst for preparative enantioselective acylation of amino compounds is under investigation and will be reported elsewhere. The established Db484N mutation can also be used to construct templates at rational design of new biocatalysts with improved catalytic properties in the Ntn-hydrolase superfamily.

Modeling of enzyme stability and catalytic activity
Theoretical and computational approaches to study protein structure as a function of solution's acidity have a long history. A number of models have been proposed to perform MD simulations at constant pH taking into account dynamic protonation states of each titratable group in a macromolecule. These 'constant pH' methods are thought to allow for sampling over the biologically more meaningful ensemble of conformations at the defined pH. However these calculations are highly demanding due to large number of available protonation states and can still produce significant errors [86,87]. Therefore in this work we have implemented the traditional notion on pH in MD by setting a constant protonation state of each titratable group at the beginning of the simulation in order to evaluate the internal mobility of a penicillin acylase at different pH. This approach also has some well-known limitations [88], however is simple and relatively fast to implement. In this study it has helped to identify the regions where the unfolding of the protein is initiated at alkaline conditions and demonstrated good correspondence with the experimental data.
The conformational flexibility of the wild type and mutant EcPA were compared at different conditions using molecular modeling. Nanosecond scale MD simulations to study the pH-induced unfolding of EcPA were carried out at two pH values (7.5 and 10.0) and at 373 K temperature. It was shown earlier that increasing the temperature of a molecular dynamics system accelerates protein unfolding without changing its pathway [89]. For each model (free Table 2. Subfamily-specific positions in the chain B of the Ntn-hydrolase superfamily. Positions are ranked in a declined statistical significance (see [68] for details). PA -penicillin acylases, GAs -glutaryl-7-aminocephalosporanic acid acylases, AHLs (PvdQ) -N-acyl homoserine lactone acylases PvdQ. The most frequently occurring amino acids are shown for every subfamily. doi:10.1371/journal.pone.0100643.t002 enzyme at defined environmental conditions) three independent MD trajectories with different starting velocities were calculated to improve sampling and collect more data (see Methods, section 3). RMSD of the protein backbone atoms from the initial coordinates was calculated with VMD [90] and used as a measure of structural differences. Pairwise comparisons of last 10 ns of all obtained RMSD time series corresponding to any two different models were made to calculate average D RMSD and f values (see Methods, section 2). These values were further used to assess the different conformational mobility of EcPA at different conditions. RMSF of Ca atoms relative to the average structure over the last 10 ns was used as an indicator of structural flexibility. Time-evolution analysis of secondary structure was performed using the Timeline plugin in VMD [90]. The ability of the wild type EcPA and the Db484N-EcPA variant to form productive enzyme-substrate complexes was studied at two pH values (7.5 and 10.0) and at 300 K temperature. For each model (enzyme-substrate complex at defined environmental conditions) three independent MD trajectories with different starting velocities were calculated (see Methods, section 3). The last 5 ns of the MD trajectories were used to evaluate the near-to-attack conformation of the substrate in the active site using the geometry criteria of the catalytic activity as described earlier [69]. Each frame was used to calculate the distance between the coxygen of the catalytic Serb1 and the substrate carbonyl carbon as well as distances from the substrate carbonyl oxygen to the backbone nitrogen atom of Alab69 and the side-chain nitrogen of the amide group of Asnb241. The mode of the substrate binding was considered productive if all distances did not exceed 3.5 Å . The binding energy of the substrate in a productive complex was evaluated using the Autodock v.4.2.5.1 scoring function [91].

Comparison of RMSD time series
We used the following protocol to compare any two MD trajectories. First, the corresponding RMSD time series (2000 frames in each measured with a common frequency) were smoothed by a moving average with window size of 10 frames to reduce the noise [92]. The Dynamic Time Warping (DTW) algorithm [93,94] was then used to align the two RMSD series by the x-axis (trajectory time). DTW accommodates the cases when elements are similar but out of phase and thus makes one time series to resemble the other as much as possible. Time shifting windows of size 0 (no DTW), 100, 250, 500, 750, 900, and 1000 frames were used. The resulting alignments were used to calculate D RMSD and f values.
D RMSD was calculated for each frame of an alignment as RMSD MD2 -RMSD MD1 and averaged over all frames of the series. Thus, D RMSD between the two MD trajectories can take positive or negative values, depending on the input order. Expectedly, D RMSD calculated without DTW were larger in absolute values than those obtained with DTW. At the same time, the results obtained with DTW and various window sizes were not significantly different. By calculating D RMSD without DTW (at equal time frames) we, in fact, reduce the two time series to the comparison of only average values, thus losing all the information about particular patterns of MD trajectories. In contrast, DTW provides more accurate and detailed comparison of different trajectories by handling local time shifting. f = f(RMSD MD1 $ RMSD MD2 ) is the frequency of frames in aligned trajectories so that RMSD MD1 $ RMSD MD2 and describes to what extent the two curves overlap. f values calculated from DTW alignments with various window sizes were not significantly different. Consequently, D RMSD and f values in Tables 1 and 4 were calculated from DTW with window size of 500 frames and averaged over all pairwise comparisons of MDs between the corresponding models. In both Tables the largest

Molecular dynamics (MD) simulation protocol
Long-time simulations of the enzyme and the enzyme-substrate complexes were carried out using NAMD version 2.9 [95] according to a recently described MD protocol [69]. An input model of a free enzyme or enzyme-substrate complex was prepared as discussed (see Methods, section 4) and parameterized in the Amber force field with AmberTools package version 12 (Case DA, Darden TA, Cheatham TE, Simmerling CL, Wang J, et al., 2012, AMBER 12, University of California, San Francisco). The initial configuration was energy minimized, then all atoms of the protein were constrained using a harmonic energy function and the system was heated to the target temperature. Next, the system was equilibrated and constraints were gradually removed after which a free (unconstrained) run was carried for 50 ns (for free enzyme) or 25 ns (for enzyme-substrate complex). Coordinate snapshots were saved every 5 ps. The simulation of the penicillin acylase model system in a water box (approximately 108000 atoms) for 50.7 ns (equilibration + free run) took 2.7 days on 256 cores of Intel Xeon X5570/Intel Xeon 5670 CPUs. In total, 0.92 ms of MD trajectories were calculated.

Preparation of the enzyme structure and pKa calculations
The PDB databank entry 1GM9 of EcPA was used as a template for molecular modeling as it has been recently shown that this structure most adequately corresponds to the productive enzyme-substrate complex [96]. Crystallographic solvent molecules and calcium ion were preserved. PDB2PQR v 1.7 was used to add missing hydrogen atoms, resolve steric hindrances and optimize hydrogen bond network [97]. In particular, the program uses a built-in PROPKA heuristic method to calculate pKa values of ionizable residues [98]. Protons were added to the corresponding heavy atoms (OD for Asp, OE for Glu, NZ for Lys, NE for Arg, NE2 or ND1 for His, N for the N-terminal, and O for the Cterminal groups) based on the user-defined environmental pH value: only those residues were protonated that had higher pKa values than the target pH. In three cases, the protonation states of residues from the A1 and B1 domains of EcPA were revised manually. Glua152 was predicted to have an unusually high pKa value 15.8 as a result of interactions with other carboxylates. This residue is located in the Ca 2+ binding site and participates in the stabilization of the ion. Both protonation states of Glua152 were evaluated and no significant impact on the stability was observed (data not shown). The second case was residue Glub482. This residue was predicted to have an unusually high pKa 13.8 under the influence of a negatively charged neighboring residue (Aspb484) and a hydrophobic environment (Pheb57). To evaluate the impact of Glub482 on the protein stability both protonation states were tested and a significant difference was observed that was further investigated in the article. Consequently, in this work Glub482 was protonated at pH 7.5 and negatively charged at pH 10.0. Finally, the N-terminal backbone amino group of the active site residue Serb1 was deprotonated at both pH according to the suggested catalytic mechanism [20].
The PDB databank entry file of penicillin acylase from Alcaligenes faecalis (3K3W) contained surprisingly low amount of crystallographic water. Considering the high structural similarity to EcPA and the availability of space between Asnb20, Trpb65 and Asnb480, the water molecule Wat2436 was transferred from 1GM9 to 3K3W and the hydrogen bond network of 3K3W was optimized.
Hydrogen bonds were accepted if the interaction was within 3.5 Å donor-acceptor distance and 145u-180u donor-hydrogenacceptor angle.

Bioinformatic analysis of Ntn-hydrolases
We employed the SSM algorithm to query the crystallographic structure 1GM9 of EcPA against the PDB databank [99]. A nonredundant set of five template proteins was retrieved: penicillin acylases from Providencia rettgeri (PDB: 1CP9) and Alcaligenes faecalis (3K3W), glutarylamidases from Pseudomonas sp. (1GK0) and Brevundimonas diminuta (1JVZ), and N-acyl homoserine lactone acylase PvdQ from Pseudomonas aeruginosa (2WYE). Next, every template protein was independently used as a query for two iterations of PSI-BLAST search [100] against the 'nr' database. Sequences sharing less than 0.25 bits-score-per-column with the corresponding template protein were discarded from further analysis [101]. This step was necessary to retain proteins that are not too distantly related and thus should have the same function as the template. At the same time, only one sequence was retained from a cluster with more than 95% pairwise identity to remove redundant sequences. Then, template protein structures were aligned using Matt [102]. The sequence sets were aligned to the corresponding template proteins with T-coffee [103].
Bioinformatic analysis of the subfamily-specific positions responsible for functional diversity within the superfamily of Ntn-hydrolases was performed with Zebra [104]. It has been experimentally shown that different penicillin acylases display remarkable variations in their pH-stability profiles [42,80]. We analyzed the multiple alignment corresponding to the penicillin acylases with Zebra and determined four functional subfamilies corresponding to close evolutionary relatives of PAs from E.coli, A.faecalis, A.baumannii, and B.megaterium. These subfamilies were in a good agreement with the experimental data on protein stability. Next, we compared PAs to N-acyl homoserine lactone acylases and glutarylamidases. The list of subfamily-specific positions was ranked by decreasing statistical significance.

Protein engineering
Wild-type and Db484N plasmids were constructed using the modified pBR322 vector carrying the cat gene responsible for chloramphenicol resistance. The gene encoding wild type EcPA (analog ATCC 11105) was obtained as described earlier [105]. For cloning and expression of the wild-type and mutant enzymes, E. DNA amplification was done using High Fidelity PCR Enzyme Mix (Fermentas) by the following program: 1 cycle -3 min at 95uC; 16 cycles -0.5 min at 95uC, 0.75 min at 60uC, 12 min at 72uC; 1 cycle -5 min at 72uC;. Then, 10 ml of reaction mixture were incubated with 1 ml FastDigest DpnI restrictase at 37uC; for 30 min. Then 50 ml of competent E. coli TG-1 cells were transformed by 10 ml of restriction mixture and incubated on ice for 30 min. Heat shock was performed at 42uC; for 1.5 min with the described below incubation on ice for 2 min. A 250 ml LB medium was added to the cells and incubated at 37uC; and 180 rpm for 30 min. A 100 ml of culture fluid was spread on agar plates with 68 mg/l chloramphenicol. After a minimum of 24 hours, 3 clones were transferred into tubes containing 2 ml LB medium with 68 mg/l chloramphenicol and incubated overnight at 37uC and 200 rpm. After that overnight cultures were divided into two parts. The first one was assigned to plasmid
Kinetic characterization was carried out according to the guidelines established by the European Working Party on Biocatalysis [106]. The penicillin acylase activity was measured spectrophotometrically at 400 nm by determining the release of 2nitro-5-aminobenzoic acid in the course of NIPAB hydrolysis. The EcPA-catalyzed reaction was performed in a thermostatted reaction vessel at 25uC, 10 mM phosphate buffer pH 7.5, 0.1 M KCl. Absolute concentration of the enzyme's active sites was determined by EcPA titration with a highly specific irreversible inhibitor PMSF as described earlier [107]. The kinetic parameters of the enzymatic hydrolysis (K M and k cat ) were determined by initial rate analysis from the experimental dependence of the initial reaction rate on the substrate (NIPAB) concentration (0.5-50.0 mM). The values of k cat and K M were found by nonlinear regression.
The stability of the wild type EcPA and the Db484N-EcPA mutant were investigated at pH 10.0 (10 mM CAPS, 25uC) and pH 7.5 (10 mM KH 2 PO 4 , 50uC). Inactivation followed first-order rate kinetics. The values of the corresponding inactivation constants (k in ) were determined by interpolation of the experimental data according to the equation A t = A 0 *exp(-k in *t), where A t is the residual enzyme activity, A 0 is the initial activity and t is the time of incubation. Figure S1 Structural fluctuation (RMSF) of EcPA and its Db484N mutant at different pH. Each curve is averaged over three independent MD trajectories. Residue numbers are given in an absolute order. Dashed line separates chains a and b, consisting of 209 and 557 amino acid residues, respectively. Residues b482 and b484 are indicated by a red arrow. These positions had one of the lowest RMSF values (,0.5 Å ) which were the same in the wild type enzyme and its mutant at different conditions. (TIF) Figure S2 Secondary structure as a function of time, for wild type EcPA at pH 7.5. The plot is based on a single representative MD, selected from three independent MD runs for the given protein at given conditions. Residue numbers are given in an absolute order. Dashed line separates chains a and b, consisting of 209 and 557 amino acid residues, respectively. Residues b482 and b484 are indicated by a red arrow. (TIF) Figure S3 Secondary structure as a function of time, for wild type EcPA at pH 10.0. The plot is based on a single representative MD, selected from three independent MD runs for the given protein at given conditions. Residue numbers are given in an absolute order. Dashed line separates chains a and b, consisting of 209 and 557 amino acid residues, respectively. Residues b482 and b484 are indicated by a red arrow. (TIF)