Solution scattering study of the Bacillus subtilis PgdS enzyme involved in poly-γ-glutamic acids degradation

The PgdS enzyme is a poly-γ-glutamic (γ-PGA) hydrolase, which has potential application for a controllable degradation of γ-PGA by enzymatic depolymerization; however, the structure of PgdS is still unknown. Here, to study in detail the full-length PgdS structure, we analyze the low-resolution architecture of PgdS hydrolase from Bacillus subtilis in solution using small angle X-ray scattering (SAXS) method. Combining with other methods, like dynamic light scattering and mutagenesis analyses, a model for the full length structure and the possible substrate delivery route of PgdS are proposed. The results will provide useful hints for future investigations into the mechanisms of γ-PGA degradation by the PgdS hydrolase and may provide valuable practical information.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 and the polydispersity decreases as a function of depolymerization time [9,13]. Therefore, the PgdS hydrolase has potential application for a controllable degradation of γ-PGA by enzymatic depolymerization.
To date, the structure of PgdS is still unknown. Here, we employ a hybrid approach that utilizes small angle X-ray scattering (SAXS) in combination with secondary and tertiary structure prediction to detail the architecture of the PgdS hydrolase from B. subtilis in solution. Combining with dynamic light scattering and mutagenesis analyses, a model for the structure and the possible substrate delivery route of PgdS are proposed. The results will provide useful hints for future investigations into the mechanisms of γ-PGA degradation by the PgdS hydrolase.

Gene cloning, protein expression and purification
The pgdS gene of B. subtilis 168 (DSM 23778, DSMZ, Germany) were amplified by PCR from genomic DNA with the 5'/3' specific primers. This primer design avoided cloning of the N-terminal signal peptide of 32 residues (predicted by the SignalP 4.1 server [14]). The amplified genes were cloned into vector pGEX-6P-1 and expressed in Escherichia coli DH5α with an Nterminal GST-tag. Cells were harvested by centrifugation, re-suspended in lysis buffer and sonicated on ice. Proteins were purified from the supernatant by GST Glutathione SepHaroseTM 4 Fast Flow column (GE Healthcare), and the GST-tag was removed by Prescission Protease (PPase) at 4˚C overnight. The eluted PgdS proteins were further purified by the combination of the Resource S anion-exchange column (GE Healthcare) and Superdex 200 size-exclusion column (GE Healthcare) with a final buffer consisting of 50 mM MES (pH 6.0) and 100 mM NaCl. Protein samples were then exchanged into a buffer containing 50 mM citric acidsodium citrate (pH 5.0) and 100 mM NaCl or 50 mM Tris (pH 8.0) and 100 mM NaCl using centrifugal filters (Amicon Ultracel, EMD Millipore) for the subsequent experiments.
All mutant PgdS proteins were generated according to the QuickChange mutagenesis protocol. All clones were verified by DNA sequencing. These mutants were purified in the same way as described above for the wild type protein.

SAXS measurements and data processing
Synchrotron SAXS measurements from solutions of PgdS were performed on the BL19U2 beamline at NCPSS (Shanghai, China), equipped with a robotic sample changer and a PILA-TUS 1M detector [15]. All samples were centrifuged at the speed of 13,000 rpm for 10 min just before measurements to get rid of aggregations and sediments. 2 mM DTT was added into the samples and buffers before measurement to avoid radiation damage. The exposure time of one frame is one seconds. Twenty successive frames were collected for one sample in order to monitor the possible radiation damage. The scattering intensity I(s) was recorded in the range of the momentum transfer, 0.02 < s < 0.4 Å where s = (4πsinθ) / λ, 2θ is the scattering angle, and λ = 1.54 Å is the X-ray wavelength. Because of the high experimental noise for s values > 0.3 Å, the most informative part of the scattering data from 0.02 to 0.3 Å was used for structural analyses. To exclude concentration dependence, different concentrations ranging between 1.1 and 7.2 mg/ml of each sample were prepared and measured. No concentration dependence and aggregations were observed during the measurements. The low angle data collected at lower concentration was merged with the highest concentration high angle data to yield the final composite scattering curve.
All SAXS data were processed with the program package ATSAS [16]. The scattering of buffers were subtracted from that of the samples, and then were extrapolated to zero concentrations using standard procedures and program PRIMUS [17]. The resultant curves were used for all calculations and reconstructions. Low resolution shapes of PgdS were reconstructed by the ab initio method, DAMMIF [18]. Twenty models obtained from the program runs were compared and averaged using the program DAMAVER [19], with the most universal model was chosen as typical model. Currently, the high resolution X-ray structure of PgdS from B. subtilis has not been determined, so a combination of secondary and tertiary structure modeling programs were applied to develop an atomistic representation of PgdS subunits, then the program SASREF [20] was used to determine the relative positions of the subunits. The program CORAL [21] was used to reconstruct missing fragments of the available high-resolution structures using the full amino acid sequences. Considering the flexibility of proteins, program EOM [22] was also used to analyze the PgdS enzyme with assemblies of different conformers.

Homology structural modeling of the PgdS domains
The 2D secondary structure prediction of PgdS was performed using PsiPred server [23], and the 3D model was generated by SWISS-MODEL [24]. The structure validation and quality control was done by Procheck [25] and WhatCheck module on WhatIf server [26].

Enzyme assay
PgdS activity was assayed using γ-PGA as the substrate. γ-PGA were reagent grade and purchased from Sigma-Aldrich. γ-PGA (100ug) was incubated with 2 μM enzyme in a 100 μl reaction volume and citric acid-sodium citrate 50 mM, pH 6.0. Reactions were incubated at 37˚C for 2 hours, and then stopped by heat treatment for 5 min at 95˚C. Products were separated on 0.8% agarose gel. γ-PGA in the gel was visualized with methylene blue stain.

Dynamic light scattering
Dynamic light scattering (DLS) measurements were performed using a DynaPro NanoStar instrument (Wyatt Technology Europe GmbH, Germany) with a 50-μl cuvette. The protein concentration used was about 10 mg/ml. All the DLS measurements were performed at 25˚C and at an angle of 90˚. The data were analyzed with the Dynamics v7.0 software.

Models of the three PgdS NlpC/P60 domains
To date, the high resolution X-ray crystal structure of PgdS from B. subtilis has not been determined. We have tried a structure determination of the PgdS enzyme, but we failed. PgdS belongs to NlpC/P60 family, and is characterized as DL-endopeptidases [12]. Sequences analyses reveal that three tandem repeats of the NlpC/P60 module present in the protein PgdS, each of them has about 35% identity to the sequence of NlpC/P60 family [12] (Fig 1). So we compromised and used a combination of well-established secondary and tertiary structure modeling programs to develop an atomistic representation of the three PgdS domains. The prediction of the secondary structure of PgdS shows alternating pattern between α-helices and β-strands along the length of sequence in common with three tandem repeats of the NlpC/P60 fold (S1 Fig). Further, the tertiary structure of the three domains of PgdS were modeled using SWISS-MODEL [24]. The N-terminal domain 1 (residues 33-159) was predicted based on the structure of the NlpC/P60 domain in a putative cell wall hydrolase Tn916-like protein (PDB: 4HPE), which has a 37.5% sequence identity to PgdS, whereas the middle domain 2 (residues 160-287) and the C-terminal domain 3 (residues 288-413) were both predicted on the structure of a lipoprotein (PDB: 4FDY), with a sequence identity of 42.2% and 39.3%, respectively.
All the three models present a typical NlpC/P60 fold, which is made up of a central β-sheet composed of five antiparallel β-strands that are surrounded by four α-helices, and the three models are extremely similar with a root-mean-square deviation (r.m.s.d) value ranging from 0.3-0.7 Å over all Cα atoms. Only small differences are occurred in the length of secondary structures and in the loop that links them (Fig 2). The geometry of the models was further validated by Procheck [25] and WhatIf [26]. The resulting Ramachandran reveals that over 80% of the amino acids fall in the preferred ϕ/ψ peptide bond angle regions and the models contain only 2% outliers in disallowed regions. The overall geometry and packing validation parameters calculated by WhatIf [26] correspond to a good-quality model.
The NlpC/P60 domain is responsible for the catalytic activity, and is highly modular. Many NlpC/P60 proteins are usually characterized by a single catalytic NlpC/P60 domain, and associated with other components, such as LysM, SH3 and choline-binding domains, to form a multifunctional protein [12,29]. For instance, the B. subtilis autolysins LytF, LytE, and CwlS each with a multiple tandem repeat of the LysM and a single NlpC/P60 domains [29], are localized at cell-separation sites during vegetative growth [30,31]. To date, several structures of the NlpC/P60 proteins have been solved with their fused domains and the single catalytic NlpC/ P60 domains [32][33][34][35][36]. Beyond that, the RflaF_05439 from Ruminococcus flavefaciens is the only current example of a duplicated NlpC/P60 domain [37], and the PgdS from B. subtilis even carries three copies of this domain. In the PgdS enzyme, it has been demonstrated that only the second repeat of NlpC/P60 domain is functional [12,13], but no mention is made in the function of the other two repeats. In next, SAXS combining with other techniques are used to study the full length PgdS, which can give useful hints into the mechanisms of γ-PGA degradation by the PgdS hydrolase.  [27] and edited by hand to match the structural similarity where appropriate by using ALINE [28]. Identical and similar residues are highlighted in black and grey, respectivey. The secondary structure elements base on the domain 2 of PgdS, α-helices and β-strands are marked by red pillar and blue arrow, respectively. The strictly conserved cysteine/histidine/ glutamine (asparagine or histidine) catalytic triad are marked with red triangles. Three conserved residues that contribute to the formation of catalytic core are also marked with red circles.  Table), which is consistent with size-exclusion chromatography results (data not shown). The distance distribution function p(r) for PgdS is shown in Fig 3D. Profiles of p(r) function for PgdS in solution is characterized as elongated body with cross-sections of~22 Å and maximal particle dimension D max of 93 Å. To obtain more specific structural information, ab initio modeling is applied using the program DAMMIF [18]. Twenty independent models generated with the algorithms give reproducible results and demonstrate good approximations to the experimental data with a discrepancy value χ2 = 1.05 for the PgdS (Fig 3A, green line). The final models display on elongated shape for PgdS, consistent with the p(r) function.
A more detailed model of the full length structure of PgdS was generated using the three domains generated by SWISS-MODEL as a rigid body for SASREF [20] modeling. The rigid model of SASREF fits the experimental data very well (χ2 = 1.15) (Fig 3A, blue line). The SASREF model reveals the domains arrange as a crescent-shaped body with the domain 1 slightly apart from the other two. To further refine the rigid model, restorations of the linker loops between the domains were performed by CORAL [21] using SASREF model as a basis. Based on the PsiPred [23] results, two loop regions (residues 158-163, 288-294) connecting the three domains are defined. The results of the restorations by the program CORAL yield good fits to the experimental SAXS data (χ2 = 1.09) (Fig 3A, cyan line), too. Importantly, the CORAL reconstructions are in good agreement with the DAMMIF models as demonstrated in Fig 4A. Thus, two independent methods give consistent results, thereby supporting the notion that the models presented here clearly represent solution structures.

The pH effect on the conformation of PgdS
The solution structure of PgdS was also investigated at pH 5.0 and pH 8.0, similar to the PgdS at pH 6.0 (S2 Fig). The scattering patterns and the p(r) functions of PgdS are shown in Fig 3B-3D. The estimated molecular mass of both also suggest a monomeric state for PgdS in solution that are consistent with the expected value (S1 Table). The real-space of R g and D max of PgdS at pH 5.0 are decreased noticeable as compared with those of the protein at pH 8.0, with R g of 25.8 Å and D max of~87 Å for pH 5.0 and R g of~27.6 Å and D max of~27.6 Å for pH 8.0 (S1 Table). The decreasing in the both values of R g and D max indicate a more compact status of the PgdS at pH 5.0 compared to that of at pH 8.0. To further confirm the obtained results, dynamic light scattering analyses measurements were performed. The hydrodynamic radius R h of PgdS at pH 8.0 was 3.0 nm, whereas the corresponding R h of PgdS at pH 5.0 and 6.0 are both 2.8 nm (S3 Fig). The R h values from DLS are nearly in line with the R g values from SAXS, which demonstrate a similar structural conformation of PgdS at different pH values. We next performed ab initio shape reconstructions and rigid body refinement on the PgdS at pH 5.0 and pH 8.0, similar to the strategy described above for the PgdS at pH 6.0. Models generated by DAMMIF [18], SASREF [20] and CORAL [21] are in good agreement as show in Fig 3B and 3C. Both the models of the PgdS proteins at pH 5.0 and pH 8.0 exhibit a crescent-shaped bodies, with the domain 2 and domain 3 coordinated tightly, similar to the reconstruction model of PgdS at pH 6.0. However, the N-terminal domain 1 in PgdS at pH 5.0 arranges closer to the other two domains than that in PgdS at pH 8.0, which may be cause of the decreasing in R g and D max (Fig 4B).
To further validate the assumptions, EOM [22] was used to describe the PgdS proteins. Using the program EOM, a large pool of 10,000 different conformations is generated to analyze the flexibility of the protein, and an optimized ensemble of 50 models that best describes the SAXS data is selected. For PgdS at pH 5.0 and pH 6.0, both the R g and D max distribution functions have a single peak with R g around 27 Å and D max around 92 Å, respectively (Fig 5A and 5B), which is basically consistent with the overall structural parameters from SAXS data. This implies PgdS at pH 5.0 and pH 6.0 may mainly exist in a compact state. However, for PgdS at pH 8.0, both the R g and D max distribution functions has a broaden peak, ranging from~25 to 31 Å and~88 to 107 Å, respectively (Fig 5C). This means the full-length protein has a degree of a flexibility at pH 8.0, which probably undergo continuous conformational changes in solution. Considering the optimal pH value of PgdS enzyme is 5.0, it seems that the compact state of PgdS may facilitates the catalytic reaction. Therefore, our results indicate that the PgdS becomes an extended state with the increasing of pH value, which is probably due to the N-terminal domain 1 extending from the other two domains. In contrast, the domain 2 and 3 coordinate rigidly with limited flexibility, regardless the environment pH.

Catalytic core and possible substrate delivery route
PgdS protein carries three copies of the NlpC/P60 domain, of which only the second is functional [13]. The NlpC/P60 domain represents a family of papain-like cysteine peptidases with a strictly conserved cysteine/histidine/glutamine (asparagine or histidine) catalytic triad [12]. Indeed, the multiple sequence alignment analysis of PgdS reveals that only the domain 2 has the complete catalytic triad, the Cys194-His247-Gln259, whereas the domain 1 and 3 have residue proline or threonine instead of cysteine, respectively (Fig 1). Overall, from the CORAL  (Fig 3A). The domain 2 represents a typical NlpC/P60 catalytic domain with a strictly conserved catalytic core (Fig 1). Around the catalytic cysteine, an aspartate, a serine and a tyrosine are strictly conserved in the domain 2 of PgdS (corresponding residues Asp193, Ser195 and Tyr181), which are presumed to relate to the substrate binding specifically. Besides, the conserved phenylalanine and tyrosine are also exist in the domain 2 of PgdS (corresponding residue Phe183 and Tyr241). Recent studies suggest these residues likely contribute as the gate accessing to catalytic core, which the side-chain of the phenylalanine can switches to a different rotamer to expose the catalytic core for substrate binding or product release [34].
The electronic surface of PgdS obtained from the CORAL models are presented in Fig 6A. A~20 Å positively charged surface is localized on PgdS at the junction of domain 2 and domain 3 This positively charged surface runs along the interface from the inside to outside of the crescent-shaped body and extended to the catalytic core of the domain 2 through the gate of the Phe183 and Tyr241 [34]. Several basic amino acid from the two domains, like as Lys359, Arg284, Lys223 and Lys242 et.al reside in this region (Fig 6B). To investigating the possible function of the positively charged surface, three residues Lys359, Arg284 and Lys242 are mutated. Interesting, all the three PgdS mutants displayed defects in their ability to degrade the γ-PGA compared with the wild-type enzyme (Fig 6C). In an 2 hours reaction, the mutant K359A and K242A have lower efficiency reduction on γ-PGA degradation, in contrast, the mutant R284A has obvious decreasing in γ-PGA degradation. The result suggest that these residues are involved in the catalytic reaction, although all of them are far away from the catalytic core. PgdS is characterized as DL-endopeptidases, which exclusively cleaves the γ-glutamyl bond between D-and L-glutamic acids [13]. In this context, the way the enzyme distinguishes the compatible γ-glutamyl bonds in the long polymer of γ-PGA is very likely based on the cooperation of the domains, therefore, this long positively charged area between the domain 2 and domain 3 may server as a substrate delivery route between the enzyme and the γ-PGA.
A positively charged surface was also observed around the active site of other NlpC/P60 fused proteins, such as the B. subtilis autolysins DL-endopeptidases LytF, LytE, CwlS [39], and the poly-γ-glutamate hydrolase P (PghP) from bacteriophage ФNIT1 [40]. Moreover, it has been demonstrated that an inhibitor protein IseA, can get stuck deep in the cleft of LytF and occlude the active site by the interaction of the positively charged surface [39]. It must be noted that, there is still a limited understanding of how the enzyme anchor onto γ-PGA and how the substrates are delivered to the catalytic domain, as the mechanism of substrate delivery and recognition by PgdS is not firmly established in this study.

Conclusion
In summary, the study presented here give the first depiction of the full-length PgdS protein.
Although being a low-resolution method, SAXS can provide useful overall structural information of PgdS, such as the crescent-shaped body of the full length protein, the positively charged surface at the interface of domain 2 and 3, which may be relevant to its biological function. In order to fully understanding the mechanisms of γ-PGA degradation by the PgdS hydrolase, the high resolution structure of PgdS is still wanting. In addition, the results in this study will also provide valuable practical information for a controllable degradation of γ-PGA by enzymatic depolymerization.  Table. Overall structural parameters of the PgdS proteins at various pH from SAXS data. (DOC)