Circularly permuted variants of two CG-specific prokaryotic DNA methyltransferases

The highly similar prokaryotic DNA (cytosine-5) methyltransferases (C5-MTases) M.MpeI and M.SssI share the specificity of eukaryotic C5-MTases (5’-CG), and can be useful research tools in the study of eukaryotic DNA methylation and epigenetic regulation. In an effort to improve the stability and solubility of complementing fragments of the two MTases, genes encoding circularly permuted (CP) variants of M.MpeI and M.SssI were created, and cloned in a plasmid vector downstream of an arabinose-inducible promoter. MTase activity of the CP variants was tested by digestion of the plasmids with methylation-sensitive restriction enzymes. Eleven of the fourteen M.MpeI permutants and six of the seven M.SssI permutants had detectable MTase activity as indicated by the full or partial protection of the plasmid carrying the cpMTase gene. Permutants cp62M.MpeI and cp58M.SssI, in which the new N-termini are located between conserved motifs II and III, had by far the highest activity. The activity of cp62M.MpeI was comparable to the activity of wild-type M.MpeI. Based on the location of the split sites, the permutants possessing MTase activity can be classified in ten types. Although most permutation sites were designed to fall outside of conserved motifs, and the MTase activity of the permutants measured in cell extracts was in most cases substantially lower than that of the wild-type enzyme, the high proportion of circular permutation topologies compatible with MTase activity is remarkable, and is a new evidence for the structural plasticity of C5-MTases. A computer search of the REBASE database identified putative C5-MTases with CP arrangement. Interestingly, all natural circularly permuted C5-MTases appear to represent only one of the ten types of permutation topology created in this work.


Introduction
DNA methylation plays important roles in several biological phenomena such as restrictionmodification in prokaryotes, genomic imprinting, X-chromosome inactivation and silencing of selfish genetic elements in eukaryotes. Biological DNA methylation is catalyzed by DNA methyltransferases (DNA MTase), which transfer a methyl group from the universal methyl a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 be, in lack of its natural environment, exposed to the solvent. We thought that this problem could be circumvented by using fragments obtained from circularly permuted variants of the enzymes, in which the C-terminal α-helix is covalently linked to the N-terminus of the enzyme. For proteins circular permutation is a rearrangement of the amino acid sequence, in which the original N-and C-termini are covalently linked and new ends are created by splitting the polypeptide chain somewhere else [33] [34]. Construction of circularly permuted and enzymatically active M.MpeI and M.SssI variants appeared feasible because in the structural models the N-and C-termini were closely located [23,30].
Here we describe construction and initial characterization of circularly permuted variants of M.MpeI and M.SssI. We show that the majority of the CP variants have detectable MTase activity. Moreover, we show that the most active CP variants are capable of fragment complementation and that some complementing fragments are more soluble than fragments derived from the wild type enzyme.  [23]. The "long linker" and the TRD are marked according to Choe et al. [24]. controlled by the E. coli araBAD promoter and the AraC protein, and can be induced by arabinose and repressed by glucose [37,38].
To construct a tandemly duplicated M.MpeI gene, the M.MpeI coding sequence was PCRamplified using pET-28a::MMpe as template and AK387 and AK388 as primers (S1 Table). The primers added XhoI sites to the ends of the amplified DNA fragment. The PCR product was digested with XhoI and cloned into the unique XhoI site located at the 3'-end of the M. MpeI gene in pET-28a::MMpe. The resulting plasmid (pET-tdM.MpeI) contains two tandemly arranged copies of the M.MpeI gene fused in frame.
A plasmid containing the duplicated M.SssI gene was constructed by PCR synthesis of the M.SssI coding sequence using pBNH-M.SssI as template and AK413 and AK414 as primers (S1 Table). The primers introduced an upstream NcoI site and a downstream XhoI site into the PCR product. The synthesized fragment was digested with NcoI and XhoI, and cloned between the unique NcoI and XhoI sites of pBNH-MSssI to create pB-tdM.SssI. In plasmid pB-tdM.SssI the two copies of the M.SssI gene are fused in frame.
Plasmids expressing circularly permuted variants of M.MpeI and M.SssI were constructed by PCR using pET-tdM.MpeI or pB-tdM.SssI as template, and primers listed in Tables 1, 2 and S1. The forward primers contained an in-frame ATG codon, whereas the reverse primers contained the complement of an in-frame stop codon (S1 Table). In some cases a GGT triplet (Gly), was added after the start codon. To facilitate cloning of the PCR products, the PCR primers contained restriction sites as 5'-extensions (S1 Table). The PCR products were cloned in pBAD24 [37].
To obtain the cp62M.MpeI-280 variant, which differs from cp62M.MpeI by a linker peptide (GGGSG) separating the native N-and C-termini, the AK280-AK281 duplex (S1 Table) was cloned into the XhoI site located between the two M.MpeI gene copies in pET-tdM.MpeI. The resulting plasmid (pET-tdM.MpeI-280) served as template for PCR synthesis of the cp62M.  (Table 3) were constructed by PCR synthesis of the corresponding gene segments and cloning the PCR products either in pBAD24 or pOK-BAD. The templates and primers used for the PCR synthesis are listed in S2 Table.

Enzymes and chemicals
Restriction endonucleases, T4 DNA ligase and Phusion DNA polymerase were purchased from Thermo Scientific or New England Biolabs. S-adenosyl-L-[methyl-3 H]methionine ([methyl-3 H]-SAM) was purchased from PerkinElmer and unlabeled S-adenosyl-L-methionine from New England Biolabs.

Estimation of DNA methyltransferase activity
Methyltransferase activity in E. coli was routinely estimated by restriction protection assay. Cells with the plasmid encoding the MTase variant to be tested were grown in LB/Ap medium to OD 600~0 .5, then production of the MTase was induced by adding 0.1% arabinose, and growth was continued for 5 hours at 30˚C. Plasmid DNA extracted from the cultures was digested with the CG-specific methylation-sensitive restriction endonucleases Hin6I and/or Eco47I. Hin6I recognizes the 5'-GCGC-3' sequence but does not cut when the underlined cytosines are methylated: 5'-GCGC/5'-GCGC (Kazlauskiene et al., cited in REBASE [15]). Eco47I recognizes the sequence 5'-GGWCC. The hemimethylated recognition site (5'-GGWCC/5'-GGWCC) is resistant to cleavage (Kazlauskiene et al., cited in REBASE [15]).
To test complementation between split MTase fragments in vivo, segments of the M.MpeI and M.SssI genes were cloned in pBAD24 or pOK-BAD as described above. Pairs of plasmids carrying different parts of the same MTase gene in pBAD24 and pOK-BAD vector were cotransformed into E. coli DH10B. Induction of MTase fragment production by arabinose and analysis of the methylation status of the plasmid preparations were done as described above for the full-length enzymes.
Methyltransferase activity was measured in vitro in crude extracts using S-adenosyl-L-[methyl-3 H]methionine. Cultures were grown to a density of OD 600~0 .5, then MTase production was induced by adding 0.1% arabinose, and growth was continued for 5 hours at 30˚C. Cells from 18 ml were harvested by centrifugation, resuspended in 2 ml of a buffer containing 50 mM Tris-HCl pH8.0, 300 mM NaCl, 5% glycerol, and disrupted by sonication. Cell debris was removed by centrifugation in a bench-top centrifuge (13,000 rpm, 3 min at 4˚C). MTase activity was measured using λ phage DNA (Thermo Fisher) and [methyl-3 H]SAM essentially as described previously [31].

SDS-polyacrylamide gel electrophoresis
Cell extracts were prepared as described above and proteins were analyzed by SDS-polyacrylamide gel electrophoresis using conventional SDS-polyacrylamide gels [36]. Solubility of the protein of interest was estimated by comparing coomassie blue stained samples from total cell extracts and from supernatants obtained after removal of the cell debris by centrifugation (13,000 rpm, 3 min at 4˚C).
Circularly permuted C5-MTases in the REBASE database [15] were identified by a computer-aided search of 20,677 entries using the criterion of motif X to precede motif I in the amino acid sequence. The search was implemented using the Biopython tools [43] (see S1 Appendix). Each sequence found in the search was checked manually to exclude false positives.

Design and construction of circularly permuted MTase variants
In the crystal structure of the M.MpeI-DNA-S-adenosyl-homocysteine complex (PDB: 4DKJ) as well as in the computational model of the M.SssI ternary complex the N-and the C-termini of the enzymes are located closely in space, which encouraged us to construct circularly permuted variants. Permutation sites were designed with the web-based tool Circular Permutation Site Predictor (CPred) [39]. CPred uses the 3D structure of the protein-of-interest as an input and assigns a Probability Estimate (P. E.) to each amino acid of the molecule. Residues with P. E. values above 0.5 are considered by the program viable permutation sites. The distribution of viable and non-viable cleavage sites along the peptide chain of M.MpeI is shown in Fig 3. Most of the predicted viable permutation sites are clustered in blocks which, as one would expect, coincide with non-conserved regions. Because of insufficient quality in some regions, the computational model of M.SssI was not accepted by CPred. To select permutation sites for M.SssI (Fig 4), we used the P. E. values calculated for M.MpeI, and identified the corresponding residues in M.SssI by sequence alignment. To minimize the possibility of disrupting the native structure, most split sites were designed to fall outside of conserved motifs and the target recognition domain (Figs 3 and 4). The new N-termini of the active and inactive variants are indicated above the sequence by blue and outlined numbers, respectively. Conserved motifs are boxed, α-helices are marked under the sequence by rectangles and β-strands by arrows [30]. The most conserved residues are printed in bold. Regions predicted by the CPred program [39] to contain viable permutation sites are highlighted in yellow. The amino acid sequences of the permuted MTases are shown in S1 and S2 Datasets.
To explore the permutation potential of the MTases, variants with split sites distributed over the whole molecule were created. Plasmids encoding fourteen M.MpeI and seven M.SssI permutants were constructed (Tables 1 and 2). Collectively, the permutation sites in the two enzymes represent each segment separating the adjacent conserved motifs (Figs 3 and 4).

Methyltransferase activity of the circularly permuted MTases
MTase activity was first tested by a restriction protection assay as described in Materials and Methods. Plasmids encoding wild-type M.MpeI or M.SssI were almost completely protected against Hin6I digestion even if the plasmid was isolated from uninduced cells. Plasmids expressing the tandemly duplicated M.MpeI or M.SssI (pET-tdM.MpeI and pB-tdM.SssI) showed similar level of resistance as the plasmids expressing wild-type M.MpeI or M.SssI (not shown), suggesting that the activity of the fused dimers was comparable to that of the wildtype enzymes. Previously, similar observations were made with the tandemly duplicated variant of another C5-MTase, M.HaeIII [14].
In some experiments the restriction enzyme Eco47I was used to detect CG-specific methylation. Eco47I recognizes GGWCC sites but can not cleave when the underlined cytosine is methylated (Kazlauskiene et al., cited in REBASE [15]). At one of the Eco47I sites in the pcpM. MpeI and pcpM.SssI plasmids the 3'-C is followed by a G creating an M.SssI/M.MpeI substrate site. Methylation of this CG site blocks Eco47I cleavage and yields a 1059 bp protected fragment. The results of Eco47I digestion were in agreement with the results obtained with Hin6I digestion: the 1059 bp fragment was detectable in the digests of all plasmids showing some level of protection against Hin6I digestion, but was missing from the digests of pcp222M. MpeI, pcp357M.MpeI and pcp377M.MpeI, which were fully digestible with Hin6I. Moreover, similarly to the Hin6I digestion, the plasmid pcp62M.MpeI showed slight resistance to Eco47I digestion even in the uninduced state (S3 Fig).
Of the seven circularly permuted M.SssI variants cp308M.SssI was inactive (Fig 6). Plasmids encoding cp33M.SssI, cp58M.SssI, cp156M.SssI, cp173M.SssI, cp243M.SssI and cp357M.SssI were resistant to Hin6I digestion when they were purified from arabinose-induced cells indicating that these CP variants had MTase activity. The plasmid expressing cp58M.SssI showed some protection even in the uninduced state (Figs 6 and S4). The high activity of cp58M.SssI in vivo was not surprising because its permutation site corresponded to that of cp62M.MpeI, which was the most active circular permutant of M.MpeI (Fig 1, Tables 1 and 2).
MTase activity of the CP variants was also estimated in cell extracts by a radioactive assay. The activity measured in the cp62M.MpeI extract was comparable to that of the WT enzyme, whereas the activities measured in the extracts of other cpM.MpeI and cpM.SssI variants were much lower (S12 and S13 Figs). The low activity of cp58M.SssI was especially unexpected because the methylation state of the plasmid encoding cp58M.SssI indicated high activity (see above).
The crystal structure of M.MpeI [30] as well as the computational model of M.SssI [23] suggested that in both MTases the N-and C-termini are close to each other. However, because a few terminal amino acids are missing from the M.MpeI X-ray model (the terminal residues in the model are D6 and N293), we could not determine the exact distance between the two ends of the molecule. Thus it was not possible to exclude that linking the ends would lead to structural perturbations affecting MTase activity. We inserted a flexible linker peptide (GGGSG) Circularly permuted DNA methyltransferases between the native N-and C-termini of cp62M.MpeI and cp58M.SssI (S1 and S2 Datasets). The plasmids pcp62M.MpeI-280 and pcp58M.SssI-280 expressing the linker-containing MTases showed the same level of resistance to Eco47I digestion as the respective parental plasmids (pcp62M.MpeI and pcp58M.SssI, suggesting that direct fusing of the native N-and C-termini had no adverse effect on the catalytic activity. The cpM.MpeI and cpM.SssI variants are listed in Tables 1 and 2, respectively. Based on the position of the permutation sites, the permutants possessing detectable MTase activity can be classified in ten types (A through J, Fig 7).

Complementation between fragments of circularly permuted MTases
This work was started on the hypothesis that poor solubility of C-terminal fragments of M. MpeI and M.SssI was due to the exposure of the C-terminal α-helix to the solvent, and that the native fold of the split fragments could be better preserved by covalently linking the C-terminal α-helix to the large domain. Of all permutants we have constructed, Class J variants appeared to be the best starting material for creating complementing fragments, because splitting them between motif VIII and the TRD would create two polypeptides, which approximately correspond to the two domains of the native enzyme (Fig 7). Consistently with the interdomain position of the split site, C5-MTases bisected naturally [45,46] or artificially [24,31,47] between motif VIII and the TRD, showed efficient fragment complementation.
The After the failure with Class J, we tried Class B permutants for complementing fragments. In Class B enzymes (cp62M.MpeI and cp58M.SssI, Fig 7) the new N-terminus is between conserved motifs II and III. Of all permutants Class B variants showed the highest MTase activity. First we tested whether fragments created by spliting WT M.MpeI or M.SssI between motifs II and III have complementation capacity. We constructed the compatible plasmids pB-Mpe[1-61] and pOB-Mpe[62-395] expressing the indicated fragments from the E. coli araBAD promoter (Tables 3 and S2, Fig 8 and Fig, Table 3). Circularly permuted DNA methyltransferases The complementation capacity of cp58M.SssI, the Class B permutant version of M.SssI was tested in similar experiments. The permutation site of cp58M.SssI exactly corresponds to that of cp62M.MpeI (Fig 1). Two bisection sites were tested, both were designed to fall between conserved motif VIII and the TRD.  Fig, Table 3).

Solubility of circularly permuted MTases and their fragments
Production and solubility of CP variants, which had detectable MTase activity in vivo was investigated by SDS-polyacrylamide gel electrophoresis of extracts prepared from uninduced and arabinose-induced E. coli cells. For most cpM.MpeI variants the amounts of the MTase detected by SDS gel electrophoresis correlated with the in vivo activities: WT M.MpeI, cp62M. MpeI, cp245M.MpeI and cp351M.MpeI, which were highly active in vivo (Fig 5), were produced upon induction in relatively large amounts and were soluble, whereas cp122M.MpeI, cp208M.MpeI and cp215M.MpeI, which had low activity, were not detectable on the gels (S7

Natural circularly permuted C5-MTases
Based on the linear order of conserved motifs, the CP variants showing MTase activity represented ten topological types (Fig 7). Type H, in which the new amino-end is between motif VIII and the TRD, has already been observed in five natural C5-MTases: M.BssHII [11], [12], M.Alw26I, M2.Eco31IC, M.Esp3I [13] and M2.BsaI [14] and observations by Zhu and Xu, cited in REBASE [15]). To explore if there are natural circularly permuted C5-MTases with permutation topology different from Type H, we performed a computer search of the C5-MTase sequences available in the REBASE database. The computer search found 27 C5-MTase sequences satisfying the criterion of motif X preceding motif I. The sequences . Because the assigment of the TRD is somewhat arbitrary, it is not easy to decide whether the natural cpMTases fall in the H or in the J class of the proposed classification. Using the assignment of the TRD for M.Alw26I, M2.Eco31IC, M.Esp3I [13], the majority of natural circularly permuted C5-MTases appear to have the Type H arrangement (S5 Dataset).
Prompted by the wish to improve some physicochemical properties of the enzymes, we used a systematic approach to construct circularly permuted variants of the CG-specific C5-MTases M.MpeI and M.SssI. To our knowledge this is the first study describing the construction of designed circularly permuted variants of C5-MTases. Most CP variants created in this work (11 of 14 for M.MpeI and 6 of 7 for M.SssI) had detectable activity in E. coli. Although most permutation sites were designed to fall outside of conserved motifs, and the MTase activity of the permutants measured in cell extracts was in most cases substantially lower than that of the wild-type enzyme, the high proportion of CP variants with detectable activity is still remarkable, and is a new evidence for the structural plasticity of C5-MTases. The observed phenotypes i.e. the presence or absence of MTase activity of the CP variants were in most cases in agreement with the CPred predictions (Fig 3).
In three of the four inactive CP variants created in this work the permutation sites are in regions with known function: for cp222M.MpeI in motif VIII, for cp377M.MpeI in motif X, and for cp308M.SssI in the TRD (Figs 3 and 4). In the fourth inactive permutant (cp357M. MpeI) the permutation site is in the middle of α-helix 9 (Figs 3 and S10). There are two other permutants (cp351M.MpeI and cp361M.MpeI), whose permutation sites are at the ends of the same α-helix, but these variants have MTase activity (Figs 3 and 5). Apparently, splitting α-helix 9 in the middle (cp357M.MpeI) perturbs the structure more than splitting the helix at the edges (Figs 3 and S10). These results show the importance of α-helix 9 for M.MpeI function and demonstrate the usefulness of CP variants in the identification of functionally important elements of enzymes. Unfortunately, the region corresponding to α-helix 9 of M.MpeI is represented in poor quality in the computational model of M.SssI (S11 Fig) making comparisons between the two MTases difficult. The equivalent permutants (cp361M.MpeI and cp357M.SssI) had similar MTase activities in vivo (Figs 5 and 6).
Of all permutants constructed in this work Class B enzymes (cp62M.MpeI and cp58M.SssI) had the highest MTase activity (Figs 5 and 6, Tables 1 and 2) suggesting that the surface loop between conserved motifs II and III is rather tolerant to structural perturbations. This notion is also supported by the presence of the relatively long non-conserved region between motifs II and III in M.MpeI and M.SssI. This sequence is absent from many C5-MTases. Consistently with the assumed tolerance of the region between motifs II and III to stuctural changes, the complementation capacity of the [1-61] + [62-395] fragment pair was higher than that of any other M.MpeI fragment combinations we have tested.
Although the methylation status of the plasmids showed that the activity of cp58M.SssI was, similarly to the equivalent cp62M.MpeI, higher in vivo than that of the other M.SssI permutants (Fig 6), we could not detect elevated MTase activity in the crude extract of cp58M.SssI. We don't know the reason of this discrepancy, it is possible that cp58M.SssI loses activity during preparation of the cell extract. The observed difference between the activities of cp62M. MpeI and cp58M.SssI was consistent with the difference between the amounts of soluble cp62M.MpeI and cp58M.SssI in crude extracts (S7 and S8 Figs). In our hands M.MpeI and its CP derivatives had higher activity and were easier to work with than M.SssI and its CP variants. It must be noted that in this work the activity of the MTase variants was mainly estimated from the methylation state of plasmid DNA purified from E. coli cells producing the enzyme. The in vivo MTase activity can be influenced by factors such as solubility, stability of the MTase, interaction with other proteins, thus the methylation status of the plasmid DNA may not truly reflect the differences between the catalytic activities of the variants. Strict comparison of the specific activities awaits enzymological studies with purified cpMTase variants.
It is interesting to compare the designed CP variants of M.MpeI and M.SssI with the CP variants of M.HaeIII created previously by a directed evolution strategy [14]. The M.HaeIII permutants were, due to the random nature of the experimental approach, not unit-length molecules, they either lacked a few amino acids or contained shorter or longer redundant peptides. Based on the position of their N-termini, the enzymatically active cpM.HaeIII variants fell in three groups. Members of the first group started either between conserved motifs II and III, or in motif III, or between motifs III and IV [14], thus they more or less corresponded to our Type B or Type C permutants. The N-termini of the second group were in the variable region between motif VIII and the TRD, hence these permutants can be classified as Type H. The permutation sites of the third group were in a more distal part of the variable region, within or very close to the TRD [14] (S14 Fig). The permutation topology of this last group does not match any of the ten types of active cpMTases constructed by us, instead it appears to correspond to cp308M.SssI, which was inactive (Figs 6 and S14). Although the two studies were done with different C5-MTases (M.HaeIII vs. M.MpeI/M.SssI), a comparison of the results (S14 Fig) shows that a much wider range of permutation topologies are compatible with C5-MTase activity (Fig 7) than detected in the previous study [14].
This work was motivated by the wish to create complementing fragments of M.SssI and/or M.MpeI, which are more soluble than those obtained from the wild-type enzymes. We did find a fragment pair derived from a circularly permuted M.MpeI variant (Mpe  and Mpe[62-279]), which had the capacity of complementation, and were more soluble than any other complementing fragment pairs we have tested (S9 Fig). Although it will require further work to determine whether the Mpe  and Mpe[62-279] fragments can be purified in good yield, these results already show that circular permutation can be a useful approach in engineering C5-MTases.
The search of the REBASE database identified 22 new C5-MTases with circularly permuted amino acid sequence. Although most of these are putative enzymes, i.e. we do not know whether they are active, the relatively large number of circularly permuted C5-MTase-like sequences (biochemically characterized and putative enzymes) suggests that circular permutation occured multiple times during evolution of this enzyme family. A mechanism involving gene duplication and subsequent bidirectional truncation was proposed to account for the evolution of circularly permuted DNA methyltransferases [52]. The permutation-by-duplication model received later experimental support [14]. The close distance of the N-and C-termini in the available structural models of C5-MTases [9,10,23,30] is consistent with circular permutation as a possible mechanism in C5-MTase evolution. It will be interesting to test whether the ability to tolerate circular permutation without major loss of activity is a general phenomenon of C5-MTases.
In all natural circularly permuted C5-MTases the permutation site falls in the variable region (S5 Dataset). In this work we showed for two C5-MTases that the activity Type B permutants was comparable to that of the wild-type enzymes. Perhaps surprisingly, the Type B permutation pattern does not seem to occur in natural C5-MTases. It is possible that the Type B CP arrangement results in decreased SAM binding affinity, which could be a disadvantage in vivo at physiological SAM concentrations, and would explain the apparent lack of natural Type B circularly permuted C5-MTases. Under the in vitro assay conditions used in this work (crude extract, excess SAM) a moderate decrease in SAM binding affinity would not be detectable. To address this question we will determine the steady state kinetic constants of purified wild-type M.MpeI and cp62M.MpeI.
The uniformity of the permutation patterns in natural circularly permuted C5-MTases suggests that evolutionary pathways leaving the two domains intact are favored by Nature.