Efficient method for site-directed mutagenesis in large plasmids without subcloning

Commonly used methods for site-directed DNA mutagenesis require copying the entire target plasmid. These methods allow relatively easy modification of DNA sequences in small plasmids but become less efficient and faithful for large plasmids, necessitating full sequence verification. Introduction of mutations in larger plasmids requires subcloning, a slow and labor-intensive process, especially for multiple mutations. We have developed an efficient DNA mutagenesis technique, UnRestricted Mutagenesis and Cloning (URMAC) that replaces subcloning steps with quick biochemical reactions. URMAC does not suffer from plasmid size constraints and allows simultaneous introduction of multiple mutations. URMAC involves manipulation of only the mutagenesis target site(s), not the entire plasmid being mutagenized, therefore only partial sequence verification is required. Basic URMAC requires two PCR reactions, each followed by a ligation reaction to circularize the product, with an optional third enrichment PCR step followed by a traditional cloning step that requires two restriction sites. Here, we demonstrate URMAC’s speed, accuracy, and efficiency through several examples, creating insertions, deletions or substitutions in plasmids ranging from 2.6 kb to 17 kb without subcloning.

Introduction A number of DNA modification techniques involve rapid and efficient site-directed DNA mutagenesis (SDM) developed in the 1990's, soon after the invention of polymerase chain reaction (PCR) [1]. Most SDM techniques make use of one version or another of inverse PCR mutagenesis. Inverse PCR was developed by Hemsley et al. [2] and later improved by the use of a proofreading DNA polymerase, such as Vent [3], and an enzymatic step to remove background template [4]. Another fast mutagenesis method developed by Papworth et al. [5] uses a primer extension design to copy the plasmid, generating staggered nicks that are repaired by bacteria after transformation. These and other SDM methods have been commercialized in DNA mutagenesis kits such as ExCite and QuikChange (Stratagene, La Jolla, CA), and Phusion Site-Directed Mutagenesis (New England BioLabs, Ipswich, MA). Because these techniques rely on copying the entire DNA plasmid with primers containing the desired mutation, they generally work best for plasmids under 3.1 kb in size [6]. This approach has inherent limitations including the difficulty of copying large plasmids, an increased chance of encountering too-high or too-low GC contents that slow or stop PCR reactions, and the introduction of unwanted mutations in the plasmid due to polymerase errors. Introducing a mutation in plasmids larger than 8 kb usually requires subcloning a section of the plasmid containing the site of mutagenesis into smaller cloning vectors to make the SDM possible. Subcloning is an inherently slow process involving restriction enzyme digestion, ligation, transformation, colony formation and selection, DNA isolation, sequence verification, and excision of the mutated DNA sequence from the subclone and its insertion into the original plasmid. All but the final insertion step are avoided in URMAC.
URMAC employs a minimalistic approach in which PCR reactions are performed on the smallest possible portion of a large plasmid that contains the mutagenic target site flanked by unique restriction sites. This approach significantly improves the rate of PCR success and the quality of the product. In this study, we provide several examples of the applicability of URMAC for deletion, insertion, or substitution of DNA sequences in plasmids ranging in size from 2.6 kb to 17 kb.

Results
General description of the URMAC method URMAC relies on the simple ability of DNA ligation to turn a linear PCR product generated from a plasmid template into a circular DNA that can be opened at a second site by amplification with primers containing the desired mutation(s), circularized by ligation again, and amplified with the original primers to reproduce the original DNA containing the desired mutation(s). The same reaction steps can be used to insert, delete, or substitute any number of DNA nucleotides. Two sets of primers are used in URMAC, the Starter Primers and the Opener (Mutagenic) Primers. These primers are 5 0 phosphorylated so that they can participate in the subsequent ligation step. The Starter Primers are first used to amplify the Modification Target sequence and again for the final enrichment PCR step. The Opener Primers are used to introduce the mutation of interest. The steps involved in URMAC are illustrated in Fig 1 and  described below. PCR #1. The Modification Target, a small portion of the plasmid containing the sequence to be modified, is selected for amplification by PCR. The positions of the Starter Primers (SP1 and SP2) are chosen such that the amplified Starter DNA includes the closest, but at least 150 bp apart, unique restriction sites (X and Y) in the plasmid flanking the site to be modified. These restriction sites will be used in the final step for insertion of the Modified DNA into the parental plasmid.  PCR #2. The Closed Starter DNA is opened at the site of the desired mutation by inverse PCR using the Opener (Mutagenic) Primers (OP1 and OP2) facing opposite directions from the opening site to generate the linear Intermediate DNA containing the mutation(s) at its termini. Sequences can be: deleted at the site of opening by moving the primers apart; inserted by adding nucleotides to the 5 0 terminus of one or both of the primers; or mutated by changing one or more nucleotides in one or both primers. Any combination of deletion, insertion or substitution can be designed into the Opener Primers.
Ligation #2. The Intermediate DNA is circularized by self-ligation to generate the Closed Intermediate DNA. The desired mutation is now in place and the fragment can be excised and ligated into the original plasmid.
Optional enrichment PCR Step. The Closed Intermediate DNA can be amplified by the same pair of Starter Primers that were used to amplify the Starter DNA in the first step (SP1 and SP2) to generate the Linear Modified DNA. This PCR step increases the number of DNA molecules with desired modifications for ligation into the original plasmid.
Restriction digestion and insertion. Both the Linear Modified DNA and the original plasmid are digested with the unique restriction enzymes identified in the first step (X and Y in Fig  1), and purified. The final DNA with the modifications sequence is ligated into the parental plasmid to produce the final product containing the desired mutation(s). By the end of the final PCR enrichment step, the amount of the Original DNA carryover from the first PCR reaction step is insignificant due dilution factors. Optionally, the addition of 1unit DpnI restriction enzyme to the first ligation step completely removes any potential carryover of the full Original DNA plasmid during the URMAC procedures. The success rate of obtaining the correct clones after the final cloning step is normally over 95%.

Validation of the method: Introducing insertion, deletion and substitution mutations
For validation of the URMAC method, we used pUC18, a widely available plasmid, as a target to test URMAC by either removing or adding a restriction site. We performed all three different types of DNA mutagenesis using the same Closed Starter DNA from the first PCR and ligation reactions as a template (Fig 2A).
We inserted the recognition sequence of a MluI restriction enzyme next to that of the native NdeI, substituted the NdeI recognition sequence with that of the MluI (that is simultaneous deletion of NdeI and insertion of MluI recognition sequences), or deleted the NdeI recognition sequence, all by altering the 5 0 ends of the OP1 and OP2 primers or positioning them appropriately (Fig 2b, and Table 1).
Since the Starter DNA was only 532 bp long (Fig 2A), roughly 20% of the full plasmid size, it was amplified in 60 minutes, more quickly than the whole plasmid would have been. The mutagenesis steps followed the basic URMAC steps illustrated in Fig 1. Successful mutagenesis was verified by restriction analysis (Fig 2c). The PCR-enriched Linear Modified DNA from each mutagenic reaction was digested with NdeI, which cuts once, its 5 0 terminus. The Intermediate DNA is circularized with T4 DNA ligase. The SP1 and SP2 primers are used in the enrichment PCR step to amplify the Linear Modified DNA. The Linear Modified DNA, and the original plasmid are digested with the restriction enzymes that cleave at the unique restriction sites, X and Y, and the appropriate fragments are ligated to produce the Modified Original DNA.
https://doi.org/10.1371/journal.pone.0177788.g001 URMAC mutagenesis in plasmids with complex GC regions PCR reactions on GC-rich templates are problematic. Additives such as DMSO, glycerol, formamide, PEG and other organic compounds [7][8][9][10] can help to overcome some of the problems associated with PCR on GC-rich templates, However GC, complexities remains a problem that can be avoided by limiting the mutagenesis to smaller regions of the target DNA rather amplifying the full plasmids. In this experiment, we aimed to introduce substitution mutations in two expression plasmids, pCG-H (6,669 bp) and pCG-F (6,411 bp), having GCrich regions that had failed in previous inverse PCR mutagenesis attempts in our laboratory. At 40-nucleotide resolution scanning (see methods), the plasmids contained a 79.5% GC region in pCG-H and 85.4% in pCG-F. We avoided the need to amplify these GC regions which would have been necessary if we had used inverse PCR mutagenesis, by amplifying and manipulating only the region of interest by URMAC.
The pCG-H and pCG-F plasmids contain the attachment (H) and fusion (F) glycoproteins of Edmonston B strain of measles virus (MV) [11]. Together, these two viral glycoproteins enable the MV virion envelope to attach to and fuse with the target cell membrane to initiate infection. When the H and F proteins are co-expressed on the cell surface during infection or following transfection, they interact with cellular receptors on neighboring cells causing cellcell fusion resulting in the formation of syncytia. Based on this fusion phenomenon, we tested the functional ability of mutant H and/or F with a modified integrin-binding Leu-Asp-Val (LDV) motif to induce syncytia in two cell lines, Vero and BKH-21. In Vero cells, our positive control, the MV glycoproteins can use the cell surface CD46 molecule as a receptor to initiate fusion independent of interaction with the LDV motif. However, BHK-21 hamster cells do not express CD46, but they express the integrins that interact with the LDV motifs on viral H and F proteins. To determine whether or not the LDV motif in either the H or F glycoprotein is critical for fusion, we used URMAC to replace the central Asp (D) with the similar amino acid, Glu (E), to create an H protein variant, D 79 E, and an F protein variant, D 461 E.
To perform this mutagenesis, we designed a pair of Starter Primers (Table 1) to amplify the region encoding the LDV motif in the F gene in pCG-F, and another pair for the H gene in pCG-H, including appropriate native restriction sites in the amplicons. The substitution reaction steps were performed as described above for pUC18. After generating the final plasmids that carry the desired modifications, the DNA of 5 clones for both mutants was sequenced at the mutation sites. All 10 sequenced clones contained the correct mutations. The biological significance of the mutations was investigated by performing a fusion assay(data not shown).
In addition to the D-to-E mutations in pCG-H and pCG-F, we generated several other mutations in the LDV motif of both the F and H proteins using a single Opener Primer in conjunction with a series of mutagenic Opener Primers on the same Closed Starter DNA template. Fig 3B illustrates this approach that generated four different mutations, L78A, D79A, D79E, and V80A, using a single common Opener Primer (OP2) paired with a series of mutagenic Opener Primers (OP1). This approach easily enables any number of mutations to be built into by restriction analysis. Fig 2C shows  a motif or region by changing the sequence of only one of the primers used in this step. Introducing mutations in the pCG-H and pCG-F took one day for the URMAC biochemical reactions and an additional 3 days to clone the Linear Modified DNA into the original plasmids. URMAC did not require extensive optimization since the PCR was used to copy only 10% of  The ">" sign = forward and the "<" sign = reverse primer orientation. Underline: native NdeI recognition sequence. Bold, italics: inserted MluI recognition sequence. Bold and underlined G in OP1-MD reflects a substitution of A to G. https://doi.org/10.1371/journal.pone.0177788.t001 URMAC: DNA cloning and mutagenesis the plasmid in this case avoiding regions of high GC content. We conclude from this experiment that URMAC can be easily used to introduce mutations in 6-7 kb plasmids containing regions of widely disparate GC content without resorting to conventional subcloning.

Introduction of a point mutation into the 11 kb dystrophin cDNA by URMAC
Large plasmids present a challenge for PCR-based mutagenesis methods that require amplifying the full sequence while preventing the introduction of unwanted mutations caused by PCR infidelity. A fragment of the cDNA can be subcloned into a smaller plasmid, mutagenized and the fragment returned to the original plasmid, but this process is time consuming.
As an example, the cDNA for muscle dystrophin open reading frame is approximately 11 kb [12]. Mutations in dystrophin are associated with a spectrum of clinical phenotypes, grouped as "dystrophinopathies" including Duchenne muscular dystrophy (DMD), Becker muscular dystrophy (BMD), and X-linked dilated cardiomyopathy (XLDCM) [13,14]. DMD and BMD are characterized by progressive skeletal muscle degeneration and development of cardiac disease leading to premature death, with DMD having an earlier childhood onset and a more severe disease progression than BMD. By contrast, XLDCM patients typically only develop severe cardiac disease and as a result, are treated with a cardiac transplant. To study the differential effects of mutations on the function of dystrophin in striated muscles, it is critical to ensure that only the desired mutation is introduced during the mutagenesis process. Given the 11kb size of the cDNA alone, introduction of mutations in dystrophin would require subcloning.
To determine whether URMAC could be a viable and fast substitute for subcloning in such a large plasmid, we sought to introduce a point mutation in the full length dystrophin cDNA. We chose the A1043G missense mutation in exon 9 that results in a Threonine to Alanine substitution at amino acid 279. This substitution is found in one family with very early onset and severe XLDCM [15,16].
We started with a commercially available 13.8 kb Gateway entry vector plasmid containing the full length human muscle dystrophin cDNA (11,061 bp). As shown in Fig 4a, the URMAC approach was used on an isolated 1,629 bp region containing exon 9 and two flanking unique restriction enzyme sites (NsiI and SphI).
This allowed us to modify this short fragment without copying the rest of the cDNA or plasmid sequences. The region to be altered was amplified using the Starter Primers (Table 1) to produce the Starter DNA. The Starter DNA was ligated and amplified with the mutagenic Opener Primers (Table 1) containing the A to G mutant sequence to generate the Intermediate DNA. Following circularization, the Starter Primer pair was used to generate the Linear Modified DNA. This entire procedure was completed in 6 hours. The Linear Modified DNA was digested with NsiI and SphI and inserted into the original plasmid. Sequencing of the cDNA region that was isolated for URMAC confirmed the A1043G point mutation (Fig 4B), and sequencing of the entire cDNA, confirmed that this was the only mutation.
Despite the size of the human dystrophin cDNA, introducing a point mutation by URMAC was simple and fast, requiring minimal effort and materials. The entire process from beginning to full sequencing of the final mutagenized plasmid was completed in less than one week.

Deletion of open reading frames from a large plasmid containing multiple genes
In this example, we used URMAC to delete two open reading frames (ORF) from a 17 kb plasmid containing the human respiratory syncytial virus (RSV) genome from which its three glycoprotein genes had already been deleted and two foreign marker genes inserted by conventional subcloning methods. Transcription of this plasmid in mammalian cells, along with the expression of the 4 viral proteins involved in genome replication and mRNA transcription, initiates continuous intracellular replication of this RNA virus replicon [17,18]. The aim was to determine whether two of the RSV genes, M2-1 and M2-2, are required for RSV replication and survival.
At 17 kb, this plasmid was too large to be modified by current methods without subcloning. We used the URMAC method to individually delete the M2-1, M2-2 or both ORFs from the replicon cDNA by the scheme shown in Fig 5a. The Starter Primers were designed to flank the M2 gene and the unique XhoI and AarI sites, and used to produce a 2.7 kb Starter DNA. Following ligation, the Closed Starter DNA was subjected to the second round of amplification using various pairings of the 4 Opener Primers to generate Linear Intermediate DNAs with the deletions.
Following ligation, each of the three Closed Intermediate DNAs were enriched by a third PCR using the Starter Primers. The three resulting Linear Modified DNAs were digested with XhoI and AarI and inserted into the parental plasmid to generate three RSV replicon plasmids with: M2-1 ORF deleted (ΔM2-1), M2-2 ORF deleted (ΔM2-2), and both ORFs deleted (ΔM2).
Deletions were confirmed by the size of the PCR products amplified by the Starter Primers (Fig  5b) and by sequencing the DNA region between the XhoI and AarI restriction sites. URMAC enabled the rapid deletion M2-1 and M2-2 from this large plasmid, allowing a quick test of the importance of the M2 genes for RSV genome replication and survival.

Discussion
The URMAC technique provides several advantages over other existing technologies for DNA mutagenesis. In URMAC, mutagenesis is minimalistic: only the smallest region targeted for mutagenesis in a given DNA sequence/plasmid is subjected to molecular manipulation. For example, in our MV H and F gene mutagenesis experiments, URMAC was applied to only 287 bp and 512 bp of the pCG-H and pCG-F plasmids instead of the entire 6,669 bp and 6,411 bp, leaving more than 95% of the plasmids untouched by the DNA polymerase. This lends several benefits to the URMAC technology: 1) PCR reactions have a higher success rate with small fragments rather than with full-length plasmids; 2) URMAC has a lower chance of introducing polymerase errors than the primer extension SDM method (QuikChange) and inverse PCR  because the fragment being amplified is much smaller; 3) regions of any plasmid that are not a direct target for mutagenesis remain untouched by DNA polymerase and therefore do not require sequence verification, a critical consideration when dealing with plasmids containing a very large gene such as the dystrophin gene, or multiple genes such as the RSV replicon; 4) URMAC is very fast compared to conventional SDM requiring subcloning, with an average of a single day to complete URMAC and an additional 3 days to clone the final product into the original plasmid, compared to at least 3-4 weeks required for subcloning; 5) URMAC can dramatically reduce the challenge of high GC-containing plasmids by avoiding PCR amplification of those parts of a plasmid; and 6) URMAC requires less labor and materials and therefore costs less than subcloning.
Furthermore, URMAC is versatile for handling any combination of insertions, deletions or substitutions. While QuikChange is fast in creating a single or double nucleotide mutation in a small plasmid, larger or multiple changes are more difficult. URMAC does not suffer from this limitation since the mutation is inserted at the 5 0 end of a short primer rather than in the middle of longer mutagenic primers. In this way, URMAC is similar to the inverse PCR techniques, yet it does not suffer from the size limitation of inverse PCR. Like URMAC, another method, called splice overlap mutagenesis (SOM) [19], requires the availability of restriction sites, but SOM primers must be complementary to the target DNA in both orientations at the joining ends of the two PCR products, limiting control over the primer design and their effectiveness. Since URMAC does not require the joining of PCR products, URMAC does not suffer from this limitation. When individual mutations are required in adjacent regions of the plasmid to generate multiple separate mutants, the Closed Starter DNA can serve as a template for all of the mutations by designing different Opener Primers. The Closed Intermediate DNA can be recycled as Closed Starter DNA, once for each mutation.
Although URMAC mutagenesis is not affected by the size of the original plasmid because the actual mutagenesis is performed on only a small region, inserting the final PCR product into the original plasmid does depend on the availability of unique restriction sites. In plasmids of 30 kb or larger, the availability of such restriction sites becomes limited. In this case, an alternative recombineering strategy [20] could be incorporated into the URMAC technique to facilitate the insertion of the mutated sequences back into very large plasmids irrespective of the availability of restriction sites.
In summary, compared to conventional site-directed mutagenesis methods, including those that require subcloning, the URMAC technology is versatile, simple, efficient and costeffective. It is particularly useful for large plasmids but also works well for small plasmids.

Mutations in pUC18 accession number L09136
All PCRs were carried out using 100 pg pUC18 as a template and 75 pmol of Starter Primers 1 and 2 (Table 1) in a total volume of 25 μl containing 1 unit Pfu DNA polymerase (Agilent Technologies, Santa Clara, CA) and the appropriate amplification buffer unless otherwise stated. PCRs were performed using the following thermocycling conditions: denaturation at 94˚C for 2 min followed by 25 cycles of 94˚C, 20 sec; 60˚C, 30 sec; 68˚C, 1 min, and a final extension at 68˚C for 5 min. DNA electrophoresis on 1% agarose gel confirmed that the Starter DNA had been produced (Fig 2A). 2.5 μl of the Starter DNA, estimated at 50-100 ng, was circularized by self-ligation using 1 unit of T4 DNA ligase in a total reaction mix of 20 μl for 10 min at 20˚C. 1 μl of a 1:200 dilution of the Closed Starter DNA product in deionized H 2 O was used as a template for the second round of PCR, but this time various pairs of Opener Primers (Table 1) were used. 2.5 μl of these products were self-ligated with T4 DNA ligase as above and 1 μl of a 1:200 dilution of the Closed Intermediate DNA product was used as a template for the final PCR reaction to create the Linear Modified DNA. Successful mutagenesis was confirmed by restriction analysis using NdeI and MluI restriction enzymes (New England Biolabs, Ipswich, MA): 2 μl of Linear Modified DNA for each mutation type was incubated with or without 5 units of restriction enzymes for 30 min at 37˚C. 5 μl of each reaction was resolved by electrophoresis on a 1% agarose gel. For cloning the modified DNA into the parental DNA, a standard protocol was used, briefly, both the fragment and the plasmid were digested with 5 units of PfoI and EcoRI separately. The appropriate fragments were gel-purified using QIAquick Gel Extraction kit (Qiagen MiniPrep Kit (Qiagen, Hilden, Germany) and eluted with 30 μl TE buffer. The DNA was measured by Nanodrop™ 2000C instrument (Thermofisher, Wilmington, DE). 15 fmol of plasmid's backbone was mixed with 45 fmol insert and incubated with T4 DNA ligase and ligation buffer for a total of 20 μl and the reaction was incubated for 1 hr at 16˚C. The reaction was terminated by incubation at 65˚C for 10 minutes. The reaction was then diluted 5-fold in pure H2O. 5 μl of diluted reaction was used to transform 50 μl DB3.1 E. coli chemically competent cells (Invitrogen). For sequence screening of final clones, the same Starter Primers were used for sequencing the full 500 nucleotide span of the modification target.

Introducing substitution mutations into the H and F glycoproteins of MV
Starter Primers (Table 1) were used to amplify the DNA sequences that include the LVD motifs (5 0 -CTA GAT GTA-3 0 and 5 0 -TTG GAC GTA-3 0 ) in the pCG-H and pCG-F [11] plasmids containing the H and F glycoprotein genes, respectively, and the flanking, unique restriction sites (NheI and BspEI for the H gene and KpnI and XcmI for the F gene). The Starter DNA for H was 287 bp and for F was 512 bp. The pCG-H and pCG-F plasmid sizes were 6,669 bp and 6,411 bp, respectively. PCR amplifications and ligations were performed as described for the pUC18 mutagenesis above.
GC region scanning was performed using BioAnnotator module of Vector NTI version 11 (Invitrogen, Carlsbad, CA). Briefly, full DNA sequences of pCG-H or pCG-F plasmids were loaded into the BioAnnotator program, then subjected to GC% analysis under the Analyze Selected Molecule tab. The GC% scanning window was left at the default of 40 nucleotide under the Select Window tab. Percent of GC% contents of each plasmid were noted at the highest peaks of the histogram.

Mutagenesis of muscular dystrophy gene
A modified Gateway entry vector (pENTR223.1, Clone ID: 40080544) containing the fulllength human muscle dystrophin cDNA (11.061 kb) was obtained from the ORFeome Collaboration-OCAB (http://www.orfeomecollaboration.org/html/index.shtml). The sequence is deposited in NCBI under accession number BC111587. The Pfu DNA Turbo polymerase (Stratagene) was used for all PCRs. Thermocycling conditions for the Starter Primers (Table 1) for the first and third PCR were as follows: denaturation at 94˚C for 2 min, followed by 25 cycles of 94˚C, 30 sec; 57.2˚C, 30 sec; and 68˚C, 3 min, followed by a final extension at 68˚C for 10 min. Thermocycling conditions for the Opener Primers (Table 1) in the second PCR were as follows: denaturation at 94˚C for 2min followed by 25 cycles of 94˚C, 30 sec; 53.1˚C, 30 sec; and 68˚C, 2 min, followed by a final extension at 68˚C for 10 min. PCR products (Starter and Intermediate DNA) were re-circularized by ligation with T4 DNA ligase for 15 min at room temperature followed by an optional enzyme inactivation at 65˚C for 10 min.
The final product (Linear Modified DNA) and the parent plasmid were digested with NsiI and SphI (New England Biolabs) and then ligated overnight at 14˚C.