Figures
Abstract
Recombinase enzymes are extremely efficient at integrating very large DNA fragments into target genomes. However, intrinsic sequence specificities curtail their use to DNA sequences with sufficient homology to endogenous target motifs. Extensive engineering is therefore required to broaden applicability and robustness. Here, we describe the directed evolution of novel lambda integrase variants capable of editing exogenous target sequences identified in the diatom Phaeodactylum tricornutum and the algae Nannochloropsis oceanica. These microorganisms hold great promise as conduits for green biomanufacturing and carbon sequestration. The evolved enzyme variants show >1000-fold switch in specificity towards the non-natural target sites when assayed in vitro. A single-copy target motif in the human genome with homology to the Nannochloropsis oceanica site can also be efficiently targeted using an engineered integrase, both in vitro and in human cells. The developed integrase variants represent useful additions to the DNA editing toolbox, with particular application for targeted genomic insertion of large DNA cargos.
Citation: Siau JW, Siddiqui AA, Lau SY, Kannan S, Peter S, Zeng Y, et al. (2024) Expanding the DNA editing toolbox: Novel lambda integrase variants targeting microalgal and human genome sequences. PLoS ONE 19(2): e0292479. https://doi.org/10.1371/journal.pone.0292479
Editor: Chandravanu Dash, Meharry Medical College, UNITED STATES
Received: September 20, 2023; Accepted: January 26, 2024; Published: February 13, 2024
Copyright: © 2024 Siau et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: This work was supported through a grant from the National Research Foundation-Competitive Research Programme, Singapore to PD (NRF-CRP21-2018-0002). The funder provided support in the form of salaries for authors, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section There was no additional external funding received for this study.
Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: Peter Droge is a co-founder and share holder of LambdaGen. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Introduction
Microalgae hold great promise as carbon-sequestering microbial cell factories for production of biofuels, nutraceuticals, pharmaceuticals, feed, and other biomolecules of commercial interest [1]. Conventional strain engineering techniques to increase yields of endogenous biomolecules in microalgae have met with limited success, often due to deleterious trade-offs between productivity and cell-growth [2]. Genome engineering is therefore a viable alternative to circumvent these issues and facilitate production of novel biomolecules by incorporating genes encoding exogenous enzymatic pathways [1]. Targeted genome editing in microalgae via site-specific DNA cleavage has been shown utilizing clustered regularly interspaced short palindromic repeats (CRISPR/Cas9) [3], transcription activator-like effector nucleases (TALENS) [4], and zinc finger nucleases [5]. Error-prone repair of lesions introduced results in corruption of native sequence context and perturbation of gene function. In the presence of exogenous donor DNA, homology-directed repair (HDR) can further result in targeted transgenesis [6, 7]. To fully leverage algae’s potential as robust microbial cell factories, additional editing tools are required, particularly those facilitating targeted integration of even larger DNA cargos encoding complex multi-component pathways.
Bacteriophage and yeast derived site-specific recombinase (SSR) enzymes have enabled transgenesis of large (up to 133 kB) exogenous genes into the genomes of a diverse range of organisms [8–11]. SSR-mediated gene insertion does not rely on potentially error-prone endogenous DNA repair pathways, minimizing deleterious off-target effects commonly seen with other editing approaches [12, 13]. However, intrinsic site-specificities of SSRs limits their applicability to targeting either cognate sequence motifs pre-introduced into the genome of interest or highly homologous “pseudo” sites. SSRs have therefore been engineered extensively to program novel sequence specificities and broaden their utility [10, 14–20].
λ integrase (Int) facilitates bi-directional transfer of the bacteriophage λ genome into the 21 bp attB target site in bacteria via specific recombination with the 241 bp attP attachment site resident in its genome [21, 22]. Both integration and excision catalyzed by Int rely on bacterial DNA-bending factors (IHF, FIS and Xis) that bind to attP. Genomic editing in non-bacterial genomes is enabled by an engineered co-factor independent integrase variant (Int-h/218) [23–26]. Activity of Int-h/218 is also not compromised when used for in vivo recombination in E.coli [24].
Here, we describe engineering of λ integrase to target potential integration sites in the genomes of the diatom Phaeodactylum tricornutum and the algae Nannochloropsis oceanica, two microorganisms of significant interest for use as cell factories. The evolved integrase variants demonstrate a significant specificity switch towards these alternate sites, driven by mutation of residues both proximal and distal to bound DNA substrate. Using these engineered enzymes, we further demonstrate efficient targeted recombination into a single-copy sequence motif with homology to the Nannochloropsis oceanica integration site that is present in human cells.
Materials and methods
Library generation and selection protocol
The integrase library was generated from C3-INT-HA pET22b(+) with GeneMorph II Random Mutagenesis kit using primers 8 and 9 (S1 Table) and then re-amplified using primers tem-INTinf-ndeF and tem-INFinf-ecoR. The library was ligated into pINT selection vectors comprising attPhae2 and attPhae2Lsites after Nde1/EcoR1 restriction, followed by electroporation into TG1 cells and plated on selection plates containing increasing amounts of ampicillin (300–900 μg/ml). Plasmid DNA was isolated from colonies growing under the most stringent selection pressure (900 μg/ml ampicillin) and amplified with primers tem-INTinf-ndeF and tem-INFinf-ecoR. The library was ligated into the selection vector for the next round of selection under more stringent conditions (12mg/ml– 18 mg/ml). C4 was selected from the first round with 900 μg/ml ampicillin while C5 was selected from the second round with 18 mg/ml ampicillin.
Recombination assay using in vitro translated proteins
Wild-type and variant integrase genes were amplified using INF-INT-nde1F and INF-INT-HA-ecoR1R (95°C 5mins, 95°C 5s, 55°C 20s, 72°C 1min, 25 cycles), cut with Nde1 andd EcoR1, and ligated into similarly cut pET22 vecor. Constructs were then amplified with primers 8 and 9 (95°C 5mins, 95°C 5s, 60°C 20s, 72°C 1min, 25 cycles). This created amplicons flanked with both T7 promoter and terminator sequences, which were used for in vitro transcription and translation (IVT). 20 ng of each integrase amplicon was used per reaction using PURExpress® In Vitro Protein Synthesis Kit (NEB) in a total volume of 9 μl. Reactions were incubated at 30°C for 1 h. Intramolecular recombination was then assayed by adding 10 ng relevant plasmid substrate (pINTattB X attBL / pINTattNanno X attNannoL /pINTattHNanno X attHNannoL / pINTattPhae2 X attPhae2L) to a total volume of 10 μl. The mixture was allowed to incubate for 1.5 h at 30°C and 1μl of the reaction was taken for PCR with primers 1 and 8 (95°C 5mins, 95°C 5s, 55°C 20s, 72°C 1min, 30 cycles) or primers 3 and 4 (95°C 5mins, 95°C 5s, 60°C 20s, 72°C 30s, 30 cycles). The relative positions of primers used are shown in Fig 2. Sequences are shown in S1 Table.
Recombination assay using purified proteins
Recombinant integrase (17nM) was incubated with relevant plasmid substrate (1.3nM) (pINTattB X attBL, pINTattNanno X attNannoL, pINTattHNanno X attHNannoL sites) in a total volume of 20μl in buffer comprising 150mM NaCl and 10mM Tris-HCl, pH7.8 at 30°C for 1 h. The mixture was diluted 1/10 using water and 1μl was used for real-time PCR quantification with 100nM of primers 5 and 6. Ct values were converted to copy number of recombined product based on standard curve calculation. End-point PCR was carried out by taking 1μl of undiluted mixture for amplification with primers 1 and 8.
Cell culture
HT1080 cell line was maintained in Dulbecco’s Modified Eagle Medium (DMEM) growth medium supplemented with 10% FBS, 1% L-glutamine and 100 Units/ml of Penicillin and Streptomycin each (Gibco, Life technologies) at 37°C under 5% CO2 in humidified condition. For selection of puromycin-resistant recombinants, puromycin (Gibco, Life technologies) was added to the growth medium (1 μg/ml final concentration). Trypsin-EDTA (Gibco, Life technologies) was used for detaching the adherent cells for passaging.
Transfection
For transfections in HT1080, 3 × 105 cells were seeded per well of a six-well plate (TPP, Switzerland) in DMEM growth medium a day before transfection to obtain 70–90% confluence. Transfections were carried out with Lipofectamine 2000 (Invitrogen, Life technologies) with DNA to Lipofectamine2000 ratio of 1 μg:3 μl. For every transfection per well, DNA (1 μg targeting plasmid, 0.5 μg scIHF2 expression vector with or without 0.5 μg Int-C7 plasmid) and Lipofectamine 2000 were incubated separately in 50 μl of Opti-MEM medium (Life Technologies). The lipid complexes were prepared by mixing DNA and Lipofectamine 2000 reagent and incubating for 20 min at room temperature. The transfection mix was added dropwise onto the cells (under DMEM growth medium without antibiotics) and transfection was allowed to proceed overnight before cells were transferred to a 10 cm cell culture plate (TPP, Switzerland).
Antibiotic selection and screening for targeted cell clones
Forty-eight hours post-transfection, selection with puromycin in growth medium at the concentrations indicated above was initiated. Selection medium was replaced once in 2 days until colonies expanded to ∼0.3–0.4 cm in diameter. At this stage, the colonies were picked by carefully scraping patches of cells with a pipette tip and transferred to 24-well plates for clonal expansion. The clones were sequentially expanded from 24 wells to six-well plates. Genomic DNA was extracted using DNeasy Blood & Tissue Kit (Qiagen, GmbH) as per manufacturer’s protocol for PCR screening.
Identification of recombination events by PCR screening
PCR was performed using GoTaq Flexi DNA polymerase (Promega) to amplify both junctions using primers listed in S1 Table and routinely 500 ng of genomic DNA from parental cells or each recombinant clone as template in 25 μl reactions. The thermal cycling parameters used for primary PCRs from transfected and puromycin selected parental cells were as follows: initial denaturation at 95°C for 2 min, 35 cycles of denaturation at 95°C for 1 min, annealing at 57°C for 1 min and extension at 72°C for 1 min 10 sec (left junction) and 1 min 40 sec (right junction), and a final step of 72°C for 5 min. This was followed by a nested PCR with 1 ul of template from the product of primary PCR as follows: 95°C for 2 min, 35 cycles of denaturation at 95°C for 30 sec, annealing at 57°C for 30 sec and extension at 72°C for 20 sec (left junction) and 1 min (right junction), and a final step of 72°C for 5 min. The PCR samples were analyzed by electrophoresis in 1% agarose gels in 0.5× TBE (Tris-boric acid-EDTA buffer) containing 0.5 μg/ml ethidium bromide, and PCR-amplified products were compared with DNA standard markers and digitally documented under UV illumination (Quantum Vilber Lourmat, Germany). PCR-amplified products were analyzed by sequencing.
Molecular dynamics simulations
The available crystal structure (pdb: 1Z1G [27]) of lambda integrase tetramer bound to a DNA holiday junction (excluding the integrase residues 1 to 74 and mutating the Phe342 back to wildtype Tyr342) was used to generate several mutant models of lambda integrase—DNA complexes. Both the wildtype (WT) and mutant models were subject to Molecular Dynamics (MD) simulations with the pmemd.CUDA module of the program Amber18 [28]. All atom versions of the Amber force fields FF14SB [29] and FF99BSC0 [30] were used to model the protein and DNA respectively. The Xleap module was used to prepare the system for the MD simulations. Each simulation system was neutralized with an appropriate number of counterions. Each neutralized system was solvated in an octahedral box with TIP3P [31] water molecules, with at least a 10 Å boundary between the solute atoms and the borders of the box. During the simulations, Lennard Jones and short-range electrostatic interactions were treated using a cut-off scheme and the long-range electrostatic interactions were treated with the particle mesh Ewald method [6] using a real space cut-off distance of 9 Å. The Settle [32] algorithm was used to constrain bond vibrations involving hydrogen atoms which allowed a time step of 2 fs to be used during the simulations. Solvent molecules and counterions were initially relaxed using energy minimization with restraints on the protein and DNA. This was followed by unrestrained energy minimization to remove any steric clashes. Subsequently the system was gradually heated from 0 to 300 K using MD simulations with positional restraints (force constant: 50 kcal mol-1 Å-2) on the protein and DNA over a period of 0.25 ns allowing water molecules and ions to move freely. During an additional 0.25 ns, the positional restraints were gradually reduced followed by a 2 ns unrestrained MD simulation to equilibrate all the atoms. For each system, a 250 ns production MD at 300 K was carried out in triplicate (assigning different initial velocities to propagate each MD simulation). To enhance the conformational sampling, the systems were subjected to accelerated MD (aMD) [33] simulations as implemented in AMBER 18 [28]. aMD simulations were performed on all three systems using the “dual-boost” version [34]. Conventional MD simulations mentioned earlier were used to derive the aMD parameters (EthreshP, alphaP, EthreshD, alphaD). aMD simulations were carried out for 500 ns each. Simulation trajectories were visualized using VMD [35] and figures were generated using Pymol [36].
Results
Selection of λ integrase variants with altered specificities
Potential λ integrase targeting sites in the genome of the marine diatom Phaeodactylum tricornutum were identified in silico using the canonical 21 bp bacterial attB sequence (Fig 1) to query sequence databases. A highly similar site, termed attPhae2, was identified with deviations from attB at 3 nucleotide positions in the core binding motifs flanking the highly variable overlap motif where reciprocal DNA strand exchange occurs (Fig 1). Recombination into attPhae2 was assessed using an in vitro plasmid-based assay, whereby intramolecular recombination between attPhae2 and attPhae2L (a modified attL sequence adjusted for the overlap and 5’region found in attPhae2) was measured by real-time PCR (Fig 2). The parental integrase (Int-h/218) showed minimal activity for attPhae2 x attPhae2L recombination. In contrast, the previously described hypermorphic Int C3 variant(10) showed ~22-fold improved recombination of this substrate pair (Fig 3A). Int C3 was therefore used as starting point for further engineering using an in vivo directed evolution platform that couples correct recombination of plasmid-borne substrate pairs to survival of E. coli on selective media(10). Random mutagenesis of Int C3 followed by 2 rounds of selection yielded two hits, C4 and C5 (Table 1) with potentially improved activity over Int C3 for recombination using attPhae2 x attPhae2L DNA substrates (Fig 3A).
The overlap sequences are in bold. Sequence deviations from the endogenous λ integrase attB sequence outside the overlap region are underlined. The boxed region denotes key cytosine to guanine change important for specificity determination.
Int activity on plasmid DNA comprising appropriate att sequences yields the indicated products. Both the deletion event and resultant minicircle are detected by PCR using the indicated primers. Table shows expected size of amplification products for indicated primers pairs in presence or absence of recombination.
(A) In vitro translated integrases were incubated with attPhae2 x attPhae2L plasmid substrate and recombination assessed by real-time PCR using primers 2 + 7 (black bars) or primers 3 + 4 (white bars). Activity is presented as fold increases over Int-h/218 (WT). Selectants 1 and 5 showing improved activity over C3 are respectively renamed Int C4 and C5. Blank corresponds to addition of in vitro translation extract only (no integrase expressed). (B) As in (A) using indicated integrase variants and attNanno1 X attNanno1L plasmid substrate. Top graph uses primer 2 + 7 pair. Bottom graph uses primer 3 + 4 pair. Inset shows expression levels of integrases by Western blot. NTC: in vitro translation mix only (no integrase expressed).
Int C4 and C5 were subsequently assayed for recombination of a putative target site identified in the genome of the microalgae Nannochloropsis oceanica (attNanno1), differing from attPhae2 by two and six bases in the left arm of the core binding and overlap sequences respectively (Fig 1). At similar expression levels, both showed potential improved activity over Int C3 for recombination of attNanno1 x attNanno1L (Fig 3B). These were tested again along with a further variant termed C6, comprising all the mutations present in C4 in addition to the H329R mutation identified in C5 (Table 1). As before, all variants potentially showed improved activity over C3 for recombination of attNanno1 substrates (Fig 4). C6 was potentially more active than either C4 and C5, highlighting a possible additive effect of combining mutations from C4 and C5. Notably, the C4 and C6 variants recombined endogenous attB x attBL with reduced efficiency compared to C3 and C5, indicated by lower yield of the 660 bp PCR product scoring for correct recombination (Fig 4B). Analysis of mutations in C4 and C6 highlighted N99D as potentially responsible for the observed specificity switch phenotype based on the proximity of N99 to the DNA substrate in the crystal structure and prior reports highlighting this residue as a specificity determinant [37–39]. We therefore generated a panel of constructs to interrogate the N99D mutation. These comprised Int-h/218 + N99D, Int C3 + N99D, Int C4 + D99N reversion, and Int C6 + D99N reversion. When tested using attNanno x attNannoL substrates, recombination by Int-h/218 + N99D and Int C3 + N99D was potentially improved over Int-h/218 and Int C3 (Fig 5). Conversely, the D99N reversion in Int C4 and Int C6 potentially led to reduced efficiency of recombination. These results highlight a potential key role of the N99D mutation for the observed specificity switch of the C4 and C6 Int variants.
(A) In vitro translated integrases were incubated with indicated DNA substrates and activity determined by real-time PCR using either primers 2 + 7 or primers 3 + 4. Blank corresponds to addition of in vitro translation extract only (no integrase expressed). (B) Same as (A), except that post incubation an aliquot of each reaction was used in end-point PCR using primer 1 + 8 pair (top panels) or primer 3 + 4 pair (bottom panels). Using the first primer pair, successful recombination of the plasmid substrate yields a 660 bp fragment (lower arrow), whilst non-recombined plasmid yields a 1203 bp fragment (upper arrow). Recombination measured by the primer 3 + 4 pair yields a 151 bp band (arrowed).
The indicated integrases were made using in vitro translation and incubated with attNanno1 x attNanno1L plasmid substrate. Scoring of recombination post incubation was carried out by end-point PCR using the primer 3 + 4 pair. Recombination measured by these primers yields a 151 bp band (arrowed). Repeat experiment shown in S1 Fig.
We next tested recombinantly expressed and purified Int C3, C4 and C6 protein variants for more quantitative evaluation of function using a real-time PCR assay. The confirmatory results in Fig 6 show enhanced recombination of attNanno1 substrates by C4 (10-fold improved over C3) and C6 (28-fold improved over C3). We also evaluated recombination of a sequence identified in the human genome with homology to attNanno1 termed attHNanno (Fig 1). Int C4 showed 91-fold improved recombination of HTN over Int C3. C6 showed 305-fold improved recombination of attHNanno over C3. As before, recombination of attB x attBL substrate was less efficient for C4 and C6 compared to C3. We further confirmed these results using purified proteins by end-point PCR assay (Fig 7).
(A) Purified integrases (C3, C4, C6) were incubated with indicated plasmid DNA substrates and activity (copy number of recombined product) determined by real-time PCR using the primer 5 + 6 pair. n = 2 ± SD. Significance (p-value) was determined using two-sided Students t-test. n.s: p < 0.1, *: p < 0.05, **: p < 0.005, ****: p < 0.00005. Values above each bar indicate fold activity relative to C3 integrase on the same substrate pair. (B) The real-time PCR reactions products analysed on agarose gel. Arrow indicates size of correct band (215 bp) indicating recombination event. CON: no enzyme in recombination reaction. Repeat end-point gel shown in S2 Fig.
Purified integrases (C3, C4, C6) were incubated with indicated plasmid DNA substrates and activity determined by end-point PCR using primers 1 and 8. The PCR products corresponding to unrecombined (1203 bp) and recombined (660 bp) template DNA are indicated by arrows. CON: no enzyme in recombination reaction. Duplicate results are shown.
Based on results highlighting a potential key role for the N99D mutation in observed phenotype, we generated a construct termed Int C7 (Table 1) comprising the expected minimal set of mutations required for improved recombination of the novel attNanno1 and attHNanno substrates (N99D and H329R) in addition to the core C3 mutations. Expressed and purified C7 showed 345-fold improvement over C3 for recombination of attNanno1 substrates and 1097-fold improvement for recombination of HTN substrates (Fig 8A). The respective values for C6 were 314 and 405. C7 was 1.1 and 2.7-fold more efficient than C6 for respective recombination of attNanno1 and attHNanno recombination targets. Conversely, C6 and C7 respectively recombined attB x attBL 13 and 6.3-fold less efficiently than C3. This represents a 3925-fold change in specificity for C6 recombination of attNanno1 versus attB substrates and a 5063-fold change in specificity for recombination of attHNanno substrates. The corresponding values for C7 are 2156 and 6856 respectively. Repetition of this experiment using a 10-fold lower final concentration of purified integrase and DNA substrate (Fig 9) led to similar results, with a 943-fold change in specificity for C6 recombination of attNanno1 versus attB substrates and a 1057-fold change in specificity for recombination of attHNanno substrates. In the case of C7, significant fold difference was only observed for the attHNanno substrate with a 915-fold switch in specificity.
(A) Purified integrases (C3, C6, C7) were incubated with indicated plasmid DNA substrates and activity (copy number of recombined product) determined by real-time PCR using the primer 3 + 4 pair. Reactions comprised 17nM integrase and 1.3nM respective DNA substrate. n = 2 ± SD. Significance (p-value) was determined using two-sided Students t-test. n.s: p < 0.1, *: p < 0.05, **: p < 0.005, ****: p < 0.00005. Values above each bar indicate fold activity relative to C3 integrase on the same substrate pair. (B) PCR products from (A) resolved on agarose gel. Arrow indicates position of expected band indicating recombination. Repeat gel shown in S3 Fig.
(A) Purified integrases (C3, C6, C7) were incubated with indicated plasmid DNA substrates and activity (copy number of recombined product) determined by real-time PCR using the primer 3 + 4 pair. Reactions comprised 1.7nM integrase and 0.13nM respective DNA substrate. n = 2 ± SD. Significance (p-value) was determined using two-sided Students t-test. n.s: p < 0.1, *: p < 0.05, **: p < 0.005, ****: p < 0.00005. Values above each bar indicate fold activity relative to C3 integrase on the same substrate pair. (B) PCR products from (A) resolved on agarose gel. Arrow indicates position of expected band indicating recombination. Repeat gel shown in S3 Fig.
Molecular dynamics simulations
The roles of the selected N99D and H329R mutations were further investigated using molecular dynamics (MD) simulations. Structures of the wild type and mutant Int tetramer–DNA complexes remained stable throughout the simulations. For wild type Int bound to attB substrate DNA (attB-Int complex), the side chain of N99 interacts with the amide of guanine (DG8) for ~95% of the simulation (Figs 10 and S4). Note that DG8 base pairs to the attB cytosine boxed in Fig 1. Replacement of N99 to D99 (attB-IntC7 and attB-IntC3_N99D complexes), results in clear loss of the interaction with DG8, possibly explaining lack of activity of variants comprising N99D mutation on attB substrate (Fig 10, S4 and S5A Figs). The side chain of H329 was found to interact with D351 from the neighboring chain in the tetramer for ~25% of the simulation time in the attB-Int complex. However, the H329R mutation resulted in an R329 –D351 salt bridge interaction that was stable for 60% of the simulation time for the attB-IntC3_H329R complex (S4 and S5B Figs). No difference was observed between the N99-DG8 interaction which remained stable as in the attB-Int complex. The combined H329R and N99D mutations present in C7 therefore result in loss of the N99-DG8 interaction but improved R329-D351 interaction (attB-IntC7 complex, Fig 10, S4 and S6A Figs). When DG8 in attB is replaced with cytosine (DC8) as present in attNanno, no interaction is observed between this base and N99 of Int (attBDC8-Int complex, Fig 10, S6B Fig). Note that DC8 base pairs with the guanine residue present in the variant att sequences (boxed in Fig 1). Furthermore, the H329-D351 interaction was reduced from 25% to only 10% of simulation time in this complex (S4 Fig). Similar interactions were observed for attBDC8-IntC3 complex simulations, again shedding light on the poor activity of Int/Int-C3 on the variant att sequences. Replacement of N99 with D99 results in formation of a stable network of interactions between the λ integrase with bound DNA in the attBDC8-IntC3_N99D and attBDC8-IntC7 complexes (Figs 10 and S4). This network engages D99 in interactions with both the amine of cytosine (DC8) and the K95 sidechain through formation of a salt bridge. Additional interactions are observed between the sidechain of K95 and the amine of adenine (DA21), the amide of guanine (DG22) and the nitrogen N3 of adenine (DA21). This large network of interactions was observed to be stable for > 75% of the simulation time and is likely a key determinant of the observed specificity switch. In addition to these interactions, the R329 –D351 interaction was observed for ~ 60% of the simulation time in the attBDC8-IntC7 complex (S4 and S7A Figs). No significant differences were observed for the R329 –D351 interaction in the attBDC8-IntC3_H329R complex compared to attB- IntC3_H329R (S7B Fig).
Tetrameric structure of Int shown in center with individual chains coloured differently. Boxed regions are expanded in the images above to show interactions of Int and Int C7 with attB or the variant attBDC8. Images below depict intra-chain interactions within Int and IntC3_H329R.
In situ targeting of genomic attHNanno by Int-C7 resulting in delivery of functional payloads
Int-C7 was next assayed for intermolecular recombination of a circular target vector carrying attHNannoL into the genomic 21bp attHNanno site on human chromosome 11. We generated a eukaryotic expression vector for Int-C7, termed pEF-ss-Int-C7-CNLS, coding for Int-C7 with a C-terminal nuclear localization signal (NLS) derived from the Simian Virus 40 (SV40) large T-antigen. Transfection into either HEK293ft or HT1080 cells following by Western blotting indicated stable Int C7 NLS expression at levels comparable to Int C3-NLS (Fig 11).
(A) Expression vectors for Int-C3 and Int-C7 were transfected into HEK293ft cells, and cell lysates were prepared by standard methods and analyzed by Western blotting using mouse anti Int-C3 antibodies. Both integrase variants are stably expressed, exhibiting slightly increased molecular weights due to the presence of a C-terminal nuclear localization signal (NLS) when compared with the corresponding purified proteins (from E.coli) loaded as controls, which do not have a tag. Endogenous actin protein was probed to control for loading of lysates. (B) Two different amounts of expression vector for Int C7, as indicated, were transfected into human fibrosarcoma HT1080 cells and analyzed by WB as in (A), with lysate from non-transfected cells serving as negative control. See S8 Fig for uncropped blots.
In order to target the genomic attHNanno sequence for integration in situ, we chose the following strategy (Fig 12): A 6.5 kb target vector carrying attHNannoL plus a selection marker and a GFP expression cassette as payload was co-transfected with expression vectors for Int-C7 and co-factor scIHF2 [40] into human HT1080 cells, followed by puromycin selection at 48 hrs. Post-selection, genomic DNA from bulk selectant cultures was isolated and PCR employed to identify the left and right recombination junctions (Fig 13). The results showed that with 750ng and 1000ng of transfected target vector, both junctions can be generated by nested PCR, which was verified by PCR product sequencing (Fig 13).
See text for details. The primers used to detect successful intermolecular recombination events at the junctions are indicated.
See Fig 12 for corresponding positions of PCR primers. The sequencing results of products for left and right junction are shown at the bottom.
In this experimental set-up, intramolecular recombination in situ between attHNanno x attHNannoL on the target vector (Fig 12) can first result in a seamless vector carrying only attHNannoL, which then recombines into the locus on chromosome 11. However, PCR/sequencing analysis of the junctions revealed that unrecombined plasmid DNA has been inserted by Int-C7 into the genomic locus. This indicates that contrary to expectations on thermodynamic grounds, Int-C7 surprisingly catalyzed intermolecular recombination more efficiently than intramolecular recombination, at least under the chosen experimental condition and with this substrate.
We screened individual cell clones from co-transfections and identified positive lines such as CL6, which showed the predicted junction product in the primary PCR as exemplified in Fig 13. Southern analysis indicated single copy payload integration into attHNanno in 3 out of 4 CL6 subclones (Fig 14A and 14B). The identity of the products was verified by sequencing. Furthermore, transgene expression (GFP) from this targeted human locus was homogeneous and sustained in the three CL6 subclones, which could be important for future gene/cell therapy approaches (Fig 14C). Using the same strategy, we could also demonstrate successful targeting of the endogenous attHNanno site in bulk selectants of HEK293ExPi cells (not shown).
(A) Feature map highlighting restriction sites and predicted orientation of GFP-expressing payload when integrated into attHNanno site. (B) Southern blot indicating single-copy insertion of payload in subclones 2,6 and 8 determined by presence of expected 7.3 kB band (highlighted in A). ‘+’ corresponds to positive control DNA (C) FACS analysis of HT1080 subclones 2, 6, and 8 showing increased GFP expression over control cells.
Bioinformatics analysis of the genomic attHNanno locus (chr11:110,828,911) shows the target sequence to be located in a gene-deserted region, with a single exon of pseudogene HNRPA1P60 positioned about 40 kb upstream (Fig 15A). No enhancer regulatory elements appear to be present in the vicinity of this locus. Furthermore, in the human pluripotent embryonic stem cell line H1, this locus belongs to a topologically associated domain (TAD) (Fig 15B), with an epigenetic H4K20me1 mark about 1.5 kb downstream (Fig 15C). Taken together, these findings indicated that the locus can be considered a safe harbor site favorable for transgene expression.
(A) Analysis of the genomic attHNanno locus (chr11:110,828,911) shows the target sequence to be located in a gene-deserted region. (B) No enhancer regulatory elements appear to be present in the vicinity of this locus. (C) An epigenetic H4K20me1 mark is located ~ 1.5 kb downstream.
Discussion
We have described selection of λ integrase variants displaying notable specificity switches towards pseudosites identified in microalgae and human genomes. Attempts at targeting the attNanno1 and attPhae2 pseudosites in vivo in the host microalgae using the engineered Int variants have thus far been unsuccessful, most likely due to problems related to the efficiency of both foreign DNA uptake and transient expression of transgenes. The novel attHNanno site that we identified in mammalian cells was, however, successfully targeted. Bioinformatic analysis suggests attHNanno is a safe harbor that can now be targeted by the orthogonal Int C7 enzyme.
Residues N99 and H329 of λ integrase have previously been substituted for the corresponding amino acids (D and R respectively) in the highly related bacteriophage HK022 integrase, enabling a specificity switch towards the HK022 attB site [38, 39]. Notably, the attB site targeted by HK022 shares the cytosine to guanine transversion present in the algal and human sequences targeted in this study (Fig 1). Identical selection of these amino acids in this study both further confirms their role in target site discrimination and illustrates how readily directed evolution can recapitulate natural selection. MD simulations provide additional insight into the orthogonal phenotypes observed in this study. A favourable network of interactions promotes binding and activity of Int variants comprising N99D in the DNA core binding domain to substrates with the cytosine to guanine transversion. H329 in the Int catalytic domain is distal to bound DNA. Mutation to R329 potentially introduces up to 4 intra-subunit salt bridges that could stabilize tetramer binding to DNA and favour recombination of non-cognate sequences. Future structural studies will shed light on the roles of these amino acids in substrate discrimination.
Scarless gene curing can be achieved by sequential recombinase mediated cassette exchange (RMCE) [41, 42]. Diseased genes flanked by pseudo att sites recognized by Int are excised in a first reaction and then replaced with the correct wild-type gene in a second reaction. RMCE is limited by the availability of pseudo sites and cognate enzymes targeting these. The highly active C7 Int variant along with its orthogonal att site potentially raises the number of genes that can be targeted. The odds of identifying appropriate pseudo-sites will also be improved by searching for flanking motifs orthogonal to C3 and C7 and subsequent combined use of these enzymes in a dual RMCE reaction.
In conclusion, we have evolved highly active and orthogonal Int variants with potential applications in microbial and human genome engineering.
Supporting information
S1 Fig. N99 acts as a specificity determinant.
The indicated integrases were made using in vitro translation and incubated with attNanno1 X attNanno1L plasmid substrate. Scoring of recombination post incubation was carried out by end-point PCR using the primer 3 + 4 pair. Recombination measured by these primers yields a 151 bp band (arrowed).
https://doi.org/10.1371/journal.pone.0292479.s001
(TIF)
S2 Fig. The real-time PCR reactions products analysed on agarose gel.
Arrow indicates size of correct band (215 bp) indicating recombination event. CON: no enzyme in recombination reaction. NTC: no template control.
https://doi.org/10.1371/journal.pone.0292479.s002
(TIF)
S3 Fig. Activity assay for purified integrases.
(A) Purified integrases (C3, C6, C7) were incubated with indicated plasmid DNA substrates and activity determined by real-time PCR using the primer 5 + 6 pair. Reactions comprised 17nM integrase and 1.3nM respective DNA substrate. PCR products subsequently resolved on gels shown. Yellow boxed lanes denote repeat experiment. Arrows show position of expected band indicating recombination. (B) As in (A), using 10-fold less enzyme and DNA substrate.
https://doi.org/10.1371/journal.pone.0292479.s003
(TIF)
S4 Fig. Interactions (%) between indicated bases/amino acids calculated by MD simulations.
https://doi.org/10.1371/journal.pone.0292479.s004
(TIF)
S5 Fig. Molecular dynamics simulations of indicated Int–DNA complexes.
Tetrameric structure of Int shown on left with individual chains coloured differently. Boxed regions are expanded in the images on the right to show interactions with attB or the variant attBDC8 and intra-chain reactions. (A) attB-Int (B) attB-IntC3-H329R.
https://doi.org/10.1371/journal.pone.0292479.s005
(TIF)
S6 Fig. Molecular dynamics simulations of indicated Int–DNA complexes.
Tetrameric structure of Int shown on left with individual chains coloured differently. Boxed regions are expanded in the images on the right to show interactions with attB or the variant attBDC8 and intra-chain reactions. (A) attB-IntC7 (B) attBDC8-Int, DNA_MUT: G8C, C22G.
https://doi.org/10.1371/journal.pone.0292479.s006
(TIF)
S7 Fig. Molecular dynamics simulations of indicated Int–DNA complexes.
Tetrameric structure of Int shown on left with individual chains coloured differently. Boxed regions are expanded in the images on the right to show interactions with attB or the variant attBDC8 and intra-chain reactions. (A) attBDC8-IntC7 (B) attBDC8-IntC3-H329R.
https://doi.org/10.1371/journal.pone.0292479.s007
(TIF)
References
- 1. Kumar G, Shekh A, Jakhu S, Sharma Y, Kapoor R, Sharma TR. Bioengineering of Microalgae: Recent Advances, Perspectives, and Regulatory Challenges for Industrial Application. Front Bioeng Biotechnol. 2020;8:914. pmid:33014997
- 2. Lenka SK, Carbonaro N, Park R, Miller SM, Thorpe I, Li Y. Current advances in molecular, biochemical, and computational modeling analysis of microalgal triacylglycerol biosynthesis. Biotechnol Adv. 2016;34(5):1046–63. pmid:27321475
- 3. Nymark M, Sharma AK, Sparstad T, Bones AM, Winge P. A CRISPR/Cas9 system adapted for gene editing in marine algae. Sci Rep. 2016;6:24951. pmid:27108533
- 4. Kasai Y, Oshima K, Ikeda F, Abe J, Yoshimitsu Y, Harayama S. Construction of a self-cloning system in the unicellular green alga Pseudochoricystis ellipsoidea. Biotechnol Biofuels. 2015;8:94. pmid:26140053
- 5. Greiner A, Kelterborn S, Evers H, Kreimer G, Sizova I, Hegemann P. Targeting of Photoreceptor Genes in Chlamydomonas reinhardtii via Zinc-Finger Nucleases and CRISPR/Cas9. Plant Cell. 2017;29(10):2498–518. pmid:28978758
- 6. Bibikova M, Beumer K, Trautman JK, Carroll D. Enhancing gene targeting with designed zinc finger nucleases. Science. 2003;300(5620):764. pmid:12730594
- 7. Naduthodi MIS, Sudfeld C, Avitzigiannis EK, Trevisan N, van Lith E, Alcaide Sancho J, et al. Comprehensive Genome Engineering Toolbox for Microalgae Nannochloropsis oceanica Based on CRISPR-Cas Systems. ACS Synth Biol. 2021;10(12):3369–78. pmid:34793143
- 8. Gottfried P, Lotan O, Kolot M, Maslenin L, Bendov R, Gorovits R, et al. Site-specific recombination in Arabidopsis plants promoted by the Integrase protein of coliphage HK022. Plant Mol Biol. 2005;57(3):435–44. pmid:15830132
- 9. Makhija H, Roy S, Hoon S, Ghadessy FJ, Wong D, Jaiswal R, et al. A novel lambda integrase-mediated seamless vector transgenesis platform for therapeutic protein expression. Nucleic Acids Res. 2018;46(16):e99.
- 10. Siau JW, Chee S, Makhija H, Wai CM, Chandra SH, Peter S, et al. Directed evolution of lambda integrase activity and specificity by genetic derepression. Protein Eng Des Sel. 2015;28(7):211–20.
- 11. Venken KJ, He Y, Hoskins RA, Bellen HJ. P[acman]: a BAC transgenic platform for targeted insertion of large DNA fragments in D. melanogaster. Science. 2006;314(5806):1747–51. pmid:17138868
- 12. Huang X, Yang D, Zhang J, Xu J, Chen YE. Recent Advances in Improving Gene-Editing Specificity through CRISPR-Cas9 Nuclease Engineering. Cells. 2022;11(14). pmid:35883629
- 13. Yee JK. Off-target effects of engineered nucleases. FEBS J. 2016;283(17):3239–48. pmid:27208701
- 14. Tay Y, Ho C, Droge P, Ghadessy FJ. Selection of bacteriophage lambda integrases with altered recombination specificity by in vitro compartmentalization. Nucleic Acids Res. 2010;38(4):e25. pmid:19966270
- 15. Buchholz F, Stewart AF. Alteration of Cre recombinase site specificity by substrate-linked protein evolution. Nat Biotechnol. 2001;19(11):1047–52. pmid:11689850
- 16. Bolusani S, Ma CH, Paek A, Konieczka JH, Jayaram M, Voziyanov Y. Evolution of variants of yeast site-specific recombinase Flp that utilize native genomic sequences as recombination target sites. Nucleic Acids Res. 2006;34(18):5259–69. pmid:17003057
- 17. Gaj T, Sirk SJ, Tingle RD, Mercer AC, Wallen MC, Barbas CF, 3rd. Enhancing the specificity of recombinase-mediated genome engineering through dimer interface redesign. J Am Chem Soc. 2014;136(13):5047–56.
- 18. Voziyanova E, Anderson RP, Shah R, Li F, Voziyanov Y. Efficient Genome Manipulation by Variants of Site-Specific Recombinases R and TD. J Mol Biol. 2016;428(5 Pt B):990–1003. pmid:26555749
- 19. Voziyanova E, Li F, Shah R, Voziyanov Y. Genome targeting by hybrid Flp-TAL recombinases. Sci Rep. 2020;10(1):17479. pmid:33060660
- 20. Abi-Ghanem J, Chusainow J, Karimova M, Spiegel C, Hofmann-Sieber H, Hauber J, et al. Engineering of a target site-specific recombinase by a combined evolution- and structure-guided approach. Nucleic Acids Res. 2013;41(4):2394–403. pmid:23275541
- 21. Nash HA. Purification and properties of the bacteriophage lambda Int protein. Methods Enzymol. 1983;100:210–6. pmid:6225930
- 22. Dynamic Landy A., structural, and regulatory aspects of lambda site-specific recombination. Annu Rev Biochem. 1989;58:913–49.
- 23. Lorbach E, Christ N, Schwikardi M, Droge P. Site-specific recombination in human cells catalyzed by phage lambda integrase mutants. J Mol Biol. 2000;296(5):1175–81. pmid:10698624
- 24. Christ N, Corona T, Droge P. Site-specific recombination in eukaryotic cells mediated by mutant lambda integrases: implications for synaptic complex formation and the reactivity of episomal DNA segments. J Mol Biol. 2002;319(2):305–14. pmid:12051908
- 25. Christ N, Droge P. Genetic manipulation of mouse embryonic stem cells by mutant lambda integrase. Genesis. 2002;32(3):203–8. pmid:11892009
- 26. Suttie JL, Chilyon M, Que Q. US patent no. 7351877. 2008.
- 27. Biswas T, Aihara H, Radman-Livaja M, Filman D, Landy A, Ellenberger T. A structural basis for allosteric control of DNA recombination by lambda integrase. Nature. 2005;435(7045):1059–66. pmid:15973401
- 28. Case D.A. IYB-S, S.R. Brozell. et al Amber 2018. 2018:1–923.
- 29. Maier JA, Martinez C, Kasavajhala K, Wickstrom L, Hauser KE, Simmerling C. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. J Chem Theory Comput. 2015;11(8):3696–713. pmid:26574453
- 30. Perez A, Marchan I, Svozil D, Sponer J, Cheatham TE 3rd, Laughton CA, et al. Refinement of the AMBER force field for nucleic acids: improving the description of alpha/gamma conformers. Biophys J. 2007;92(11):3817–29. pmid:17351000
- 31. Jorgensen WL, Chandrasekhar, Jayaraman., Madura, Jeffry D., Impey, Roger W., Klein, Michael L. Comparison of simple potential functions for simulating liquid water. The Journal of Chemical Physics. 1983;79:926–35.
- 32. Miyamoto S, Kollman, Peter A. Settle: An analytical version of the SHAKE and RATTLE algorithm for rigid water models. Journal of Computational Chemistry. 1992.
- 33. Pierce LC, Salomon-Ferrer R, Augusto FdOC, McCammon JA, Walker RC. Routine Access to Millisecond Time Scale Events with Accelerated Molecular Dynamics. J Chem Theory Comput. 2012;8(9):2997–3002. pmid:22984356
- 34. Hamelberg D, de Oliveira CA, McCammon JA. Sampling of slow diffusive conformational transitions with accelerated molecular dynamics. J Chem Phys. 2007;127(15):155102. pmid:17949218
- 35. Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graph. 1996;14(1):33–8, 27–8. pmid:8744570
- 36.
DeLano WL. The PyMOL molecular graphics system. San Carlos, CA, USA: DeLano Scientific. 2002.
- 37. Cheng Q, Swalla BM, Beck M, Alcaraz R Jr., Gumport RI, Gardner JF. Specificity determinants for bacteriophage Hong Kong 022 integrase: analysis of mutants with relaxed core-binding specificities. Mol Microbiol. 2000;36(2):424–36.
- 38. Dorgai L, Yagil E, Weisberg RA. Identifying determinants of recombination specificity: construction and characterization of mutant bacteriophage integrases. J Mol Biol. 1995;252(2):178–88. pmid:7674300
- 39. Yagil E, Dorgai L, Weisberg RA. Identifying determinants of recombination specificity: construction and characterization of chimeric bacteriophage integrases. J Mol Biol. 1995;252(2):163–77. pmid:7674299
- 40. Corona T, Bao Q, Christ N, Schwartz T, Li J, Droge P. Activation of site-specific DNA integration in human cells by a single chain integration host factor. Nucleic Acids Res. 2003;31(17):5140–8. pmid:12930965
- 41. Baer A, Bode J. Coping with kinetic and thermodynamic barriers: RMCE, an efficient strategy for the targeted integration of transgenes. Curr Opin Biotechnol. 2001;12(5):473–80. pmid:11604323
- 42. Elias A, Kassis H, Elkader SA, Gritsenko N, Nahmad A, Shir H, et al. HK022 bacteriophage Integrase mediated RMCE as a potential tool for human gene therapy. Nucleic Acids Res. 2020;48(22):12804–16. pmid:33270859