Circular Permutation of Red Fluorescent Proteins

Circular permutation of fluorescent proteins provides a substrate for the design of molecular sensors. Here we describe a systematic exploration of permutation sites for mCherry and mKate using a tandem fusion template approach. Circular permutants retaining more than 60% (mCherry) and 90% (mKate) brightness of the parent molecules are reported, as well as a quantitative evaluation of the fluorescence from neighboring mutations. Truncations of circular permutants indicated essential N- and C- terminal segments and substantial flexibility in the use of these molecules. Structural evaluation of two cp-mKate variants indicated no major conformational changes from the previously reported wild-type structure, and cis conformation of the chromophores. Four cp-mKates were identified with over 80% of native fluorescence, providing important new building blocks for sensor and complementation experiments.


Introduction
Circular permutation of GFP and its variants has markedly expanded the utility of fluorescent proteins (FPs), enabling the development of genetically encoded sensors [1,2,3] and facilitating the use of FPs in fluorescence complementation assays [4,5]. The extension of this technology to red fluorescent proteins, while potentially quite useful, has been limited; bright and thermostable circular permutants or complementary peptides have been difficult to design [1,2], likely because the local chromophore environment is altered by the rearrangements attempted to date [6]. In an effort to address this limitation, we undertook a systematic effort to produce bright and stable circularly permutated variants of two naturally occurring proteins, mCherry, a monomeric variant of the Discosoma sp. coral protein DsRed [7], and mKate, a monomeric variant of the anemone Entacmaea quadricolor protein eqFP578 [8,9]. Both proteins have advantageous tissue imaging properties, including relative brightness and long wavelength emission characteristics; moreover, both are spectrally distinct from GFP and its derivatives, potentially expanding the color palette of genetically encoded sensors or complementation pairs that can be used simultaneously with GFP-based constructs.
Here we report the systematic evaluation of circular permutation sites in mCherry and mKate, and the development of highly efficient circularly permutated red fluorescent proteins (cp-RFPs).

Generation of cp-RFPs from Tandem Fusion Templates
To efficiently probe multiple permutation sites, we created tandem fusion templates [10] of mKate (pRSET-tdmKate) and mCherry (pRSET-tdmCherry). For the former, the mKate (231 AA) coding sequence was amplified from plasmid TagFP635(Evrogen) using Phusion High-Fidelity DNA Polymerase (Finnzymes) with copy 1, forward (F1) BamHI-mK-1F (atcaggatccatgtctgagctgattaaggaga) and reverse (R1) HindIII-mK-231R (ctacaagctttcatttgtgccccagtttgctagg) primers (restriction sites intalicized). The amplified product was inserted into the BamHI/HindIII sites of pRSET-A (Invitrogen) to produce pRSET-mKate. The second copy of the mKate reading frame was amplified from the TagFP635 template with (F2) BamHI-mK-1F and (R2) BamHI-mK-231R-linker (ctacggatccgccggtaccgcctttg tgccccagtttgctagg) primers. Note the absence of a stop codon in the reverse primer (R2) and the underlined linker such that the BamHI site provides the residues GS in the linker peptide GGTGGS. The PCR product was inserted into the BamHI site of pRSET-mKate and proper orientation was determined by PCR and DNA sequencing. The final plasmid (pRSET-tdmKate) contains two tandem mKate coding sequences separated by the coding sequence for linker peptide GGTGGS, in a contiguous reading frame (Fig. 1A). A similar strategy was used to construct pRSET-tdmCherry using the mCherry (236 AA) coding sequence from pRSET-B mCherry [7], primers (F1) BamHI-mC-1F (ctacggatccatggtgagcaagggcgaggagga), (R1) HindIII-mC-236R (ctacaagctttcacttgtacagctcgtccat), (F2) BamHI-mC-1F and (R2) BamHI-mC-236R-linker (ctacggatccgccggtaccgcccttgtacagctcgtcca). The tandem fusion templates were used as PCR templates to generate cp-RFPs by systematically varying the N-and C-terminal primers. To construct cp-mKate 156-155 , the XhoI site was replaced with the NheI in the forward primer and the amplified fragment was cloned into the NheI and EcoRI sites of pRSET-A to prevent recreation of cp-mKate 154-155 variant ( Fig. 1B,C). Truncated cp-mKate variants were constructed from the tandem fusion mKate template using a similar PCR strategy.
Screening of cp-RFPs in E.coli E.coli BL21 Star (DE3) pLysS (Invitrogen) cells were transformed with either pREST-cp-mCherry or pRSET-cp-mKate variant plasmids and plated on LB agar plates supplemented with 100 mg/mL ampicillin. After incubation at 37uC for 24 h, colonies were screened for red fluorescence using a widefield macroimaging system (OV100, Olympus, Japan; 545 nm excitation/570-625 nm emission). Colonies were rescreened after 1-7 days at 2-8uC. Images were analyzed for brightness using Image J and normalized to mCherry or mKate -transformed colonies; vector control colonies were used as the zero fluorescence background.

Protein Expression and Purification
Overnight seed cultures of transformed BL21 Star (DE3) pLysS cells were used to inoculate (1:40 dilution) 500 mL Terrific Broth containing 100 mg/mL ampicillin. The cultures were incubated at 37uC (250 rpm) until OD 600 ,1.0; 0.5 mM IPTG was added and cultures incubated for 16 h at 20uC (250 rpm). Cells were collected by centrifugation at 60006g for 10 minutes at 4uC and resuspended in 20 mM HEPES (pH7.4) with 350 mM NaCl and 0.1% TritonX-100. Cultures were sonicated and cell debris removed by centrifugation at 20,0006g for 30 minutes at 4uC. The cleared lysate was mixed with 3 ml of Profinity IMAC Ni-Charged Resins (Bio-Rad) and incubated on a rocker platform for 5 minutes, the resin poured onto a 0.864 cm chromatography column (Bio-Rad), washed 3 times with 20 mM HEPES (pH 7.4) buffer containing 350 mM NaCl and 0.1% TritonX-100, and 3 times with 20 mM HEPES (pH 7.4) containing 350 mM NaCl; each wash was followed by low speed centrifugation at 8006g for 30 seconds. Proteins were eluted with 20 mM HEPES (pH 7.8) containing 350 mM NaCl and 300 mM imidazole buffer and the pooled protein fractions desalted and concentrated using an Amicon Ultra-15 device (Millipore) with 20 mM HEPES (pH 7.4), refilling 2-3 times. Protein purity was checked by SDS-PAGE and the concentration was measured (Pierce BCA Protein Assay).

Crystallization and Structure Determination
For crystallization the coding region of cp-mKate 154-153 and cp-mKate 168-167 was amplified and cloned into a modified pET28a expression plasmid (Novagen) yielding an N-terminal hexahistidine SUMO fusion protein, with the tagged moiety cleavable with the protease Ulp-1 from S. cervisiae.
E.coli BL21 (DE3) cells (Novagen) were transformed by cp-mKate plasmids. Protein expression and purification were performed according to protocols described previously [6]. Crystals were obtained by hanging drop vapor diffusion by mixing equal volumes of protein (,35 mg/ml) and reservoir solution followed by incubation at 20uC. Crystals were obtained with the reservoir solution containing 0.1 M Tris-HCl pH 7.4, 0.2 M MgCl 2 and 18% PEG3350 (for cp-mKate 154-153 ) or 0.1 M Magnesium formate dihydrate and 14% PEG 3350 (for cp-mKate 168-167 ). All crystals were cryo-protected using crystallization solutions supplemented with 20% xylitol, frozen in liquid nitrogen, and kept at 100 K during data collection.
Data sets were collected using synchrotron radiation at the Cornell High Energy Synchrotron Source (CHESS, Ithaca). Data reduction was carried out with the software package HKL2000 [14]. Phases were obtained from molecular replacement using the software package PHENIX [15] with the available structure of mKate (1.8 Å , pH 2.0, PDB code: 3BX9) [16] as the search models. Manual refinement in COOT [17] and minimization using PHENIX [15] yielded the final models with good geometry.

Circular Permutation of mCherry
To efficiently scan mCherry for promising circular permutation sites we created multiple cp-mCherry clones by PCR amplification of a tandem-mCherry template (pRSET-tdmCherry, Methods) and directly compared the red fluorescence of bacterial colonies with those transformed by pRSET-mCherry. Initial cp-mCherry constructs targeted the loops of mCherry from the 4 th to 10 th bstrands (Fig. 2). None of the cp-mCherry variant colonies exhibited appreciable fluorescence after overnight incubation at 37uC when compared with colonies transformed with wildtype mCherry, whereas after further incubation for 24 h at either room temperature or 2-8uC, fluorescence was detected in cp-mCherry 158-157 , cp-mCherry 175-174 , and cp-mCherry 193-192 colonies. Other variant colonies did not exhibit appreciable fluorescence even after 7 days at 2-8uC (data not shown).
Initial screening identified three sites in distinct loops connecting b strands. We probed the flanking sequences at each of these sites by constructing and analyzing additional cp-mCherry variants. The brightest colony (cp-mCherry  ) exhibited only 1% of mCherry after 24 h incubation at 37uC, indicating that the circularly permutated proteins matured slowly in E.coli. Based on this observation all screens were scored after 24 h at 37uC and further 72 h at 2-8uC ( Table 1). Colony fluorescence data indicated that permissive sites were grouped in three loops between 7 th and 8 th b-strands, 8 th and 9 th b-strands, 9 th and 10 th b-strands, and extending to ends of flanking b-strands in each region, respectively (Fig. 2). Validation of the bacterial colony fluorescence assay with fluorescence measurements of purified cp-mCherry proteins in 20 mM HEPES (pH 7.4) indicated that cp-mCherry  is the brightest variant from all of the screened circular permutations, displaying approximately 90% relative brightness on a chromophore basis and 60.6% protein brightness compared to mCherry ( Table 2). The difference between these measurements indicates suboptimal protein folding, as discussed below. cp184-mCherry was previously reported to display 18% of native mCherry fluorescence when proteins are expressed in E.coli, and 37% of fluorescence when the isolated proteins are compared [18]. Evolution of cp193-mCherry to cp193g7 (corresponding to cp-mCherry  ) by random mutagenesis improved the brightness of this variant, achieving 61% of mCherry brightness on a protein basis [19].

Circular Permutation of mKate
We performed a similar circular permutation analysis of the monomeric far-red protein mKate, which derives from the sea anemone Entacmaea quadricolor and shares structural similarities with mCherry. Based on structural and sequence similarities, we first tested three cp-mKate variants that corresponded to the brightest cp-mCherry variants in each region of permissive sites. Fluorescence was detected in colonies transformed with cp-mKate 151-150 , cp-mKate 167-166 , and cp-mKate 189-188 . As with the cp-mCherry evaluation, we also screened the sequences flanking the three identified circular permutation sites in mKate for additional tolerant sites. In contrast to the slow development of fluorescence in cp-mCherry colonies, most of the fluorescent cp-mKate variants exhibited appreciable fluorescence after 24 h incubation at 37uC ( Table 3). cp-mKate 189-188 is the brightest circular permutation variant in E.coli. However, analysis of purified proteins in 20 mM HEPES (pH 7.4) indicated that the brightness of cp-mKate 149-148 , mKate 151-150 , cp-mKate 167-166 , and cp-mKate 168-167 was quite high, exceeding 80% of fluorescence of native mKate, indicating highly efficient fluorescent configurations. The 442 nm absorption peak (Fig. 3B), which emits green fluorescence at 532 nm, revealed a slight augmentation of green fluorescence in cp-mKates. The ratio of absorption at 588 nm and 442 nm ( Table 4) indicated a relatively high percentage of green components in cp-mKate 154-153 , cp-mKate 168-167 , cp-mKate 187-186 , and cp-mKate 189-188 , whereas cp-mKate 149-148 displays a fluorescence spectrum similar to the native mKate. The quantum yield at 588 nm for the cp-mKates is only slightly lower than mKate, whereas the ratio of A 588 /A 280 indicated that a number of cp-mKate molecules (cp-mKate 154-153 , cp-mKate 187-186 , cp-mKate 189-188 ) have a substantially lower absorption at 588 nm with matched protein concentration ( Table 4).

Truncations of cp-mKate
To explore the possibility that the fluorescence of individual circular permutations is influenced by N-or C-terminal residues, we next determined the sensitivity of fluorescence to deletions at the amino and carboxy termini of the fluorescent cp-mKate permutants. A series of variants with truncated amino or carboxy ends were constructed. In this regard an individual construct could   Crystal Structure of cp-mKate X-ray crystallography on cp-mKate 154-153 and cp-mKate 168-167 allowed us to resolve the structure of the two permutants at 3.0Å and 1.7Å resolution, respectively ( Table 6 and Fig. 4). The resolved structure is quite similar to that previously reported for wild-type mKate [13] (route mean square deviations of 0.31Å and 0.23Å , respectively), revealing an elliptical b-barrel that is properly folded in cp-mKate 168-167 and indistinguishable in tertiary structure to that of wild-type mKate (Fig. 4A). The linker that connects the two reoriented components of the circularly permutated molecule appears in the loop structure and locates Fluorescence of cp-mCherry relative to mCherry with fixed protein concentration (BCA assay). f Our data; the published data are 72,000 [7], 78,000 [9]. g Published values [18,19], which were based on the protein quantification (absorption at 280 nm).    at one end of the b-barrel (Fig. 4A). Minor conformational changes in cp-mKate were restricted to the loop structures at both ends of the b-barrel and may account for the reported red-green shift. The chromophore resides entirely in cis-conformation, however, with hydrogen-bonding with Trp 90, Arg92 and Ser143 (Fig. 4B).

Discussion
The generation of large numbers of circular permutations by tandem template PCR and fluorescence screening of bacterial colonies is a highly efficient approach to exploring the potential permutation variants of fluorescence proteins. We have systematically examined mCherry and mKate by creating circular permutants at the loop regions in which the greatest flexibility would be expected. In attempting to develop a highly fluorescent circularly permutated red protein, design strategies aimed at structurally mimicking cp-EGFP, which is the basis for the successful GCaMP Ca 2+ sensors [3,20,21], proved not to be a successful approach, as the analogous cp variants of both mCherry and mKate failed to show significant fluorescence. Structural analysis revealed three highly homologous loop regions in mCherry and mKate, and these areas along with flanking sites were systematically explored. Each region tolerated circular permutation, but fluorescence varied markedly, with most efficient permutation sites tending to occur within the central loop regions.
As shown in Table 1 and Table 2, the brightest mCherry variant (cp-mCherry 194-193 ) retained 60.6% of the brightness of native mCherry on a protein basis, but strong fluorescence was also observed for circular permutations cp-Cherry 159-158 , cp-Cherry 160-159 , cp-mCherry 175-174 , cp-mCherry 190-189 , cp-mCherry  and cp-mCherry  . All of the cp-mCherry variants displayed similar absorption spectra as native mCherry (Fig. 3A). Moreover, the red chromophore in cp-mCherry variants and mCherry are functionally similar, having nearly equivalent extinction coefficient and quantum yield values. Thus, the marked difference in relative brightness between cp-mCherry variants and native mCherry when evaluated on a chromophore and equivalent concentration of protein basis indicates that the decreased brightness of the cp-mCherry proteins is due to incomplete folding of a significant fraction of the protein. This interpretation is further supported by the fact that the 6 amino acid mutations in cp193g7, which have been shown to improve protein folding efficiency, resulted in a much higher fluorescence than observed in cp193-mCherry [19], similar to the brightness effect of the superfolder green fluorescent protein [22].
In general the fluorescence of cp-mCherry permutants from native mCherry developed more slowly, and achieved much higher brightness when bacteria were further incubated at lower temperature. By contrast, cp-mKate variants developed significant fluorescence in bacteria by 24 h at 37uC ( Table 3). The brightest variant in the bacterial assay was cp-mKate 189-188 , with colonies achieving 55.2% of the brightness of native mKate after 72 h incubation. However, the brightness of bacterial colonies did not strictly correlate with protein fluorescence, and proteins isolated from only modestly fluorescent colonies were among the brightest proteins identified. Thus although cp-mKate 189-188 demonstrated the brightest colony fluorescence, the purified protein displayed 36.3% of native mKate fluorescence, whereas the relative fluorescence of cp-mKate 149-148 , cp-mKate 151-150 , cp-mKate 167-166 , and cp-mKate 168-167 exceeded 80% at pH 7.4. The higher fluorescence of these proteins may relate to cytosolic factors such as pH, resulting in higher percentage of properly folded protein and mature red chromophore in the purified proteins.
Individual sites for the modification of mKate that tolerate circular permutation or splitting have been previously reported. A voltage probe used a cp-mKate(180) [23] that contained 3 duplicated amino acids at the N-ternimus and otherwise corresponded to the   [24] belongs to the highly fluorescent Loop 7-8 region ( Table 3). We also confirmed the tolerance of selected sites to peptide insertions, which was found to be robust (data not shown). Although mKate and its variants have been widely used as red fluorescent proteins, these proteins exhibit green fluorescence to a variable degree [25]. We found that circular permutation of mKate enhanced green fluorescence in several constructs, augmenting absorption at 442 nm. As any green fluorescence of mKate proteins will contribute to the total fluorescence after alkaline-denaturation, a precise measurement of the extinction coefficient of the red chromophore in cp-mKates by this method is not valid. As with mCherry circular permutants, red fluorescence quantum yield is only slightly decreased in cp-mKates. In the case of mCherry, the extinction coefficients of circularly permutated and wild-type proteins are quite similar, and the decrease in the A 587 /A 280 absorption ratio indicates that the contribution of poorly folded proteins is the major factor in loss of brightness ( Table 2 and Fig. 3A). However, for mKate variants with lower brightness than the native protein, similar A 588 /A 280 ratios between these forms reflects not only the effect of improperly folded protein, but the extent to which permutation has resulted in a red-green shift. Interestingly, we found very bright cp-mKates with substantial red-green shifts, such as cp-mKate 168-167 , and very bright variants without a substantial shift, such as cp-mKate 149-148 ( Table 4 and Fig. 3B). The overall brightness of these constructs suggests minimal effects of incomplete folding at pH 7.4, whereas loss of brightness in variants with similar redgreen shifts, such as seen in a comparison of cp-mKate 168-167 and cp-mKate 187-186 , indicates less efficient folding and chromophore stabilization in the latter variant ( Table 4 and Fig. 3B). cp-mKate 149-148 was the brightest protein identified with over 90 percent of native brightness, but 3 additional variants (cp-mKate 151-150 , cp-mKate 167-166 , and cp-mKate 168-167 ) exceed 80% of mKate fluorescence, constituting new permutants available for implementation as sensor or complementation pairs. Moreover, the discovery of bright sensors with substantial green fluorescence, such as cp-mKate 168-167 , or a stabilized cp-mKate 154-153 , may be useful in the development of green/red ratiometric sensors.
Truncation variants showed that in each region of cp-mKate, there are minimum native N-terminal and C-terminal fragments required to maintain fluorescence. For example, the minimum Cterminal and N-terminal fragments were 192-231 and 1-181, for Loop 9-10 region cp-mKate. The ability to truncate variants without loss of significant fluorescence provides significant flexibility in the linkage of these permutants to other functional peptides.
Finally, crystallization and structural analysis of cp-mKate 154-153 and cp-mKate 168-167 revealed the expected tertiary structure previously reported for mKate [16], with only slight variations around the permutation point. Future studies will be directed at determining the structural basis for fluorescence variation in circular permutants of mKate.
In summary, we report several highly fluorescent circularly permutated variants of mCherry and mKate. These variants are grouped in 3 regions and constitute the brightest red circularly permutated proteins with native protein sequences reported to date. The reported bright circularly permutated mKate proteins, and further stabilized mCherry variants, should provide additional candidates for the construction of red sensors and complementation tools.

Accession Numbers
Atomic coordinates and structure factors have been deposited in the RCSB Protein Data Bank under ID code 3rwt and 3rwa.