Artificial Environments for the Co-Translational Stabilization of Cell-Free Expressed Proteins

An approach for designing individual expression environments that reduce or prevent protein aggregation and precipitation is described. Inefficient folding of difficult proteins in unfavorable translation environments can cause significant losses of overexpressed proteins as precipitates or inclusion bodies. A number of chemical chaperones including alcohols, polyols, polyions or polymers are known to have positive effects on protein stability. However, conventional expression approaches can use such stabilizing agents only post-translationally during protein extraction and purification. Proteins that already precipitate inside of the producer cells cannot be addressed. The open nature of cell-free protein expression systems offers the option to include single chemicals or cocktails of stabilizing compounds already into the expression environment. We report an approach for systematic screening of stabilizers in order to improve the solubility and quality of overexpressed proteins co-translationally. A comprehensive list of representative protein stabilizers from the major groups of naturally occurring chemical chaperones has been analyzed and their concentration ranges tolerated by cell-free expression systems have been determined. As a proof of concept, we have applied the method to improve the yield of proteins showing instability and partial precipitation during cell-free synthesis. Stabilizers that co-translationally improve the solubility and functional folding of human glucosamine 6-phosphate N-acetyltransferase have been identified and cumulative effects of stabilizers have been studied.


Introduction
Newly synthesized proteins are at great risk of aberrant folding already inside the cellular environment. Formation of aggregates or inclusion bodies composed out of denatured proteins is commonly observed in particular during overexpression of proteins [1]. In addition, protein denaturation could result from degradation mechanisms such as deamidation or oxidation. While refolding can sometimes help to rescue proteins, often high amounts of sample are lost and not useful for further applications. Living cells can support the stability of proteins by a number of organic substances known also as chemical chaperones [2]. Upon recombinant protein production, such chemicals are unfortunately only of limited value as access to the inner cell compartment in conventional cell-based expression systems is restricted. Increasing intracellular concentrations of stabilizers by e.g. inducing specific solute transporters requires strong impacts such as osmotic shocks which could cause dramatic changes in cell physiology and expression patterns [3,4]. Stabilization strategies are therefore usually confined to manipulations of growth conditions or to attempts of post-translational stabilization during protein extraction, when significant protein precipitation might already have occurred. Cell-free (CF) expression systems offer the new option to support the stability of expressed proteins already co-translation-ally with a wide and diverse range of additives, while on the other hand being relatively sensitive to manipulations of reaction conditions such as incubation temperature. The open nature of CF reactions allows to supply any tolerated chemical directly into the protein expression environment [5]. Production protocols for unstable and difficult proteins can therefore be individually designed and stabilizers or mixtures thereof can be adjusted according to specific requirements.
Protein stabilizing agents comprise a wide range of chemicals including alcohols and molecular crowding agents such as polyethylenglycols (PEG). Many organisms accumulate small organic molecules in stress situations, which are generally called osmolytes [6,7]. Those solutes act as chemical chaperones in the cell by preventing protein unfolding and improving protein thermostability. Major groups of osmolytes are polyols, amino acids, polyions or urea [2]. Prominent examples are the synthesis of betaine or trehalose in E. coli, glycerol in Saccharomyces cerevisiae and generally a number of different polyols and amino acid derivatives in yeasts and plants [7]. Hyperthermophilic microorganisms accumulate organic solutes such as betaine, ectoine or trehalose in high concentrations while responding to heat stress [8,9]. The intracellular concentration of some of these compounds can even reach molar levels dependent on medium osmolality and growth conditions [10].
CF reactions are ideal for screening experiments and have been applied for the expression of target libraries [11][12][13], protein evolution [14] or drug screening [15]. We have established a process based on extracts of E. coli cells and on the batch configuration that allows the screening of chemical chaperones. The tolerated concentration ranges of all additives were determined in linear screening schemes and by using shifted green fluorescent protein (sGFP) as expression monitor. Additives showing positive effects on sGFP fluorescence were then further analyzed in linear or in correlated screening schemes for their effects on two unstable proteins. The screening process for cotranslational protein stabilization was exemplified with the human glucosamine 6-phosphate N-acetyltransferase (GNA1) and with the halogenase domain of the fungal CurA polyketide synthetase [16]. Improved solubility of the two proteins was in particular monitored with choline and L-arginine and cumulative effects of selected compounds were analyzed in correlated screens. The established process could provide guidelines and options for the preparative scale production of unstable proteins as well as for exploiting the stabilizing role of osmolytes for biotechnology purposes.

Chemicals
PEG 6000 was obtained from Applichem (Darmstadt, Germany). All other chemicals were from Sigma-Aldrich (Taufkirchen, Germany) and obtained at highest purity.

DNA Templates
Shifted green fluorescence protein (sGFP) was cloned into the pIVEX 2.3d vector and expressed with a C-terminal poly(His) 10 tag using restriction free cloning. The coding region of human GNA1 (GenBank access code BC012179.1) was first cloned into the vector pET21a. A C-terminal fusion of sGFP to GNA1 was then constructed by restriction free cloning. The forward primer had a 24 base overlap complementary to the 59 end of the desired insertion site of the vector and followed by a start codon and 20-25 bases of the 59 end of GNA1 coding sequence. The reverse primer annealed to the vector with 24 bases complementary to the 39 end of the insertion site. A pair of primers was furthermore designed in order to fuse the TEV-sGFP gene sequence after the GNA1 gene sequence ( Table 1). The CurA halogenase domain was cloned into the vector pET28b (Merck Bioscience, Darmstadt, Germany) and expressed with an N-terminal His 6 -tag. The native protein sequence covers the amino acids 1599 to 1930 of CurA according to the sequence accessible at NCBI (GenBank accession code: AAT70096.1). DNA templates used for CF expression were transformed into E. coli strain DH5a and isolated by standard plasmid purification kits (Macherey-Nagel, Düren, Germany).

Compound Screening
Batch reactions were pipetted with a Tecan Freedom EVO 200 device equipped with an eight channel liquid handling arm (461,000 ml and 4650 ml syringes) and two transport arms (Tecan, Mä nnedorf/Zürich, Switzerland). The pipetting range was in between 300 nl and 800 ml. Stock solutions of chemicals (Sigma-Aldrich, Steinheim, Germany) were prepared in either H 2 O or 500 mM HEPES-KOH buffer, pH 8.2, and kept on cooling carriers at 4uC upon pipetting. All additives were adjusted prior addition to pH 8.2 by titration with either 500 mM HEPES-KOH, pH 8.2, or with 100 mM L-glutamic acid.
Linear concentration screening of selected single compounds as well as correlated concentration screening of two compounds was programmed by the custom designed EYES software based on the  Gemini operating system. In a first step, the final concentration of each reaction compound was calculated and liquid classes for proper pipetting were defined. A mastermix of common compounds was then prepared and the screening compounds were pipetted first into the individual cavities of 96well microplates, followed by appropriate volumes of the mastermix. Processing time for calculation and pipetting was approximately 30-45 min per one complete 96well microplate screen. During pipetting, the microplate was chilled at 4uC and the reactions were started by addition of template DNA with subsequent incubation at 30uC on a shaker.

Protein Quantification
Proteins containing red shifted sGFP fusions were quantified by fluorescence measurement with an excitation wavelength of 484 nm and emission wavelength of 510 nm [5]. Potential effects of the analyzed chemicals on sGFP were determined by fluorescence measurements after incubating aliquots of 300 mg/ml purified sGFP with corresponding chemicals at 30uC for 4 hrs.
Alternatively, immunoblotting using anti-His antibodies or proteins labeled with 35 S-methionine were used for quantification. 35 S-methionine mixed with non-labeled amino acids in a ratio of 1:40,000 were added into the reaction. After expression, samples were transferred into reaction tubes, centrifuged at 22,0006g for 10 min and the supernatant was precipitated with 10% trichloric acid. After washing, the pellet and the precipitated supernatant were measured for radioactivity. Control experiments without any DNA template were used as background value for the radioassay.

Activity Assay of GNA1-sGFP
The 50 ml reactions were transferred into D-tubes (Novagen, Darmstadt, Germany), diluted with 50 ml buffer (50 mM Tris-HCl, pH 8.0) and dialyzed against 500 ml buffer with stirring at 4uC for 2 hrs. Samples were then centrifuged at 22,0006g for 10 min and supernatants were used for enzyme activity assay. The assay was performed in 50 ml buffer containing 500 mM Dglucosamine 6-phosphate (GlcN6P), 500 mM AcCoA, 50 mM Tris-HCl, pH 8.0, 5.0 mM MgCl 2 and 10% glycerol in 96well flat bottom plates. Approximately 0.4 mg unpurified GNA1-sGFP (determined by fluorescence) were added to start the reaction. After incubation at 30uC for 5 min, the reaction was terminated by adding 50 ml of stop solution (50 mM Tris-HCl, pH 8.0, and 6.4 M guanidine hydrochloride) and then 50 ml of CR buffer (50 mM Tris-HCl, pH 8.0, 1 mM EDTA, and 200 mM 5,59dithiobis(2-nitrobenzoic acid) (DTNB). The amount of CoA produced by GNA1 was determined by 4-nitrothiophenolate formation and measured at 412 nm in a microplate reader (Fisher Scientific, Schwerte, Germany). A blank reaction using CF reactions without GNA1-sGFP template was used as control. The amount of CoA produced was calculated using the extinction coefficient of DTNB at 30uC (13,800 M 21 cm 21 ).

Basic CF Reaction Set Up for Robotic Screening Applications
The production of fluorescent sGFP was used as fast monitor for setting up the basic reaction protocol and for the subsequent evaluation of compound compatibility. In order to reduce pipetting time, a number of standard reaction compounds including salts, polyamines and some precursors were combined in a premix ( Table 2). S30 extract, enzymes, unstable reagents and screening compounds were kept separately. The premix is stable at 280uC for at least one year and remains active after repeated freeze-thaw cycles [17]. Protein synthesis with the basic batch protocol is effective over 2 hrs and then reaches a plateau at production levels of approximately 0.5-0.8 mg sGFP per ml of batch reaction. Folding of sGFP is oxygen dependent and the plates were therefore further incubated for 2 hrs after the reaction prior to fluorescence determination.
Working lists for programming and pipetting were generated by the specific EYES software and optimal concentration ranges for several basic compounds were determined by linear or correlated concentration screening ( Table 2). The S30 extract had a welldefined optimum at approximately 31% final concentration (Fig. 1A). Mg 2+ ions are known to be critical for CF reactions and optimal concentration ranges were determined in between 20-28 mM depending on the S30 extract preparation. Reducing conditions could become important depending on the nature of the synthesized target proteins. DTT as reducing agent is tolerated in the reaction at least up to 10 mM final concentration while it could also be completely omitted without significant effects. NH 4 + ions were tolerated at least up to 30 mM final concentration (Fig. 1A). Protein expression increased with plasmid DNA template concentrations up to 2-4 ng/ml reaction and then remained at a relatively stable plateau. The DNA template concentration optimum appeared to be independent from the coding regions of sGFP or GNA1-sGFP (Fig. 1B).
Mg 2+ ions could interact with other negatively charged compounds of the reaction such as NTPs or PEP and correlated optimal concentration ranges were analyzed (Fig. 2). With the combination of NTP mix and Mg 2+ , optimal efficiency was determined within the range of 1-2 fold NTP mix and 20-26 mM Mg 2+ (Fig. 2A). With the combination of PEP and Mg 2+ , optimal concentrations were ranging from 36-50 mM and 24-30 mM, respectively (Fig. 2B). After establishing reaction conditions, the protein production in the CF batch reaction could be scaled up to at least 1 ml reaction volumes without loss of efficiency.

PEG Derivatives as CF Additives
PEG derivatives are known to act as molecular crowding agents by binding water thus making other reaction compounds more readily accessible. PEGs with increasing average molecular weights starting from 200 up to 8,000 kDa were added and with the exception of PEG 400 resulted into an increased sGFP fluorescence of 10-20% at final concentrations of 2-3% (Fig. 3A). The addition of PEG 10,000 resulted into an instant precipitation of reaction components presumably due to protein denaturation. PEG and other molecular crowding agents have been used to condense reactants and to mimic cellular environments in CF systems based on wheat germ extracts [18,19]. A more detailed study revealed that PEG 8,000 resulted into increased CF transcription but rather reduced CF translation [18] and also different effects correlated with the PEG molecular weight on proteins are known [20]. However, systematic analysis of PEGs with different molecular weights in CF systems have not been made yet.

Alcohols as CF Additives
Organic solvents are usually denaturizing by disrupting hydrophobic contacts in between the nonpolar side chains of amino acids. These effects are concentration dependent and some solvents such as alcohols or ketones can even act as protein stabilizers at lower concentrations while they convert to denaturants at high concentrations [21]. A further important parameter for stabilizing effects is the chain length of alcohols. We have analyzed alcohols of chain lengths from one to six carbon atoms for their compatibility with our CF system and for their effects on sGFP fluorescence (Fig. 3B). With the exception of ethanol, all other analyzed alcohols had concentration dependent negative effects on sGFP fluorescence most likely due to inhibition of factors essential for the basic protein expression machinery [22]. With pentanol and hexanol, already the lowest supplied concentration resulted in almost complete inhibition of sGFP expression and precipitate formation indicated substantial denaturation of proteins from the S30 extract. Addition of ethanol at 6-8% final concentration resulted into an 60% increase of sGFP fluorescence corresponding to an expression of approximately 800 mg/ml (Fig. 3B).
Our results are consistent with previous observations that denaturation effects of alcohols are correlated with their chain length and concentration. Low concentrations of ethanol in between 0.1-2.5% stabilized proteins and inhibited the mechanical denaturation of hemoglobin or the degradation of cytosolic proteins [23]. In the E. coli CF system, ethanol appears to be most promising in promoting protein expression as a result of either stabilizing the expression machinery and/or improving the folding of sGFP. Methanol, isopropanol and butanol had only minor positive effects but were tolerated to some extent up to 4-6% final concentration. Alcohols are frequently used in combination with detergents in order to stabilize hydrophobic membrane proteins in crystallization studies. The CF compatible alcohols might thus be considered as potential stabilizers of these protein types in future expression approaches.

Natural Cellular Stabilizers as CF Additives
Living cells can produce a number of small molecules in order to stabilize intracellular proteins in extreme environmental conditions [10]. The major classes of these compounds are (i) polyols/sugars, (ii) amino acids and (iii) polyions. Polyols can protect proteins against a variety of denaturation and degradation mechanisms including aggregation, thermal denaturation, deamidation and oxidation [24,25]. Further applications are preventing protein dehydration upon freeze-drying by serving as water substituent through hydrogen bonding. Sucrose and glycerol have become standard stabilizers for the long-term storage of protein samples. Protein protection by individual polyols can act in different ways and even mixtures might therefore be considered for optimal effects [26]. Amongst the most frequent polyols synthesized in various organisms are sucrose, glycerol, D-trehalose, D-mannose or D-sorbitol [27]. For lysozyme, D-mannitol was found to prevent aggregation, sucrose acted against deamidation and lactose reduced oxidation [28].
We have analyzed the compatibility of glycerol, sucrose, Dsorbitol, D-trehalose and D-mannose for our CF system by monitoring fluorescent sGFP expression (Table 3). D-sorbitol, Dtrehalose and D-mannose were dose dependent inhibitors of fluorescent sGFP production starting already at 1% final concentration in the reaction (Fig. 4A). In contrast, sucrose and glycerol are tolerated up to 8% and 4% final concentration, respectively. Both compounds could thus be considered as potential CF additives in the determined tolerated concentration ranges.
Amino acids can have a dual role in CF expression systems as they primarily serve as substrater for translation, but also could help to stabilize the expression machinery and/or the synthesized target protein. Proteinogenic amino acids such as L-arginine and L-glutamic acid in addition to some non-proteinogenic amino acids such as trans-OH-L-proline, N-acetyl-L-lysine and Lcarnitine are known as protein stabilizers in vitro [29] and the concentration ranges compatible to the CF system were determined by fluorescent sGFP monitoring (Fig. 4B). Overall, all tested amino acids showed beneficial effects with some 10-20% increased sGFP fluorescence. The concentration optima were different and ranging from 50-80 mM for glutamic acid, 20-90 mM for trans-OH-L-proline, 20-50 mM for L-arginine, 30-50 mM for N-acetyl-L-lysine, 30-50 mM for L-carnitine and 50-70 mM for sarcosine. In particular N-acetyl-L-lysine and Lcarnitine rapidly inhibit sGFP expression above their optimal concentrations while the concentration optima of the other amino acids have a more Gaussian appearance.
The polyions betaine, choline and ectoine are synthesized by organisms living in extremophile environments for the stabilization of cytoplasmic proteins. However, even E. coli is able to synthesize high amounts of betaine under some conditions [30]. Stabilizing effects have been shown with the inhibition of the in vitro insulin amyloid formation by ectoine or betaine [25]. For betaine and ectoine, a high tolerance of up to approximately 150 mM and 100 mM was determined in the CF system (Fig. 4C). However, neither compound had a positive effect on sGFP fluorescence. In contrast, an approximately 30% increased sGFP fluorescence was measured in presence of 4-14 mM choline. The general

Improving the Soluble CF Expression of Human GNA1 and of CurA Halogenase by Addition of Stabilizers
As a first proof of principle, we approached to improve the CF expression of two targets known to partly precipitate as aggregates.  The human glucosamine 6-phosphate N-acetyltransferase (GNA1) is required for the de novo synthesis of N-acetyl-D-glucosamine-6phosphate representing an essential precursor in UDP-GlcNAc biosynthesis [31]. The protein was synthesized with a C-terminal fusion to sGFP. The 40.5 kDa halogenase domain of the polyketide synthetase CurA from Lynbya majuscula was synthesized with a N-terminal poly(His) 6 -tag [16]. Efficient CF expression protocols for both enzymes have been established with yields exceeding 1 mg/ml. However, solubility is limited and approximately 30-50% of the expressed proteins precipitate during the reaction.
Considering the screening results of the analyzed types of additives, only representative compounds shown to be tolerated by the CF system were analyzed for potential stabilizing effects on the two proteins. The addition of sucrose, D-sorbitol, ectoine or betaine in the tolerated concentration ranges had no effects on the soluble expression of GNA1-sGFP as monitored by sGFP fluorescence (data not shown). However, either 10 mM choline or 10 mM L-arginine increased the GNA1-sGFP fluorescence by approximately 20% (Fig. 5A). The addition of choline and Larginine could either stabilize the general expression machinery resulting into higher yields, and/or they could stabilize the synthesized protein resulting in increased solubility. In order to investigate the reason for increased GNA1-sGFP fluorescence, the total protein production in the CF reaction was quantified by 35 S-Met incorporation measurements. In addition, CF sGFP expression was included as a second reference reaction and the specific enzymatic activity of GNA1-sGFP was furthermore determined. The total sGFP expression as determined by 35 S-Met incorporation was increased with either 10 mM L-arginine or 10 mM choline to 10% and 20%, respectively (Fig. 5A). However, in contrast with GNA1-sGFP only a slight increase with 10 mM choline was detectable while even a minor reduction of the total yield was measured with 10 mM L-arginine. Moreover, the increase in GNA1-sGFP fluorescence correlated with higher specific activity of the GNA1 enzyme upon addition of 10 mM choline into the CF reaction. Choline therefore appears to have multiple stabilizing effects in the CF expression reaction. The increased total protein production indicates a basic beneficial effect on the CF expression machinery that also at least partly contributes to the increased fluorescence of sGFP and GNA1-sGFP in the soluble protein fractions. However, an additional stabilizing effect of choline on the synthesized proteins is measured by the observed increased specific activity of GNA1. Accordingly, also the effect of L-arginine on sGFP fluorescence appeared to be cumulative based on higher expression as well as on better solubility. This is in accordance with previous observations of better folding of GFP in presence of L-arginine [32]. Interestingly, L-arginine increased solubility of GNA1-sGFP but not its total expression or specific activity. Therefore, even basic beneficial effects of stabilizers on the CF expression machinery appear to be template dependent and might be determined by improved formation of e.g. specific translation initiation complexes.
Choline and L-arginine as individual additives improved the CF production of soluble GNA1-sGFP for some 10-20%. We therefore analyzed whether beneficial compounds could have synergistic effects if added in a cocktail. Surprisingly, the combination of choline with L-arginine in correlated concentration screens was not cumulative and even some reduction in solubility was observed (data not shown). However, correlated screening of further stabilizer combinations identified a synergistic effect of choline with PEG 8,000, resulting in 50-60% increased fluorescent GNA1-sGFP production when a concentration range of 8-16 mM choline and 2-3% PEG 8,000 was used (Fig. 5B). This result demonstrates that effects of stabilizer combinations are hard to predict and underlines the need for a systematic screening approach.
As a further target, the soluble CF expression of the halogenase domain of CurA was analyzed (Fig. 6). The reactions were supplemented with either 10 mM choline, 10 mM L-arginine or 6% D-trehalose and the protein in the supernatant was quantified after the reaction by immunoblotting. In accordance to the results obtained with sGFP, the addition of L-arginine and choline again resulted into 8% and 25% increased soluble expression, while the presence of D-trehalose was inhibitory.

Conclusions
Small molecules belonging to different groups of natural chemical chaperones can be added into CF expression reactions and acting as general or specific stabilizers. This work has defined the working ranges in CF expression systems for a representative variety of the most commonly employed chemical chaperones. The tolerated concentrations of the supplied chemicals by the CF system are different from those reported from living organisms and a number of compounds tolerated in vivo became rapidly inhibitory to the CF expression machinery. As most promising stabilizing agents for the analyzed proteins we could define ethanol, PEG derivatives, amino acids and choline. However, additional polyols and polyions are also tolerated at relatively high concentrations and might therefore be useful in expression approaches with other target proteins. We could show that stabilizing effects can depend on the nature of the target protein as well as on the combination of several additives. Modes of action of the analyzed stabilizers include increased expression, better solubility as well as improved stability and could be exclusive or cumulative. We therefore propose and have established an empirical screening approach in order to define the optimal concentration balance of stabilizers in individual CF protein expression approaches. The presented CF screening platform will become accessible to the scientific community in the European INSTRUCT network (www. structuralbiology.eu).