Recombinant Production of the Amino Terminal Cytoplasmic Region of Dengue Virus Non-Structural Protein 4A for Structural Studies

Background Dengue virus (DENV) is a mosquito-transmitted positive single strand RNA virus belonging to the Flaviviridae family. DENV causes dengue fever, currently the world's fastest-spreading tropical disease. Severe forms of the disease like dengue hemorrhagic fever and dengue shock syndrome are life-threatening. There is no specific treatment and no anti-DENV vaccines. Our recent data suggests that the amino terminal cytoplasmic region of the dengue virus non-structural protein 4A (NS4A) comprising amino acid residues 1 to 48 forms an amphipathic helix in the presence of membranes. Its amphipathic character was shown to be essential for viral replication. NMR-based structure-function analysis of the NS4A amino terminal region depends on its milligram-scale production and labeling with NMR active isotopes. Methodology/Principal Findings This report describes the optimization of a uniform procedure for the expression and purification of the wild type NS4A(1-48) peptide and a peptide derived from a replication-deficient mutant NS4A(1-48; L6E, M10E) with disrupted amphipathic nature. A codon-optimized, synthetic gene for NS4A(1-48) was expressed as a fusion with a GST-GB1 dual tag in E. coli. Tobacco etch virus (TEV) protease mediated cleavage generated NS4A(1-48) peptides without any artificial overhang. Using the described protocol up to 4 milligrams of the wild type or up to 5 milligrams of the mutant peptide were obtained from a one-liter culture. Isotopic labeling of the peptides was achieved and initial NMR spectra were recorded. Conclusions/Significance Small molecules targeting amphipathic helices in the related Hepatitis C virus were shown to inhibit viral replication, representing a new class of antiviral drugs. These findings highlight the need for an efficient procedure that provides large quantities of the amphipathic helix containing NS4A peptides. The double tag strategy presented in this manuscript answers these needs yielding amounts that are sufficient for comprehensive biophysical and structural studies, which might reveal new drug targets.


Introduction
Dengue fever is currently the fastest-spreading tropical disease in the world, with more than 2.5 billion people at risk. Dengue virus (DENV), the causative agent of this disease, is estimated to infect 390 million people across all continents each year [1]. DENV is transmitted by the bite of female mosquitoes. DENV causes flu-like symptoms in most of the infected patients, but severe forms of the disease like dengue hemorrhagic fever and dengue shock syndrome are life-threatening. There is no specific treatment and no anti-DENV vaccines.
DENV is a positive single strand RNA virus of the Flaviviridae family. Its genome is translated into a single polyprotein, which is subsequently cleaved into three structural and seven non-structural (NS) proteins. DENV replicates its RNA genome in replication complexes (RCs), which are associated with modified intracellular membranes [2]. While the viral structural proteins compose the mature virion, the NS proteins together with the viral RNA and host factors generate the viral RC. NS4A is an endoplasmic reticulum (ER)-localized, 16 kDa transmembrane protein, which is an essential component of the viral RC. NS4A has been suggested to be involved in inducing host membrane alterations that resemble the virus-induced membrane structures [3]. A membrane remodeling function of NS4A was also reported in other flaviviruses [4,5].
Recently, we identified a conserved amphipathic helix (AH) in the cytoplasmic, amino terminal region of NS4A (amino acid residues 1-48) that is essential for viral replication [6]. AHs are a helical protein regions in which one face of the helix is hydrophobic while the opposite face is hydrophilic [8]. In contrast to transmembrane domains that span the membrane bilayer, AHs often serve as in-plane membrane anchors [7]. Peptides that are predicted to form amphipathic helices are frequently unstructured in buffer devoid of membranes but adopt a helical conformation upon association with membranes or in a membrane mimicking environment. In addition AHs can contribute to membrane curvature [8][9][10] or mediate protein-protein-interactions. AHs in proteins of several positive strand RNA viruses were shown to be essential for the viral life cycle [11][12][13][14][15][16].
The use of direct acting antivirals is one of the most important new therapeutic approaches for treating infections with hepatitis C virus (HCV), a close relative of DENV. Notably, AHs in HCV NS4B and NS5A were found to be amenable to pharmacological inhibition [11,17] indicating that AHs could serve as novel antiviral targets. Our recent data emphasizes the significance of the amino terminal AH of NS4A in the DENV life cycle and demarcate it as potential target for the design of novel antiviral therapy [6]. In this report, we describe a novel protocol for recombinant production of a peptide comprising the first 48 amino acids of NS4A, NS4A , and containing the above mentioned AH. Production of milligram amounts of NS4A(1-48) is a crucial prerequisite for biophysical and in particular NMR experiments on NS4A . Such studies are urgently needed as a starting point for the rational design of new strategies to inhibit the activity of this NS4A region in the virus life cycle.
It is a common strategy to express short peptides as a fusion with another protein in order to avoid the well-known degradation of short peptides in bacterial cells. However, expression of fusion proteins is complex and does not always follow predictions. Hence, it is necessary to test several fusion strategies in search for a highly efficient protocol that works for a given peptide. Several different proteins have been described in the literature as fusion tags for peptide production [18,19]. We tested glutathion S-transferase (GST), the immunoglobulin-binding domain of streptococcal protein G (GB1), and yeast ubiquitin as fusion partners of NS4A(1-48) in our quest for an effective production strategy of this peptide in E. coli. For DENV NS4A(1-48) a dual fusion tag in combination with a tobacco etch virus (TEV) protease cleavage site revealed the highest peptide yields. Our aim was to establish a general protocol for production and purification of decent amounts of wild type NS4A(1-48) as well as of various mutated forms of the peptide. Such mutants are quite useful in an in-depth structure-function-analysis of proteins.
Construction of NS4A(1-48) expression vectors NS4A(1-48) plasmid constructs with a single fusion tag. We first cloned NS4A(1-48) into a modified pTKK19ubi vector (see below), which codes for an amino terminal yeast ubiquitin fusion. The peptide bond between ubiquitin and a defined peptide insert in the expressed fusion protein can be cleaved by yeast ubiquitin hydrolase. This strategy enables the production of a NS4A(1-48) peptide without any artificial overhangs. In order to simplify cloning of inserts into pTKK19ubi vector [20], we modified the original pTKK19ubi by introducing silent mutations that generate a unique SacII endonuclease cleavage site at the carboxy-terminal end of the ubiquitin coding sequence. In detail, the original codons for Leu 73, Arg 74 and Gly 75 in the ubiquitin coding sequence of pTKK19ubi are substituted with ctc, cgc, and ggc, respectively, in the new pTKK19ubi/SacII. Plasmid pUbi-NS4A(1-48) was obtained by cloning an optimized sequence coding for amino acids 1 to 48 of NS4A from dengue virus type 2 (GenBank: NP739588) into pTKKubi/SacII. Gene optimization was performed using the GeneOptimizerH software provided by GeneArtH [21]. To construct the plasmid, four complementary synthetic oligonucleotides (sequences 1 to 4, table 1) were annealed, ligated and subjected to a PCRamplification with two shorter primers (9 and 10, table 1) containing the restriction sites for SacII and SalI. The purified fragment was cut by SacII and SalI and then ligated into dephosphorylated pTKK19Ubi/SacII. Second, a plasmid for expression of the NS4A peptide as protein fusion with the immunoglobulin-binding domain of streptococcal protein G (GB1) at the amino terminus was obtained by cloning the optimized sequence coding for NS4A(1-48) into the vector pGEV2 [22] using the four synthetic oligonucleotides 1 to 4 (table 1). PCR amplification was performed with different primers (5 and 6) in order to introduce a nucleotide sequence that codes for the first six residues of a tobacco etch virus (TEV) protease cleavage site (ENLYFQ) in front of the NS4A coding sequence. The primers contained BamHI and XhoI restrictions sites at their 59 and 39 ends, respectively. This fragment was ligated into dephosphorylated pGEV2 to yield pGEV-NS4A(1-48). As TEV protease recognizes E-X-X-Y-X-QQ(G/S), and the first amino acid of the NS4A peptide is serine, a NS4A(1-48) peptide without any artificial overhang can be produced with this vector as well.
NS4A(1-48) plasmid construct with a dual fusion tag. Efficient separation of NS4A(1-48) from GB1 after TEV protease cleavage was impossible when utilizing pGEV-NS4A . In order to avoid this problem we extended the GB1-NS4A(1-48) fusion by an amino terminal GST affinity tag. For this purpose the GB1-NS4A(1-48) sequence was amplified with a 59-primer containing an EcoRI site (7, table 1) and a commercially available T7-terminator primer (8, table 1). The restricted insert was cloned into the pGEX4T-2 vector using the EcoRI and XhoI sites. The obtained vector pGEX-GB1-NS4A(1-48) was used to produce the GST-GB1-NS4A fusion protein in E. coli.
Tail-to-tail mutagenesis was used to obtain the mutant NS4A peptide carrying L6E and M10E substitutions (pGEX-GB1-NS4A(1-48; L6E, M10E) [23] using kappa HiFi DNA polymerase (Kapa Biosystems Cambridge, MA, USA). The forward primer (11, table 1) carried the mutations while the reverse primer (12, table 1) annealed to the opposite strand, with their 59-ends adjacent to each other. Subcloning into pGEX4T-2 was performed as described for the wild type peptide. The correct sequence of all constructs was verified using DNA sequencing (Seqlab, Göttingen, Germany).

Purification of GST-GB1-NS4A(1-48) fusion proteins and their proteolytic cleavage
Pellets of pGEX-GB1-NS4A(1-48) or pGEX-GB1-NS4A(1-48; L6E, M10E) transformed cells harvested from 1 L expression culture, respectively, were thawed and resuspended on ice in 25 ml lysis buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM DTT, 0.5 mM EDTA) supplemented with protease inhibitors (Complete mini, Roche, Penzberg, Germany). Cells were lysed using 4-5 cycles in a Microfluidizer M-100P (Microfluidics, Worcestershire, UK). The crude lysate was clarified by centrifugation (500006 g, 4uC, 30 min). Subsequent purification steps were performed at 22uC. The supernatant was applied onto a gravity flow column (column volume (CV) of 5 ml) packed with GSH sepharose 4B (GE Healthcare) and pre-equilibrated with lysis buffer. Unbound material was removed by washing with 10 CV of lysis buffer. When utilizing the mutant plasmid, on column cleavage of the GST-GB1-NS4A(1-48; L6E, M10E) fusion protein with 10 mM TEV protease (corresponding to a peptide to TEV ratio of about 100) was performed in standard buffer at 22uC overnight. The flow-through as well as the wash fractions (5 CVs) containing the free NS4A(1-48; L6E, M10E) peptide and TEV protease were pooled and concentrated to 5 ml using a Vivaspin 20 centrifugal concentrator (MWCO: 3 kDa, Sartorius, Göttingen, Germany). Separation of TEV protease from NS4A mutant peptide was accomplished with a HiLoad 16/60 Superdex 75 prep grade column (GE Healthcare) on an Ä KTA purifier system at 22uC with a flow rate of 1 ml/min. Interestingly, the GST-GB1-NS4A(1-48) wild type fusion protein could not reliably be digested under these standard conditions. Here, cell lysis had to be performed in lysis buffer supplemented with 0.5 M urea. GSHbinding and on column cleavage had to be done in the presence of 0.5 M urea and at a higher TEV concentration (50 mM, peptide to TEV ratio of about 10). Flow-through and wash was collected after TEV digestion and concentrated using Vivaspin 20 centrifugal concentrator. TEV-protease and urea was removed from the concentrated peptide solution using a Highload 16/60 Superdex 75 prep grade column, equilibrated in 50 mM Tris-HCl

TEV protease expression
TEV protease was produced using the plasmid pRK793 (Addgene). This plasmid codes for TEV fused to highly soluble maltose-binding protein that cleaves itself in vivo to generate a Histagged TEV protease catalytic domain. Because of its S219V mutation this enzyme does not show the wild type typical autoinactivation and allows the production of a stable and highly active enzyme. Expression and purification of the protease was performed as described [24].  Mass spectrometric analysis Identity of the NS4A peptides was confirmed using mass spectrometry. NS4A(1-48) peptide bands were excised from SDS polyacrylamide gels and subjected to tryptic digestion. Peptide fragment analysis was performed using an LC-MS/MS instrument (nanoUPLC-QTOF Premier, Waters Corp., Milford, MA, USA). These experiments were performed at the technology platform Integrated Functional Genomics of the Interdisciplinary Center for Clinical Research of the University of Münster, Germany.

Results and Discussion
Codon adaption of the NS4A(1-48) coding sequence Only negligible amounts of NS4A peptide could be obtained when the original DENV type 2 cDNA sequence encoding the first 48 residues of NS4A, NS4A(1-48), was used in a pGEX expression vector. This may be explained by the fact that the viral DNA sequence for NS4A(1-48) is not optimal for expression in E. coli cells, due to different codon usage. Specifically several codons for leucine, isoleucine and arginine occurring in the viral gene sequence are ''rare codons'' in E. coli (Fig. 1). Rare codons are known to cause translational problems such as low protein expression and frame shifts [28]. Thus the codons were optimized for expression in E. coli cells and chemically synthesized oligonucleotides were used to create a synthetic NS4A(1-48) DNA cassette, which was then used to generate several NS4A (1-48) expression constructs.

Expression yields of NS4A(1-48) strongly vary between different fusion tags
To identify the ideal conditions for NS4A(1-48) expression, the expression levels of codon-optimized NS4A(1-48) linked to three different kinds of fusion tags in E. coli cells were tested ( Fig. 2A). E. coli BL21 harboring pGEX-TEV-NS4A(1-48) and BL21(DE3) cells harboring pUbi-NS4A(1-48) or pGEV-NS4A(1-48) were induced with IPTG for the indicated time, a band corresponding to the full-length NS4A(1-48) fusion protein was detected in the Coommassie-stained gel for the GST and the GB1 fusions. A fulllength NS4A(1-48) fusion protein could not be detected for Ubi-NS4A(1-48) using Commassie staining. However, western blot analysis confirmed its expression (data not shown). Highest expression levels of NS4A(1-48) were detected with the GB1 fusion tag. As all the plasmids contained the same optimized NS4A sequence the GB1 domain seems to account for these high expression levels presumably by stabilizing the NS4A(1-48) peptide. In addition, only the GB1-NS4A(1-48) fusion protein could be efficiently extracted to the soluble fraction after cell lysis without adding detergents. These findings are in line with previous observations [18,29]. The average yields for the three expression constructs are summarized in the upper part of table 2. The highest amount of target peptide was produced with pGEV-NS4A(1-48). Subsequently, the GB1-NS4A(1-48) fusion was purified as described [18], however, TEV protease was used Table 2. Average yields and purities of the studied NS4A(1-48) fusion proteins and of the resulting NS4A(1-48) target peptides obtained from 1 L of culture. instead of Factor Xa. The optimum recognition site for TEV is Glu-Asn-Leu-Tyr-Phe-Gln-(Gly/Ser) (ENLYFQ(G/S)) and cleavage occurs between the Gln and Gly/Ser residues [30]. DENV NS4A starts with serine and thus TEV cleavage allows a very specific and efficient removal of the fusion tag without any artificial overhang. Briefly, the GB1 fusion protein was extracted from the cell lysate by a simple heating step followed by size exclusion chromatography yielding approximately 90% pure   (Fig. 2B). However, we were not able to efficiently separate the cleaved NS4A(1-48) peptide from the GB1 domain. Separation by size exclusion chromatography was hampered by similar elution behavior of both molecules, which have similar molecular weights (5223.7 Da for NS4A(1-48) and 7585.3 Da for GB1). Thus IgG sepharose affinity chromatography was used following an established protocol [22]. However, the GB1 domain could not be completely removed even when IgG sepharose was applied in large excess. This suggests that the heating step might have caused misfolding of a substantial fraction (up to 20%) of GB1 causing the observed leakage during affinity chromatography. Subjecting the digest reaction to diverse reverse phase chromatography resins did not yield satisfying results either. The target NS4A(1-48) peptide showed a strong tendency to remain on the column even at high acetonitrile concentrations, presumably due to its amphipathic nature (data not shown). Consequently, the highest purity that could be achieved for untagged NS4A(1-48) was about 80% with an average recovery of 60% (see Table 2).
Additional tagging of GB1-NS4A with GST allows a simplified separation of NS4A(1-48) from the fusion tag Due to the difficulties in separating NS4A(1-48) from GB1 after TEV protease cleavage as described above, we decided to add an additional amino terminal GST tag to the protein (Fig. 3). The relative expression levels achieved with this construct and the respective mutant construct were evaluated using SDS-PAGE (Fig. 4A). The data demonstrate proper and stable expression of NS4A(1-48) fusion proteins. The GST-GB1 fusions could be easily purified using GSH sepharose. The average yields of the GST-GB1-NS4A peptide fusions are given in Table 2. TEV cleavage under standard conditions could efficiently remove the dual tag from the NS4A(1-48; L6E,M10E) mutant peptide. However, the GST-GB1-NS4A(1-48) wild type construct showed unexpectedly low cleavage efficiency (Fig. 4B).
Addition of urea improves the TEV cleavage of the wild type NS4A(1-48) peptide GST is known to form dimers in solution [31][32][33]. Examples of GST-induced oligomerization of GST fusion proteins can be found in the literature [34][35][36]. Such a GST-induced oligomerization might also explain the observed resistance of a substantial fraction of wild type GST-GB1-NS4A(1-48) to TEV cleavage under standard conditions (Fig. 4B). Previous results from our laboratory indicate that the amino terminal AH of NS4A plays a role in the homo-oligomerization of NS4A [6]. We speculated that the purified GST-GB1-NS4A(1-48) peptides might self-associate and thus block the TEV recognition site (see Fig. 5A for explanation). The NS4A mutant peptide, which is expected to show a reduced self-association, was easily cleaved supporting the above notion. We optimized the TEV digestion reaction conditions with the aim to decrease the NS4A peptide selfassociation without reducing the proteolytic activity of TEV. Denaturing or chaotropic reagents like urea are typically used to break up such aggregates. TEV activity studies by Sun et al. suggested that a recombinant TEV protease can retain most of its activity at relatively high concentrations of denaturants such as 2 M urea [37]. In contrast Waugh et al. reported a lower urea tolerance of 0.5 M for TEV [38]. Therefore, we first tested the GST-GB1-NS4A(1-48; L6E, M10E) mutant construct, which easily can be digested without additives, to assay the urea tolerance of TEV protease under our experimental conditions. At low urea concentrations (up to 0.5 M), no loss of TEV activity was observed (Fig. 5B). Higher urea concentrations (1-2 M) resulted in a significant activity loss of more than 50%, as described by Waugh [38]. Our results indicate that the NS4A peptide aggregation is perhaps a cooperative process. Therefore, addition of chaotropic molecules early on in the purification process might further improve the final yield of target protein. In order to add urea already to the lysis buffer further optimization was required. It was important to ensure that GST binding to GSH sepharose will not be affected by the added urea. An earlier study indicated that GST binds to GSH sepharose in presence of chaotropic reagents like 2-3 M guanidine hydrochloride or urea [39]. We independently assessed the binding properties of GST-GB1-NS4A(1-48) to GSH sepharose in the presence of increasing urea concentrations (Fig. 5C). Addition of 0.5 M of urea did not alter the binding behavior of GST to GSH sepharose. However, higher urea concentrations resulted in a drastic shift of the GST-GB1-NS4A(1-48) peak from the elution to the flow-through fractions. Next we assayed the TEV cleavage efficiency for the double tagged wild type NS4A(1-48) fusion construct in the presence of increasing urea concentrations. Conveniently, 0.5 M urea was sufficient to initiate the removal of GST-GB1 tag ( figure 5D). Yet, TEV protease amounts had to be increased at least 5-fold compared to the cleavage reaction of the mutant peptide. Nevertheless, under these conditions, nearly complete removal of GST-GB1 from the wild type NS4A(1-48) peptide was achieved.
The GST-GB1 fusion allows a nearly identical purification protocol for both wild type and mutant NS4A peptides Production of diverse mutant forms of NS4A(1-48) besides the wild type is necessary for an in-depth structure-function-analysis. Our aim was, to establish a general purification procedure applicable to various NS4A(1-48) mutant peptides. Therefore, purification steps that are sensitive to changes in peptide charge or hydrophobicity, as it would be the case when applying ion exchange or reverse phase chromatography, were avoided. The feasibility of our simple purification procedure, which is based exclusively on GSH-affinity and size exclusion chromatography, is demonstrated for the wild type peptide and for the NS4A(1-48; L6E, M10E) mutant as an example. This NS4A(1-48) mutant carries glutamate substitutions at positions 6 and 10 instead of leucine and methionine, respectively, resulting in a shift of the isoelectric point from 6 to 5 and a considerably decreased hydrophobicity of the peptide. The purification progress for the wild type and the mutant NS4A(1-48) peptides is summarized in figure 6. In both cases highly purified peptides could be obtained after size exclusion chromatography. The identity of the peptides was confirmed by in-gel tryptic digestion of the electrophoretically separated protein band and mass spectrometric peptide mapping. As summarized in table 2 we could obtain up to 3 mg untagged wild type and up to 4 mg mutant peptide of very high purity (approx. 98%) from 1 liter culture in rich medium. These values indicate an almost complete TEV cleavage as well as an efficient separation of the peptides from the dual tag, which was not possible with the GB1 single tag system. Thus, our dual tag strategy provides an efficient way for producing NS4A(1-48) in E.coli.
Peptide labeling with stable isotope and initial nuclear magnetic resonance spectra Peptides labeled with the stable isotopes 15 N and 13 C can be studied in great detail by nuclear magnetic resonance (NMR) spectroscopy. Solution NMR is the method of choice for the structural investigation of conformationally flexible peptides. Using the double tagged GST-GB1-NS4A(1-48) fusion protein we could easily produce even higher amounts of labeled NS4A(1-48) by growing the E. coli cells in minimal medium supplemented with [ 15 N] ammonium chloride with either [ 12 C 6 ] or [ 13 C 6 ] glucose as sole nitrogen and carbon sources ( Table 2). Production of uniformly 15 N-labeled NSA4(1-48) peptides in minimal medium resulted in very similar peptide yields and purity as in case of doubly labeled peptide. 2D ( 1 H-15 N)-BEST-TROSY spectra of NS4A wild type and mutant peptides in aqueous buffer are shown in figure 7. The position and low spectral dispersion of the observed amide 1 H-15 N cross signals indicate unstructured peptides. This is in good agreement with our earlier circular dichroism (CD) data on these two peptides in aqueous buffer [6]. These results indicate that the amount of highly pure isotopelabeled NS4A peptides produced using the described double tag strategy is sufficient for future multidimensional NMR experiments. These experiments are required for peptide resonance assignment and for gathering information regarding the structure of the peptides and their dynamics in presence of membranes or membrane mimetic model systems.

Conclusion
Our GST-GB1 fusion approach represents a valuable tool for preparing milligram amounts of unlabeled or isotope labeled NS4A(1-48) peptides. This is an important prerequisite for a detailed analysis of the structure-function relationship of the amino terminal region of NS4A with the aim to elucidate the mechanism how this region interacts with membranes. The results may lead to new antiviral strategies to fight DENV. In general the presented production procedure may also aid the characterization of other aggregation prone proteins that are frequently coded in viral genomes.