A Versatile System for USER Cloning-Based Assembly of Expression Vectors for Mammalian Cell Engineering

A new versatile mammalian vector system for protein production, cell biology analyses, and cell factory engineering was developed. The vector system applies the ligation-free uracil-excision based technique – USER cloning – to rapidly construct mammalian expression vectors of multiple DNA fragments and with maximum flexibility, both for choice of vector backbone and cargo. The vector system includes a set of basic vectors and a toolbox containing a multitude of DNA building blocks including promoters, terminators, selectable marker- and reporter genes, and sequences encoding an internal ribosome entry site, cellular localization signals and epitope- and purification tags. Building blocks in the toolbox can be easily combined as they contain defined and tested Flexible Assembly Sequence Tags, FASTs. USER cloning with FASTs allows rapid swaps of gene, promoter or selection marker in existing plasmids and simple construction of vectors encoding proteins, which are fused to fluorescence-, purification-, localization-, or epitope tags. The mammalian expression vector assembly platform currently allows for the assembly of up to seven fragments in a single cloning step with correct directionality and with a cloning efficiency above 90%. The functionality of basic vectors for FAST assembly was tested and validated by transient expression of fluorescent model proteins in CHO, U-2-OS and HEK293 cell lines. In this test, we included many of the most common vector elements for heterologous gene expression in mammalian cells, in addition the system is fully extendable by other users. The vector system is designed to facilitate high-throughput genome-scale studies of mammalian cells, such as the newly sequenced CHO cell lines, through the ability to rapidly generate high-fidelity assembly of customizable gene expression vectors.


Introduction
The medical use of therapeutic proteins is in rapid growth and their full potential in health care is vast. More than 200 approved biopharmaceuticals are already on the market, with the most rapidly developing market being monoclonal antibody (mAb)based products [1]. Accordingly, the interest in construction and development of mammalian cell factories for production of therapeutic proteins is increasing. This interest has been further stimulated by the recent publication of the first genome drafts of CHO cell lines [2][3], the primary host for expression of human proteins in general, and antibodies in particular [1]. These publications open new avenues of genetic engineering of these important cell factories [4].
Vector systems for these mammalian cell lines have so far mainly focused on efficient expression of recombinant proteins. This has primarily been achieved by relatively simple and inflexible vector systems containing a strong promoter (often viral) followed by a multiple cloning site (MCS), a terminator, and a selectable marker. The vectors are thus typically assembled by conventional methods based on the use of restriction endonucleases and ligases. Vector construction will therefore often be hampered by a number of limitations such as few restriction enzyme cut sites or cloning method incompatibilities between plasmid and desired DNA inserts. Moreover, it is next to impossible to assemble more than two DNA fragments in a single restriction/ligation step. For these reasons, restriction enzyme and/or ligase-independent techniques, e.g. In-Fusion cloning, Gibson assembly and USER cloning, are getting increasingly popular in many other cell systems [5]. These techniques allow seamless and directed assembly of vector fragments and inserts that are enabled by single stranded DNA overhangs [6][7][8][9][10][11][12]. In the case of the USER cloning method, the overhangs are generated by substituting a single deoxy-thymidine nucleotide with a deoxyuridine nucleotide in the 59 end of each primer designed to amplify the desired genetic target. Subsequently, the resulting PCR DNA fragment is treated with the USER enzyme-mix (Uracil DNA glycosidase and DNA glycosylase-lyase endo VIII) resulting in formation of unique 39 single stranded overhangs [9][10][11][12]. Importantly, in vivo fusions of DNA fragments are so efficient that several fragments can be combined in a single round of cloning. The method is PCR-based and can therefore easily be used to introduce modifications, e.g. point mutations and linkers, into a DNA fragment during the assembly process [10][11][12][13]. Furthermore, the method is very suited for high-throughput setups [8,12].
For efficient high-throughput cloning and advanced genetic engineering, it is also of importance to have a modular vector system, especially if one wishes to have flexibility of the inserted components. Possibly the best known system of this type is the BioBrick standard, developed by the iGEM Foundation (www. igem.org). It provides a modular assembly standard, and is noncommercial, but is limited by BioBricks requiring restriction enzymes for plasmid integration. Biobricks are thus difficult to apply to mammalian systems where genes are typically very long genes and contain multiple bacterial restriction enzyme cut sites. Combining several modules also becomes a multiple step operation.
In the present study, we have developed and characterized a non-commercial comprehensive vector expression platform for mammalian cell engineering based on the DNA ligase-free uracilexcision based USER cloning method [9]. The platform contains a basic set of fixed shuttle vectors (pBASE) for high-throughput cloning of specific genes of interest (GOI). In addition, it contains a flexible multipurpose DNA fragment toolbox containing sequence building blocks that are equipped with Flexible Assembly Sequence Tags, FASTs, allowing them to be easily fused by USER fusion [11]. Using FASTs, the building blocks can be used for rapid construction of E. coli shuttle vectors with different mammalian selection markers where GOIs can be equipped with a variety of promoters and terminators or combined with sequence encoding internal ribosome entry sites for bicistronic gene expression. Moreover, the toolbox contains FAST building blocks encoding reporter proteins and cellular localization sequences, as well as purification and epitope tags. To demonstrate the potential of our toolbox, we have successfully assembled vectors composed by up to seven building blocks in a single cloning step and provided proof of functionality in mammalian hosts. For example, we have made vectors for protein secretion and for bicistronic gene expression, as well as vectors encoding chimeric fluorescent proteins, which were successfully used as cytological markers in U-2-OS cells.

Strains, cell cultures and media
All standard cloning and plasmid propagation was performed in Escherichia coli strain DH5a, which was grown in standard Luria Broth (LB) medium supplemented with 100 mg/ml ampicillin.

Plasmids and primers
All plasmids used as PCR templates in this study are listed in Table 1. The pIRES-DHFR plasmid is a modified version of the IRES domain of the plasmid pIRES (Clontech, Palo Alto, CA, USA), which was linked to the dihydrofolate reductase (DHFR) gene in house. Oligonucleotides for PCR are listed in Table S1 and oligonucleotides for DNA hybridization of building blocks are listed in Table S2. Plasmids generated in this study are found in Table S3.

Construction of DNA building blocks
DNA building blocks were amplified by PCR using the proofreading polymerase PfuX7 [15]. All primers for amplification of the DNA building blocks were designed based on commercially available vectors (Table 1). Furthermore, each primer was extended by a specific FAST at the 59-end (Table 2). PCRs were performed in 35 PCR cycles in a final volume of 50 ml with addition of 1% MgCl 2 (New England Biolabs, Ipwich, MA, USA). Targeting signals (TSs) of 9-25 bp were added by inclusion of the appropriate sequence in the primer for the PCR amplification of the GOI as an extension between the FAST and the annealing primer. TSs between 25-100 bp were made by DNA hybridization of two complementary single stranded oligonucleotides. DNA hybridization was performed in Milli-Q water in a final volume of 80 ml with a concentration of 50 mM of each complementary oligo. The mixture was incubated for 5 min at 98uC and left at room temperature overnight before storage at 220uC. For TSs above 100 bp, Fusion PCR (overlapping PCR) were performed on synthetic oligos covering the complete length in first 5 cycles with annealing at 54uC, followed by 30 cycles with annealing at 66uC. PCR of building blocks encoding vector backbones were subjected to DpnI (New England Biolabs) digestion (20 U, 37uC, 1 h), heat inactivation (80uC for 20 min) and purified by gel. All PCR products were isolated by agarose gel separation and purified using illustra GFX TM PCR DNA and Gel Band Purification Kit (GE Healthcare, Buckinghamshire, United Kingdom).

Construction of expression vectors
USER cloning and USER fusion was performed as previously described [9,11] with minor modifications: Vector backbones holding a PacI/Nt.BbvCI USER cloning compatible cassette were digested as previously described [13]. 1 ml of USER enzyme mix, 0.5 ml NEB Buffer 4, 0.5 ml BSA x10 (all purchased from New England Biolabs), and 0.1 pmol digested vector backbone were mixed in a 0.2 ml PCR tube. If vector backbone was amplified by PCR, 1 ml purified DNA element was added. Finally, 7 ml of purified DNA elements were mixed to a total reaction volume of 10 ml. If more than one DNA element was to be inserted, all elements were added in equal volumes. The reaction mixture was incubated for 40 min at 37uC, followed by 30 min at 25uC. Subsequently the 10 ml reaction mixture was used directly to transform chemically competent E. coli DH5a cells. Transformants were selected in Luria Broth (LB) medium supplemented with 100 mg/ml ampicillin. Three transformants were picked randomly for each construct and validated by DNA sequencing (Star SEQ, Mainz, Germany). For each construct, a validated expression vector was purified by Plasmid Plus Maxi kit (Qiagen, Hilden, Germany) following the manufacturer's instruction and dilution to a concentration of 1 mg/ml DNA in MilliQ water.

Cell cultivation and transfection
All cells were cultivated in a humidified incubator at 37uC with 5% CO 2 . CHO-S cells were expanded in Erlenmeyer cell culture flasks (Corning, Sigma-Aldrich) and experiments were performed in uncoated 6 well plates (Falcon, BD Biosciences). Adherent cell lines were passaged by exposure to 0.05% trypsin-EDTA (Lonza), when fully confluent and were seeded to a confluence of 20% the day prior to transfection. CHO-S cells were transfected (Day 0) in a Nucleofector 2b using the Amaxa Cell Line Nucleofector Kit V (Lonza). A total of 2N10 6 cells were transfected with 2 mg plasmid. Plasmid transfections of adherent cells were performed using X-tremeGENE HP (Roche, Basel, Switzerland) in Gibco Opti-MEM (Life Technologies, Paisley, United Kingdom) medium.

Fluorescence microscopy
The day after transfection (Day 1) of CHO-S cells, 25,000 cells from each sample were stained with one drop of NucBlue Live Cell Stain (Hoechst33342) for 20 min before analysis using a Celigo Imaging Cell Cytometer (Brooks Automation).
U-2-OS cells were grown on coverslips (VWR BDH Prolabo, Poole, United Kingdom) in 6-well plates (2 coverslips/well). The cells on coverslips were fixed in 4% formaldehyde solution (VWR) for 12 min at room temperature. Subsequently, the coverslips were washed in 1x PBS (Lonza) twice and stored in PBS at 4uC in darkness. Prior to microscopy, the coverslips were mounted onto glass slides with DAPI-containing mounting medium (Vector Laboratories Inc., Burlingame, CA, USA) and preserved with clear nail polish. Confocal laser scanning microscopy was performed for visualization of fluorescent proteins on a Zeiss LSM 780 confocal with an Axio Observer with 63x/1.40 oil DIC Plan-Apochromat. Images were acquired through a Zeiss Zen 2010 digital camera.

Measurement of secreted eGFP
Transfected HEK293 cell were grown in p60 dishes (Nunc), and 500 ml of medium extract was sampled in triplicates approximately 72, 96, and 120 hours after transfection. The samples were centrifuged for 5 min at 12.000 g and the supernatant collected. 100 ml supernatant was used for quantification of fluorescence intensity determined in flat-bottomed micro-titer plates (Nunc) on a Synergy 2 Microplate Reader (BioTek Instruments Inc., Burlington, VT, USA) for emission of a wavelength of 485 nm within a range of 20 nm.

SEAP assay
SEAP activity was measured as described by Durocher et al. [16]. Transiently transfected HEK293 cells were cultured in p100 dishes (Nunc) and culture medium was harvested approximately 48 hours after transfection. The samples were centrifuged for 5 min at 12.000 g and the supernatant was collected. To inhibit endogenous phosphatase activities, samples were heat-inactivated at 60uC for 30 minutes. 50 mL samples were transferred to a 96-well microtiter

Construction of a basic set of mammalian expression vectors
We have created a simple, rapid and robust vector system for inserting a GOI into any of six E. coli based vector backbones, pBASE1-6. In each case, the GOI is inserted into a PacI/Nt.BvCI USER cassette [13] flanked by a promoter and terminator by USER cloning, see Figure 1 and Figure S1. For mammalian gene expression, the vectors contain combinations of hygromycin (HygR) and neomycin (NeoR) selectable markers; SV40, PGK, and CMV promoters; and SV40 and BGH terminators, see Table  S3. The functionality of the vector design was verified by inserting the gene encoding enhanced green fluorescent protein (eGFP) into pBASE2 for proof of concept. Similar to our previous experience using the PacI/Nt.BvCI cassette for cloning [13], E. coli transformants were readily obtained (.200 colonies) and identical pBASE2-eGFP vectors were isolated from three of these colonies. One of the isolated pBASE2-eGFP plasmids was transfected into CHO-S cells for further analysis. A strong green fluorescent signal was detected in the cells (Figure 2). In contrast, CHO-S cells transfected with control plasmid did not show increased fluorescence. Thus, the vector system allows integration of a GOI into a fixed vector in a simple USER fusion reaction.

Design of a flexible multipurpose DNA fragment toolbox
In many cases, it is necessary to combine multiple DNA elements in a single vector to achieve the desired sequence. This task can easily be achieved with USER fusion where assembly of up to four fragments has previously been demonstrated [10][11][12][13]. Exploiting this possibility, we have designed a DNA fragment toolbox where the individual DNA building blocks can be combined in a flexible manner for the construction of a multitude of vectors. The individual building blocks in the toolbox have been made either by PCR or by simple annealing of complementary oligonucleotides. The vectors are assembled by combining the individual building blocks by USER fusion via FASTs (Figure 3, Table 2). The FASTs noted in Table 2 have been tested and verified for functionality, but can in principle be changed to suit the needs of the researcher. Each building block is capped with two defined FASTs at either end to allow for directionally controlled fusions to other building blocks in the toolbox. Vector assembly does not require USER cassettes as the vector backbone is generated by PCR. In the present version of our toolbox we have designed seven FASTs, which can mediate assembly of up to  Firstly, in the simplest scenario, it is possible to swap building blocks in the constructs that are based on the basic vector set described above, e.g. if another selectable marker or vector backbone is desirable ( Figure 3A). Secondly, a GOI can be combined with a new set of building blocks to form vectors with a configuration similar to those in the basic expression vector set, but with compositions of promoters, terminators, markers and vector backbone, which are not included in the set ( Figure 3B). The building blocks harboring the promoter, terminator and marker have fixed positions in the vector relative to the backbone, and for that reason they are always equipped with the same FASTs. In this way there is full flexibility to choose between the three promoters, two terminators, three mammalian markers and two vector backbones that are currently in the toolbox. In total this amounts to 36 vector combinations; a number that will expand as new promoters and terminators are added to the toolbox. The toolbox is not limited to these components, as one can freely add more components to suit specific projects. The only requirement is the addition of the defined FASTs to PCR amplification primers. Thirdly, the toolbox contains building blocks that allow for the construction of expression vectors where the GOI is fused to one or more sequences encoding relevant sorting signals, reporter proteins, and purification/epitope tags ( Figure 3C). Currently, the toolbox contain building blocks encoding a ER signal peptide; ER retention-and Golgi retention signals; mitochondrial-, nuclear-, peroxisomal-, and plasma membrane localization signals; reporter proteins including eGFP, eCFP, eYFP, mCherry and secreted alkaline phosphatase (SEAP); and the His6, FLAG and cMyc tags, see Table 3. Building blocks coding for protein are fused with FASTs encoding three amino acid residues, which serve as linkers between the two protein-based components ( Table 2). For each of these building blocks, variants exist with different FASTs ( Figure  S2). As a result, the composition of the FASTs and the relative positioning of building blocks are flexible. This part of the toolbox allows any GOI encoded protein to be fused, N-or C-terminally, with any of the tags for localization, purification and visualization mentioned above. Lastly, vectors supporting mammalian bicistronic gene expression can be constructed as one of the building blocks in the toolbox is an internal ribosome entry site, IRES. This mode of gene expression is desirable if stoichiometric transcription levels of the individual genes are required.

Efficiency of the versatile FAST vector assembly system
In order to benchmark the assembly efficiency of the FAST system and the functionality of the assembled vectors, a comprehensive set of mammalian vectors based on different numbers of PCR derived building blocks (five, six or seven) were constructed ( Table 4, Table S1). By testing vectors with 5-7 blocks, all FASTs are also validated for functionality. Nine, six, and eleven vectors were successfully made by fusing five, six and seven building blocks, respectively. Importantly, in all these 26 experiments, E. coli transformants were easily obtained. However, we noted that the number of colonies decreased as the number of building blocks was increased; hence, the lowest number of transformants, was obtained for a vector that required fusion of seven building blocks (Table 4). For all experiments, three randomly picked colonies were analyzed for the quality of vector assembly. Like the number of transformants, the fusion fidelity also decreased as the number of building blocks was increased. Nevertheless, in the 11 attempts to fuse seven building blocks, 93% of the 33 tested colonies contained a correctly assembled vector (Table 4). Moreover, sequencing of all correctly assembled plasmids showed that the building blocks were fused in an error free manner and that no mutations were introduced during PCR.
Small building blocks in the toolbox, which contain sequences that are too short to be made by PCR, were formed by annealing two partly complementary oligonucleotides. In these cases, nonhomologous extensions at the 39-ends of the two oligonucleotides provide the FAST overhangs. These building blocks include the localization sequences for endoplasmatic reticulum, mitochondria, and plasma membrane, as well as signals for retention in the medial-and trans-Golgi, and trans-Golgi network. To investigate whether these building blocks could be efficiently incorporated into vectors using the approach described above, they were mixed with five other building blocks in a number of vector assembly experiments. As a result, 18 different vectors were successfully assembled (Table 4). Compared to vector assembly, which is based solely on PCR derived building blocks, we note a minor decrease in the cloning efficiency. Even so, the number of colonies was still sufficient to achieve correctly assembled vectors in the first trial. Moreover, among the 54 vectors tested in total, 87% contained correctly fused building blocks ( Table 4).
The vectors constructed above were examined for functionality in U-2-OS cells. Firstly, we tested vectors based on the HygR marker. After transfection, small foci of resistant cells appeared after 4 days of selection pressure. In contrast, no foci formed with cells transfected with empty vectors, see Figure S3. Secondly, we validated the functionality of building blocks forming expression PTS1 (peroxisomal) Synthetic [18] c-Ha-ras (plasma membrane) Synthetic [19][20] COX-VIII (mitochondrial) Synthetic [21] CRT (ER) Synthetic [22] KDEL (ER retention signal) Synthetic [23] GalNacT1 (medial-Golgi) Synthetic [24] b-1,4 GT (trans-Golgi) Synthetic [25] a-2,6 ST (TGN) Synthetic [26][27] Secretion signals hIFN-c Synthetic [28][29] Proteins eGFP peGFP-1 Clontech eYFP peYFP-C1 Clontech eCFP peCFP-C1 Clontech mCherry pmCherry-N1 Clontech SEAP pGEM-4Z/PLAP489 [14] Other elements IRES pIRES-DHFR In house Vector backbone pU0002 [13] His-tag Synthetic [30] FLAG cassettes for production of fluorescent proteins. Accordingly, U-2-OS cells transfected with plasmids expressing genes or gene fusions encoding eCFP, eGFP, eYFP, and mCherry were examined by fluorescent microscopy. In all cases, cells containing an easily detectable signal in the cytoplasm at the expected wavelengths were observed, see Figure 4A-E. The toolbox includes building blocks encoding cell sorting sequences containing the information to direct a protein containing no sorting signal to any of eight different locations. To test whether these building blocks could be functionally fused via FAST linkers to a GOI we made 18 new vectors, see Table S3, each encoding a fluorescent protein fused to a specific cell sorting sequence. These plasmids were transfected into U-2-OS cells and subsequently examined by fluorescent microscopy ( Figure 5). In all experiments, cells containing a fluorescent signal were detected, and, as expected for functional fusions, the sorting sequences dictated the cellular locations of the tagged fluorescent proteins. For example, cells producing eYFP fused to the SV40 nuclear localization signal emitted yellow light that co-localized with DAPI stained nuclei ( Figure 5A). The distribution patterns of the remaining fusion proteins ( Figure 5B-H), corresponded to those already presented in the literature for proteins using these signals for sorting [17][18][19][20][21][22][23][24][25][26][27]. We therefore conclude that all protein-sorting sequences in the toolbox are functional when fused to fluorescent proteins and that the FASTs did not interfere with the localization of the targeting signal.  Validation of protein secretion using the FAST system Similar to intracellular targeting signals, the toolbox also includes a building block encoding a signal peptide that allows a protein to enter the secretory pathway. To test the functionality of this building block, it was fused to the gene encoding eGFP. The resulting construct and a construct coding for eGFP without the signal peptide was subsequently transfected into HEK293T cells. Transfected cells were propagated for four days and the growth medium examined for the presence of eGFP. Relative fluorescence intensity (RFI) from medium extracted from cells expressing the secreted protein was 24% and 50% higher than from medium extracted from cells expressing the non-secreted eGFP and from medium that was not inoculated with cells, respectively (Table 5). We therefore conclude that the signal peptide for secretion is functional with our FAST linker.

FAST bicistronic protein production
To investigate whether our FAST toolbox could support bicistronic gene expression, the building block containing an internal ribosomal entry site (IRES) was tested in two separate setups. Firstly, IRES was inserted between eGFP and mCherry and cells expressing this bicistronic construct were examined by fluorescence microscopy. The resulting cells contained both eGFP and mCherry in the cytoplasm. An analogous construct where the order of the two genes was reversed gave a similar result. In both cases, the dual signal cannot be the result of the formation of a fusion protein, since the two genes are not in the same reading frame, in addition, both coding sequences terminate with a stop codon. The simultaneous presence of the two fluorescent in the cytoplasm therefore strongly indicates that the ribosome was loaded at both the cap structure and at the IRES sequence of the mRNA transcribed from the plasmid. Secondly, IRES was inserted between the secreted alkaline phosphatase (SEAP) and either mCherry or eGFP. In these cases, significant extracellular activity of SEAP was detected in both experiments. Similarly, the expected fluorescent protein, but not the other, was detected in each of the two cells (Table 6, Figure S4).

Conclusion
In this work, we have generated and validated a versatile vector assembly system for rapid generation of mammalian expression vectors. The system is based on FAST linker sequences and consists of two parts using this technology: the pBASE vectors allowing rapid ligation-free insertion and expression of single gene expression cassettes, and the FAST-directed assembly (pFAST vector set) allowing assembly of up to seven PCR fragments in a single cloning step. As proof of concept of the versatility, we have  developed a set of constructs encoding fluorescent proteins that can be used to visualize compartments. The localization signals encoded by these constructs were 3-62 amino acids long; all were functionally fused to fluorescent proteins via our FAST linkers in the cell lines HEK293, U-2-OS, CHO-K1, and CHO-S. We have in this setup tested assembly of up to seven fragments, but the fact that these constructs were easily obtained and showed cloning efficiencies above 90%, indicates that even more complex vectors consisting of additional fragments can likely be constructed with this method. Furthermore, the FAST linkers make the system easily expandable to any components a user might wish to add. In summary, we provide a non-commercial validated method for one-step assembly of up to seven DNA fragments. The versatile vector assembly strategy we present here can therefore be adapted to a wide range of uses and broadly benefit the mammalian research community.