COSPLAY: An expandable toolbox for combinatorial and swift generation of expression plasmids in yeast

A large number of genetic studies in yeast rely on the use of expression vectors. To facilitate the experimental approach of these studies, several collections of expression vectors have been generated (YXplac, pRS series, etc.). Subsequently, these collections have been expanded by adding more diversity to many of the plasmid features, including new selection markers and new promoter sequences. However, the ever growing number of plasmid features makes it unrealistic for research labs to maintain an up-to-date collection of plasmids. Here, we developed the COSPLAY toolbox: a Golden Gate approach based on the scheme of a simple modular plasmid that recapitulates and completes all the properties of the pRS plasmids. The COSPLAY toolbox contains a basal collection of individual functional modules. Moreover, we standardized a simple and rapid, software-assisted protocol which facilitates the addition of new personalized modules. Finally, our toolbox includes the possibility to select a genomic target location and to perform a single copy integration of the expression vector.


Introduction
Over the last decade, the explosion of genome-wide analyses, as well as the rise of synthetic/ systems biology has profoundly transformed the methodology used to study model organisms [1]. Yeast has been a pioneering model in these emerging fields thanks to the versatility and the ease to perform genetic manipulations. High recombination rates, fast generation and crossing times are crucial features that allowed to create large collections of tagged and deleted strains [2][3][4] which proved to be very useful for systematic functional studies [5,6].
Whereas large strain collections are built using single-step PCR-based gene replacement techniques, the vast majority of classical genetic studies in yeast rely on the use of expression vectors, such as the YXplac [7] and pRS [8]  pBluescript backbones, respectively. These vectors feature a number of unique cloning sites and commonly used auxotrophy selection cassettes (HIS3, LEU2, TRP1, URA3, ADE2 in the pRS system), which make them convenient for genetic manipulations of standard lab strains. There are three types of such plasmids with different replication properties: integrating (stable chromosomal integration), centromeric (episomal low-copy plasmid), and 2 microns (high copy episomal plasmid) [8].
Since the first release of the pRS series, several improvements were brought to these expression vectors: first, the development of antibiotic resistance markers, such as the KanMX module conferring resistance to Geneticin (G418) [8,9], expands its versatility of use as no auxotrophic mutation is required in the targeted strain. Second, a set of pRS-based plasmids containing standard controllable promoter sequences (GAL1, MET25) were constructed to standardize the means to ectopically drive the expression of a gene of interest in a controlled manner [10]. This collection was later expanded to fine tune expression level in a synthetic biology context, by making the constitutive promoting sequences of the following genes available: TEF1, ADH1, TDH3, CYC1 [11]. Next, an updated version of the pRS series, called pRSII, was reported, which substantially increases the number of available selection markers (auxotrophic: HIS2 and ADE1; antibiotic: phleomycin, hygromycin B, nourseothricin, and bialaphos) [12].
Therefore, there are now a large number of potential combinations of plasmid features for ectopic gene expression. This ever-growing number makes it unrealistic for research labs to maintain an up-to-date collection of plasmids. Yet, the complexity of today's genetic studies, in which more than eight mutations or tags are routinely used within a strain [13], strongly pleads for a flexible expression system in which plasmid type (integrating, centromeric, etc..), selection marker, and gene promoters could be selected in a systematic and modular manner.
Since the late 1990s, several fast cloning strategies have been developed to ensure a quick and reliable assembly of DNA fragments within a destination vector. The first system released is known as the Gateway vector system (available from Life Technology), which uses a proprietary recombination scheme to assemble up to 4 inserts into a destination vector following a 3-step procedure [14]. Interestingly, specific Gateway destination vectors with widely used selection cassettes, tags and promoters have been built for budding yeast (commercially available on addgene.com) [15]. However, a major drawback is that Gateway is a patented system which is not open to modifications, and may involve expensive running costs. In addition, due to the large number of inserts available in yeast, it is still necessary to purchase a large collection of plasmids (288 plasmids) in order to avoid multiple sub-cloning steps, since combinatorial shuffling of inserts in one step is not possible.
As an alternative to Gateway, the BioBrick cloning format was developed as an open-source system intended to normalize plasmid construction for synthetic biology. It uses a DNA editing technique based on 4 restriction enzymes to iteratively assemble inserts in a standardized manner [16]. However, there are as many steps in the assembly process as the number of fragments to integrate in the ultimate destination vector. Therefore, swapping a single insert from a complex destination plasmid requires starting the assembly process from scratch.
To overcome these limitations, a novel strategy based on the exonuclease activity of polymerase (or using a dedicated exonuclease) was used to generate sticky ends in the destination vector as well as in the insert to be integrated. These methods, known as SLIC [17] or Gibson assembly [17,18] provide a powerful way to assemble multiple DNA fragments within one destination vector in a single step, and have been followed by others that used a different implementation with a similar outcome [19,20]. Yet, in common to all these techniques, the assembly process relies on the annealing of several sequence-dependent single strands DNA pieces, thus making the assembly efficiency somewhat variable.
Another strategy developed recently is the Golden Gate system, which is based on the use of a type IIS restriction enzyme (such as BsaI) that cuts outside of its recognition sequence [21,22]. Using an appropriate design of flanking sequences of the DNA fragments to be integrated, one can assemble a virtually unlimited number of modules in the right order and in one reaction. This technique has been further developed to allow large-scale construction of multi-genic vectors using a multi-level assembly scheme [23,24]. The Golden Gate system has recently been adapted to yeast [25] with the generation of a collection of modules (i.e plasmid type, selection markers, promoters, etc. . .) that can be assembled in different combinations to produce expression vectors (up to 11 modules) as well as complex multi-gene assemblies for synthetic biology approaches. Here, we report the development of a simpler alternative plasmid architecture based on the Golden Gate system that allows both efficient assembly of individual modules into a destination expression vector and rapid generation of new custom modules using the MEGAWHOP technique. The resulting COSPLAY (COmbinatorial Swift PLasmid Assembly in Yeast) toolbox includes a collection of 26 modules that can be integrated into 6-module expression vectors that recapitulates and completes the properties of the pRS plasmid series. We have paid particular attention to make this toolbox straightforward to use and to expand it by developing custom Matlab software that automates the design of the tailed primers used to generate individual modules. In addition, unlike standard yeast integrating plasmids, our toolbox is designed to offer the possibility to ensure single copy integration of the expression vector to a target locus in the genome, independently of auxotrophic markers. The COSPLAY toolbox is available on Addgene as a package of 28 plasmids and the software can be downloaded from github: https://github.com/ gcharvin/cosplay.

Yeast strains and media
Yeast strains used in this study are listed in Table 1.
The strains YAP194 and YAP197 were obtained by single integrations of reporter cassettes at a locus situated in chromosome VI (position 260998 to 261148).
To induce GAL1 promoter, cells were initially grown in SD medium containing 2% dextrose and then switched to SD medium containing 2% Raffinose and 1.5% Galactose.

Module library construction
To generate individual modules we amplified the specific DNA sequence of interest by PCR using primers (Figs 1 and 2) that contain 25 bp of the multiple cloning site (MCS) of pUC57 followed by the BsaI restriction site, a specific 4 bp overhang (Fig 1 and Table 2), which determines the cloning position. These sequence is in turn followed by 15-25 bp of the target sequence of interest (Fig 2).
PCR products were amplified by PCR using the Q5 (NEB) high-fidelity polymerase following standard conditions. The product of this first PCR was then used as a megaprimer to insert the target sequence into the pUC-57 destination vector through a second, PCR. To do this, 0.5 μl of the first PCR product are mixed with 50 ng of pUC-57 plasmid and a second PCR is carried out in a 25 μl reaction with the following conditions: 98˚C (30 s) / 30 cycles-98˚C (10 s), 68˚C (30 s), 72˚C (30 s/kb) / 72˚C 2 min. The second PCR product is digested by the addition of 1 μl DpnI-FD (ThermoFishe) for 1 hour at 37˚C to destroy the methylated pUC-57 template plasmid. After digestion, 10 μl of this reaction are transformed into DH-5 α or TOP10 bacterial strains and plated on LB plates supplemented with Ampicillin and X-Gal for Blue/White screening. If repetitive sequences are to be cloned, the usage of the Stbl4 (Invitrogen) bacterial strain is recommended.
Note, for primer design it is mandatory to check whether the target sequence of interest contains BsaI sites. If so, these have to be mutated by PCR.
Note, all individual modules were tested functionally by assembling expression vectors followed by transformation in yeast.

Expression vector assembly
A combination of 6 modules was assembled into a destination vector (modified version of pUC-57 that does not contain BsaI sites and carries a Chloramphenicol resistance cassette instead of Ampicillin) by an all-in-one reaction of digestion/ligation.

Yeast transformation
Yeast were transformed using methods involving lithium acetate, polyethylene glycol, denatured herring sperm DNA and sorbitol [26,27]. Transformants were selected by spinning down yeast cells after a heat shock and resuspending them in sterile water before plating on the appropriate dropout or drug-selection medium. For drug selection, the yeast were resuspended in YPD and allowed to recover before plating.

Microscope imaging
Cells were imaged using an inverted microscope (Zeiss Axio Observer Z1, or Nikon Tie). Wide-field epifluorescence illumination was achieved using an LED light source (precisExcite, CoolLed), and light was collected using a 100× N.A. 1.4 objective and an EM-CCD Luca-R camera (Andor).

Fluorescence quantification by flow cytometry
Cells were grown in SD media until 0.5 OD and then analyzed by flow cytometry using FACS Celesta (BD Biosciences, San Jose, CA). Cytometry data analysis was performed using custom Matlab 2017b scripts.

COSPLAY primers design software
The COSPLAY primer design software was developed in MatLab to simplify and to automate the addition of new custom modules to the COSPLAY modules collection. The software can be downloaded from github: https://github.com/gcharvin/cosplay (as a Matlab application, requires the version 2017b or higher of Matlab).
The main functions of the COSPLAY software are: 1. Automated design of optimized primers for amplifying a specific target sequence. Several criteria (Primer Tm, length, self complementarity, 3' complementarity, matrix complementarity, 3' matrix complementarity, GC %, GC clamps, 3' stability, 3' GC%, primers pair Tm difference, cross complementarity, cross 3' complementarity) are used for this optimisation. The basic rules of optimisation are similar to those of the Primer3 software [28].
2. Addition of pUC57 MCS sequences, BsaI sites and specific overhangs depending on cloning position.Detection of BsaI sites in the target sequence and replacement with silent substitutions whenever it is possible (or suggestion of alternative strategies to discard the sites).
3. Estimation of the best PCR temperature conditions. The Tm of the primers is calculated using an accurate thermodynamic approach [29]. This Tm is optimised for Q5 polymerase which is recommended to use in our protocol (if standard Taq polymerase is used, the Tm has to be lowered by 7˚C). The effect of the buffer salt concentration is also calculated [30].
4. Display of a full report about the quality of the primers pair when the entire tailed primers are automatically designed from a user-defined template or indication of potential quality problems when only tails are added to user-defined primers.
See additional COSPLAY software user guide (S1 File).

COSPLAY, a toolbox for rapid generation of yeast expression vectors
The object of the COSPLAY toolbox is to provide the possibility of easily, quickly and efficiently generate new expression vectors containing different combinations of functional elements or modules. Each of these modules can be selected from a collection of ready-to-use plasmids, or can be easily generated from cDNA, as described below. The organization of a yeast expression vector generated with the COSPLAY toolbox is based on six functional modules assembled in a specific order into a destination vector (pUC57; Fig 1A). Each position is committed to a specific role. The first (Fig 1) determines the mode of plasmid replication (i.e. which could be either a fluorescent protein or a degron to decrease the stability of the resulting protein. The fifth position consist of a transcriptional terminator sequence followed by a selection marker (Position 6) used upon yeast transformation. The COSPLAY toolbox uses the Golden Gate cloning technology [21,22], which relies on the usage of Type IIS restriction enzymes (i.e. BsaI, BpiI, etc.). These enzymes cleave outside of their recognition site (Fig 1B), leaving a 4 bp 3' overhang that can be manipulated to generate DNA ends (Fig 1B), which will dictate the compatibility between individual DNA fragments and the cloning order and orientation (Fig 1C). In addition, by placing the recognition sites at each extremity of the DNA fragment of interest (in opposing orientations), the restriction sites are lost upon ligation. Thus allowing to simultaneously carry out the restriction digestion and ligation reactions in a single tube (Fig 1C). The destination vector (pDV) has been engineered to contain a chloramphenicol resistance (CmR) and carries only 2 BsaI restriction sites (compatible with the module library) and flanking a LacZ cassette and located upstream of an SV40 polyA sequence (Fig 1C). This cassette is lost upon successful assembly of individual modules into the destination vector and allows performing a Blue/White screening to increase the cloning efficiency. The product of the restriction/ligation reaction is transformed into E. coli and grown on CmR plates. Since the destination vector and the backbone vector for the module library contain different antibiotic selection markers (Table 3), one can easily select only for the destination vector only. Individual white colonies are grown in liquid culture, plasmid DNA is prepared and vector integrity is verified by restriction digest using appropriate restriction enzymes and sent for sequencing. With the COSPLAY toolbox, any given construct can be assembled in 4 days with 100% cloning efficiency (see cloning note on methods section).

Combinatorial generation of a large number of expression vectors from a small collection of modules
We have generated a library of modules, cloned in specific positions that are readily available in the COSPLAY toolbox and which correspond to widely used sequences for yeast genetic engineering (Table 3 and S2 File).
Regarding position 1 (plasmid type), we have built modules that contain either a 2μ plasmid replication sequence (high copy number), a CEN-ARS sequence for transient extrachromosomal expression, or a nonfunctional short linker sequence to generate an integrating plasmid (in this case, homologous recombination leading to chromosomal integration is obtained by linearizing the plasmid within the selection marker). In position 2, we have generated widely used inducible or constitutive promoters, such as GAL1p, MET25p, ACT1p, CYC1p, TEF1p, ADH1p and TDH3p. Since classical expression vectors (pRS, pRSII,..) are often used to make fluorescent transcriptional reporters, we have generated modules in position 3 containing fluorescent proteins (i.e. superfolder GFP (sfGFP), mCherry, Venus). These sequences lack a stop codon and can thus be destabilized by adding a CLN2-pest degron cloned in position 4 or not, https://doi.org/10.1371/journal.pone.0220694.t002 if a stop codon is cloned in position 4. Furthermore, to enable the fusion of a C-terminal fluorescent tag to a protein of interest, we have generated additional modules in position 4 encoding sfGFP and mCherry (bearing a STOP codon). We have generated a module in position 3 carrying a nuclear localization signal (NLS) sequence based on three repeated sequences from the SV40 virus, in order to allow the expression of nuclear fluorescent reporters (when cloned with sfGFP or mCherry in position 4). In position 5, we have cloned a single module which carries the ADH1 transcriptional terminator. Finally, position 6 provides a large choice of universal yeast selection markers such as auxotrophy genes (i.e. TRP1, URA3, HIS3, LEU2) as well as drug resistance cassettes (i.e. KanMX, NatMX). All modules (Fig 1A and Table 3), in all positions, are compatible with each other within the framework of an expression vector containing 6 elements (Fig 1). Therefore, despite the relatively small collection of functional modules, the COSPLAY toolbox offers extreme flexibility in the assembly of different expression vectors.

An efficient cloning method to generate new modules
In addition to the modules available in the COSPLAY library it is possible to easily generate new modules to meet user-specific needs (e.g. a specific cDNA, promoter, fluorescent protein, etc.). To this end, we have standardized a cloning-free two-step PCR process [37] whereby the sequence of interest is amplified by PCR using appropriate primers containing BsaI sites with appropriate overhangs and 25 bp of the pUC57 multiple cloning site (Fig 2A). The PCR product of this first reaction is then used as a mega-primer to amplify the pUC57 destination vector, leading to the direct integration of the target sequence into pUC57 (Fig 2A). The pUC57 plasmid contains a LacZ cassette, which is rendered nonfunctional when a DNA sequence is cloned in its multiple cloning site, thus allowing Blue/White screening after transformation in E. coli (Fig 2C).
In order to facilitate primer design to generate new modules, we have developed a custom program with a graphical user interface in Matlab (Fig 2B) that provides assistance in this process: the user either enters the sequence of primers homologous to the sequence to be integrated in the module or the complete sequence of the module. For each module position (1 to 6), the program adds the primer tails and runs a battery of tests to evaluate the quality of the primers. In addition, the program identifies potentially conflicting BsaI restriction sites in the module sequence and proposes strategies to mutate them through fusion PCR. A comprehensive documentation of the program is provided as supplementary material, and the program can be downloaded from GitHub: https://github.com/gcharvin/cosplay.

Generation of transcriptional reporters as an application of the COSPLAY toolbox
As a proof-of-principle we generated transcriptional reporter plasmids for the Gal1 and Cyc1 promoters using mCherry and sfGFP fluorescent proteins, respectively (Fig 3A and 3B). For the galactose-inducible promoter (Gal1p), cells were grown in SD medium supplemented with 2% dextrose, then transferred to a SD medium supplemented with 2% Raffinose and 1.5% Galactose before observation under the microscope for fluorescent protein expression. As expected, when the SV40 nuclear localization sequence was cloned in position 3, fluorescent signal was entirely nuclear (constitutively expressed with ACT1p- Fig 3C and 3D).

Single copy integration of plasmids using the COSPLAY toolbox
Quantitative studies in which ectopic gene expression is used require a precise control of plasmid copy number. Unlike centromeric and multicopy 2μ plasmid, integrative plasmids guarantee a stable expression over time. However, multiple integration events leading to variations in plasmid copy number across selected clones are not uncommon [38]. This is due to the fact that the integration site is replicated following the first homologous recombination, hence enabling subsequent integrations (S1 Fig). Accordingly, it has been shown that the standard integrative plasmids pRS have a strong tendency to integrate multiple times at the same genomic locus while the single integration represents only 12.5% (when transformed with 1.5 μg of linearized plasmid) [38]. Therefore, the number of integrated clones must be assessed quantitatively either by directly scoring integration events (e.g. using southern blot) or by measuring the expression level of the gene of interest (e.g. using QPCR, western blot, cytometry, etc..).
To overcome this limitation, we developed a module in position 1 that ensures a single copy integration of the expression vector to a specific target region in the genome. This module is based on two inverted sequences (separated by the rare AscI and FseI restriction sites) which are homologous to a small genomic region (see Fig 4A and Methods). To facilitate module variant design, we have added an option in the software that takes this additional constraint into account for the generation of primers.
After AscI-driven linearization (or FseI if AscI is present in the inverted sequences), an assembled plasmid containing this module can be integrated at the homologous genomic locus of interest (Fig 4B). Subsequent plasmid integration is still possible, yet such event will only replace the already integrated plasmid copy (see S1 Fig). Thus, only a single copy of a plasmid containing this particular modules should be inserted in the genome, unlike the C m r

Assembled Expression vector
In t

Assembled Expression vector
In t

In t G A L 1 p m C h er ry ST O P A D H 1 t T R P 1
Phase   Fig 4. Principle of integration to a target chromosomal locus. A) A given locus is represented with its 5' (green) and 3' (yellow) domains. Primers are designed so that the two sequences are placed in reverse order (3' region upstream of 5'), and AscI and Fse1 restriction sites are added in between these regions; B) The module carrying-plasmid and the final assembled vector are generated according to the standard procedure described in the text; C) Following AscI (or alternatively, FseI) digestion, the linearized plasmid is integrated at the target locus.
https://doi.org/10.1371/journal.pone.0220694.g004 chromosomal integration at the locus of an auxotrophic marker. Indeed, the fraction of single integration events has already been estimated to be close to 98% [38].
To check the validity of this approach, we have selected a target integration locus that fulfills the following criteria: 1) it must be far from telomeric and centromeric regions; 2) it must be in an ORF-free region of at least 1000 bp, flanked by ORFs that do not share the same regulation nor function. 3) It must be in a region of at least 200 bp devoid of protein binding sites and non coding RNA sequences. We found such loci in chromosome VI (position 260998 to 261148) and in chromosome IX (position 301997 to 302147). We have built the corresponding modules in position 1 using appropriate primer pairs as described above. Next, we have generated plasmids that contain these sequences, as well as a sfGFP transcriptional reporter of the Gal1 promoter and a Ura3 selection marker. These plasmids were linearized using either AscI (to target the specific loci on chromosome VI or IX) or EcoRV (to target the Ura3 locus), and all linearized plasmids were independently transformed in yeast (Fig 5A).
To assess the number of plasmid copies respectively integrated at the chromosome VI, IX or at the Ura3 locus, we measured the protein expression level of more than 100 randomly selected clones of each type by flow cytometry (Fig 5B). As expected, recombinant clones exhibited a~100-fold higher fluorescence level than the negative control ( Fig 5B). Therefore, we quantified the fraction of recombinant clones for both Ura3 and integrations targeted to specific chromosomal loci ( Fig 5C) and we calculated the median fluorescence level of each clone ( Fig 5D). As expected, standard integration based on linearization within the Ura3 auxotrophic marker occasionally lead to multiple integrations (Fig 5D). In contrast, targeting specific loci on Chromosome VI or IX lead to single copy integrations (Fig 5D). Therefore, these results indicate that designing a custom module in position 1 targeting a specific chromosomal locus provides a straightforward way to ensure a single-copy integration.

Discussion
In this study, we have developed a new toolbox called COSPLAY that provides fast and easy access to a high number of plasmid variants by handling the combinatorial part of the expression vector design. Our toolbox is based on a ready-to-use collection of individual functional modules and a one-step protocol to rapidly assemble a combination of modules into a functional expression vector. Each module can be assembled with any of the others modules leading to a large number of potential expression vectors.
A particular feature of the COSPLAY toolbox is that the collection of modules can be easily expanded using the MEGAWHOP protocol [37]. Here, specific sequences of interest (promoters, cDNAs, fluorescent reporters, epitope tags, etc.) are seamlessly cloned by PCR, thus producing the possibility to expand the flexibility of the COSPLAY toolbox at will. To streamline this process, we have developed a standardized protocol that makes the generation of new modules very straightforward.
A potential future expansion of the COSPLAY toolbox would incorporate some of the recent advances in the field of synthetic transcriptional regulation in Saccharomyces cerevisiae. While the COSPLAY toolbox already includes some widely used promoters/terminators, new sequences have been developed to improve gene expression robustness [39] [40], inducibility [41][42] [43], as well as mRNA stability [44]. These elements could be easily integrated within the COSPLAY toolbox to further increase the number of potential expression vectors.
While other combinatorial strategies for plasmid generation based on the Golden Gate system already exist for yeast [25], we made the choice to develop a toolbox that is simpler, that yet recapitulates all the features of the pRSII series plasmids. Indeed, our toolbox is built on an expression vector scheme containing only 6 module types (versus 8-11 for alternative  Number of clones strategies [25]). Moreover, we decided to use only one type IIS restriction enzyme (BsaI) for expression vector assembly. This further simplifies the toolbox since a smaller number of modules needs to be assembled, and there are less constraints regarding the presence of specific restriction sites in the module sequence. Last, our custom software facilitates the design of primers for the generation of new module variants, especially when cloned sequences feature unwanted BsaI restriction sites: in this case, the software provides assistance to the user by proposing potential design strategies to remove them. Altogether, this toolbox provides a simple and efficient expression system that might be of particular interest in labs with little experience in yeast molecular biology. The COSPLAY toolbox also allows for single integration using a single module that can be easily designed with our software, instead of the 3 modules required in alternative methods [25].
To conclude, the COSPLAY toolbox is extremely flexible and efficient, simple of use, easily expandable and complementary to alternative Golden Gate cloning-based systems for yeast genetic editing.  Table 3). (ZIP) Fig 5. Single copy chromosomal integration. A) A sample plasmid containing the indicated modules is linearized by cutting either within the target locus (using AscI, see Fig 4 and corresponding text) or within the auxotrophic marker to compare the number of copy integrated at each locus. B) Flow cytometry analysis of GFP expression on yeast transformed with URA3, ChRVI or ChrIXintegrating plasmids. Each colored line represents the distribution of fluorescence obtained with a given clone. The black line represents the control. C) Fraction of positive clones (i.e. selected clones with a fluorescence higher than background) for strain transformed at the Ura3 versus a specific target locus integration; D) Distribution of median fluorescence of the positive clones (reported in C) selected after transformation at the Ura3 versus the specific target loci (ChrVI-260998 or ChrIX-301997); In the case of Ura3 (top plot), the histogram shows distinct peaks corresponding to 1-and 2-copy plasmid integrations as well as a higher number of integration; For the specific target loci (bottom plot), only the 1-copy peak is present.