CRISPR-induced indels and base editing using the Staphylococcus aureus Cas9 in potato

Genome editing is now widely used in plant science for both basic research and molecular crop breeding. The clustered regularly interspaced short palindromic repeats (CRISPR) technology, through its precision, high efficiency and versatility, allows for editing of many sites in plant genomes. This system has been highly successful to produce knock-out mutants through the introduction of frameshift mutations due to error-prone repair pathways. Nevertheless, recent new CRISPR-based technologies such as base editing and prime editing can generate precise and on demand nucleotide conversion, allowing for fine-tuning of protein function and generating gain-of-function mutants. However, genome editing through CRISPR systems still have some drawbacks and limitations, such as the PAM restriction and the need for more diversity in CRISPR tools to mediate different simultaneous catalytic activities. In this study, we successfully used the CRISPR-Cas9 system from Staphylococcus aureus (SaCas9) for the introduction of frameshift mutations in the tetraploid genome of the cultivated potato (Solanum tuberosum). We also developed a S. aureus-cytosine base editor that mediate nucleotide conversions, allowing for precise modification of specific residues or regulatory elements in potato. Our proof-of-concept in potato expand the plant dicot CRISPR toolbox for biotechnology and precision breeding applications.


Introduction
The recent and extensive development of plant genome editing in the last few years has opened new avenues and exciting perspectives for both basic research and crop breeding. The class 2 type II CRISPR-Cas9 genome editing system from Streptococcus pyogenes has been broadly adopted by the plant science community, and consists in a two-components complex made of the DNA endonuclease SpCas9 and a customizable single guide RNA (sgRNA) [1]. This complex scans the genome, searching for a 5'-NGG-3' protospacer adjacent motif (PAM), and triggers local DNA melting and interrogation of adjacent DNA sequence for complementarity with the customizable spacer sequence at the 5'end of the sgRNA, eventually resulting in double strand DNA break (DSB) about 3-bp upstream of the PAM by the concerted activity of HNH and RuvC nuclease domains [2]. Once a DSB is created, the error-prone non-homologous end-joining (NHEJ) DNA repair pathway is activated [3], leading potentially to random small insertions or deletions (indels) mutations at the breaking site and eventually to gene knockout through frameshift mutations. While most studies focused on the production of loss-of-function alleles so far, new CRISPR tools have been recently developed, such as the CRISPR-mediated base editing system that allows for precise base conversion without a donor DNA or the induction of a DSB [4]. So far, two kinds of base editors (BEs) have been developed: cytosine base editors (CBEs) [5] and adenine base editors (ABEs) [6], whose architectures consist in the fusion of a Cas9 with an impaired DNA cleavage activity, generally a nickase Cas9 (nCas9) for plant applications, and a catalytic domain involved in cytosine or adenine deamination, respectively. After fixation of the nCas9 to its genomic target, a small window of the non-targeted ssDNA can be a substrate for deaminase domains. While ABEs almost exclusively mediate A-to-G conversion [6], CBEs can result in C-to-T, C-to-G and C-to-A according to the architecture of the BE [7].
Although the CRISPR-SpCas9 system revolutionized plant functional genomics, several other Cas9 enzymes from diverse bacteria have been used as an alternative for genome editing in plants, including the Staphylococcus aureus Cas9 (SaCas9) [8,9]. Use of SaCas9 for plant genome editing presents different assets. First, because the PAM recognized by the SaCas9 (5'-NNGRRT-3') [10][11][12][13] is different from the canonical 5'-NGG-3' PAM from SpCas9 (where N is for any nucleotide while R can be A or G), its use expands the number of sites that can be targeted in a given genome. In addition, the fact that the PAM of SaCas9 is more sophisticated than the one of SpCas9 may allow an increase in the specificity of the system by limiting the off-target activity, in particular for highly conserved genomic regions that are frequent in polyploid species. Furthermore, SaCas9 has been reported to edit target sequences with efficiencies greater than or similar to SpCas9 in Arabidopsis [14], rice, tobacco [12] and citrus [13]. Finally, because SaCas9 is smaller than SpCas9 (1053 vs 1368 amino acids), the delivery into plant cells could be easier, especially for strategies involving virus vectors. To date, CRISPR-SaCas9 has been used in different plant species for both gene knockout and/or base editing applications, including tobacco [12], Arabidopsis [14], citrus [13] and rice [12,15,16].
The cultivated and tetraploid potato (Solanum tuberosum) has received a lot of attention for genome editing in recent years by several groups. These achievements allowed for the production of plants with new agronomic traits using gene knockout and/or base editing approaches, such as the production of tubers with low levels of amylose [17][18][19] or with improved resistance to harvest and post-harvest processes [20]. However, all the studies carried out so far on potato have used the native or engineered variants [21] of SpCas9, emphasizing the need to expand the CRISPR toolbox for this species which constitutes one of the most important crops for food production worldwide. In this study, we report on the successful use of the SaCas9 enzyme for both knockout and base editing applications in the tetraploid potato, confirming that the CRISPR-SaCas9 system constitutes a relevant alternative to the classical CRISPR-Sp-Cas9 technology for functional studies and plant breeding.

CRISPR-SaCas9-mediated gene editing of the potato genome
In order to evaluate the efficiency of the CRISPR-SaCas9 system in potato, we first designed two sets of two sgRNAs each based on the potato reference genome and on the Sanger sequencing of the targeted loci in the Desiree genome using a TA cloning strategy designed to assess the allelic variability. The first set targeted the StGBSSI (PGSC0003DMG400012111) and StDMR6-1 (PGSC0003DMG400000582) genes with spacers of 20-bp sequence length (sgRNA1 and 2), while the second set targeted the same loci but with spacers of 24-bp sequence length (sgRNA3 and 4) (Fig 1). All the spacer sequences were chosen upstream of a 5'-NNGGAT-3' PAM with a high specificity score according to the CRISPOR software (http://crispor.tefor. net/) [22], selecting spacer sequences harboring at least 4 mismatches with other loci in the genome. For expression of the CRISPR-SaCas9 system in potato cells, we cloned each set of sgRNA cassettes (Arabidopsis U6-26 promoter/spacer sequence/sgRNA scaffold) into the binary vector previously used in Arabidopsis [14], resulting into the pDeSaCas9/sgRNA1-2 and the pDeSaCas9/sgRNA3-4 plasmids (Fig 1). The delivery of the CRISPR components into potato cells, was performed via Agrobacterium-mediated stable transformation of potato explants. Genomic DNA from transgenic plants was extracted and the target loci were first analyzed by high resolution melting (HRM) analysis. A Sanger sequencing directly on PCR products was then performed for HRM positive plants to validate HRM results and identify the nature of the mutations. For the pDeSaCas9/ sgRNA1-2 condition, among the 33 transgenic plants, none of them was mutated at the StGBSSI locus (sgRNA1), while 11 plants (33% efficiency) were found to be mutated in the StDMR6-1 target sequence (sgRNA2) (Fig 2A). For the pDeSaCas9/sgRNA3-4 condition, among the 27 transgenic plants, none of them displayed mutations at the StGBSSI locus (sgRNA3), while 4 plants (15% efficiency) were found to be mutated at the StDMR6-1 target site (sgRNA4) (Fig 2A). These results indicate that the SaCas9 can be used for gene editing of the potato genome, with spacer sequences of up to 24-bp. The observation that no editing activity was detected at the StGBSSI target locus for both spacer lengths (sgRNA1 and 3) may be due to the presence of an inefficient motif in the spacer sequence. However, none of the two motifs identified as inefficient in a previous study [23] was present in our spacer sequences, suggesting that the lack of editing at this locus is due to another factor, such as the genomic context that may interfere with SaCas9 binding and cleavage. Therefore, these results underscore the need to test independent spacer sequences for a target gene in order to maximize the likelihood of successful editing. Based on manual analysis of the Sanger chromatograms obtained after direct sequencing of the PCR products for sgRNA2 and 4 (StDMR6-1 locus), we found that frameshift mutations mostly occurred about 4-bp upstream of the PAM sequence (Fig 2B), as previously reported for SaCas9 [13,14]. For each chromatogram, we found an unambiguous wild-type sequence trace (Fig 2B), indicating that CRISPR-induced indels did not occur for all the alleles. Intriguingly, Sanger sequencing analysis directly on PCR products from the StDMR6-1 targeted sequence in the wild-type (Desiree cultivar) identified one natural SNP (T/A) (Fig 2B), that is present at the 5'end of target sequence of sgRNA2 and sgRNA4 (position -19 from the PAM). This SNP was also identified in a recently released SNP map from the Desiree genome [24], which is predicted to be present on two alleles. These results suggest that a bias occurred  Table summarizing the efficiencies of editing of the StGBSSI and StDMR6-1 targeted loci using both HRM analysis and Sanger sequencing. B) Sanger chromatograms of some CRISPR-SaCas9-edited potato plants at the StDMR6-1 gene with the pDeSaCas9/sgRNA1-2 for mutants # 3, 9 and 10 and with the pDeSaCas9/sgRNA3-4 for mutants # 12 and 13. The PAM, which is located on the reverse strand, is indicated in red and the spacer sequence in purple. C) Table summarizing the results of Sanger sequencing for 4 mutants after cloning of individual PCR products through TA cloning. Numbers in brackets represent the number of sequencing reads for each mutation type. Mutants # 9 and 10 and mutants # 12 and 13 were edited using the pDeSaCas9/sgRNA1-2 and pDeSaCas9/sgRNA3-4 constructs, respectively.

PLOS ONE
https://doi.org/10.1371/journal.pone.0235942.g002 during the initial TA cloning analysis that aimed to capture the allelic diversity at the StDMR6-1 locus from Desiree, although the primers used in this study were located in totally conserved regions. Although this SNP was present at the distal end of the target sequence in the non-seed region [25], its presence may affect overall editing efficiency and prevent the production of tetra-allelic mutants.
To characterize in more details the SaCas9-mediated editing footprint at the StDMR6-1 target site, we sequenced individual PCR amplicon after a TA-cloning reaction for 4 independent mutated plants, being aware that we did not capture all the allelic diversity due to the bias observed above. Most of the mutations were small indels about 3/4-bp upstream of the PAM (Fig 2C and S1 Fig), confirming the results from PCR products sequencing. However, we also observed a 82-bp deletion for one plant, showing that large sequence rearrangement can occur at the target site (Fig 2C and S1 Fig). Interestingly, this large deletion may be the product of a microhomology-mediated end-joining (MMEJ) repair pathway [26], as suggested by the presence of a TCCA homology at both sides of the deleted fragment (S1 Fig). Furthermore, we observed four different mutated alleles in addition to the wild type sequences, indicating that this plant is mosaic. It is well established that Agrobacterium-mediated transformation often results in the production of mosaic plants, as previously reported in potato [19]. As a result, the mutations detected in the chromatograms could be explained by both mosaic and stable mutations. As an alternative method to increase the rate of stable mutations, protoplast-mediated transient transfection could be applied using plasmid DNA or ribonucleoproteins (RNPs), as previously shown in potato [17][18][19][20].
Taken together and compared to our previous work on genome editing in potato [19], our results show that SaCas9 constitutes an alternative to the classical SpCas9. As previous data showed that SaCas9 and SpCas9 could edit different plant genomes with a comparable or even higher efficiency [12][13][14], the efficiency of SaCas9 in potato needs to be further investigated by targeting several other loci before any conclusion can be made on the relative efficiency of SaCas9 compared to SpCas9 in this species.

CRISPR-SanCas9-mediated cytosine base editing of the potato genome
Because the introduction of precise nucleotide substitutions is of upmost importance for both functional genomics (e.g. protein domain characterization) and plant breeding (e.g. gain of function variants), and because knockout mutants can have growth penalties compared to functional allelic variants, we next decided to develop a CRISPR-SanCas9 cytosine base editor to mediate cytosine substitution. We first introduced a point mutation in the SaCas9 sequence to produce a SanCas9 (D10A) that we fused to a dicot codon-optimized fragment consisting in a cytosine deaminase (PmCDA1) and an uracil glycosylase inhibitor (UGI) domain. This fusion protein was cloned into a modified version of the pDe backbone [21,27,28], resulting in the pDeSanCas9_PmCDA1_UGI binary vector for expression in dicot species (Fig 3A). The four sgRNAs used for CRISPR-mediated indels were individually cloned into this CBE through Gateway cloning (Fig 3B), each spacer harboring two to five cytosines in the putative editing window (distal part of the spacer sequence from the PAM) for this CBE, based on previous studies using the PmCDA1 enzyme in plants with SpnCas9 or SanCas9 [15,19,21,29,30] ( Fig 3B).
The delivery of the CBEs into potato cells was performed via Agrobacterium-mediated stable transformation and potato explants were grown on kanamycin-containing medium for several weeks. For both constructs targeting the StGBSSI gene (sgRNA1 and sgRNA3), none of the transgenic plants displayed mutations according to HRM analysis, which indicates, together with the inability to induce indels at this locus with the SaCas9 nuclease, that the spacer sequences and/or the targeted locus display characteristics preventing an efficient fixation of the CRISPR complex. For the pDeSanCas9_PmCDA1_UGI/sgRNA2 construct harboring a 20-bp spacer sequence that targets StDMR6-1, we did not find any base edited plant, suggesting that cytosine conversion occurs with lower efficiency than indel production at this locus. However, we identified three mutated plants for the pDeSanCas9_PmCDA1_UGI/ sgRNA4 construct that harbors a 24-bp spacer sequence targeting StDMR6-1. One of these mutants (#16) experienced indel mutations at one or more targeted alleles, while two mutants (#17 and #18) correspond to plants that experienced targeted cytosine conversion without the production of unwanted indels (Fig 4). Although this 24-bp spacer sequence was less efficient than the corresponding 20-bp spacer sequence to induce indels mutations (Fig 2A), its higher efficiency for cytosine base editing may be due to the presence of additional cytosines in the editing window at the extended 5'end of the spacer (Fig 3B). Supporting this hypothesis, we found that base conversion only occurred at C -23 and C -22 (counting from the PAM) in the two base edited plants #17 and #18 (Fig 4B). Interestingly, despite the presence of an UGI domain, our CBE construct was able to mediate both transition (C-to-T) and transversion (C- to-G) mutations (Fig 4B), increasing the diversify of edits, albeit at the cost of indel formation that occurred at the 5'end of the target sequence for one plant. This observation suggests that cytosine deamination-associated DNA repair mechanisms are involved in the production of this by-product.
To summarize, the CRISPR-SanCas9 CBE permitted cytosine base conversion at distal location from the 5'-NNGGAT-3' PAM in the cultivated potato, which is to our knowledge the first report of such application in a dicot species. The CRISPR-SanCas9 CBE developed in this study represents a complementary tool to the previously described SpnCas9 based CBE and, thanks to its capacity to hybridize with 18-24-bp guide sequences [11], may be useful to efficiently target specific nucleotide(s) at the distal part of longer spacer sequences, as demonstrated here.  Table summarizing the base editing efficiencies at the StGBSSI and StDMR6-1 targeted loci using both HRM analysis and Sanger sequencing. B) Sanger chromatograms of the three CRISPR-SaCBE-edited potato plants at the StDMR6-1 gene with the pDeSanCas9_PmCDA1_UGI/sgRNA4. Because the PAM (in red) is located on the reverse strand, and in order to avoid confusion, we sequenced using a reverse primer to clearly identify the C conversion. The spacer sequence is represented in purple. https://doi.org/10.1371/journal.pone.0235942.g004

Concluding remarks
The CRISPR-SaCas9 tools used and developed in this study broaden the scope of genome editing applications for potato, but also for dicot species in general. While the use of SaCas9 that recognizes a sophisticated 5'-NNGRRT-3' PAM may be useful to limit off-target activity at conserved sequences, this enzyme suffers from a narrowed targeting scope for base editing experiments due to the low occurrence of the PAM and the necessity to place the targeted base (s) in small editing window. In order to unleash the base editing potential of SanCas9, a San-Cas9-KKH variant has been engineered, that recognizes the relaxed 5'-NNNRRT-3' PAM, and has been used successfully in rice for both adenine and cytosine conversion [15,16,31]. This SanCas9 variant could be of particular interest in dicot species. Finally, validation in potato of the use of SpCas9 and SaCas9, which are associated with distinct sgRNA scaffolds, allows their simultaneous use to perform different catalytic functions (e.g. gene knock out, base editing, prime editing, transcription regulation, epigenome modulation) in a single transformation step and extends the possibilities of genome engineering in this essential crop.

Plant material
The potato cultivar Desiree (ZPC, the Netherlands) was propagated in sterile conditions in 1X MS medium including vitamins at pH 5.8 (Duchefa, the Netherlands), 0.4 mg/L thiamine hydrochloride (Sigma-Aldrich, USA), 2.5% sucrose and 0.8% agar powder (VWR, USA). Plants were cultured in vitro in a growth chamber at 19˚C with a 16:8 h L/D photoperiod.

Cloning procedures
The entry plasmid pEn_Sa_Chimera for spacer cloning and the binary vector pDeSaCas9 were kindly provided by Holger Puchta [14]. For spacer cloning, the pEn_Sa_Chimera entry plasmid was digested by BbsI and annealed oligonucleotides bearing complementary overhangs were ligated through T4 DNA ligase (ThermoFisher Scientific, USA) (S1 Table). For multiplex editing using the pDeSaCas9/sgRNA1-2 and pDeSaCas9/sgRNA3-4, sgRNA1 and sgRNA3 were introduced into the pDeSaCas9 backbone through MluI restriction and T4 DNA ligation (ThermoFisher Scientific, USA), while sgRNA2 and sgRNA4 were then introduced through a LR Gateway reaction (ThermoFisher Scientific, USA). The resulting plasmids were checked by restriction digestion and Sanger sequencing (Fig 1 and S1 Table).
The pDeSanCas9_PmCDA1_UGI binary plasmid was produced as follow. The SanCas9 sequence was produced through PCR amplification with the Superfi DNA polymerase (Ther-moFisher Scientific, USA) using a forward primer bearing polymorphism for D10A amino acid shift (S1 Table), devoid of a STOP codon. The PCR fragment was cloned into an intermediate pTwist plasmid through MluI/EcoRI restriction followed by T4 DNA ligation (Thermo-Fisher Scientific, USA). A sequence encoding the PmCDA1 and UGI catalytic domains was previously dicot-codon optimized and synthesized (TwistBioscience, USA) [21], and cloned into the intermediate pTwist plasmid through EcoRI restriction and T4 DNA ligation (Ther-moFisher Scientific, USA), downstream of the SanCas9 coding sequence. The construct was checked by sanger sequencing (S1 Table). The SanCas9_PmCDA1_UGI sequence (S2 Fig) was then cloned into a modified pDeCas9 backbone [28] through AscI restriction and T4 DNA ligation (ThermoFisher Scientific, USA). The final pDeSanCas9_PmCDA1_UGI was checked by restriction ligation and Sanger sequencing (Fig 3A and S1 Table). Previously built sgRNA cassettes were then individually cloned into the Sa_CBE plasmid through a LR Gateway reaction (ThermoFisher Scientific, USA). The resulting plasmids were checked by restriction digestion and Sanger sequencing (Fig 3B and S1 Table).

Agrobacterium-mediated transformation and plant regeneration
Binary plasmids described above were transferred into Agrobacterium C58pMP90 strain by heat shock. Agrobacterium-mediated stable plant transformation and plant regeneration were performed on explants of the Desiree cultivar, as previously described [19]. Plant tissues were cultured on 50 mg/L kanamycin for 6-12 weeks and regenerated stems were then cut and individually grown on a kanamycin-free culture medium for few weeks. For pDeSaCas9/sgRNA1-2, pDeSaCas9/sgRNA3-4, pDeSanCas9_PmCDA1_UGI/sgRNA1 and pDeSanCas9_PmC-DA1_UGI/sgRNA3 constructs, the identification of transgenic plants was performed by a rooting test on a culture medium containing 50 mg/L kanamycin. For the pDeSanCas9_PmC-DA1_UGI/sgRNA2 and pDeSanCas9_PmCDA1_UGI/sgRNA4 constructs, transgenic plants were identified by testing for the presence of the T-DNA by PCR using primers matching the nptII gene (S1 Table).

Target site genotyping
Genomic DNA from control and regenerated plants was extracted using the NucleoSpin Plant II kit (Macherey-Nagel, Germany) according to the manufacturer's instructions. HRM analysis was performed using the High Resolution Melting Master (Roche Applied Science, Germany) on the LightCycler1 480 II system (Roche Applied Science, Germany) (S1 Table), as previously described [19]. Plants harboring a HRM mutated profile were then Sanger sequenced (Genoscreen, France) (S1 Table). Plants harboring mutations at the StDMR6-1 locus with the pDeSaCas9 constructs were further analyzed by cloning the PCR products (Superfi DNA polymerase, ThermoFisher Scientific, USA) into the pCR4-TOPO TA vector (ThermoFisher Scientific, USA), followed by Sanger sequencing (Genoscreen, France). The SanCas9 sequence is in blue, the two NLS sequences in purple, the PmCDA1 sequence in green and the UGI sequence in red. All the coding sequence is optimized for expression in dicot species. (DOCX) S1 Table. List of primers used in this study. (XLSX) for his efficient management of the GENIUS project. We acknowledge the BrACySol BRC (INRAE, Ploudaniel, France) for providing the potato plants used in this study.