Fig 1.
A GFP-reporter assay for Cas9 screening.
(A) Schematic diagram of the GFP-reporter assay. A lentiviral vector containing a CMV-driven GFP, which is disrupted by an insertion of a target sequence followed by 7-bp random sequence between ATG and GFP coding sequence. The library DNA is stably integrated into HEK293T cells. Genome editing will induce GFP expression for a portion of cells. (B) Phylogenetic tree of selected Cas9 orthologs from different bacterial strains for activity screening. Seven validated Cas9 orthologs (green color) are used as reference. (C) Transfection of SauriCas9 with gRNA resulted in GFP expression, whereas transfection of SauriCas9 alone did not induce GFP expression. BF, bright field; CMV, cytomegalovirus; GFP, green fluorescent protein; gRNA, guide RNA; LTR, long terminal repeat; SauriCas9, Cas9 nuclease from S. auricularis.
Fig 2.
PAM sequence analysis for SauriCas9.
(A) Genetic locus of CRISPR/SauriCas9. (B) Twelve CRISPR array repeat sequences were identified in CRISPR/SauriCas9 locus. Nucleotide mutations are shown in red. (C) Deep sequencing revealed that targets with NNGG PAM can be efficiently edited in the GFP-reporter assay. GFP sequence is shown in green; insertion mutations are shown in red; NNGG PAM sequences are highlighted in yellow; GCG trinucleotide is used to fix 7-bp random sequence. (D) WebLogo is generated from deep-sequencing data. (E) PAM wheel is generated from deep-sequencing data. GFP, green fluorescent protein; PAM, protospacer adjacent motif; SauriCas9, Cas9 nuclease from S. auricularis.
Fig 3.
Genome editing capability of SauriCas9.
(A) A GFP-reporter construct is used to compare the activity of NNGG and NNNGG PAMs. Target sequences are shown below. PAM sequences are shown in blue. (B) Transfection of SauriCas9 with gRNA resulted in GFP expression. Quantification is shown on the right. Underlying data for all summary statistics can be found in S1 Data. (C) Genome editing with SauriCas9 and SaCas9 for 12 endogenous loci. PAMs are underlined (n ≥ 2). Underlying data for all summary statistics can be found in S1 Data. GFP, green fluorescent protein; gRNA, guide RNA; PAM, protospacer adjacent motif; SaCas9, S. aureus Cas9; SauriCas9, Cas9 nuclease from S. auricularis.
Fig 4.
(A) Schematic of the SauriBE4max. (B) SauriBE4max induces C-to-T conversions for a panel of 9 genomic loci. Underlying data for all summary statistics can be found in S1 Data. “E1-C4” means C at target E1 position 4. (C) Schematic of the SauriABEmax. (D) SauriABEmax induces A-to-G conversions for a panel of 9 genomic loci (n = 2). Underlying data for all summary statistics can be found in S1 Data. “A2-A7” means A at target A2 position 7. aa, amino acid; GFP, green fluorescent protein; NLS, nuclear localization signal; SauriABEmax, TadA-TadA*(involved TadA)–SauriCas9n; SauriBE4max, APOBEC1–SauriCas9n–UGI; SauriCas9, Cas9 nuclease from S. auricularis; SauriCas9n, nickase form of SauriCas9.
Fig 5.
Analysis of SauriCas9 specificity.
(A) A target sequence is inserted between ATG and GFP coding sequence, disrupting GFP expression. Target cleavage will induce GFP expression. Underlying data for all summary statistics can be found in S1 Data. A panel of gRNAs with dinucleotide mismatches (red) and each gRNA activity are shown below. (B) Two potential off-target sequences are selected for targeted deep-sequencing analysis. Mismatches are shown in red. (C) Indel sequences are detected by deep sequencing. (D) Indel frequencies detected by targeted deep sequencing (n = 2). Underlying data for all summary statistics can be found in S1 Data. (E) Off-targets for G1 locus are analyzed by GUIDE-seq. Read numbers for on- and off-targets are shown on the right. Mismatches compared with the on-target site are shown and highlighted in color. GFP, green fluorescent protein; gRNA, guide RNA; GUIDE-seq, genome-wide, unbiased identification of double-strand breaks enabled by sequencing; indel, insertion/deletion; SaCas9, S. aureus Cas9; SauriCas9, Cas9 nuclease from S. auricularis.
Fig 6.
Characterization of chimeric eSa-SauriCas9.
(A) Schematic diagram of eSa-SauriCas9. (B-C) WebLogo and PAM wheel of eSa-SauriCas9 are generated from deep-sequencing data. (D) Specificity of eSa-SauriCas9 is measured by the GFP-reporter assay. Underlying data for all summary statistics can be found in S1 Data. A panel of gRNAs with dinucleotide mismatches (red) is shown below. (E) eSa-SauriCas9 generates indels for a panel of 14 endogenous loci (n ≥ 2). Underlying data for all summary statistics can be found in S1 Data. eSa-SauriCas9, enhanced specificity SaCas9-SauriCas9; GFP, green fluorescent protein; Indel, insertion/deletion; PAM, protospacer adjacent motif; PID, PAM-interacting domain; SaCas9, a Cas9 derived from S. aureus; SauriCas9, Cas9 nuclease from S. auricularis.
Fig 7.
Characterization of SauriCas9-KKH.
(A) Schematic diagram of SauriCas9. Q788K/Y973K/R1020H mutations are shown below. (B-C) WebLogo and PAM wheel of SauriCas9-KKH are generated from deep-sequencing data. (D) Specificity of SauriCas9-KKH is measured by the GFP-reporter assay. Underlying data for all summary statistics can be found in S1 Data. A panel of gRNAs with dinucleotide mismatches (red) is shown below. (E) SauriCas9-KKH generates indels for a panel of 15 endogenous loci (n ≥ 2). Underlying data for all summary statistics can be found in S1 Data. GFP, green fluorescent protein; gRNA, guide RNA; Indel, insertion/deletion; PAM, protospacer adjacent motif; PID, PAM-interacting domain; SauriCas9, Cas9 nuclease from S. auricularis; SauriCas9-KKH, triple mutations (Q788K/Y973K/R1020H) on SauriCas9.