Figure 1.
Major parameters to identify gRNA-Cas9 target and off-target sites with the offTargetAnalysis function of CRISPRseek.
A. Target sequences for CRISPR-Cas9-derived nucleases are composed of two components, the guide sequence (red) that matches the variable region in the guide RNA and an associated PAM sequence that is recognized by the Cas9 protein. The sequence shown is a possible target sequence for a nuclease based on the S. pyogenes CRISPR-Cas9 system, which has a 20 base pair guide sequence and a PAM sequence of NGG or NAG. B. offTargetAnalysis first identifies candidate guide sequences in an input sequence. Some of the possible parameters to identify these sequences are indicated, including parameters for the guide sequence, the adjacent PAM sequence and other features such as overlapping restriction enzyme sites. Next, potential off-target sites for the identified guide sequences are identified in a specified genome. A variety of parameters can be adjusted to determine the criteria used to identify and score off-target sites.
Figure 2.
Major parameters to identify gRNA-Cas9 target sites that are shared or distinct in two related sequences with the compare2sequences function of CRISPRseek.
compare2sequences first identifies target sites in two input sequences using many of the same parameters available for the offTargetSequence function. Next, cleavage scores are determined for each target site in both target sequences.
Figure 3.
An example of using compare2sequences to identify gRNAs that target for the C allele of a SNP in the human huntingtin gene.
A. Sequences for both alleles of the SNP RS362331 were used as inputs for compare2sequences. The input sequence used for the C allele is shown while the T allele is identical except at the position in bold. Positions of potential gRNA sequences for the C allele are visualized by inporting the genbank format output from compare2sequences into the “A plasmid Editor” or ApE sequence analysis program and displaying a graphical map. B. A partial output of gRNAs and their cleavage scores in each input sequence is shown. gRNAs are numbered based on their orientation (f, forward. r, reverse complement), position and input sequence name. Target site sequences include the gRNA sequence and the adjacent PAM sequence. Cleavage scores are calculated for the gRNA in both input sequences and the name and position of each mismatch relative to the PAM sequence is shown. The cleavage score difference indicates the likehood that a gRNA with a target site in one input sequence will also cleave the other input sequence. A score difference of “0” indicates that the gRNAs are predicted to cleave both sequences with equal efficiency; “100” indicates that the gRNAs are predicted to be specific for just one sequence. The gRNAs in bold are candidates to show greater activity for the C allele compared with the T allele.