Skip to main content
Advertisement

< Back to Article

Overcoming CRISPR-Cas9 off-target prediction hurdles: A novel approach with ESB rebalancing strategy and CRISPR-MCA model

Fig 6

A Example of a gRNA-target DNA sequence code. The first five channels are base channels, which are responsible for converting base pairs in the sequence into unique One-hot vectors, adenine is coded as [1, 0, 0, 0, 0], guanine is coded as [0, 1, 0, 0, 0], cytosine is coded as [0, 0, 1, 0, 0], thymine is encoded as [0, 0, 0, 1, 0], and indels indicated by underscores (_) are encoded as [0, 0, 0, 0, 1]. The last two channels are direction channels, which are used to mark the direction of the bases at the position where the mismatch occurred. B The architecture of CRISPR-MCA is depicted, starting with a 24*7 matrix derived from the encoded gRNA-target DNA sequence as its input. This matrix is processed by a multiscale convolutional layer designed to extract sequence features. The output from the Multi-CNN Layers undergoes positional encoding before being input into the Multi-Head Self-Attention layer for further sequence analysis. Following processing, the data is merged with earlier inputs and then channeled through three dense layers, comprising 256, 128, and 2 neurons respectively. The final layer utilizes a softmax activation function to yield binary classification outcomes.

Fig 6

doi: https://doi.org/10.1371/journal.pcbi.1012340.g006