CRISPR-Analytics (CRISPR-A): A platform for precise analytics and simulations for gene editing
Fig 5
Enhanced precision with spikes, UMIs, and mock characterization.
A) Spikes count in the Illumina experiment. Count of spike-in synthetic sequences with different deletion sizes. From each spike, the same number of molecules were added to the edited samples. On the left, linear regression of spike-in sequences mean percentages among all spike-in sequences at 30 cycles of amplification and a low number of molecules. At the right, count of spikes in the original sample and after correction by spikes. B) Size bias correction using the spike-in model. On the left, edited sample deletions distribution by position corrected by spikes (blue) against the original distribution (gray). At the bottom right, count difference between original and corrected in function for deletion size of sample deletions distribution shown at left. C) Noise reduction by UMIs cluster filtering. Standardized distribution of deletions (left) and insertions (right) without taking into account UMIs (gray) and after clustering by UMIs with a minimal identity of 0.95 and filtering by UMI bin size (UBS) >50 and <130 (blue) in Lama2 target. The Red dashed line corresponds to the cut site position. D) Mock-based noise correction. Samples with less percentage of editing tend to have a higher correction since the noise represents a higher proportion of the indel reads. The two plots comparing treated and mock files show that this subtraction is always specific, regardless of the editing percentage. E) Difference between the editing percentage reported by 8 tools and the 4 options of CRISPR-A and the manually curated percentage of 30 samples (10 different targets and 3 replicates for each).