Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

GO-CRISPR: A highly controlled workflow to discover gene essentiality in loss-of-function screens

  • Pirunthan Perampalam,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations London Health Sciences Centre Research Institute, London Regional Cancer Program, London, ON, Canada, Department of Biochemistry, Western University, London, ON, Canada, Copoly.ai Inc., Ottawa, ON, Canada

  • James I. McDonald,

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – review & editing

    Affiliations London Health Sciences Centre Research Institute, London Regional Cancer Program, London, ON, Canada, Department of Pathology and Laboratory Medicine, Western University, London, ON, Canada

  • Frederick A. Dick

    Roles Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    fdick@uwo.ca

    Affiliations London Health Sciences Centre Research Institute, London Regional Cancer Program, London, ON, Canada, Department of Pathology and Laboratory Medicine, Western University, London, ON, Canada, Children’s Health Research Institute, London, ON, Canada

Abstract

Genome-wide CRISPR screens are an effective discovery tool for genes that underlie diverse cellular mechanisms that can be scored through cell fitness. Loss-of-function screens are particularly challenging compared to gain-of-function because of the limited dynamic range of decreased sgRNA sequence detection. Here we describe Guide-Only control CRISPR (GO-CRISPR), an improved loss-of-function screening workflow, and its companion software package, Toolset for the Ranked Analysis of GO-CRISPR Screens (TRACS). We demonstrate a typical GO-CRISPR workflow in a non-proliferative 3D spheroid model of dormant high grade serous ovarian cancer and demonstrate superior performance to standard screening methods. The unique integration of the pooled sgRNA library quality and guide-only controls allows TRACS to identify novel molecular pathways that were previously unidentified in tumor dormancy and undetectable to analysis packages that lack the guide only controls. Together, GO-CRISPR and TRACS can robustly improve the discovery of essential genes in challenging biological scenarios such as growth arrested cells.

Introduction

Gene editing using CRISPR/Cas9 technology has seen widespread adoption across most biomedical disciplines, including cancer research [1,2]. In particular, the ability to multiplex CRISPR gene knockouts on a genome-wide scale has stimulated systematic interrogation of cell biology [3]. Pooled single guide RNA (sgRNA) libraries are used to create single-gene knockouts in individual cells and selective pressure is applied through culture conditions or drug treatment. Genetic deficiencies that produce resistance or susceptibility are quantitated using sgRNA coding sequences as barcodes to compare gene knockout abundance between the start and end of the experiment [4]. CRISPR screens can therefore discover functional roles for genes and pathways not suggested by more traditional hypothesis-driven research.

Gain-of-function genome-wide CRISPR screens can lead to several orders of magnitude change in sgRNA sequence abundance because of resistant cell proliferation, unequivocally identifying resistance genes [57]. Conversely, loss-of-function is more challenging to quantitate because complete disappearance of sgRNA sequences for a gene may represent technical failure of the screen design, or its execution [8]. In addition, knockout of an individual gene in the chosen culture condition may not cause lethality with complete penetrance [9]. Ultimately, identification of essential genes in loss-of-function screens has relied on prolonged periods of cell proliferation to separate the abundance of bystander sgRNA abundance from true deleterious changes [4]. For this reason, most CRISPR screens have utilized rapidly proliferating 2D cell culture conditions as in the case of DepMap screens [10]. Scenarios such as the tumor microenvironment, metastasis and tumor dormancy, are better assessed in 3D culture models such as multicellular tumor spheroids or organoids [1114]. However, the inability of organoids to quantitatively regenerate from individual cells upon subculture has prevented robust library representation [15], and in some cases this has been compensated by screening more compact, partial genome libraries [16]. Furthermore, most 3D spheroids exhibit slower growth kinetics due to hypoxia and necrosis which can further hamper detection of gene loss events [17]. All of these factors likely contribute to stochastic loss of guides which can confound loss-of-function studies since current methods cannot distinguish these Cas9-independent events from bona fide loss-of-function due to gene editing. For these reasons, the classification of gene ‘essentiality’ is highly challenging in 3D culture conditions.

This indicates a need for a screening method that can be adapted for a broad range of complex culture conditions that include low proliferation rates to identify essential genes. It motivated us to develop Guide-Only control CRISPR (GO-CRISPR). GO-CRISPR is a scalable loss-of-function screening method that can be used to discover essential genes in standard monolayer (2D) or complex 3D culture conditions such as dormant tumor spheroids that exhibit arrested cell proliferation [18]. To support broad usability, we also developed TRACS (Toolset for the Ranked Analysis of GO-CRISPR Screens) to automate the analysis of GO-CRISPR screens in an easy-to-use software package. Together, GO-CRISPR and TRACS allowed us to discover novel survival pathways in dormant ovarian cancer spheroids such as axon guidance and MAPK signaling [18]. In this report we provide detailed insight into the GO-CRISPR/TRACS workflow, including analysis the same Cas9+ data using MAGeCK and BAGEL. This demonstrates that established CRISPR screening approaches using only Cas9+ gRNA abundance data were unable to find essential genes, whereas the guide only controls allowed us to mine gene dependencies in this challenging culture condition. We expect that the GO-CRISPR approach can be broadly applied to genome-wide loss-of-function CRISPR screens in low proliferation biological contexts.

Materials and methods

Generation of Cas9-positive cells

High-grade serous ovarian cancer (HGSOC) iOvCa147 cells have previously been reported [19]. They were transduced with viral particles encoding a Cas9 expression cassette (lentiCas9-Blast, Addgene #52962) to generate cells constitutively expressing Cas9 (Cas9-positive cells). Cells were selected with blasticidin (20 μg/mL). Single-cell clones were isolated by limiting dilution. Lysates were collected from clones and western blots were performed to determine Cas9 expression (Cell Signaling #14697). Cas9 editing efficiency was determined by viability studies using sgRNAs targeting selected fitness genes (PSMD1, PSMB2, EIF3D) and a non-targeting control (LacZ) as previously reported [20]. See S1 Table for sgRNA sequences. Lentivirus producing culture media for these guides was used to infect Cas9 expressing candidates and infected cells were selected with 2 μg/mL puromycin for three days and viability was determined by Alamar blue. A single clone showing the most effective Cas9 activity was selected for all further studies.

GeCKO v2 library preparation

HEK293T cells were transfected with the combined A and B components of the GeCKO v2 (Addgene #1000000048, #1000000049) whole genome library (123,411 sgRNAs in total) along with plasmids encoding lentiviral packaging proteins. Media was collected 2–3 days later and any cells or debris were pelleted by centrifugation at 500 x g. Supernatant containing viral particles was filtered through a 0.45 μM filter and stored at -80°C with 1.1 g/100 mL BSA.

GO-CRISPR screen in iOvCa147 cells

iOvCa147 Cas9-positive or Cas9-negative cells were separately transduced with virus collected as described above at a multiplicity of infection of 0.3 and with a predicted library coverage of >1000-fold. Cells were grown in media containing 2 μg/mL puromycin (Sigma #P8833) to eliminate non-transduced cells. Cells were maintained in complete media containing puromycin in all following steps. A total of 1.1 x 109 cells were collected and split into three groups consisting of approximately 3.0 x 108 cells each and were cultured for an additional 2–3 days in complete media, then collected and counted. Triplicate samples of 6.2 x 107 cells were saved for sgRNA sequence quantitation at T0. The remaining cells (approximately 1.4 x 109/set) were plated at a density of 2.0 x 106 cells/mL in each of twenty 10 cm ULA plates (total of 60 ULA plates). Following 2 days of culture, media containing spheroids was transferred to ten, 15 cm adherent tissue culture plates (total of 30 plates). The next day unattached spheroid cells were collected and re-plated onto additional 15 cm plates. This process was repeated for a total of 5 days at which point very few spheroids remained unattached. The attached cells were collected for DNA extraction and this population represents Tf. Complete media refers to DMEM/F12 media (Gibco #11320033) supplemented with 10% FBS (Wisent FBS Performance lot #185705), 1% penicillin-streptomycin glutamine (Wisent #450-202-EL) and 2 μg/mL puromycin (Sigma #P8833).

High-throughput next generation sequencing (NGS)

Cells were harvested and DNA was extracted using QIAmp Blood Maxi Kits (QIAGEN #51194). Genomic encoded sgRNA sequences were PCR amplified as previously described [21]. Two rounds of PCR were performed. The initial round serves to increase the abundance of the initial sgRNA population, while the second round inserts barcodes necessary for identification of group and replicate number (sample barcode). PCR products were gel purified, quantitated by Qubit (Invitrogen), pooled and sequenced using an Illumina NextSeq 550 75-cycle high output kit (#20024906). FASTQ files were obtained containing raw reads and were demultiplexed to obtain individual FASTQ files for each sample. FASTQ files were processed accordingly for downstream analysis with TRACS, MAGeCK, or BAGEL.

Analysis with MAGeCK

FASTQ files were trimmed with Cutadapt (1.15) to remove adapter sequences and sample barcode identifiers. The library reference file (CSV) for the GeCKOv2 library was used in Bowtie2 (2.3.4.1) to align the initial library read FASTQ file and generate a BAM file (Samtools 1.7) in order to increase the read depth of the initial library. This library BAM file and trimmed FASTQ files for all samples were then inputted into the MAGeCK (0.5.6) count function to generate read counts. Read counts were normalized in MAGeCK as previously described by Li et al. using the built-in median normalization method [22]. Following normalization, we tested both the MaGeCK RRA (robust ranking algorithm) method and the MLE (maximum likelihood estimation) method to rank guides. Differences in sgRNA abundance were computed using the MAGeCK-RRA (robust ranking aggregation) or MAGeCK-MLE (maximum likelihood estimation) methods. All plots and comparisons to TRACS were performed in R (3.6.2).

Analysis with BAGEL

BAGEL (0.91) was run using read counts generated by the MAGeCK (0.5.6) count function as described above. Standard non-essential and essential training gene sets were used as previously described [20]. Bayes factors (BFs) obtained by BAGEL were plotted in R (3.6.2).

Analysis with TRACS

The library reference file containing a list of all sgRNAs and their sequences (CSV file), raw reads for the pooled sgRNA library (FASTQ files (L0) and raw reads (FASTQ files) for all Initial (T0) and Final (Tf) replicates for Cas9-positive and Cas9-negative cells (12 replicates) were loaded into TRACS (https://github.com/developerpiru/TRACS). TRACS then automatically trimmed the reads using Cutadapt (1.15). TRACS builds a Bowtie2 (2.3.4.1) index and aligns the trimmed initial sgRNA library read file to generate a BAM file using Samtools 1.7. MAGeCK (0.5.6) is then used to generate read counts from this library BAM file and all the trimmed sample FASTQ files. Instead of dropping all reads below a certain threshold (e.g. <30 counts), all reads were incremented by 1 to prevent zero counts and division by zero errors. The TRACS algorithm was then run using this read count file to determine the Library Enrichment Score (ES), Initial ES, Final ES and the Enrichment Ratio (ER) for each gene (see The TRACS algorithm section).

Data exploration using VisualizeTRACS

The VisualizeTRACS feature, part of the TRACS software suite, was then used to visualize and explore the data output from TRACS. Gene filtering (Library ES > 985, ER < 0, padj < 0.05 (paired t-test) for our example ovarian cancer workflow) was performed, figures were generated and the final table of essential genes that met these criteria were downloaded for further analysis.

The TRACS algorithm

After read count preprocessing, TRACS first determines a Gene Score, GS, for every gene in the supplied library reference file by calculating the log2-fold-change (LFC) from all sgRNAs for that gene for n replicates (minimum of 2 replicates required) of Cas9-positive and Cas9-negative samples:

Where s is the number of unique sgRNAs for a gene j. This is done for each replicate such that for n replicates, there are n gene scores, GS, for a gene j. For each n replicates, the GS for all genes are then ranked in ascending order from 1 to x, where x is the rank of the gene with the highest GS in each respective replicate. TRACS then determines the Enrichment Score, ESj, which is the average rank across all n replicates of a gene j, divided by the total number of sgRNAs, s, identified for that gene.

TRACS then determines the Enrichment Ratio, ER, for gene j by determining the LFC of the compared to .

TRACS calculates the p value for each gene using a paired t-test by pairing each of the n replicates together per gene per the initial (T0) condition and final (Tf) condition. The Benjamini–Hochberg procedure is used to control the false discovery rate (FDR) at the user-defined level (10% in our example workflow).

After the ER is calculated, TRACS determines the distribution of Library ES values across all genes. The cutoff value for the Library ES was set to the first quartile for our example screen.

Pathway analysis

Using the final list of essential genes from TRACS, we performed gene ontology and pathway enrichment analysis using the ConsensusPathDB enrichment analysis test (Release 34 (15.01.2019)) for top-ranked genes of interest. padj values and ER values for each gene were used as inputs. The minimum required genes for enrichment was set to 45 and the FDR-corrected padj value cutoff was set to < 0.01. The Reactome pathway dataset was used as the reference. For each identified pathway, ConsensusPathDB provides the number of enriched genes and a q value (padj) for the enrichment. Scatter plots were generated in R (3.6.2) using these values to depict the significant pathways identified.

Generation of single-gene knockouts

Gibson Assembly (NEB #E2611) was used to clone a pool of three sgRNAs per gene (AGPS, SLC2A11, ZC3H7A, PDCD2L, NPM1, EPS15, hsa-mir-761, RPAP1, SYAP1, TRAF3IP1, and EGFP) into lentiCRISPR v2 (Addgene #52961). See S1 Table for sgRNA sequences. iOvCa147 cells were transduced with viral particles encoding a Cas9 and sgRNA expression cassettes. Cells were selected for 2–3 days in media containing 2 μg/mL puromycin. Knockout cells were cultured for 72 hours in suspension conditions using ULA plasticware (2 x 106 cells per well) to induce spheroid formation. Spheroid cells were then collected and transferred to standard plasticware to facilitate reattachment for 24 hours. Reattached cells were fixed with fixing solution (25% methanol in 1x PBS) for 3 minutes. Fixed cells were incubated for 30 minutes with shaking in staining solution (0.5% crystal violet, 25% methanol in 1x PBS). Plates were carefully immersed in ddH2O to remove residual crystal violet. Plates were incubated with detaining solution (10% acetic acid in 1x PBS) for 1 hour with shaking to extract crystal violet from cells. Absorbance of crystal violet at 590 nm was measured using a microplate reader (Perkin Elmer Wallac 1420) for each knockout and normalized to the EGFP control. Percent survival is inferred from relative absorbance.

Statistics

All error bars in the bar graphs represent standard deviation. Statistical significances were determined using two-way ANOVA with Sidak’s multiple comparisons test. * denotes P < 0.05, *** denotes P < 0.001, **** denotes P < 0.0001 and ns denotes not significant (P > 0.05).

Results

The GO-CRISPR workflow

The challenges presented by genome wide CRISPR screening in growth arrested populations of cells motivated us to develop a new workflow that could reveal critical insights into mechanisms of survival in cancer cell dormancy [18]. We developed GO-CRISPR to overcome these challenges and its typical experimental workflow is illustrated in Fig 1A. CRISPR screens depend on high-level Cas9 expression to ensure maximum efficiency of gene disruption in Cas9-positive cells transduced with a pooled sgRNA library (L0) [4,23]. GO-CRISPR uniquely incorporates sequencing data from a parallel screen in which Cas9-negative cells are also transduced with the same pooled sgRNA library (L0), in an independent infection event. Both the Cas9-positive and Cas9-negative cells are treated in an identical manner. Following antibiotic selection for viral transduction and expansion into triplicate cultures, cells are harvested from the initial culture condition (T0). Next, both Cas9-positive and Cas9-negative populations are exposed to the desired selective pressure or culture conditions (Ps) and cells are harvested from the final culture condition (Tf). Next-generation sequencing (NGS) is then used to quantitate the abundance of PCR-amplified sgRNA sequences from these 12 samples, as well as the initial library preparation (L0).

thumbnail
Fig 1. Typical experimental workflow for GO-CRISPR screening and analysis using TRACS.

(A) iOvCa147 High-grade serous ovarian cancer (HGSOC) cells were transduced with lentivirus expressing Cas9. High efficiency Cas9-positive cells (top row) and Cas9-negative cells (bottom row) were separately transduced with the GeCKO v2 pooled sgRNA library (L0). After puromycin selection for 72 hours, both Cas9 positive and negative cells were split into triplicates (x3) and maintained in initial culture conditions (T0) before being transferred to suspension culture conditions in ULA plasticware (selective pressure, Ps) to induce spheroid formation and select for cell survival. Viable spheroid cells were then transferred to standard plasticware to facilitate reattachment through successive plating over five days in the final culture condition (Tf). The initial pooled sgRNA library (L0) and Cas9-positive and Cas9-negative cells were collected at T0 and Tf for sgRNA quantitation by NGS. TRACS was used to calculate Library, Initial and Final Enrichment Scores (ES) using read quantities from L0 and Cas9-positive and Cas9-negative samples. (B) 3D plot output from TRACS illustrating the Library ES, Initial ES and Final ES for each gene. Genes highlighted in dark blue have low Library ES (determined by calculating the first quartile value across all Library ES; < 985 in this experiment). (C) Euler diagram showing the distribution of retained (in red) and discarded genes based on the Library ES (16,284 genes had Library ES > 985). (D) 2D scatter plot output from TRACS showing the distribution of Initial ES and Final ES for all genes. Genes highlighted in light blue (6,717 genes) met the low Library ES cutoff and had a negative Enrichment Ratio (ER) and padj < 0.05 (paired t-test), indicating their sgRNA abundance decreases in Tf compared to T0.

https://doi.org/10.1371/journal.pone.0315923.g001

To evaluate GO-CRISPR screens, we developed the TRACS algorithm that integrates data from Cas9-positive and Cas9-negative populations to make gene essentiality predictions (S1 Fig). It is based on assigning gene enrichment scores similar to the single gene score previously described by Wang et. al. [24]. However, TRACS differs by calculating three different enrichment scores for each gene (Fig 1A). These include a Library Enrichment Score (Library ES) that compares sgRNA read counts for each gene between Cas9-negative cells and the library (L0) to determine Cas9-independent non-gene-editing-related changes in abundance. This is an important consideration since pooled sgRNA library preparations do not uniformly represent all genes [25]. An Initial Enrichment Score (Initial ES) is calculated by comparing sgRNA abundances for each gene in Cas9-positive cells relative to their abundances in Cas9-negative cells where they cannot direct gene editing. Lastly, a Final Enrichment Score (Final ES) determines sgRNA abundance between Cas9-positive and Cas9-negative cells following the exposure of both populations to the desired selective pressure or culture condition (Ps). For each gene, the Library ES, Initial ES and Final ES are weighted according to the number of sgRNAs that are detected for that gene. Thus, a relatively low Initial ES or Final ES indicate reduced sgRNA abundance in the Cas9-positive population and these scores incorporate a penalty for undetected sgRNAs to emphasize the most reliable sgRNA measurements. Finally, TRACS calculates an Enrichment Ratio (ER) that is the log2-fold-change (LFC) value between the Final ES and Initial ES to reveal changes in relative abundance between T0 and Tf culture conditions to detect sgRNAs that were depleted under the selective pressure (Ps), thereby identifying gene essentiality. The ER informs researchers if a gene shows essentiality for fitness (negative ER) or is non-essential (positive ER) in the experimental condition.

Discovering ovarian cancer spheroid vulnerabilities

To demonstrate the value of the GO-CRISPR and TRACS workflow, we performed a genome-wide screen in iOvCa147 high-grade serous ovarian cancer (HGSOC) cells. HGSOC is a highly metastatic disease in which cells detach from primary tumors and aggregate to form 3D spheroids in the abdomen [26]. These spheroid cells are growth arrested and highly resistant to chemotherapy, emphasizing the need to discover their vulnerabilities to improve treatment [27]. We designed a GO-CRISPR screen experiment (Fig 1A) to elucidate the genes and pathways that are critical to spheroid cell survival using ultra-low attachment (ULA) plasticware to induce spheroid formation in vitro [19]. Ovarian cancer cells undergo significant cell death in suspension culture while spheroids form, therefore after 48-hours we transferred cells back to standard plasticware to allow reattachment and purification of viable cells.

Following analysis with TRACS, we sought to discover genes that were most selectively required for survival in suspension conditions; these represent potential therapeutic targets for dormant ovarian cancer cell spheroids. Fig 1B displays each ES in a 3D plot that reveals the distribution of scores in each dimension and highlights genes with low Library ES in dark blue. A low Library ES means that a gene’s sgRNA sequences were poorly represented at T0 due to non-gene-editing events that occurred between viral transduction of the pooled sgRNA library and antibiotic selection. This is an important consideration because when a gene’s Library ES is low, its initial sgRNA abundance is also low, and relatively small changes in sgRNA abundance can lead to extreme enrichment scores at T0 (Initial ES) or Tf (Final ES) (S2A Fig). To avoid these false positives, we excluded the first quartile of Library ES from further analysis (Library ES < 985 in this experiment) (Figs 1C and S2B). To discover genes essential for spheroid cell survival, we focused our attention on those that had a negative ER. In Fig 1D, genes highlighted in light blue met the Library ES cutoff (> 985) and had ER < 0 and padj < 0.05 at a false discovery rate (FDR) of 10%. We found 6,717 genes that met these criteria and the top 10 genes with the most negative ER are shown in Table 1. This data suggests these are the ten most essential genes required for spheroid cell viability in iOvCa147 cells.

thumbnail
Table 1. Top 10 genes with the most negative Enrichment Ratio (ER) in TRACS.

https://doi.org/10.1371/journal.pone.0315923.t001

Validation of TRACS gene essentiality predictions

To determine the validity of gene essentiality predictions made by TRACS, we measured its ability to categorize the 1,000 non-targeting control (NTC) sgRNAs from the GeCKO v2 pooled library. These NTC sgRNAs target non-coding intergenic sequences and should rank as non-essential [25]. We computed a receiver operating characteristic curve (ROC) and determined the area under the curve (AUC) was 98.5%, demonstrating that TRACS correctly identified NTC sgRNAs as non-essential (Fig 2A). This is a critical control because amplified genome regions produce false essential calls among non-coding controls [28,29]. HGSOC is characterized by extensive amplifications and deletions [30] and this data demonstrates TRACS eliminates this potentially confounding interpretation. For added validation, we used CRISPR/Cas9 to disrupt the top five genes with the most negative ER (Table 1) in iOvCa147 cells. Independent knockout of each gene showed significant loss of viability under suspension culture conditions (Fig 2B). Conversely, knockout of the top five genes with the most positive ER did not compromise viability (Fig 2C), suggesting our GO-CRISPR screen approach reliably discovers loss-of-function events, as a positive ER score can have a number of potential explanations that are discussed later.

thumbnail
Fig 2. Validation of TRACS gene essentiality predictions.

(A) The GeCKO v2 pooled library contains 1,000 non-targeting control (NTC) sgRNAs that should not elicit a change in cell fitness. We evaluated the ability of TRACS to classify these sgRNAs by computing a receiver operating characteristic curve (ROC). The area under the curve (AUC) was determined to be 98.5%, indicating TRACS accurately classifies these NTC sgRNAs as non-essential. (B) We evaluated the essentiality of the top five genes with the most negative ER: AGPS, SLC2A11, ZC3H7A, PDCD2L, NPM1 (see Table 1). CRISPR/Cas9 was used to disrupt each gene in iOvCa147 cells and pure single-gene knockout populations were assayed for spheroid cell viability in suspension culture conditions. Genes in bar graph are arranged from most negative ER to least negative ER. (C) We similarly knocked out the top five genes with the most positive ER (EPS15, hsa-mir-761, RPAP1, SYAP1, TRAF3IP1) and assayed for viability in suspension culture conditions. Disruption of these genes did not adversely affect viability. Genes in bar graph are arranged from smallest to largest ER. For B and C, each point represents a biological replicate (n = 6). Error bars represent means and error bars represent standard deviation. Statistics were performed using two-way ANOVA; ** denotes p < 0.01; *** denotes p < 0.001; **** denotes p < 0.0001; ns denotes not significant (p > 0.05). (D) We performed gene ontology and pathway enrichment analysis with the 6,717 genes identified by TRACS to have negative ER and plotted the results. The minimum genes required for enrichment per category was set to 45 to ensure stringent selection of pathways. The dashed vertical line represents a padj value of 5 x 10−9. Pathways to the right of this line have padj < 5 x 10−9 after controlling for FDR at 10%. Pathways labelled in blue are previously undescribed in HGSOC.

https://doi.org/10.1371/journal.pone.0315923.g002

GO-CRISPR and TRACS identify novel pathways in HGSOC

To further explore the genes identified, we performed gene ontology and pathway enrichment analysis with genes that had a negative ER and padj < 0.05 and found 109 significantly enriched pathways (Fig 2D). Among these are cell cycle regulation [19], MAPK signaling [31] and TP53 signaling [30] which are known to be involved in HGSOC progression and metastasis, although not all were previously implicated in survival. Remarkably, our analysis also found novel pathways that have not yet been implicated in HGSOC including Rho GTPase signaling and interleukin signaling among others [18]. Together, these data demonstrate that GO-CRISPR and TRACS can robustly identify functionally connected genes to enable novel pathway discoveries.

Comparison of GO-CRISPR with conventional CRISPR screen workflows

Formation of growth arrested HGSOC spheroids in suspension culture is a stressful process in which many cells die without being incorporated into a spheroid. Moreover, the communal nature of spheroids further suggests that individual gene loss events in single cells may be masked in loss-of-function CRISPR screens through non-cell autonomous mechanisms. Thus GO-CRISPR and TRACS were born out of the desire to screen a significantly challenging biological scenario. To fully illustrate the advantages of GO-CRISPR and TRACS, we have analyzed the triplicate replicates of T0 and Tf solely in Cas9-expressing cells using MAGeCK-RRA, MAGeCK-MLE [32] and BAGEL [20] as this represents a commonly used CRISPR screen workflow that lacks guide only controls (S3 Fig). A basic premise for genome-wide CRISPR screens using only Cas9-expressing cells is that poorly represented sgRNAs, or stochastic changes unrelated to the experiment in question, will be removed through statistical cutoffs. Analysis of this data using MAGeCK-RRA/MLE did not detect any essential genes using standard statistical cutoffs (S3A–S3D Fig), including the genes with the most negative ER that were found to be essential by TRACS (Fig 3). BAGEL did not discover essential genes either (S3G and S3H Fig). We then removed statistical cutoffs in MAGeCK-RRA/MLE and found approximately 30% of top-ranked genes had low Library ES according to TRACS, reinforcing the previously described phenomenon of identifying false positives due to low initial sgRNA abundances (S3E, S3F and S4 Figs). Additionally, our computed ER discriminates essentiality of NTCs effectively (Fig 2A), whereas MAGeCK (without statistical cutoffs) frequently misclassifies NTCs as essential (S5A Fig). TRACS was also noticeably more reliable at identifying universally essential and non-essential gene sets [20] (S5B and S5C Fig). TRACS penalizes genes that have low sgRNA numbers and favors those with higher sgRNA values to further mitigate the effects of stochastic sgRNA loss and ensure that gene essentiality predictions are made using the largest possible sample size (S6A Fig). Without statistical cutoffs, many MAGeCK-MLE top-ranked gene decisions are based on single gRNAs (S6B Fig). Overall, integrating data from the pooled sgRNA library and Cas9-negative populations allows TRACS to outperform other methods to accurately predict gene essentiality in a challenging low proliferation, suspension culture scenario.

thumbnail
Fig 3. Gene scoring for top five TRACS genes using MAGeCK.

The Log2 fold change for the indicated genes was plotted using data from TRACS, MAGeCK-MLE, MAGeCK-RRA and their statistical confidence levels. Note that Log2 fold change for TRACS refers to its enrichment ratio, whereas MAGeCK calculates a fold change of sgRNA sequence read abundance. The dashed vertical line represents a typical LFC or ER cutoff of -1 and the dashed horizontal line represents a p value of 0.05. Genes above and to the left of these lines are significant.

https://doi.org/10.1371/journal.pone.0315923.g003

Discussion

The GO-CRISPR and TRACS workflow offers an important alternative to conventional genome-wide loss-of-function CRISPR screens. It rigorously identifies genes that contribute to survival and facilitates novel mechanistic discoveries in low proliferation culture conditions by controlling for the stochastic effects of Cas9-independent sgRNA loss. Our data indicates this is a weakness of conventional screening methods that lack guide-only controls.

We recently published results using GO-CRISPR and TRACS to determine gene essentiality in a panel of cell lines that model 3D ovarian cancer spheroids [18]. It demonstrated novel insights into ovarian cancer cell dormancy, specifically through the identification of Netrin signaling to MEK-ERK to support viability. Further investigation of Netrin signaling in HGSOC cancer genomic data revealed previously unappreciated roles in disease progression [18]. Animal models of dormancy and metastasis further revealed mechanistic relevance for our screen findings [18]. In this report we emphasize the utility of GO-CRISPR/TRACS through a comparison with MAGeCK and BAGEL analyses of just the Cas9+ screen data. Important complementarity exists between these studies as the multiple screens that repeatedly find Netrin and MAPK pathway in survival provide independent validation of our methods described here. Our eLife paper also includes two additional sets of comparisons that use low throughput analysis of gene essentiality by cell cycle control components and Netrin pathway genes in spheroid survival and compares them with screen findings to support validity of screen findings [18]. These data items further emphasize the reliability of discoveries from GO-CRISPR and TRACS. This report includes detailed considerations for how to avoid false positives related to poor library representation that we identify with the library score. It differentiates candidate genes that have potential to misrepresent gene dependencies and removes them early in the analysis. In addition, comparisons with other analysis programs such as MAGeCK and BAGEL also reveals the importance of the guide only controls as gene essentiality is undiscoverable without them.

We emphasize the importance of using GO-CRISPR/TRACS for loss of function discovery of gene dependencies in a complex, 3D cell culture model system. Data presented in this report suggests that it is not ideal for simultaneously discovering gene essentiality in standard culture akin to DepMap. Standard CRISPR screens identify essential genes through prolonged culture periods on the order of three or four weeks [20]. We deliberately produce CRISPR edited cells and introduce suspension culture rapidly so that many genes that are essential in prolonged culture can be interrogated in suspension. The discovery of MAPK signaling components exemplifies this as these genes are essential for rapid proliferation, but also have a context of supporting survival in suspension that we discover with GO-CRISPR/TRACS [18]. Similarly, we express caution for trying to uncover gain of function effects in positive ER scoring genes. Our data shows that some genes in our top five highest ER actually have a statistically significant gain in viability when tested independently, but not all. A pooled screen inevitably creates competition among different gene loss effects in the population and bystanders can appear to gain function due to comparison with those that genuinely lose function and compromise viability.

To accommodate guide-only controls, we needed a new analysis pipeline. CRISPR screen analysis pipelines generally require an understanding of programming or advanced Unix/Linux knowledge to setup and manipulate raw NGS read files. In contrast, the TRACS software suite (https://github.com/developerpiru/TRACS) presents researchers with an easy-to-use graphical environment for analysis and data exploration. TRACS fully automates the analysis process–from raw NGS files to output–which will significantly reduce the barrier for many researchers to use GO-CRISPR. Furthermore, TRACS is fully scalable and can be deployed on a local workstation or a multi-CPU platform such as Amazon Web Services, Google Cloud Platform, or Microsoft Azure. We also provide example workflows and documentation to use TRACS on these platforms, including Docker containers for Linux, Mac OS and Windows that will automate setup.

We used the GeCKO v2 pooled sgRNA library [25] in our screen. However, the modularity of GO-CRISPR and TRACS will allow for the use of any pooled sgRNA library as long as Cas9 expression is separate from sgRNA viral delivery. In addition, the flexibility of TRACS in terms of unrestricted replicates and sgRNA library size will support the use of validated libraries, such as GeCKO v2, or custom libraries to answer novel questions across biological systems of interest. Taken together, we anticipate GO-CRISPR and TRACS will open new opportunities for loss-of-function screens across diverse model systems and biological questions.

Supporting information

S1 Fig. Typical analysis workflow using TRACS to identify essential genes.

(A) The TRACS workflow is separated into five steps. Step 1: Experiment parameters are entered in the graphical user interface (GUI). Step 2: Library reference file (.csv format) and raw read files (.fastq format) for all Cas9-positive replicates are selected in the GUI. Step 3: Raw read files (.fastq format) for all Cas9-negative replicates are selected in the GUI. Step 4: Raw reads are trimmed and aligned to generate read counts, then the TRACS algorithm runs to calculate Library ES, Initial ES, Final ES and the ER for each gene. Step 5: TRACS saves the results with all scores in an output file which can then be explored using the accompanying VisualizeTRACS data explorer. (B) Screenshot of the easy-to-use TRACS GUI asking user to enter experimental parameters (Step 1). Subsequent displays provide a similar interface for selecting input data files for Steps 2–4. (C-D) Screenshot of the accompanying VisualizeTRACS data explorer that researchers can use to visualize and inspect their TRACS output files and generate publication-ready figures. Researchers can control all aspects of filtering and data manipulation (Library ES, Initial ES, Final ES, ER, padj) to fine-tune selection of genes.

https://doi.org/10.1371/journal.pone.0315923.s001

(PDF)

S2 Fig. Genes with low Library ES tend to have extreme Initial ES and/or Final ES.

(A) TRACS 3D plot illustrating the distribution of Library ES, Initial ES, Final ES in an extreme case example screen that had very poor representation of sgRNAs at T0 in Cas9-negative cells. Genes that have low Library ES (genes that fall into the first quartile of all Library ES across all genes) are shown in dark blue. These genes also tend to have extreme values for Initial ES and/or Final ES which can lead to potential false positives. This extreme example demonstrates how initial sgRNA abundances can be low due to non-gene-editing events and skew gene scores at T0 (Initial ES) and Tf (Final ES). (B) Histogram illustrating the distribution of Library ES across all genes in our GO-CRISPR experiment. To diminish the effects of poorly represented sgRNAs, TRACS determines the distribution of the Library ES across all genes and computes the cutoff value for the first quartile (the bottom 25% of all Library ES; highlighted in dark blue). TRACS then discards genes that have Library ES below this threshold (< 985 for our iOvCa147 screen), however researchers can increase or decrease the threshold within the TRACS software suite for further fine-tuning.

https://doi.org/10.1371/journal.pone.0315923.s002

(PDF)

S3 Fig. MAGeCK and BAGEL are unable to identify essential genes in our screen using Cas9 positive read data.

(A) We analyzed our screen data using Cas9-positive replicates from T0 and Tf using the MAGeCK-RRA (robust rank aggregation) method with a controlled FDR of 10%. The dashed horizontal line represents p < 0.05; any genes above this line are significant. Genes to left of the dashed vertical line have log2-fold-change (LFC) < 0 indicating their sgRNA abundances decrease from T0 to Tf. We did not find any genes to be significant using these typical parameters for MAGeCK-RRA. (B) Removal of FDR control with MAGeCK-RRA revealed 932 genes (highlighted in purple) that had LFC < 0 and unadjusted p value < 0.05. Genes shown in grey did not meet these criteria. (C) We analyzed our screen data using Cas9-positive replicates from T0 and Tf using the MAGeCK-MLE (maximum likelihood estimation) method with a controlled FDR of 10%. Genes above the dashed horizontal line have p < 0.05 and are significant. Genes to the left of the dashed vertical line have LFC < -1, the typically used cutoff for gene essentiality using this method. We did not find any genes that met both of these criteria. (D) Removal of FDR control with MAGeCK-MLE and increasing the LFC cutoff to < 0 revealed 1,918 genes (highlighted in green) that had LFC < 0 and p value < 0.05. Genes shown in grey did not meet these criteria. (E) Venn diagram showing overlap of the 932 genes (in purple) identified by MAGeCK-RRA with the genes identified by TRACS as having low Library ES (5,424 genes total). 259 genes overlap between the two sets (27.8%). (F) Venn diagram showing overlap of the 1,918 genes (in green) identified by MAGeCK-MLE with the genes identified by TRACS as having low Library ES. 499 genes overlap between the two sets (26%). (G) We analyzed our screen data using Cas9-positive replicates from T0 and Tf using BAGEL and plotted the Bayes factor output for each gene in relation to the gene ranking. The Bayes factors for all genes were negative, indicating BAGEL did not discover any perturbations in sgRNA abundances between T0 and Tf. (H) A graphical representation of Bayes factors calculated by BAGEL for the top 15 genes with the highest integer value Bayes factors. All Bayes factors are < 0 indicating gene essentiality was not detected. Error bars show standard deviation for each gene as calculated by BAGEL.

https://doi.org/10.1371/journal.pone.0315923.s003

(PDF)

S4 Fig. Top-ranked genes by MAGeCK have low representation in the T0 pool of cells.

The 3D plot highlights in dark blue the genes that TRACS determined to have low Library ES. The vertical axis represents Library ES. The volcano plots illustrate genes that were found to be essential by MAGeCK-RRA or MAGeCK-MLE (LFC < 0 and unadjusted p value < 0.05; no FDR cutoffs). Dark blue data points in volcano plots indicate genes that TRACS found to have low Library ES, demonstrating that removing the FDR cutoff selects for genes with poor sgRNA representation. In all three plots, genes in red have Library ES > 985 and genes in dark blue have Library ES < 985.

https://doi.org/10.1371/journal.pone.0315923.s004

(PDF)

S5 Fig. TRACS accurately classifies non-targeting controls and robustly classifies known essential and non-essential gene sets.

(A) We evaluated the ability of MAGeCK to classify the 1,000 NTC sgRNAs in the GeCKO v2 pooled library as non-essential and compared it to TRACS as shown in Fig 2A. The AUC for MAGeCK-RRA (51.3%) and MAGeCK-MLE (56.3%) were considerably lower than TRACS (98.5%). (B) We evaluated the ability of TRACS and MAGeCK to classify the previously described Hart et al. gene set of universally non-essential genes. TRACS (AUC: 93.5%) outperformed MAGeCK-RRA (AUC: 86.9%) and MAGeCK-MLE (AUC: 85.8%) suggesting it can reliably identify these non-essential genes. (C) We also evaluated the ability of TRACS and MAGeCK to classify a known set of universally essential genes. TRACS (AUC: 92.6%) consistently outperformed MAGeCK-RRA (AUC:78.1%) and MAGeCK-MLE (AUC: 85.9%) indicating it can robustly identify essential genes.

https://doi.org/10.1371/journal.pone.0315923.s005

(PDF)

S6 Fig. TRACS selects for essential genes based on the most sgRNAs.

(A) Bar plot showing the distribution of the number of sgRNAs per gene for the 6,717 genes that had ER < 0 and padj < 0.05 in TRACS. Light blue color corresponds to light blue data points shown in Fig 1D. (B) Bar plots showing the distribution of sgRNAs per gene discovered by MAGeCK-RRA and MAGeCK-MLE with LFC < 0 and unadjusted p value < 0.05. Purple and green colors correspond to the colored data points in the volcano plots in S3 Fig. Most top-ranked genes identified by MAGeCK-RRA had 6 sgRNAs per gene although at reduced frequency which is attributed to fewer genes discovered by MAGeCK. MAGeCK-MLE had wider disparity across genes as it made essentiality calls using as low as 1 sgRNA per gene. The peaks at 4 sgRNAs per gene in each of the three histograms represent miRNAs which have a maximum of 4 sgRNAs instead of 6.

https://doi.org/10.1371/journal.pone.0315923.s006

(PDF)

S1 Table. sgRNA sequences used in this study.

https://doi.org/10.1371/journal.pone.0315923.s007

(XLSX)

References

  1. 1. Shalem O, Sanjana NE, Zhang F. High-throughput functional genomics using CRISPR-Cas9. Nat Rev Genet. 2015;16(5):299–311. pmid:25854182
  2. 2. Lytle NK, Ferguson LP, Rajbhandari N, Gilroy K, Fox RG, Deshpande A, et al. A Multiscale Map of the Stem Cell State in Pancreatic Adenocarcinoma. Cell. 2019;177(3):572–86 e22. pmid:30955884
  3. 3. Wang T, Wei JJ, Sabatini DM, Lander ES. Genetic screens in human cells using the CRISPR-Cas9 system. Science. 2014;343(6166):80–4. pmid:24336569
  4. 4. Shalem O, Sanjana NE, Hartenian E, Shi X, Scott DA, Mikkelsen TS, et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014;343(6166):84–7. pmid:24336571
  5. 5. Parnas O, Jovanovic M, Eisenhaure TM, Herbst RH, Dixit A, Ye CJ, et al. A Genome-wide CRISPR Screen in Primary Immune Cells to Dissect Regulatory Networks. Cell. 2015;162(3):675–86. pmid:26189680
  6. 6. Sanson KR, Hanna RE, Hegde M, Donovan KF, Strand C, Sullender ME, et al. Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities. Nat Commun. 2018;9(1):5416. pmid:30575746
  7. 7. Cai MY, Dunn CE, Chen W, Kochupurakkal BS, Nguyen H, Moreau LA, et al. Cooperation of the ATM and Fanconi Anemia/BRCA Pathways in Double-Strand Break End Resection. Cell Rep. 2020;30(7):2402–15 e5. pmid:32075772
  8. 8. Koike-Yusa H, Li Y, Tan EP, Velasco-Herrera Mdel C, Yusa K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat Biotechnol. 2014;32(3):267–73. pmid:24535568
  9. 9. Thyme SB, Akhmetova L, Montague TG, Valen E, Schier AF. Internal guide RNA interactions interfere with Cas9-mediated cleavage. Nat Commun. 2016;7:11750. pmid:27282953
  10. 10. Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, et al. Defining a Cancer Dependency Map. Cell. 2017;170(3):564–76 e16. pmid:28753430
  11. 11. Kenny HA, Lal-Nag M, White EA, Shen M, Chiang CY, Mitra AK, et al. Quantitative high throughput screening using a primary human three-dimensional organotypic culture predicts in vivo efficacy. Nat Commun. 2015;6:6220. pmid:25653139
  12. 12. Jacob F, Salinas RD, Zhang DY, Nguyen PTT, Schnoll JG, Wong SZH, et al. A Patient-Derived Glioblastoma Organoid Model and Biobank Recapitulates Inter- and Intra-tumoral Heterogeneity. Cell. 2020;180(1):188–204 e22. pmid:31883794
  13. 13. Fujii M, Shimokawa M, Date S, Takano A, Matano M, Nanki K, et al. A Colorectal Tumor Organoid Library Demonstrates Progressive Loss of Niche Factor Requirements during Tumorigenesis. Cell Stem Cell. 2016;18(6):827–38. pmid:27212702
  14. 14. Vlachogiannis G, Hedayat S, Vatsiou A, Jamin Y, Fernandez-Mateos J, Khan K, et al. Patient-derived organoids model treatment response of metastatic gastrointestinal cancers. Science. 2018;359(6378):920–6. pmid:29472484
  15. 15. Ringel T, Frey N, Ringnalda F, Janjuha S, Cherkaoui S, Butz S, et al. Genome-Scale CRISPR Screening in Human Intestinal Organoids Identifies Drivers of TGF-beta Resistance. Cell Stem Cell. 2020;26(3):431–40 e8.
  16. 16. Planas-Paz L, Sun T, Pikiolek M, Cochran NR, Bergling S, Orsini V, et al. YAP, but Not RSPO-LGR4/5, Signaling in Biliary Epithelial Cells Promotes a Ductular Reaction in Response to Liver Injury. Cell Stem Cell. 2019;25(1):39–53 e10. pmid:31080135
  17. 17. Zanoni M, Piccinini F, Arienti C, Zamagni A, Santi S, Polico R, et al. 3D tumor spheroid models for in vitro therapeutic screening: a systematic approach to enhance the biological relevance of data obtained. Sci Rep. 2016;6:19103. pmid:26752500
  18. 18. Perampalam P, MacDonald JI, Zakirova K, Passos DT, Wasif S, Ramos-Valdes Y, et al. Netrin signaling mediates survival of dormant epithelial ovarian cancer cells. Elife. 2024;12. pmid:39023520
  19. 19. MacDonald J, Ramos-Valdes Y, Perampalam P, Litovchick L, DiMattia GE, Dick FA. A Systematic Analysis of Negative Growth Control Implicates the DREAM Complex in Cancer Cell Dormancy. Mol Cancer Res. 2017;15(4):371–81. pmid:28031411
  20. 20. Hart T, Chandrashekhar M, Aregger M, Steinhart Z, Brown KR, MacLeod G, et al. High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities. Cell. 2015;163(6):1515–26. pmid:26627737
  21. 21. Joung J, Konermann S, Gootenberg JS, Abudayyeh OO, Platt RJ, Brigham MD, et al. Genome-scale CRISPR-Cas9 knockout and transcriptional activation screening. Nat Protoc. 2017;12(4):828–63. pmid:28333914
  22. 22. Li W, Xu H, Xiao T, Cong L, Love MI, Zhang F, et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 2014;15(12):554. pmid:25476604
  23. 23. Zhou Y, Zhu S, Cai C, Yuan P, Li C, Huang Y, et al. High-throughput screening of a CRISPR/Cas9 library for functional genomics in human cells. Nature. 2014;509(7501):487–91. pmid:24717434
  24. 24. Wang T, Yu H, Hughes NW, Liu B, Kendirli A, Klein K, et al. Gene Essentiality Profiling Reveals Gene Networks and Synthetic Lethal Interactions with Oncogenic Ras. Cell. 2017;168(5):890–903 e15. pmid:28162770
  25. 25. Sanjana NE, Shalem O, Zhang F. Improved vectors and genome-wide libraries for CRISPR screening. Nat Methods. 2014;11(8):783–4. pmid:25075903
  26. 26. Matulonis UA, Sood AK, Fallowfield L, Howitt BE, Sehouli J, Karlan BY. Ovarian cancer. Nat Rev Dis Primers. 2016;2:16061. pmid:27558151
  27. 27. Bowtell DD, Bohm S, Ahmed AA, Aspuria PJ, Bast RC Jr., Beral V, et al. Rethinking ovarian cancer II: reducing mortality from high-grade serous ovarian cancer. Nat Rev Cancer. 2015;15(11):668–79. pmid:26493647
  28. 28. Aguirre AJ, Meyers RM, Weir BA, Vazquez F, Zhang CZ, Ben-David U, et al. Genomic Copy Number Dictates a Gene-Independent Cell Response to CRISPR/Cas9 Targeting. Cancer Discov. 2016;6(8):914–29. pmid:27260156
  29. 29. Wang T, Birsoy K, Hughes NW, Krupczak KM, Post Y, Wei JJ, et al. Identification and characterization of essential genes in the human genome. Science. 2015;350(6264):1096–101. pmid:26472758
  30. 30. Patch AM, Christie EL, Etemadmoghadam D, Garsed DW, George J, Fereday S, et al. Whole-genome characterization of chemoresistant ovarian cancer. Nature. 2015;521(7553):489–94. pmid:26017449
  31. 31. Sun C, Fang Y, Yin J, Chen J, Ju Z, Zhang D, et al. Rational combination therapy with PARP and MEK inhibitors capitalizes on therapeutic liabilities in RAS mutant cancers. Sci Transl Med. 2017;9(392). pmid:28566428
  32. 32. Wang B, Wang M, Zhang W, Xiao T, Chen CH, Wu A, et al. Integrative analysis of pooled CRISPR genetic screens using MAGeCKFlute. Nat Protoc. 2019;14(3):756–80. pmid:30710114