Table 1.
Ten molecular signatures from genomics data used for predicting novel RLR pathway components.
Fig 1.
Bayesian integration of ten molecular signatures of RLR pathway components from genomics data.
(A) Distributions of the 49 known RLR pathway components (RLR genes, green) and 5,818 genes unlikely to be part of the pathway (non-RLR genes, red) across the 10 molecular signature data sets we identified as predictive of the RLR system (see also Table 1). Data sets were binned into discrete intervals and fractions of (non-)RLR genes add up to one. Arrows indicate the behavior of RIG-I across the data. The top five signatures describe the relationship of RLR signaling with viruses; the bottom five describe properties of the pathway itself. (B) Boxplots of the genome-wide integrated RLR score (Bayesian posterior probability score). Genes were grouped into one of five classes: known RLR genes (green, see [A]), components of other PRR signaling pathways (‘TLR, CLR, NLR, cytDNA’; purple), genes functioning in other aspects of the innate immune response (‘innate immunity’; blue), and non-RLR genes (red, see [A]). The remaining genes are classified as ‘other’ (gray). (C) The 50 genes with the highest RLR scores. Representative RLR and other innate antiviral response genes are indicated. The pie chart shows the occurrences of the different gene classes in the top 354 RLR ranks. (D) Receiver operating characteristic (ROC) curve illustrating the performance of the integrated RLR score (solid black line) and the individual molecular signatures (black dots) for predicting known RLR versus non-RLR genes. Sensitivity and specificity were calculated at various score thresholds (for the RLR score), or at specific thresholds that include all bins with positive likelihood ratio scores (for the individual data sets; see (A)). The asterisk denotes the sensitivity and specificity corresponding to a false discovery rate (FDR) of 57% (top 354 genes). Note that, to avoid circularity, the predictive ability of the co-expression, protein domain and RLR pathway PPI data sets in (A) and (D) was assessed using the set of TLR, CLR, NLR, cytDNA genes instead of the RLR genes (see Methods).
Table 2.
Overlap between innate (antiviral) response data sets and the top 354 RLR predictions excluding known RLR genes.
Fig 2.
RNAi screens validate a role for the novel RLR candidates in RIG-I-mediated IFNβ induction.
(A) Flow chart of the RNAi validation screens. 187 candidate RLR genes were screened for RIG-I pathway activity in three different RNAi screens. In screens 1 and 2, HeLa cells stably expressing an IFNβ promoter-controlled firefly luciferase (Fluc) reporter were stimulated with a 5’-ppp-containing RIG-I RNA ligand. The 57 hits (15 up, 42 down) with the largest effect on IFNβ induction upon siRNA knockdown in screen 1 (stringent Z-score <-2 or >2) were tested again in screen 2 with a different set of siRNAs. The 19 top hits from screen 2 were then picked for screen 3, which is similar to the first two screens except that it measures IFNβ mRNA levels using quantitative real-time qRT-PCR. (B) Correlation between the negative control-based robust Z-scores of RNAi screens 1 and 2. The 57 top hits with Z-scores <-2 or >2 in screen 1 were tested again in screen 2 (purple data points). N.T., non-transfected; SCR, scrambled. (C) Overview of the 19 novel RIG-I pathway genes with the largest effects on IFNβ induction in screens 1 and 2 (Z-score <-2 in both screens). Black data points correspond to genes whose knockdown also causes a reduction in IFNβ mRNA levels in screen 3. (D) RNAi screen 3. 13 of the 19 top hits from screens 1 and 2 also reduce RIG-I-mediated IFNβ mRNA production (black bars). Experiments were performed in triplicate (n = 3). Bars (mean±SEM) display the fold induction of IFNβ mRNA (corrected for actin mRNA levels) compared to the mock-treated control. Statistical significance was assessed by one-way analysis of variance (ANOVA) followed by Dunnett’s multiple comparison test, comparing the values for each of the 19 test genes to the combined negative control conditions (scrambled and LGP2, red bars). ** P < 0.01; *** P < 0.001. (E) Correlation between the in silico integrated RLR score and the probability of experimental confirmation in RNAi screen 1. The dark purple line represents all 94 hits with Z-score <-1.25 or >1.25; the light purple line represents the top 57 hits with Z-score <-2 or >2. The 187 experimentally tested genes were rank-ordered based on the RLR score and precision was calculated sequentially as the fraction of validated hits among all tested genes having a certain RLR score or higher.
Table 3.
Validations of our predicted RLR candidates by independent studies.
Fig 3.
Human and viral protein interaction networks connecting the known RLR pathway with the newly identified RIG-I factors DDX17 and SNW1.
Human proteins are represented by circles, viral proteins by rounded rectangles (purple nodes). Green nodes represent known components of the RLR pathway. Orange nodes (DDX17 and SNW1) are novel RIG-I pathway components discovered in our study, which are connected to the RLR network through interactions with the green nodes. Edges between human proteins represent physical interactions (both low- and high-throughput) obtained from BioGRID Release 3.3 [54]. Interactions between human and viral proteins were obtained from the PHISTO database (29 Sep. 2014) [28]. See S1 Fig for a more complete representation of the RLR pathway containing the curated set of 49 known RLR genes. LaCV, La Crosse virus; EBV, Epstein-Barr virus; SFSV, Sandfly fever Sicilian virus; PRRSV, Porcine reproductive and respiratory syndrome virus; HPV, Human papillomavirus.