Table 1.
Gene Sets Analyzed.
Figure 1.
Overlap Analysis to Generate an Empirical p-Value.
In each panel the downward arrow indicates the number of genes overlapping between the experimental data sets, and the bars show the frequencies of overlaps in comparisons to random distributions. (A) Simulation of overlap between all of the genes called in genome-wide screens (lists 1–9) and the NCBI database of factors reported to be involved in HIV replication (list 10). One thousand pairs of gene sets were drawn randomly from the set of all human genes, 1,254 to simulate the set of all genes from genome-side screens and 1,434 to simulate NCBI genes, and the overlap in each pair plotted. The experimental overlap was 257 genes. (B) Simulation of overlapping genes between the König et al. and Zhou et al. siRNA screens. The experimental overlap was nine genes. The p-value calculated using the hypergeometric distribution was slightly lower (p = 0.014). (C) Simulation of expected overlap between screens given the measured error between replicates. The standard deviation of infectivity measurements were calculated from the König et al. siRNA screens, and then simulated datasets were generated containing the measured error. For simulations, either two replicates (pink) or ten replicates (yellow) were generated and the overlap quantified. The y-axis: number of top-scoring genes considered in overlap analysis; x-axis: actual number of overlapping genes seen comparing simulated data sets. (D) Choices for toxicity threshold strongly influence the recovery of genes affecting HIV infection. The genes tested in the König et al. siRNA screen were ranked according to toxicity of knockdown, then sets containing 100% of genes, the least toxic 50%, or the least toxic 20% were extracted (top). From each of these, the 300 genes that when knocked down showed the strongest reduction in HIV infection were then selected, and the overlap between gene sets calculated (bottom).
Table 2.
Statistical Analysis of Genes in Common between All Pairs of Genome-Wide Studies.
Table 3.
Genes Called in at Least Two siRNA Screens.
Figure 2.
Gene Ontology Analysis of GO Groups Enriched among Genes Called in Two or More siRNA Screens.
The color code indicates the number of genes in each functional group from each screen derived using DAVID Functional Annotation Clustering. Annotations for each function group were based on the assessment of GO categories that comprised each group, which can be found in Table S1.
Figure 3.
Gene Clusters, Generated Using PPI and MCODE Analysis, Derived from the Full Set of Genes Implicated in HIV Infection (Lists 1–10).
The size of each node is proportional to the number of screens in which the host cell gene was called. Gene identifiers are in Table S1. Diamonds indicate genes from the NCBI HIV interactions database. Color code: red = König et al., green = Brass et al., blue = Zhou et al., cyan = Fellay et al., magenta = Frankel interaction screens (unpublished data), yellow = HIV particle associated, and grey = Studamire and Goff integrase interactions. For genes that were called in multiple screens (larger symbols), a color was chosen arbitrarily from among the screens positive for that gene. Default parameters were used, specifically— Degree Cutoff: 2. Node Score Cutoff: 0.0. Haircut: true. Fluff: false. K-Core: 2. Maximum Depth From Seed: 100. (A) Proteasome; (B) Transcription/RNA Pol; (C) Mediator Complex; (D) Tat activation/Transcriptional elongation; (E) Immune response; (F) RNA Binding/Splicing; (G) H5PA5/BiP Chaperone; (H) CCT Chaperone; (I) t-RNA Synthase; (J) Transport; (K) Unknown.