Fig 1.
Results for selecting optimal cluster size using the elbow method (Panel A), silhouette method (Panel B) and the gap_statistic method (Panel C).
Selection for the elbow method is based on the largest local derivative of the within groups sum of squares (Panel A triangles), the maximum silhouette width (Panel B) and the first non-negative value for Gap(k)-(Gap(k + 1)−sdk+1) (Panel C triangles). These results indicate an optimal number of clusters in the 26–28 range. Panel D displays the GI50codebook dendrogram (Euclidean, Ward’s) with cuts at 28 (red lines) and 7 clusters (green lines), respectively.
Fig 2.
Panel A displays a heatmap of GI50codebook, colored spectrally from green(chemoinsensitive) to red(chemosensitive) response.
Dendrogram at the left represents hierarchical clustering (Euclidean, Ward’s) of GI50codebooks (reproduced from Fig 1 Panel D). Panel B displays the SOMDTP colored according to hierarchical cutree [18] specified at the optimal number of 28 meta-clades. The 28 colors appear spectrally from meta-clade 1 (dark blue), at the bottom of the hierarchical dendrogram, to meta-clade 28 (dark red), at the top of the hierarchical dendrogram. Grayscale bar adjacent to the 28 meta-clade spectrally colored bar displays the 7 meta-clades groupings. The NCI60 tumor cell lines clustered in the heatmap are ordered, left to right, as: SK.OV.3.Ovarian, NCI.H322M.Lung, DU.145.Prostate, A549.ATCC.Lung, HOP.62.Lung, OVCAR.5.Ovarian, TK.10.Renal, EKVX.Lung, A498.Renal, NCI.H226.Lung, SK.MEL.28.Melanoma, SK.MEL.2.Melanoma, BT.549.Breast, UACC.257.Melanoma, MALME.3M.Melanoma, SN12C.Renal, OVCAR.8.Ovarian, NCI.H23.Lung, IGROV1.Ovarian, MDA.MB.231.ATCC.Breast, OVCAR.4.Ovarian, CAKI.1.Renal, ACHN.Renal, UO.31.Renal, HS.578T.Breast, RXF.393.Renal, T.47D.Breast, HOP.92.Lung, HCC.2998.Colon, HT29.Colon, COLO.205.Colon, NCI.H460.Lung, KM12.Colon, PC.3.Prostate, OVCAR.3.Ovarian, M14.Melanoma, MDA.MB.435.Breast, UACC.62.Melanoma, SK.MEL.5.Melanoma, SW.620.Colon, HCT.15.Colon, HCT.116.Colon, LOX.IMVI.Melanoma, MCF7.ATCC.Breast, MCF7.Breast, NCI.H522.Lung, K.562.Leukemia, RPMI.8226.Leukemia, HL.60.TB..Leukemia, SR.Leukemia, MOLT.4.Leukemia, CCRF.CEM.Leukemia.
Fig 3.
Panel A displays GI50codebook for SOM1,13, ordered from most to least chemosensitivity.
The 5 tumor cell lines with the defective ABL1 gene appear as red bars. Panel B displays GI50component for the 5 tumor cell lines with defective ABL1. SOMDTP nodes are colored spectrally from highest chemosensitivity (red) to lowest chemosensitivity (blue).
Fig 4.
Panels A and B display significant chemosensitive SOMDTP nodes (projected as their t-statistic from a Student’s t-test; blue:least, red:most significant) for tumor cell lines with defective ABL1 and KRAS, respectively.
Panel C displays the 28 SOMDTP meta-clades.
Fig 5.
Panel A: SOMDTP projections for FDA approved compounds for the primary CellMiner assigned MOAs.
Projections include the top 10th percentile of SOMDTP nodes for each compound. Panel B: histogram of the counts for these primary MOAs across SOM meta-clade groups. Primary MOAs appear color-coded in each vertical bar, with their heights corresponding to MOA counts in each meta-clade. Horizontal grayscale bar below Panel B indicates meta-clade groups A:G (reproduced from Fig 2 Panel A).
Table 1.
Most frequent primary MOA assignments within meta-clade groups A:G.
Fig 6.
Panel A displays the significant SOMDTP nodes for ABL1 (Ngene = 48).
Eleven FDA compounds are co-projected to Ngene; yielding 6 MOAs. The SOMDTP in Panel B displays the top 10th percentile of projections for FDA compounds sharing these MOAs (NFDA = 189). The intersection of Ngene and NFDA = 22, yielding a Fishers exact p-value of 1.958262e-09, log(p-value = -20.05).
Fig 7.
Fisher’s exact scores (log(pvalue), pvalue< = 0.05).
Results are based on classifications using up to the 10th best SOM projection nodes for FDA compounds. Forty-seven defective genes have significant Fisher’s exact scores when tested over the complete SOMDTP.
Fig 8.
SOMDTP is colored according to similarity of GI50codebooks, where the most similar node neighbors are displayed in deep red and the most dis-similar node neighbors appear in bright yellow (see vertical bar adjacent to SOMDTP). The 28 optimal meta-clade boundaries are displayed as a black line, with the boundaries of the 7 meta-clade groups super-imposed as a white line. FDA approved compounds are projected onto SOMDTP as blue hexagons, where hexagons are sized according to the number of FDA agents appearing in any node. Panel B displays the between node GI50codebook Euclidean distances for nodes with FDA compound projections (top) and without (bottom). Panel C lists FDA compound names grouped by 28 meta-clades. Panel D displays SOMDTP with FDA compounds (blue hexagons), meta-clade boundaries (solid lines) and meta-clade labels as numbers. FDA approved projections to SOMDTP nodes are listed in S5 master_appendix sheet appendix_Table_III.
Fig 9.
Panel A displays the contingency scores, ordered left to right, from the most to least significance.
The horizontal dashed lines represent significance thresholds of p< = 0.05 (lower line) and p< = 0.1 (upper line). Panel B displays the SOMDTP co-projections of significant defective genes and MOAs for FDA compound. Only co-occurrences for SOMDTP projections of FDA compounds are displayed. The SOMDTP region displayed in Panel B represents the boundary for meta-clades 1 through 6 (see the white border in Fig 8 Panel A). Panel C lists the counts for co-occurrence (see S6 master_appendix sheet gp_A). Panel D displays the tabular results in Panel C as a histogram. Node colors for defective genes correspond to the legend inserted into the upper left panel. The counts displayed in Panel C represent the top 10th percentile of SOMDTP co-projections for FDA compounds. A consistent coloring scheme is used for this and all subsequent figures, such that all defective genes presented in the RESULTS are assigned a unique color. S13 master_appendix_sheet gp_A_FDA list the counts for each FDA and MOA entry for these significant genes.
Fig 10.
Results for group B(meta-clades 7 through 9).
The SOMDTP region displayed in Panel B represents the boundary for meta-clades 7 through 9 (see the white border in Fig 8 Panel A). S7 master_appendix sheet gp_B lists the table in Panel C. See the legend of Fig 9 for details. S14 master_appendix sheet gp_B_FDA lists the FDA compounds associated with these defective genes.
Fig 11.
Results for group C(meta-clades 10 through 15).
The SOMDTP region displayed in Panel B represents the boundary for meta-clades 10 through 15 (see the white border in Fig 8 Panel A). S8 master_appendix sheet gp_C lists the table in Panel C. See the legend of Fig 9 for details. S15 master_appendix sheet gp_C_FDA lists the FDA compounds associated with these defective genes.
Fig 12.
Results for group D(meta-clades 16 through 18).
The SOMDTP region displayed in Panel B represents the boundary for meta-clades 16 through 18 (see the white border in Fig 8 Panel A). S9 master_appendix sheet gp_D lists the table in Panel C. See legend of Fig 9 for additional details. S16 master_appendix sheet gp_D_FDA lists the FDA compounds associated with these defective genes.
Fig 13.
Results for group E(meta-clades 19 through 20).
The SOMDTP region displayed in Panel B represents the boundary for meta-clades 19 through 20 (see the white border in Fig 8 Panel A). S10 master_appendix sheet gp_E lists the table in Panel C. See legend to Fig 9 for additional details. S17 master_appendix sheet gp_E_FDA lists the FDA compounds associated with these defective genes.
Fig 14.
Results for group F(meta-clades 21 through 24).
The SOMDTP region displayed in Panel B represents the boundary for meta-clades 21 through 24 (see the white border in Fig 8 Panel A). S11 master_appendix sheet gp_F lists the table in Panel C. See legend to Fig 9 for additional details. S18 master_appendix sheet gp_F_FDA lists the FDA compounds associated with these defective genes.
Fig 15.
Results for group G(meta-clades 25 through 28).
The SOMDTP region displayed in Panel B represents the boundary for meta-clades 25 through 28 (see the white border in Fig 8 Panel A). S12 master_appendix sheet gp_G lists the table in Panel C. See legend to Fig 9 for additional details. S19 master_appendix sheet gp_G_FDA lists the FDA compounds associated with these defective genes.