GPMelt: A hierarchical Gaussian process framework to explore the dark meltome of thermal proteome profiling experiments
Fig 6
Panels A to C present results from the Staurosporine 2014 dataset [1], panels D to F from the ATP 2019 dataset [19]. (A-B) Detection of outliers (A) Subset of proteins presenting a BH adjusted p-value superior or equal to 0.8 according to NPARC and GPMelt methods, but an alternative model posterior probability larger than 0.99 according to the Bayesian semi-parametric model. For each of these proteins, the plot represents on a log scale the estimated values of the output-scale parameters obtained from the three-level HGP model, with the shape corresponding to the replicate (r) and the color to the condition (c). The shaded area represents output-scale values larger than the 95th percentile, and the dotted line is the 99th percentile. (B) Replicates with associated
above the 95th percentile are likely to correspond to replicates presenting either one to multiple outlier observations. (C-D) Area Between Curves as a new metric To replace the previously used ΔTm as measure of the discrepancy between the fitted curves, we propose a new metric, denoted the Area Between the Curves (ABC). The ABC can be computed by considering the median of the observations in each condition, and computing the area between these medians. As complement to this ABCmedian metric, we propose a refined computation of the ABC using the output of the HGP model, denoted by ABCGPMelt (see Appendix D in S1 File). (C) A protein with at least one value
above q75 + 1.5 × IQR is defined as presenting at least one outlier replicate (q75 being the 75th percentile, and IQR the interquartile range). Comparing the differences in ABC estimated using either ABCmedian or ABCGPMelt, shows that ABCmedian likely overestimate the ABC for proteins presenting at least one outlier replicate (Wilcoxon signed-rank test). (D) ABCGPMelt as a valid metric to replace ΔTm: considering the solubility effects reported by Sridharan et al [19], a positive, resp. negative, ABCGPMelt is correctly computed for desolubilized, resp. solubilized, proteins (Wilcoxon signed-rank test). (E-F) Introduction of a new scaling factor. The broadly used Fold Change, in which intensities in a replicate are scaled to the intensity at the lowest temperature, is compared to a newly proposed scaling, named the mean scaling. This scaling consists in scaling intensities in a replicate to the mean intensity of this replicate. (E) Scaling comparison. Considering the differences in scaled abundances at each temperature between replicates of a condition, the x-axis represents the median difference and the y-axis the variance of the differences. Results are divided by condition (control and treatment) and by scaling (fold change vs mean scaling). The panels are split in four, with the left bottom corner corresponding to reproducible observations between replicates of a condition. The three other panels reveals a lack of reproducibility between replicates. (F) Examples of proteins falling outside of the left bottom corners in panel (E). For these proteins, the results of the mean scaling and the fold change are compared.