GPMelt: A hierarchical Gaussian process framework to explore the dark meltome of thermal proteome profiling experiments

doi:10.1371/journal.pcbi.1011632

GPMelt: A hierarchical Gaussian process framework to explore the dark meltome of thermal proteome profiling experiments

Fig 6

Additional model features.

Panels A to C present results from the Staurosporine 2014 dataset [1], panels D to F from the ATP 2019 dataset [19]. (A-B) Detection of outliers (A) Subset of proteins presenting a BH adjusted p-value superior or equal to 0.8 according to NPARC and GPMelt methods, but an alternative model posterior probability larger than 0.99 according to the Bayesian semi-parametric model. For each of these proteins, the plot represents on a log scale the estimated values of the output-scale parameters obtained from the three-level HGP model, with the shape corresponding to the replicate (r) and the color to the condition (c). The shaded area represents output-scale values larger than the 95^th percentile, and the dotted line is the 99^th percentile. (B) Replicates with associated above the 95^th percentile are likely to correspond to replicates presenting either one to multiple outlier observations. (C-D) Area Between Curves as a new metric To replace the previously used ΔT_m as measure of the discrepancy between the fitted curves, we propose a new metric, denoted the Area Between the Curves (ABC). The ABC can be computed by considering the median of the observations in each condition, and computing the area between these medians. As complement to this ABC_median metric, we propose a refined computation of the ABC using the output of the HGP model, denoted by ABC_GPMelt (see Appendix D in S1 File). (C) A protein with at least one value above q75 + 1.5 × IQR is defined as presenting at least one outlier replicate (q75 being the 75^th percentile, and IQR the interquartile range). Comparing the differences in ABC estimated using either ABC_median or ABC_GPMelt, shows that ABC_median likely overestimate the ABC for proteins presenting at least one outlier replicate (Wilcoxon signed-rank test). (D) ABC_GPMelt as a valid metric to replace ΔT_m: considering the solubility effects reported by Sridharan et al [19], a positive, resp. negative, ABC_GPMelt is correctly computed for desolubilized, resp. solubilized, proteins (Wilcoxon signed-rank test). (E-F) Introduction of a new scaling factor. The broadly used Fold Change, in which intensities in a replicate are scaled to the intensity at the lowest temperature, is compared to a newly proposed scaling, named the mean scaling. This scaling consists in scaling intensities in a replicate to the mean intensity of this replicate. (E) Scaling comparison. Considering the differences in scaled abundances at each temperature between replicates of a condition, the x-axis represents the median difference and the y-axis the variance of the differences. Results are divided by condition (control and treatment) and by scaling (fold change vs mean scaling). The panels are split in four, with the left bottom corner corresponding to reproducible observations between replicates of a condition. The three other panels reveals a lack of reproducibility between replicates. (F) Examples of proteins falling outside of the left bottom corners in panel (E). For these proteins, the results of the mean scaling and the fold change are compared.

doi: https://doi.org/10.1371/journal.pcbi.1011632.g006