Capturing cell heterogeneity in representations of cell populations for image-based profiling using contrastive learning
Fig 4
CytoSummaryNet profiles partially generalize to unseen compounds and do not generalize to out-of-distribution batch data (Stain 5).
The box plots illustrate the mAP of replicate retrieval for training and validation compounds (each data point is the average mAP of a plate) from the test plates of Stain2, Stain3, Stain4, and Stain5; CytoSummaryNet profiles performance in dark blue and dark green respectively, and average profiles performance in cyan and light green respectively. Note (i) the boxplots corresponding to validation compounds in the second panel (“test plates Stain3”) are the same as the boxplots in the third panel of Fig 3, and (ii) although the boxes are labeled “training compounds” and “validation compounds”, all data shown comes from test plates and therefore none of it has been seen during training (see description of stratification in Results for further details). Welch’s t-tests were used to compare the means between CytoSummaryNet and average mAP scores on corresponding data; their p-values are indicated as stars at the top of each plot (ns = not significant). The limited number of data points, due to averaging mAP scores per plate, may impact the statistical significance of the comparisons.