A benchmark of semi-supervised scRNA-seq integration methods in real-world scenarios
Fig 5
Scenario IV: Partially Annotated Batches.
(a) Bar plots of all the methods’ performance in four datasets under this setting. The bar indicates the overall weighted score of each method, while the triangle and circle represents the batch correction scores and bio-conservation scores respectively. The vertical dashed lines separate the bars into four groups, namely, 30%, 50%, 70% and unsupervised. The five unsupervised counterparts are presented on the right, depicted with empty (unfilled) bars. (b) Scatter plot of the scaled batch correction score against the scaled bio-conservation score for each method under the partially annotated batches setting for different proportions, averaging across four datasets. The scaled score for each dataset and missing proportion is calculated as the ratio of overall bio-conservation/batch-mixing metric for a given method with respect to the corresponding mean using five unsupervised methods. The detailed scaling procedure can be found in Methods Section 2.2. Scaled scores for unsupervised methods are also included using unfilled triangles. Different colors indicate the methods and the size of dot shapes represent the missing batch proportions. The horizontal red dashed line represents the average bio conservation score across all methods (both supervised and unsupervised methods), while the vertical blue dotted line represented the average batch correction score. (c) Radar plots showing the performance of all methods on individual metrics for the human pancreas, macaque, human immune, and lung atlas datasets, averaged over all the three proportions for semi-supervised methods. Metrics include biological conservation (red) and batch correction (blue). As scCRAFT achieved the highest overall performance among unsupervised methods, only its scores are shown for clarity; radar plots for the remaining methods are provided in the Section 3 in S1 Text.