Inference of B cell clonal families using heavy/light chain pairing information
Fig 8
Pair info cleaning effectiveness on real data for a relatively well-paired sample (left, same sample as Fig 7) and a sample with substantial numbers of multiply-paired sequences (right, not yet published, but raw data included in https://doi.org/10.5281/zenodo.5860143).
The x axis shows the number of sequences paired with each sequence (“paired seqs per seq”) before and after application of our pair info cleaning algorithm. Thus for example a perfectly paired sample (with perfect allelic exclusion, no dropout, etc.) would have all sequences in the 1-bin (i.e. uniquely paired), while a sample with two cells in each droplet would have all sequences in the 3-bin. So an ideal application of our algorithm would leave every sequence uniquely paired (all in the 1-bin), but in practice some are left unpaired (0-bin).