Interpreting and de-noising genetically engineered barcodes in a DNA virus
Fig 10
Overlap of clustered barcodes from the plasmid library, ligated virus genomes, and the virus library.
A. L3 clustering distance of barcodes along with a 99% cumulative count cutoff was applied to the three libraries. In the lower panel, the UpSet plot shows the number of barcodes from the virus library that intersect with barcodes from the plasmid library and/or the ligated virus genomes. Data show that the vast majority of barcodes from the virus library are present in the plasmid library and the ligated virus genomes. In the top panel, the UpSet plot represents the total read counts associated with barcodes in the virus library that intersect with the plasmid library and the ligated virus genomes. Data show that the barcodes in all three libraries account for the overwhelming number of counts in the virus library. B. Overlap of clustered barcodes that are shorter than 12 nucleotides from the plasmid library, ligated virus genomes, and the virus library. L3 clustering distance along with a 99% cumulative count cutoff was applied to the three libraries. In the lower panel, the UpSet plot shows the number of barcodes from the virus library that intersect with barcodes from the plasmid library and/or the ligated virus genomes. Data show that the majority of short (<12nt) barcodes in the virus library are also present in the plasmid library and ligated virus genomes. In the top panel, the UpSet plot represents the total read counts associated with shorter barcodes in the virus library that intersect with the plasmid library and the ligated virus genomes. Data show that the shorter barcodes that overlap in all three libraries account for the majority of shorter barcodes in the virus library, suggesting that they did not arise during the course of infection.