Landscape of essential growth and fluconazole-resistance genes in the human fungal pathogen Cryptococcus neoformans
Fig 2
TN-seq enabled prediction of gene essentiality.
(A) 176 unique transposon insertions (orange vertical lines) are plotted along a region of Chromosome 1 centered on the known essential gene ERG11 and showing two flanking predicted nonessential genes. For nine of the displayed sites, we recovered transposon insertions in both orientations. (B) Flow chart depicting the random forest approach to classifying gene essentiality. (C) Schematic illustrating parameters that describe each gene within the TN-seq data for machine learning. (D) Precision recall curve describing tradeoff between precision and recall for the random forest model. Each point is the mean of 100 replicates where the training data was randomly split into training and validation sets. The threshold was then varied by 0.01 from 0.01 to 0.99 for each set. (E) The importance of each feature for the V2 model is plotted. Each importance value is also calculated from the same 100 replicates of the training data. Error bars indicate standard deviation. (F) Histogram of the essentiality prediction score for the entire gene set of C. neoformans based on the mean of 100 replicates. Data underlying A can be found in S1 Data at 10.5281/zenodo.15264486. Data underlying F can be found in S1 Table. Raw data underlying D–E can be found in corresponding excel sheets at 10.5281/zenodo.15264486.