Population genomics shows no distinction between pathogenic Candida krusei and environmental Pichia kudriavzevii: One species, four names
(A) Intron consensus sequence for P. kudriavzevii CBS573 genes, generated using MEME . Consensus sequences for the 5’ and 3’ splice sites (SS), and the branch point are shown, with the median distances (S1 and S2) between them, for all 205 introns in the nuclear genome. (B) Duplication and intron status of cytosolic ribosomal protein genes in P. kudriavzevii (Pkud) and S. cerevisiae (Scer). The four quadrants show the number of copies of each RP gene in the two species. Within each quadrant, the four columns show whether the gene contains an intron in both species (purple), in S. cerevisiae only (red), in P. kudriavzevii only (blue), or in neither species (black). Daggers (†) show cases where only 1 of the 2 copies of a gene in P. kudriavzevii contains an intron, and the asterisk (*) shows a case where only 1 of the 2 S. cerevisiae gene contains an intron. (C) Analysis of intron clustering. The distance, measured in genes, between each intron-containing gene and the next one (to its right in the genome) was calculated. The set of distances was then sorted so that introns close to other introns have low ranks. The plots show the running total of all distances up to a particular rank, for real introns (red points), and for 1000 simulated datasets in which introns were randomly assigned to genes (black points, with error bars ±1 s.d.). The plot on the left shows the result for all introns in the genome, and the plot on the right shows the result with ribosomal protein genes omitted.