Table 1.
Comparison of Z. bailii and Z. parabailii genome assemblies.
Fig 1.
Circos plot of relationships among the Z. parabailii ATCC60483 chromosomes.
In the outer arcs, purple and green coloring indicates A- and B-genes on the Watson and Crick strands of each chromosome. Arcs in the center of the diagram link homeologous (A:B) gene pairs.
Fig 2.
(A) Histogram of the distribution of synonymous site divergence (KS) values for 10,087 Z. parabailii ATCC60483 genes compared to their closest Z. bailii CLIB213T homologs. (B) Pie chart showing the proportions of genes classified into each category. The 2 largest categories refer to A-genes and B-genes that are in A:B pairs. “N” means genes for which no Z. bailii homolog was found or KS to Z. bailii exceeded 0.25. “As” and “Bs” indicate other A-genes and B-genes, as analyzed in panel C. (C) Breakdown of the numbers of genes assigned to the A- or B-subgenomes that are not in A:B pairs. See S1 Data for category counts and KS values for each gene.
Table 2.
Z. parabailii ATCC60483 chromosomes and centromeres.
Fig 3.
Dot-matrix plot between Z. bailii CLIB213T scaffolds [38] and Z. parabailii ATCC60483 chromosomes.
Each dot is a protein-coding gene (purple: A-genes; green, B-genes). Red triangles indicate chromosome ends that appear unpaired due to break-induced replication (BIR). “M” and “m” indicate the active and broken MAT loci of Z. parabailii, respectively.
Fig 4.
Subgenome and duplication status of each Z. parabailii gene.
Each gene was classified into 1 of 7 categories and color-coded as shown in the legend. For each chromosome, 7 rows were then drawn, showing the locations of genes in each category (the 7 rows appear in the same order from top to bottom as in the legend). “R” shows the locations of ribosomal DNA (rDNA clusters). “M” and “H” indicate the locations of MAT and HML/HMR loci. Circles with arrows mark the 3 chromosome ends where our sequence is incomplete due to break-induced replication (BIR); in each case, the missing sequence is apparently identical to the end of another chromosome, as shown. For example, we infer that at the right end of chromosome 14, our assembly artefactually lacks a second copy of the genes that are labeled as “A unique” on the right end of chromosome 9. The high sequence identity of the chromosome 9 and 14 copies of this region caused them to coassemble, and the coassembled contig was arbitrarily assigned to chromosome 9.
Fig 5.
(A) Organization of MAT, HML, and HMR loci in Z. parabailii ATCC60483. The genome contains 6 MAT-related regions, with 1 MAT, 1 HML, and 1 HMR locus derived from each of the A and B parents. Pink and green backgrounds indicate sequences from the A- and B-subgenomes, respectively. The MAT locus in the A-subgenome (position 294 kb on chromosome 7) is intact and expressed. The MAT locus of the B-subgenome has been broken into 2 parts by cleavage by HO endonuclease. All 6 copies of the X repeat region (654 bp) are identical in sequence, as are all 6 copies of the Z repeat region (266 bp). Gray triangles indicate the disruption of the splicing of intron 2 in MATα2 and HMLα2 of the B-subgenome. The binding sites for primers A–F used for PCR amplification are indicated by gray arrows. (B) Sequences at the MAT locus breakpoint. Red, MATα1-derived sequences. The HO cleavage site (CGCAGCA, giving a 4-nucleotide 3′ overhang) is highlighted in gray. Blue, the GDA1-YEF1 intergenic region from the equivalent region of Z. bailii CLIB213T and homologous sequences from the A-subgenome on Z. parabailii chromosomes (chrs.) 2 and 16. A 5-bp sequence (ACAAC) that became duplicated during the rearrangement is underlined. (C) Sequences of MATα2 intron 2 (lowercase) from the A- and B-subgenomes. An AG-to-AC mutation (red) at the 3′ end of the intron moved the splice site by 2 bp in the B-subgenome, causing a frameshift and premature translation termination. The splice sites in both genes were identified from RNA sequencing (RNA-Seq) data.
Fig 6.
(A,B) Ascospore formation in Z. parabailii ATCC60483. White arrows show conjugation tubes in dumbbell-shaped asci. Black arrows show budding vegetative cells. Scale bars, 10 μm. Cultures were grown on 5% malt extract agar for 6–10 days at 25°C. (C) Examples of PCR determination of MAT locus genotypes in tetrads. Pairs of PCR primers as shown in Fig 5A were used to amplify the MAT locus in colonies grown from spores after dissection of conjugated asci. PCR primer pairs AB and AE amplify the left side of the MAT locus, including the Z region (AB, 1,485-bp product from MATα; AE, 2,103-bp product from MATa). Primer pairs DF and DC amplify the right side of the MAT locus, including the X region (DF, 1,882-bp product from MATa; DC, 2,027-bp product from MATα). PCR products were sequenced to determine whether they originated from the A- or B-subgenome. (D) Summary of MAT genotypes in colonies grown from spores from 13 dissected tetrads. Magenta circles denote colonies with A-subgenome alleles (MATa_A or MATα_A), and green circles denote colonies with B-subgenome alleles (MATa_B or MATα_B). Half circles represent colonies that gave both MATa and MATα PCR products.
Fig 7.
Cartoon of key steps in the origin of the Z. parabailii genome.
Chromosome regions (thick bars) are colored according to their location in Z. bailii (magenta outlines). The corresponding homeologous regions are scrambled in Parent B (green outlines). Circles represent centromeres. (i) Interspecies mating occurred between Parent A (Z. bailii) and Parent B. The genomes differed by about 34 rearrangement breakpoints and 7% nucleotide sequence divergence. The resulting zygote was unable to form viable spores because of the noncollinearity of its chromosomes. (ii) Expression of HO endonuclease in the zygote, due to the absence of a1-α2, resulted in cleavage of the B-copy of the MAT locus and ectopic recombination with the GDA1-YEF1 region of the A-subgenome, causing a reciprocal translocation. (iii) The resulting genome has only 1 functional MAT locus and behaves as a haploid. Recombinations and other exchanges between homeologous regions of the 2 subgenomes, such as those that exchanged the HML/HMR regions, occurred but are not shown here for simplicity. (iv) The current life cycle of Z. parabailii involves mating between 16-chromosome haploids to form 32-chromosome diploids, which immediately sporulate to regenerate 16-chromosome haploids. Z. parabailii is homothallic because it contains an intact HO gene, which allows interconversion between MATa and MATα haploids and hence autodiploidization. chrs., chromosomes.