Widespread Compensatory Evolution Conserves DNA-Encoded Nucleosome Organization in Yeast
Shown are the inferred C to T substitution rates for the S. cerevisiae lineage (X-axis), and other sensu stricto lineages (color coded, Y-axis). Each point represents the C to T substitution rate in one of 4x4 different flanking nucleotide contexts which are defined using the 5′ and 3′ nucleotides depicted on top. The data reveal a four-fold variation in substitution rates at different contexts, which is consistent among the different lineages (as shown by the fit between the independently inferred substitution rates of S. cerevisiae and of the other lineages). Controlling for this variation is important when comparing substitution dynamics in A+T rich vs. A+T poor genomic regions, such as low and high occupancy sequences. B) Different evolutionary dynamics in low and high occupancy loci. Shown are log-ratios of substitution rates in low vs. high occupancy sequences (Y-axis) plotted against the substitution rates at high occupancy sequences (X-axis). Each point represents the rate of one of four types of substitutions (color coded) in loci flanked by the 5′ and 3′ nucleotide depicted inside the data point. Substitutions in reverse complementary contexts are averaged and shown only once. A/T-losing substitutions (red, pink) are ∼45% slower in low occupancy loci, an effect that is observed independently for transitions and transversions across the different flanking sequence contexts. A/T-gaining substitutions (blue, cyan) are highly dependent on the context, with the main group having rates which are independent of the nucleosome occupancy and with A/T gains in G/C flanking contexts highly conserved in low occupancy sequences. C) Averaged substitution trends. Shown are overall rates of A/T-gaining and A/T-losing substitutions in high and low nucleosome occupancy (occ.) averaged over all contexts. The simplified divergence pattern is difficult to explain using standard models of selection, since different types of substitution are differentially affected. D) The S. cerevisiae lineage maintained the G+C content of low and high occupancy sequences. Shown are the average G+C content in the extant S. cerevisiae genome and in the inferred common ancestor of S. cerevisiae and S. paradoxus, depicted for 10 levels of S. cerevisiae nucleosome occupancy (Methods). The analysis suggests that the highly variable substitution rates shown in B are not driving divergence in net G+C content but take part in a conservative process.