Conserved Substitution Patterns around Nucleosome Footprints in Eukaryotes and Archaea Derive from Frequent Nucleosome Repositioning through Evolution

Nucleosomes, the basic repeat units of eukaryotic chromatin, have been suggested to influence the evolution of eukaryotic genomes, both by altering the propensity of DNA to mutate and by selection acting to maintain or exclude nucleosomes in particular locations. Contrary to the popular idea that nucleosomes are unique to eukaryotes, histone proteins have also been discovered in some archaeal genomes. Archaeal nucleosomes, however, are quite unlike their eukaryotic counterparts in many respects, including their assembly into tetramers (rather than octamers) from histone proteins that lack N- and C-terminal tails. Here, we show that despite these fundamental differences the association between nucleosome footprints and sequence evolution is strikingly conserved between humans and the model archaeon Haloferax volcanii. In light of this finding we examine whether selection or mutation can explain concordant substitution patterns in the two kingdoms. Unexpectedly, we find that neither the mutation nor the selection model are sufficient to explain the observed association between nucleosomes and sequence divergence. Instead, we demonstrate that nucleosome-associated substitution patterns are more consistent with a third model where sequence divergence results in frequent repositioning of nucleosomes during evolution. Indeed, we show that nucleosome repositioning is both necessary and largely sufficient to explain the association between current nucleosome positions and biased substitution patterns. This finding highlights the importance of considering the direction of causality between genetic and epigenetic change.

We simulate 100,000 such units, mutate each independently (maximum one mutation per unit) and track a) if a mutation occurred, b) in which ancestral context it occurred (in the linker or nucleosome state), and c) in which context it is now observed (i.e. a mutation might occur in the linker segment, but that segment might now be in a nucleosomal state if the nucleosome shifted). For each unit that we evolve, we have a matching partner (the "chimp unit"), which we evolve under the same parameters. Critically, we can now evaluate where mutations occurred in chimp relative to the observed human state (which, if the nucleosome moved, can be different from the chimp and/or ancestral state), exactly like we do empirically for the substitution data.
An example is illustrated in Figure S8 for a model where the probability of mutation is the same in nucleosome and linker state (P(mut) nuc = P(mut) link ), but where a shift only occurs if the mutation falls in a nucleosomal region (much as we propose might frequently be the case for C to T substitutions). In our example, we consider two chromatin units in parallel (to illustrate equal mutation rates in nucleosome and linker). The first unit (unit A) mutates in the nucleosome state in humans, causing a shift, whereas the orthologous unit in chimp mutates in the linker region causing no shift. The opposite happens for unit B. We can now ask, across units A and B, how many mutations we observe in the current linker and nucleosome state in humans (top left panel). More importantly, we can ask where mutations fall in chimp relative to the current human nucleosome state (grey arrows). In this example, we observe both chimp mutations in a nucleosome state, although one of them actually occurred in the linker state. Evidently, we might also observe cases where mutations fall in the same segment in both human and chimp (grey shaded box) so that we would end up observing similar trends in both species. On average (dashed box) we would observe a strong trend in humans but no trend in chimps.

Figure S8
Using different combinations of values for the four parameters, we then consider the following: If P(mut) nuc = P(mut) link , i.e. there is no nucleosome-associated mutation bias, can we replicate the array of trends we observe empirically simply by having differential rates of nucleosome shifting following mutation in the nucleosome and linker state?
The model reveals -as fore-shadowed by the example given in Figure S8 -that different rates of mutation-induced shifting is sufficient to generate patterns with strong trends in human, but flat trends in chimp, where the direction and steepness of the human trend depends on the absolute and relative shifting frequencies. A representative sample of parameter combinations is given in Figure S9.
In addition, under some parameter combinations we can also obtain non-flat trends in chimps even in the absence of mutation bias, both in the same direction as the human trend, but also, more strikingly, in the opposite direction (see Figure S10). For most parameter combinations (under P(mut) nuc = P(mut) link ) we see flat trends in chimps, which is what we observe empirically. These results highlight that under a model without mutation bias and solely based on differential shifting we largely expect to observe trends in human but no trends in chimp, but might more rarely observe trends in chimp too. A full set of simulated substitution rates can be found in