Comparing DNA replication programs reveals large timing shifts at centromeres of endocycling cells in maize roots

Plant cells undergo two types of cell cycles–the mitotic cycle in which DNA replication is coupled to mitosis, and the endocycle in which DNA replication occurs in the absence of cell division. To investigate DNA replication programs in these two types of cell cycles, we pulse labeled intact root tips of maize (Zea mays) with 5-ethynyl-2’-deoxyuridine (EdU) and used flow sorting of nuclei to examine DNA replication timing (RT) during the transition from a mitotic cycle to an endocycle. Comparison of the sequence-based RT profiles showed that most regions of the maize genome replicate at the same time during S phase in mitotic and endocycling cells, despite the need to replicate twice as much DNA in the endocycle and the fact that endocycling is typically associated with cell differentiation. However, regions collectively corresponding to 2% of the genome displayed significant changes in timing between the two types of cell cycles. The majority of these regions are small with a median size of 135 kb, shift to a later RT in the endocycle, and are enriched for genes expressed in the root tip. We found larger regions that shifted RT in centromeres of seven of the ten maize chromosomes. These regions covered the majority of the previously defined functional centromere, which ranged between 1 and 2 Mb in size in the reference genome. They replicate mainly during mid S phase in mitotic cells but primarily in late S phase of the endocycle. In contrast, the immediately adjacent pericentromere sequences are primarily late replicating in both cell cycles. Analysis of CENH3 enrichment levels in 8C vs 2C nuclei suggested that there is only a partial replacement of CENH3 nucleosomes after endocycle replication is complete. The shift to later replication of centromeres and possible reduction in CENH3 enrichment after endocycle replication is consistent with a hypothesis that centromeres are inactivated when their function is no longer needed.

between the two types of cell cycles. The majority of these regions are small, with a median size 48 of 135 kb, and shift to a later RT in the endocycle. However, we found larger regions that shifted 49 RT in centromeres of seven of the ten maize chromosomes. These regions covered the majority 50 of the previously defined functional centromere in each case, which are ~1-2 Mb in size in the 51 reference genome. They replicate mainly during mid S phase in mitotic cells, but primarily in 52 late S phase of the endocycle. Strikingly, the immediately adjacent pericentromere sequences are 53 primarily late replicating in both cell cycles. Analysis of CENH3 enrichment levels in nuclei of 54 different ploidies suggested that there is only a partial replacement of CENH3 nucleosomes after 55 endocycle replication is complete. The shift to later replication of centromeres and reduced 56 CENH3 enrichment after endocycle replication is consistent with the hypothesis that centromeres 57 are being inactivated as their function is no longer needed. 58 AUTHOR SUMMARY 59 In traditional cell division, or mitosis, a cell's genetic material is duplicated and then split 60 between two daughter cells. In contrast, in some specialized cell types, the DNA is duplicated a 61 second time without an intervening division step, resulting in cells that carry twice as much DNA 62 -a phenomenon called an endocycle, which is common during plant development. At each step, 63 DNA replication follows an ordered program, in which highly compacted DNA is unraveled and 64 replicated in sections at different times during the synthesis (S) phase. In plants, it is unclear 65 whether traditional and endocycle programs are the same. Using root tips of maize, we found a 66 small portion of the genome whose replication in the endocycle is shifted in time, usually to later 67 in S phase. Some of these regions are scattered around the genome, and mostly coincide with 68 active genes. However, the most prominent shifts occur in centromeres. This location is 69 noteworthy because centromeres orchestrate the process of separating duplicated chromosomes 70 into daughter cells, a function that is not needed in the endocycle. Our observation that replication in atxr5 and atxr6 mutants [24]. 123 In addition, there is as yet no information as to whether changes in RT programs are 124 associated with endoreduplication or differentiation in plant systems. That such changes might 125 occur in association with differentiation is supported by reports of extensive changes in RT 126 between animal cell cultures representing different embryonic or differentiated cell types (e.g. 127 [13,[25][26][27]). 128 To address these questions in the maize root tip system, we carried out a detailed 129 comparison of RT dynamics in mitotic and endocycling cells. To isolate endocycling nuclei, we 130 focused on a root segment 1-3 mm from the apex where there is a higher proportion of 131 endocycling cells and used flow cytometry to separate nuclei of higher ploidy. We found very 132 little evidence for changes in copy number that would be associated with over-or under-133 replication, and the RT profiles for the vast majority of the genome are very similar. However, 134 we found significant changes in timing for a number of loci that together correspond to 2% of the 135 genome. Most notably, we found major changes in the RT of centromeres, which replicate 136 mainly during mid S phase in mitotic cells, but primarily in late S phase of the endocycle. 137

138
Separating endocycling from mitotic nuclei 139 As reported previously and described in Methods, we used a 20-min pulse of the thymidine 140 analog, EdU, to label newly replicated DNA in intact maize roots. This was followed by 141 formaldehyde fixation and isolation of nuclei from defined segments of root tips (Fig 1A). 142 Incorporated EdU was conjugated with Alexa Fluor 488 (AF-488) by "click" chemistry [28]. The 143 nuclei were then stained with DAPI and fractionated by two-color fluorescence activated flow 144 sorting to generate populations at different stages of the mitotic cell cycle or the endocycle [8,9]. 145 Fig 1B and 1C show flow cytometry profiles obtained for root segments 0-1 mm and 1-3 mm 146 from the tip, respectively. Fluorescent signals from nuclei that incorporated EdU during S phase 147 of a normal mitosis form an "arc" between 2C and 4C DNA contents, while nuclei labeled 148 during the endocycle S phase form a similar arc between 4C and 8C. As seen in Fig 1C, the 149 endocycle arc is more prominent in nuclei preparations from 1-3 mm root segments. To analyze 150 endocycle RT, which we will describe in detail below, we separated labeled nuclei representing 151 early, mid, and late S-phase fractions using the sorting gates shown in Fig 1C,  animal systems, we investigated whether there are local copy number differences in the maize 182 genome after endocycle replication. To do this, we used the non S-phase 2C, 4C, and 8C nuclei 183 populations described above, and carried out whole genome paired-end sequencing. To gain a 184 better representation of the copy number of repeat regions in the genome, reads that could not be 185 uniquely mapped to a single location were included, but we retained only the primary alignment 186 location for each read pair. These data were examined for regions in which normalized read 187 frequencies in 5-kb windows differed between 8C and 4C or 4C and 2C nuclei, using procedures 188 described by Yarosh et al. ([29]; S1 Text). We found about 5% of the 5-kb windows had ratio 189 values that fell outside of two standard deviations of the mean ratio for 8C  varied between ~3 and 11× genome coverage per S-phase sample, so all samples were randomly 219 downsampled to ~3× coverage to ensure comparable results (see Methods and S1 Spreadsheet). 220 We used the Repliscan analysis pipeline [14] to generate profiles of replication activity in 221 early, mid and late fractions of each S phase. These profiles were generated by aggregating the 222 Repli-seq read densities for each S-phase sample in 3-kb static windows, scaling the reads to 1× 223 genome coverage, and then dividing by the scaled read counts from the unlabeled 2C reference 224 data and smoothing by Haar wavelet transform (see Methods and [14]). Normalizing with the 2C 225 reference corrected for differences in sequencing efficiencies and collapsed repeats that caused 226 "spikes" in the data (illustrated for late replication in the endocycle in S3 Fig), producing an 227 estimate of replication intensity or "signal" in each 3-kb window. We also excluded 3-kb 228 windows with extremely low read coverage in the 2C reference sample (see Methods) from all 229 analyses ("blacklist" windows, indicated by black tick marks in Fig 1E). Despite the global similarity of the RT programs of mitotic and endocycling cells, there are 239 regions scattered around the maize genome that show a shift in RT. To identify timing 240 differences, we first calculated the difference in normalized replication signal between the 241 mitotic and endocycle data at each genomic location for the early, mid and late profiles 242 separately (S1 Table;  where there was an equal and opposite timing difference in at least one other S-phase fraction 244 (for example, regions in which a decrease in early replication signal in endocycling cells was 245 associated with a corresponding increase in mid and/or late S-phase signal at the same location). 246 We allowed a gap distance of 6 kb when searching for regions with timing differences to account 247 for small blacklist regions that break up larger regions of change. We found that 11% of the 248 genome showed a difference in timing of at least 10% of the total difference range for a given 249 profile (difference in replication signal ≥ 0.4; S1 Table), with an opposite timing difference at the 250 same threshold criterion at the identical location in another S phase profile. Many of these 251 regions are small, with the lower 50% of regions ranging in size from 3 kb to the median size of 252 33 kb (S2 Table), and it is not clear if such small alterations are biologically relevant. 253 To identify more robust differences, designated Regions of Altered Timing (RATs), we 254 identified regions in which the difference in replication signal was ≥ 25% of the total difference 255 range for a given profile (difference in replication signal ≥ 1.0; S1 Table), and which also met 256 the criterion of having an opposite difference in at least one other profile. To highlight larger and 257 contiguous regions of change, we included ≥ 10% regions that were adjacent to the original ≥ 258 25% regions. However, RATs had to have at least one core region where the timing change was 259 at least 25% (S2 Table) to be included in our analysis. Representative probably be long continuous RATs (see Fig 3). 286 287 Robust RATs fall into two categories, those where the strongest replication signal occurs 288 later in the mitotic cycle than it does in the endocycle ("Later-to-Earlier" shift), and those in 289 which the strongest signal occurs earlier in the mitotic cycle than in the endocycle ("Earlier-to-290 Later" shift). In addition, we separately characterized a subset of the Earlier-to-Later RATs that 291 are located in functional centromeres ("Earlier-to-Later-CEN") using centromere (    windows that contain a non-CEN Earlier-to-Later RAT that met our compensation criteria. 336 Timing differences between early and mid profiles are shown in S13 Fig.  337 338

Non-centromeric RATs 339
We analyzed the non-CEN RATs for the content of genes and TEs, as well as the presence of 340 histone modifications and functional annotations related to the genes within RATs. To assess 341 whether the percentage of RATs containing genes differed from random expectation, we 342 randomly shuffled coordinates corresponding to the non-CEN Later-to-Earlier and Earlier-to-343 Later RATs around the genome 1000 times and calculated the percentage of regions that overlap 344 genes in each set. We found that 93% and 96% of Later-to-Earlier and Earlier-to-Later RATs, 345 respectively, contain at least one annotated gene and usually contain a small cluster of genes 346 (Tables 1 and S3). Using root-tip RNA-seq data that are not specific to mitotic or endocycle 347 cells, we found that although only 50% of the 682 genes found in non-CEN RATs are expressed 348 at a meaningful level (FPKM ≥ 1; S3 Table), 83% and 91% of Later-to-Earlier and Earlier-to-349 Later RATs, respectively, contain at least one expressed gene (Table 1) RATs. Given that these regions are shifting to a later RT in the endocycle, a decrease in gene 381 expression would be expected [12]. Clearly, however, more work will be needed to confirm this 382 hypothesis. 383 The general organization of the maize genome is genes clustered in "islands" interspersed 384 with blocks of transposable elements [41-43]. We used a similar permutation strategy as for the 385 genes to estimate the significance of any differences in percent coverage of each TE superfamily 386 in non-CEN RATs as compared to random expectation, estimated from 1000 randomly shuffled 387 sets. The TE annotations were from the recent RefGen_v4 TEv2 disjoined annotation, where 388 every bp is assigned to a single TE [39]. We found the coverage of the RLG/Gypsy superfamily 389 in Earlier-to-Later RATs is significantly less than random expectation (permutation P value ≤ 0.001; S4 Table). There are other, less significant, positive and negative associations with TE 391 superfamilies in non-CEN RATs, including RLC/Copia, DTT/Tc1-Mariner, DTM/Mutator and 392 DHH/Helitron (S4 Table). We also found that the percent AT content in RATs is similar to that 393 of the genome as a whole, with median values of 55% and 56% for Later-to-Earlier and Earlier-394 to-Later RATs, respectively, and a median value of 55% for the whole genome (S10 Fig). replication timing data, for example, we found that on average 45% of all reads that map to 403 centromeres could be uniquely mapped to a single location (S11 Fig). Only these uniquely 404 mapping reads were used for further analysis. In addition, most of the maize centromere 405 assemblies are relatively intact, and functional centromeres have been located by mapping ChIP-406 seq reads for CENH3 [38]. When combined with our replication timing data, these features of 407 the maize system create a unique opportunity to assess RT programs for centromeres. 408 Our analysis found large, robust RATs across seven of the ten centromeres (Figs 3C, 3D 409 and S12), with replication occurring mainly in mid S in mitotic cells, but changing to primarily 410 late S in endocycling cells. It is also noteworthy that though replication occurs mainly in mid S 411 in mitotic cells, there are some distinct peaks of early replication inside or directly adjacent to the 412 called centromere (indicated by black arrowheads in Fig 3 and S12) in all but one of the maize 413 centromeres. These early peaks remain in the endocycle, though in some cases there is a 414 reduction in early signal with a concomitant increase in mid signal at the same location. The 415 seven centromeres that contain robust RATs (CEN 2, 3, 4, 5, 8, 9 and 10) were previously 416 classified as "complex" because they contain a mixture of retrotransposons with some 417 centromere satellite repeat arrays (CentC; [40,47]). In the RefGen_v4 genome assembly, CEN 9 418 has two called CENH3-binding regions [38], which we refer to as CEN 9a and 9b (Fig 3C; black  419 bars). Interestingly, we only found a robust RAT in the larger CEN 9a, with the smaller CEN 9b 420 showing almost no timing shift. 421 The remaining three centromeres (CEN 1, 6, and 7) were previously characterized as 422  Table).  Table), suggesting many of these uncompensated differences may 454 result from technical variation rather than from meaningful biological differences. In contrast, 455 nearly all (85%) of the centromeric windows have compensated RT shifts. so these families were grouped together in Fig 4C. When present in centromeres, all three major 484 classes of elements -genes, CRM1/2, and CentC repeats -clearly replicate later during the 485 endocycle than in the mitotic cycle (Fig 4). In contrast, genes and CRM elements in the 486 pericentromere show little or no timing shifts. A full analysis of the replication times of CentC 487 repeats in pericentromeres is hampered by the limited representation of this repeat class in the 488 genome assembly (Fig 4D and S14E). 489 490

Chromatin features in centromeres 491
We also examined activating (H3K56ac and H3K4me3) and repressive (H3K27me3) histone 492 post-translational modifications to look for epigenetic changes in centromeres after endocycle 493 replication. It was previously reported that some H3K4me3 and H3K27me3 peaks of enrichment 494 occur in the centromere, mainly associated with genes [50]. We asked whether genes that have 495 these modifications continue to have them after mitotic and endocycle replication, and found 496 very few changes in the number of genes with these modifications at each ploidy level (S15 Fig).  497 There was also very little change in the fold enrichment of these histone marks in centromere 498 genes when comparing 2C, 4C and 8C nuclei. 499 We also investigated the levels of dimethylation of histone H3 lysine 9 (H3K9me2) 500 enrichment in each centromere. Previous work indicated there is a depletion of H3K9me2 in 501 centromeres relative to adjacent pericentromeres [51, 52], which we observed as well (S16 Fig). Traditional peak calling tools are not effective for H3K9me2 because of its even distribution 503 across the maize genome. Instead, we estimated the fold enrichment by calculating the percent of 504 total H3K9me2 ChIP reads in a given centromere region (using coordinates from [38]) and 505 dividing by the percent of total input reads corresponding to that centromere in three biological 506 replicates). We found a similar H3K9me2 average fold enrichment for all centromeres and for 507 2C, 4C and 8C nuclei, although values for 4C and 8C nuclei were consistently slightly higher 508 than those for 2C nuclei (S16A Fig). CENH3  binding in sorted non S-phase 2C, 4C, and 8C populations of nuclei. It is important to note that 526 the 4C nuclei come from a mixture of cells, some of which will return to the mitotic cycle and 527 others that will continue on to the endocycle (at least 13% of nuclei in the 1-3 mm region). We 528 asked whether the location or level of CENH3 enrichment changed after DNA replication in the 529 mitotic cycle or the endocycle. For visualization of CENH3 localization, ChIP-seq read counts 530 from three biological replicates for each ploidy level were aggregated in 3-kb windows and 531 normalized to the level of a uniform 1× genome coverage, so that corresponding windows in the 532 different ploidy level profiles were comparable. The normalized read count in each 3-kb window 533 was then divided by the corresponding normalized read count for the corresponding ploidy input 534 DNA to calculate a fold enrichment relative to DNA content value for CENH3 binding 535 sequences in that window. The spatial distribution of CENH3 enrichment across the centromeres 536 remained the same in 2C, 4C, and 8C cells. This is illustrated for CEN 9 and CEN 10 in Fig 5A  537 and 5B, and data for the rest of the centromeres are shown in S17 Fig. There are also a few small 538 spikes of CENH3 enrichment outside the called centromere (e.g. seen in Fig 5 and S17, but also 539 occasionally further out on the arms). These spikes also remain in the same location between 2C, 540 4C and 8C cells, some of which could be related to misassembly of the reference genome. 541 However, if real, these ectopic CENH3 peaks are less numerous and more persistent in G2 than 542 those recently observed in HeLa cells [61]. To compare total CENH3 content of entire centromeres at different ploidy levels, we 559 calculated the percent of total CENH3 reads found in a given centromere and made a ratio to the 560 percent of total reads from the corresponding input DNA in that centromere separately for each 561 biological replicate, as described above for H3K9me2. The CENH3 average fold enrichment 562 relative to total DNA content is similar for 2C and 4C nuclei in each of the complex centromeres 563 ( Fig 5C), with an average 4C/2C enrichment ratio of 1.1 (S7 Table). However, CENH3 564 enrichment decreases with the increase in ploidy from 4C to 8C (Fig 5C), with an average 8C to 565 4C enrichment ratio of only 0.7 (S7 Table). Average CENH3 enrichment values for simple 566 centromeres were lower and slightly more variable, likely because of assembly issues. In both 567 cases, however, the ratio of CENH3 enrichment in 8C cells to that in 4C cells is clearly higher 568 than 0.5, which would be expected if there was no incorporation of new CENH3 after endocycle 569 replication, but smaller than the 1.0 ratio expected if there was full replacement (S7 Table). It is 570 worth noting that these data refer to post-replication 8C nuclei, which exited S phase prior to the 571 time of analysis, and that post-replication 4C nuclei show no dilution of CENH3 relative to DNA 572 content. Thus, our data are consistent with a model in which the CENH3 to DNA ratio is reduced 573 as DNA replicates during the endocycle S phase, and only partially restored after completion of 574 S phase. 575 We generated whole genome Repli-seq data for root cell nuclei undergoing DNA 584 replication in either the mitotic cycle or the endocycle, making use of in vivo EdU labeling of 585 intact root tips and two-color fluorescence activated nuclei sorting. By doing so, we avoided 586 potential artefacts caused by cell synchronization [65] and chromosome aberrations often found 587 in plant and animal cell cultures (e.g. [66-68]). We present replication activity profiles for early, 588 mid and late replication separately, instead of collapsing the data into an early:late ratio as many 589 studies do. The rationale for this approach is that, for roughly one third of the maize genome, we 590 previously found heterogeneity in mitotic RT -e.g. regions of the genome in which root tip cells 591 exhibit significant replication activity in both early and mid S, or both mid and late S [12]. An 592 additional advantage to presenting the replication profiles separately is the ability to assess 593 whether there are concomitant or "compensated" changes in a region at multiple stages of S 594 phase. This compensation criterion helped us separate RT shifts that could be subject to technical 595 error, such as alterations in flow sorting gates, from shifts that are more likely to represent 596 meaningful changes in the population preference to replicate a replicon or cluster of replicons at 597 a particular time in S phase. 598

DISCUSSION
The current study sought to investigate whether the mitotic RT program is maintained in 599 the first round of the endocycle in maize root cells, despite the need to replicate twice as much 600 were similar to those for biological replicates within each type. The high level of reproducibility 605 is particularly noteworthy in the case of the early replication profiles, given that the flow sorting 606 gate for early replicating nuclei in the endocycle had to be adjusted to minimize contamination 607 from late replicating mitotic nuclei (Fig 1C). This overall conservation of RT programs suggests 608 that the process of re-establishing the RT program must be similar for the two types of cell 609 cycles in maize roots. In animal systems, re-establishment of the RT program has been shown to 610 occur in G1 of each cell cycle at a "timing decision point" In the case of centromeres, it is easy to imagine that the large shifts to later replication are 656 related specifically to endocycling, because endocycling cells no longer require functional 657 centromeres. Though often broken by unmappable and multi-mapping ("blacklist") regions in the 658 genome assembly, when combined, centromeric RATs are much larger in size than the non-659 centromeric RATs and cover the majority of each of the seven complex centromeres (S5 Table). The presence of distinct peaks of early replication in or adjacent to functional 686 centromeres (arrowheads in Fig 3 and S12) is noteworthy because they signify a population 687 preference for initiation in early S phase at these loci. This observation is of particular interest 688 because yeast centromeres contain a replication origin that is the first to initiate on its respective 689 chromosome and plays a role in centromere specification [80]. In maize, there is no evidence that 690 these early regions in centromeres are the first to replicate on the entire chromosome, but they centromeres, which showed that, as a group, these repeats consistently shift RT from mid to late. 704 Another piece of evidence comes from our analysis of complex centromeres, which showed that 705 the magnitude of the RT change tapers off toward the outer edges of the functional centromere. 706 One can speculate that the simple centromere assemblies are comprised mostly of the sequences 707 at the edges of the actual centromere, which would still be anchored to nonrepetitive regions in 708 the genome assembly. As in complex centromeres, these edge sequences might have a smaller 709 In our analysis of CENH3 enrichment relative to DNA content in maize root cells, the population 723 of 4C nuclei appear to have a full complement of CENH3, which would be consistent with the 724 previous results for plant species. This result suggests a model in which the sub-population of 4C 725 cells entering the endocycle also carry a full complement of CENH3. If that model is correct, our 726 data for 8C nuclei imply that CENH3 is only partially replaced after DNA replication in the 727 endocycle. Because the population of 8C nuclei we analyzed likely represents a mixture of cells 728 that recently exited endocycle S phase and others that exited some time ago we cannot determine 729 whether CENH3 will be fully restored in all cells at a later time. However, it is clear that the ratio of CENH3 to DNA is not immediately restored, and the lower ratio is widely distributed 731 across all ten centromeres. 732 It is unlikely that endocycling cells will ever re-enter the mitotic cycle [1,96,97] between 4C and 8C, corresponding to early, mid and late S phase of the endocycle. The early 795 endocycle gate was shifted slightly to the right to exclude mitotic nuclei in late S phase (Fig 1C). 796 For each biological replicate, between 50,000 and 200,000 nuclei were sorted from each fraction of the endocycle S phase. A small sample of nuclei from each gate was sorted into CLB buffer 798 containing DAPI and reanalyzed to determine the sort purity (S1 Trimming and quality control of 100-bp paired-end Repli-seq reads were carried out as described 836 resulting mitotic Repli-seq data were more than 3× the sequencing coverage of the endocycle 843 Repli-seq data (S1 Spreadsheet). Repli-seq results are robust at various sequencing depths [14], 844 but to ensure that the mitotic and endocycle data were comparable, the reads were downsampled 845 by a uniform random process using a custom python script incorporating the BEDTools suite 846 [114] to a total of 65.7 million reads per sample, representing almost 3× genome coverage for 847 each S-phase fraction (S1 Spreadsheet). We preferred this to normalization so that any possible 848 sampling bias due to sequencing depth would be similar in all samples. were used. Read densities were aggregated in 3-kb windows across the genome (parameter -w 855 3000). Additionally, we customized the cutoff for reducing type one errors which excluded 856 genomic windows with extremely low coverage in the 2C reference sample. To identify these 857 low read mapping windows, which we labeled "blacklist", Repliscan log-transformed the read 858 counts from the pre-replicative 2C reference sample and windows with read counts in the lower 859 2.5% tail of a fitted normal distribution were excluded from all samples (parameter --pcut 2.5-860 100). The upper 2.5% tail containing extremely high coverage windows or "spikes" was not 861 removed at this step, because we found that these data spikes were adequately normalized in the 862 subsequent step of dividing each 3-kb window in the S-phase samples by the 2C reference data -863 which also normalized for sequencing biases and collapsed repeats (S3 Fig) The difference between normalized signal profiles of mitotic and endocycle Repli-seq data for 871 early, mid, and late S was calculated in 3-kb windows, and the maximum negative and positive 872 differences were then calculated for each chromosome and averaged. Regions showing a timing 873 difference of ≥ 25% (difference in replication signal ≥ 1.0) or ≥ 10% (difference in replication 874 signal ≥ 0.4) of the total range of differences in each profile were identified (S1 Table;  analysis only if their timing differences were "compensated" by opposite timing difference(s) of 877 ≥ 25% or ≥ 10%, respectively, in one or both of the other two S-phase fractions. For example, a 878 decrease in early replication signal in endocycling cells must be compensated by an increase in 879 mid and/or late S-phase signal in the same cell population. Adjacent 3-kb windows with timing 880 differences that met either the ≥ 10% or ≥ 25% threshold were merged, keeping the two files 881 separate, using mergeBED in the BEDTools suite, and allowing a 6 kb gap distance (parameter -882 d 6000) [114]. This initial step resulted in many very small regions being identified (S2 Table). 883 As a second step, if ≥ 10% regions were immediately adjacent to ≥ 25% regions, they were 884 merged together using mergeBED to highlight larger regions of contiguous change (S2 Table). 885 Only regions that contained at least one ≥ 25% region were kept for further analysis, and termed 886 regions of alternate timing (RATs). By requiring a ≥ 25% RT change core region to be included, 887 all of the stand-alone, extremely small regions (< 24 kb) were effectively filtered out, without the 888 requirement of an arbitrary size filter. RATs were categorized into three groups: 1) later in 889 mitotic to earlier in endocycle (Later-to-Earlier), 2) earlier in mitotic to later in endocycle 890 (Earlier-to-Later) and 3) a subset of the Earlier-to-Later RATS that were located in the 891 previously identified functional centromeres (Earlier-to-Later-CEN) (coordinates from [38]). 892 There were no Later-to-Earlier-CEN RATs. For a list of RAT regions, including genomic 893 coordinates and genes within them, see S2 and S3 Spreadsheets. 894 895

ChIP-seq data analysis 896
ChIP-seq reads for H3K27me3, H3K4me3, H3K56ac (100-bp paired-end reads), H3K9me2 and 897 CENH3 (150-bp paired-end reads) were trimmed, mapped to maize B73 RefGen_v4.33, and 898 filtered to retain only properly-paired, uniquely-mapped reads (MAPQ score > 10) as described 899 above for Repli-seq reads. The 2C ChIP and input data for H3K27me3, H3K4me3, H3K56ac is 900 from [12], while the 4C and 8C ChIP data was generated for this study, see S1 Spreadsheet. For 901 details on peak calling and analysis for H3K27me3, H3K4me3, H3K56ac, see S1 Text. 902 For visualization of CENH3 localization in 2C, 4C and 8C nuclei, read counts for 903 individual biological replicates of CENH3 or input samples were scaled to 1× genome coverage 904 using the reads per genomic content (RPGC) method. Biological replicate data had good 905 agreement (Pearson's correlation coefficient values between biological replicates of 0.97-0.99; 906 S1 Spreadsheet), and were merged and scaled again to 1× coverage so the samples would be 907 comparable. CENH3 scaled read counts in each 3-kb window were divided by the scaled read 908 counts from the input sample for the corresponding ploidy level, resulting in CENH3 fold 909 enrichment values relative to input. 910 To compare CENH3 enrichment relative to DNA content in 2C, 4C and 8C cells over 911 entire centromeres, we calculated the percent of total CENH3 reads found in a given centromere 912

Analysis of features in RATs and random permutation analysis 926
We tested the association of various genomic features with the non-CEN RAT categories by 927 determining the overlap of a particular feature with each RAT type. The coordinates for genomic 928 features (genes, expressed genes, TE superfamilies) were intersected with RAT coordinate 929 intervals using intersectBED (parameters -wa -wb) in the BEDtools suite [114]. The percent of 930 RATs containing a feature or the percent coverage of genes and TE superfamilies were computed 931 and compared to values for the genome as a whole. The number of genes per RAT was also 932 determined using intersectBED (parameter -u). 933 For comparison, the coordinates for the non-CEN Earlier-to-Later and Later-to-Earlier 934 RAT sets were randomly shuffled around the genome, excluding functional centromeres, using 935 BEDTools shuffle [114]. These random sets preserved the number of regions and region size of 936 the original RAT sets, and are labeled "EtoL shuffle1" and "LtoE shuffle1" for the Earlier-to-937 Later and Later-to-Earlier RATs, respectively. When there appeared to be differences in the 938 observed overlap values with genomic features between non-CEN RATs and their corresponding 939 random shuffle sets, a permutation or feature randomization test, as described in [12] was used to 940 assess the statistical significance of the observed value. To do so, the coordinates for the non-941 CEN RAT sets were randomly shuffled around the genome 1000 times, as described above. CRM element spanned more than one of the 3-kb windows, the replication signals were averaged 952 using mergeBED (parameter -o mean) to compute a single value for the entire gene or element. 953